2015-03-19

Text = Data + Style

We used to consider the Web as an hypertext, a smart and wonderful extension of the writing space. It is now rather viewed and used as a huge connected and distributed data base. Search engines tend to become smart query interfaces for direct question-answering, rather than guides to the Web landscape. Writing-reading-browsing the hypertext, which was the main activity on the first Web, is more and more replaced by quick questions asking for quick answers in the form of data, if possible fitting the screen size of a mobile interface, and better encapsulated in applications. Is this the slow death of the Web of Text, killed by the Web of Data?
For a data miner, text is just a sort of primitive and cumbersome way to wrap data, from which the precious content has to be painfully extracted, like a gem from a dumb bedrock. But if you are a writer, you might consider the other way round that data is just what you are left with when you have stripped the text of its rhythm, flavor, eagerness from the writer to get in touch with the reader, in one word, style. Why would one bother about style? +Theodora Karamanlis puts it nicely in her blog Scripta Manum under the title "Writing: Where and How to begin".
You want readers to be able to differentiate you from amongst a group of other writers simply by looking at your  style: the “this-is-me-and-this-is-what-I-think” medium of writing. 
Writing on the Web is weaving, as we have seen in the previous post, and your style in this space is the specific texture you give to it locally, in both modern graphical sense and old meaning of way of weaving. The Web is indeed a unified (hyper)text space where anything can be weaved to anything else, but this is achieved through many local different styles or textures. It would be a pity to see this diversity and wealth drowned in the flood of data.
We've learnt those days that Google is working on a new kind of ranking, based on the quality of data (facts, statements, claims) contained in pages. But do or will search engines include style in their ranking algorithms? Can they measure it, and take it into account in search results and personal recommandations, based on your style or the styles you seem to like? Some algorithms are able to identify writing styles the same way other ones identify people and cats in images, or music performers. If I believe I Write Like I just tried on some posts of this blog, I'm supposed to write like I. Asimov or H.P. Lovecraft. Not sure how I should take that. But such technologies applied to compare blogs' styles could yield interesting results and maybe create new links that would not be discovered otherwise.
The bottom line of our data fanatics here could be that after all, style is just another data layer. I'm not ready yet to buy that. I prefer the metaphor of style as a texture. Data is so boring.

2015-03-11

... something borrowed, something blue

I already mentioned +Teodora Petkova in a recent post. Reading her blog, you'll maybe have as I had several times this "exactly ... that!" feeling you get when stumbling on words looking like they have been stolen from the tip of your tongue or pen. In particular don't miss this piece, with its lovely bride's rhyme metaphor, to be applied to every text we write in order to weave it with the web of all texts.
Something old, something new, something borrowed, something blue
Something old ... how can one write without using something old, since what is older than the very words and language we use to write? And one should use them with due respect and full knowledge of their long history. Let's look at some of those venerable words. Children of the Northern European languages, web and weaving seem to come from the same ancient root, hence Weaving the Web is a kind of pleonasm. And text comes from the Latin texo, texere, textus meaning also to weave, and cognate to the ancient Greek τέχνη, the ancestor of all our technics, technologies and architectures. In the Web technologies the northern germanic warp of words have been interwoven with the southern latin woof, and each new text on the Web is a knot in this amazing tapestry. Our Web of texts is not as bad as I wrote a few years ago, and with its patchy, fuzzy, furry and never-finished look, we love it and want to keep it that way.

Something new ... Text seems to be old out-fashioned stuff those days, it's data and multimedia and applications all over the place. Even the Semantic Web has been redubbed Web of Data by the W3C. And what if after Linked Open Data (2007) and Linked Open Vocabularies (2011), we were opening in 2015 the year of Linked Open Text?

Something borrowed ... Teodora encapsulates all the above with the concept of intertextuality. And that one I definitely borrow and adopt (just added it to the left menu), as well as the following from another great piece.
As every text starts and ends in and with another text and we are never-ending stories reaching out to find possible continuations…
Something blue ... The blue of links indeed, but to make the Linked Open Text happen and deliver its potential, we need certainly more than a shade of blue. As Jean-Michel Maulpoix writes in his Histoire du bleu ... All this blue is not of the same ink.
Tout ce bleu n’est pas de même encre.
On y discerne vaguement des étages et des sortes d’appartements, avec leurs numéros, leurs familles de conditions diverses, leurs papiers peints, leurs photographies, leurs vacances dans les Alpes et leurs terrasses sur l’Atlantique, les satisfactions ordinaires et les complications de leurs vies. La condition du bleu n’est pas la même selon la place qu’il occupe dans l’échelle des êtres, des teintes et des croyances. Les plus humbles se contentent des étages inférieurs avec leurs papiers gras et leurs graffitis : ils ne grimpent guère plus haut que les toits hérissés d’antennes. Les plus heureux volent parfois dans un impeccable azur et jettent sur les cités humaines ce beau regard panoramique qui distrayait autrefois les dieux.
To fly that high, we need indeed to invent and use new shades of blue to paint the links between our texts, and the words where those links are anchored. 

2015-03-02

Could computers invent language?

Artificial intelligence is something about which not a line has been written in these pages in next to two hundred posts and over more than ten years. But I feel today like I should drop a couple of thoughts about it, after exchanges on Google+ around this post by +Gideon Rosenblatt and that one by +Bill Slawski, not to mention recent fears expressed by more famous people.
There are many definitions of artificial intelligence, and I will not quote or choose any. Samely, popular issues I also prefer to let alone, like knowing if computers are able to deal only with data and algorithms, or if they can produce information or even knowledge, or if they think and can individually or collectively accede to consciousness or even wisdom. All those terms are fuzzy enough to allow anyone to write anything and its contrary on such issues. Let's rather look at some concrete applications.
Pattern recognition is one of the great and most popular achievements of artificial intelligence. Programs are now able with quite good performance to translate speech into written language, identify music tracks, cluster similar news, identify people and cats on photographs etc. 
Automatic translation is also quite popular, and working not that bad for simple factual texts, has still hard time dealing with context to solve ambiguity, understand puns and implicit references, all things generally associated with intelligent understanding of a text. 
Question-answering is also making great progress, based on more and more rich and complex knowledge graphs, and translation of natural language question into formal queries.
No doubt algorithms will continue to improve in those domains, with many useful applications and some related and important issues regarding privacy and delegation of decision to algorithms.

All the above tasks deal more or less with the ability of computers to process successfully our languages. But, and this is where I'm bound from the start, there is a fundamental capacity of human intelligence which, as far as I know, has not even began to be mimicked by algorithms. It's the capacity to invent language. It has been largely discussed since Wittgenstein whether a private language is possible or not, but there is no discussion that language has been and still is built collectively through a proceess of collective continuous invention. Anyone can invent a new word or a new linguistic form; whether it will be integrated into the language commons depends of many criteria akin to the ones enabling a new species to expand and survive or disappear. This is the way our languages constantly evolve and adapt to the changing world of our communication and discourse needs. Could computers be able to mimick such a process, take part in it, and even expand it further than humans? Could algorithms be able to produce new and relevant words, smoothly integrated in the existing language, to name concepts not yet discovered or named? In short, are computers able to take part in the continuous invention of language, and not only make a smart use of the existing one?
Such a perspective would be indeed fascinating and certainly scary, insofar as machines inventing collectively such language extensions would not necessarily share them with humans, and even if they do, humans would not necessarily be able to understand them

Whether such an evolution is possible at all or in a foreseeable future is a good question. Whether we should hope for it and work to let it happen, or should fear and prevent it, is yet a more interesting one. But at the very least, those questions we can technically specify, making them much more valuable for assessment and definition of artificial intelligence than vague digressions on whether computers can think, have knowledge or can become conscious. We don't even really know what the latter means for humans, our shared language being the closest proxy we have for whatever is going on in our brainware. So let's assess the progress of artificial intelligence by the same criteria we generally use to assess the human intelligence, its ability to deal with language, from plain naming of things to invention of new concepts.