2015-04-14

Weaving beyond the Web

More on this story of names (including URIs) and text (including the Web), as promised to all those who have provided a much appreciated feedback to the previous post. I'm still a bit amazed by the feedback coming from the SEO community, because I really did not have SEO in mind. But I must admit I'm totally naive in this domain, and tend to stick to principles such as do what you have to do, say what you have to say, make it clear and explicit, and let search engines do their job, quality content will float towards the top. And explicit semantic markup is certainly part of the content quality. Very well ... but that was not my point at all. That said, any text is likely to be read and interpreted many ways, and there is often more in it than its author was aware of. And actually, this is akin to what I am about today, the meaning of a text beyond its original context of production.

Language is an efficient and resilient distributed memory, where names and statements can live as long as they are used. And even if not used any more, they can nevertheless live forever if part of some story we keep telling, reading, commenting and translating, some text we are still able to decipher. We still use or at least are able to make sense of texts forged by ancient languages thousands of years ago, even if the things they used to name and speak about do not exist any more. Dead people, buildings and cities returned to ground centuries ago, obsolete tools and ways of life, forgotten deities, concepts of which usage has faded away, the names of all those we nevertheless keep in the memory of languages - the texts. Some of us still read and make sense of ancient Greek and Latin, or even ancient Egypt hieroglyphs. The physical support of this memory has changed over time, from oral transmission to bamboo, clay tablets, papyrus, manuscripts and printed books, analogic and numeric supports of all kinds, today the cloud and what else tomorrow. Insofar as such migrations were possible at all, we trust the resilience of our language.

How do URIs fit in this story? URIs are a very recent kind of names, and RDF triples a new and peculiar form of weaving sentences. People who forged the first of them are still around, and they have been developed for a very specific technical context, which is the current architecture of the Web. Will they survive and mean something centuries from now? Do and will the billions of triples-statements-sentences we have written since the turn of the century make sense beyond the current context of the Web? Like Euclid's Elements, are they likely to live forever in long meaning?

Let's make a thought experiment to figure it. We are in 2115, the current Web architecture has been overriden since 2070 by some new technological infrastructure we can barely figure out in 2015, no more no less than our grandmothers in 1915 could figure the current Web architecture. HTTP is obsolete, data is exchanged through whatever new protocol. Good old HTTP URIs don't dereference to anything anymore since half a century. Do they still name something? Do the triples still make sense? Imagine you have saved all or part of the 2015 RDF content, and you have still software able to read it - just a text reader will do. Can you still make sense of it? Certainly, if you have a significant corpus. If you have the full download of 2015 DBpedia or WorldCat, most of its content should be understandable if the natural language has not changed too much. Hopefully this should be the case. We read without problem in 2015 the texts written by 1915. And if you have saved a triple store infrastructure and software, you might still be able to query those data in SPARQL by 2115. Triples are triples, either on the Web or outside it.

What lesson do we bring home from this travel to the future? Like any text, URIs and triples can survive and be meaningful well beyond the current Web infrastructure, they belong to the unfolding history of language and text. Of course today the Web infrastructure allows easy navigation, query and building services on top of them. But when forging URIs and weaving triples, consider that beyond the current Web what you write can live forever if it's worth it. Your text is likely to be translated into formats, languages and read through supports and infrastructures you just can't imagine today. Worth thinking about it before publishing. Text never dies.

2015-04-11

From names to sentences, the Web language story.

Conversation about text and names and how they are interwoven within the Web architexture is going on here and there. The more it goes, the more I feel we need more non technical narratives and metaphors to have people get what the (Semantic) Web is all about. We have drowned them under technical talks and schemas of layers of architecture and protocols and data structures and ontologies and applications ... and the neat result is that too many of them, and smart people, think only experts, engineers and geeks can grok it. So let me try one of such - hopefully simple - narratives. 

The story of the Web is just the story of language, continued by other means. Forging names to call things, and weaving those names in sentences and texts. On the Web, things have those weird names called URIs, but names all the same. As we have seen in a previous post, a name is to begin with a way to shout and identify people and things in the night. On the Web to call a thing by its URI-name you will use some interface, a browser, a service, an application, and at this call something will come through the interface. Well, the thing you have called does not actually come itself to you through the network, but you get something which is hopefully a good enough representation of the thing. The deep ontological question of the relationship between the name and what is named has been discussed for ages and will continue forever. The Web does not change that issue, does not solve it, just provides new use cases and occasions to wonder about it. But this is not my point today.

On the first ages of the Web, calling things was all you could do with those URI-names. You had the language ability of a two years old kid. You could say "orange" or "milk" when you were thirsty, and "dog" and "cat" and "car" and "sea" and "plane" when you saw or wanted one, and cry for everything else you could not express or the dumb Web would not understand. With no more sophisticated language constructs, you could nevertheless discover the wealth of the Web, through iterative serendipitous calls. Because courtesy of the Web is such that when you call for a thing the answer comes often back with a bunch of other names you can further call (an hyperlink just does that, enabling you to call another name just by a click). You would bring back home things you had not the faintest idea of the very existence a minute before. Remember this jubilation, the magic of your first Web navigation, twenty years ago? Like a kid laughing aloud when discovering the tremendous power of names to call things.
Today in many (most) of our interactions with the Web we are no more aware of using names. We make actions with our fingertips, barely guessing that under the hood, this is transformed in a client calling a server or something on this server by some name, and many calls are made on the network to bring back what your fingers asked. Only geeks and engineers know that. The youngest generations who have not known the first ages of the Web, and interact only through such interfaces, are plainly ignoring all that names affair. Did you say URL Dad? What's that? It sounds so 90's ...

Now when you grow older than two, you go beyond using names just for shouting them in the face of the world, you begin to understand and build yourself sentences. That's a complete new experience, a new dimension of language unfolding. You link names together, you discover the texture, the power to understand and invent stories and to ask and answer questions. You still use the same names, you are still interested in oranges, cats, dogs and cars, and all the thousands of things which are the children of naming. But you are now able to weave them together using verbs (predicates), qualifiers and quantifiers and logical coordination. You have become a language weaver.

And that's exactly and simply what the Semantic Web is about, and how it extends the previous Web. Just growing and learning to weave sentences, telling stories, asking questions. But using the same URI-names as before. Any URI-name of the good old Web can become a part of the Semantic Web. Just write a sentence, publish a triple using it as subject or object, and here you are.