2015-04-14

Weaving beyond the Web

More on this story of names (including URIs) and text (including the Web), as promised to all those who have provided a much appreciated feedback to the previous post. I'm still a bit amazed by the feedback coming from the SEO community, because I really did not have SEO in mind. But I must admit I'm totally naive in this domain, and tend to stick to principles such as do what you have to do, say what you have to say, make it clear and explicit, and let search engines do their job, quality content will float towards the top. And explicit semantic markup is certainly part of the content quality. Very well ... but that was not my point at all. That said, any text is likely to be read and interpreted many ways, and there is often more in it than its author was aware of. And actually, this is akin to what I am about today, the meaning of a text beyond its original context of production.

Language is an efficient and resilient distributed memory, where names and statements can live as long as they are used. And even if not used any more, they can nevertheless live forever if part of some story we keep telling, reading, commenting and translating, some text we are still able to decipher. We still use or at least are able to make sense of texts forged by ancient languages thousands of years ago, even if the things they used to name and speak about do not exist any more. Dead people, buildings and cities returned to ground centuries ago, obsolete tools and ways of life, forgotten deities, concepts of which usage has faded away, the names of all those we nevertheless keep in the memory of languages - the texts. Some of us still read and make sense of ancient Greek and Latin, or even ancient Egypt hieroglyphs. The physical support of this memory has changed over time, from oral transmission to bamboo, clay tablets, papyrus, manuscripts and printed books, analogic and numeric supports of all kinds, today the cloud and what else tomorrow. Insofar as such migrations were possible at all, we trust the resilience of our language.

How do URIs fit in this story? URIs are a very recent kind of names, and RDF triples a new and peculiar form of weaving sentences. People who forged the first of them are still around, and they have been developed for a very specific technical context, which is the current architecture of the Web. Will they survive and mean something centuries from now? Do and will the billions of triples-statements-sentences we have written since the turn of the century make sense beyond the current context of the Web? Like Euclid's Elements, are they likely to live forever in long meaning?

Let's make a thought experiment to figure it. We are in 2115, the current Web architecture has been overriden since 2070 by some new technological infrastructure we can barely figure out in 2015, no more no less than our grandmothers in 1915 could figure the current Web architecture. HTTP is obsolete, data is exchanged through whatever new protocol. Good old HTTP URIs don't dereference to anything anymore since half a century. Do they still name something? Do the triples still make sense? Imagine you have saved all or part of the 2015 RDF content, and you have still software able to read it - just a text reader will do. Can you still make sense of it? Certainly, if you have a significant corpus. If you have the full download of 2015 DBpedia or WorldCat, most of its content should be understandable if the natural language has not changed too much. Hopefully this should be the case. We read without problem in 2015 the texts written by 1915. And if you have saved a triple store infrastructure and software, you might still be able to query those data in SPARQL by 2115. Triples are triples, either on the Web or outside it.

What lesson do we bring home from this travel to the future? Like any text, URIs and triples can survive and be meaningful well beyond the current Web infrastructure, they belong to the unfolding history of language and text. Of course today the Web infrastructure allows easy navigation, query and building services on top of them. But when forging URIs and weaving triples, consider that beyond the current Web what you write can live forever if it's worth it. Your text is likely to be translated into formats, languages and read through supports and infrastructures you just can't imagine today. Worth thinking about it before publishing. Text never dies.