Artificial intelligence and the invention of language

Artificial intelligence is something about which not a line has been written in these pages in next to two hundred posts and over more than ten years. But I feel today like I should drop a couple of thoughts about it, after exchanges on Google+ around this post by +Gideon Rosenblatt and that one by +Bill Slawski, not to mention recent fears expressed by more famous people.
There are many definitions of artificial intelligence, and I will not quote or choose any. Samely, popular issues I also prefer to let alone, like knowing if computers are able to deal only with data and algorithms, or if they can produce information or even knowledge, or if they think and can individually or collectively accede to consciousness or even wisdom. All those terms are fuzzy enough to allow anyone to write anything and its contrary on such issues. Let's rather look at some concrete applications.
Pattern recognition is one of the great and most popular achievements of artificial intelligence. Programs are now able with quite good performance to translate speech into written language, identify music tracks, cluster similar news, identify people and cats on photographs etc. 
Automatic translation is also quite popular, and working not that bad for simple factual texts, has still hard time dealing with context to solve ambiguity, understand puns and implicit references, all things generally associated with intelligent understanding of a text. 
Question-answering is also making great progress, based on more and more rich and complex knowledge graphs, and translation of natural language question into formal queries.
No doubt algorithms will continue to improve in those domains, with many useful applications and some related and important issues regarding privacy and delegation of decision to algorithms.

All the above tasks deal more or less with the ability of computers to process successfully our languages. But, and this is where I'm bound from the start, there is a fundamental capacity of human intelligence which, as far as I know, has not even began to be mimicked by algorithms. It's the capacity to invent language. It has been largely discussed since Wittgenstein whether a private language is possible or not, but there is no discussion that language has been and still is built collectively through a proceess of collective continuous invention. Anyone can invent a new word or a new linguistic form; whether it will be integrated into the language commons depends of many criteria akin to the ones enabling a new species to expand and survive or disappear. This is the way our languages constantly evolve and adapt to the changing world of our communication and discourse needs. Could computers be able to mimick such a process, take part in it, and even expand it further than humans? Could algorithms be able to produce new and relevant words, smoothly integrated in the existing language, to name concepts not yet discovered or named? In short, are computers able to take part in the continuous invention of language, and not only make a smart use of the existing one?
Such a perspective would be indeed fascinating and certainly scary. insofar as machines inventing collectively such language extensions would not necessarily share them with humans, and even if they do, humans would not necessarily be able to understand them. 

Whether such an evolution is possible at all or in a foreseeable future is a good question. Whether we should hope for it and work to let it happen, or should fear and prevent it, is yet a more interesting one. But at the very least, those questions we can technically specify, making them much more valuable for assessment and definition of artificial intelligence than vague digressions on whether computers can think, have knowledge or can become conscious. We don't even really know what the latter means for humans, our shared language being the closest proxy we have for whatever is going on in our brainware. So let's assess the progress of artificial intelligence by the same criteria we generally use to assess the human intelligence, its ability to deal with language, from plain naming of things to invention of new concepts.


Statements are only statements

A few days ago in the comments of this post by +Teodora Petkova on Google+ I promised to +Aaron Bradley a post explaining why I am uneasy with the reference to things in Tim Berners-Lee's reference document defining (in 2006) Linked Data. The challenge was to make it readable by seven-years old kids or marketers, but I'm not sure the following meets this requirement.

When Google launched its Knowledge Graph (in 2012) with the tagline things, not strings, it was not much more than the principles of Linked Data as exposed in the above said document six years before, but implemented as a Google enclosure of mostly public source data, with neither API nor even public reusable URIs. I ranted here about that, and nothing seems to have changed since for that matter.
But something important I missed at the time is a subtle drift between TBL's prose and Google's one. The former speaks about things and information about those things. The latter starts by using also the term information, but switches rapidly to objects and facts.
[The Knowledge Graph] currently contains more than 500 million objects, as well as more than 3.5 billion facts about and relationships between these different objects.
The document uses "thing", "entity" and "object" at various places as apparent broad synonyms, conveying (maybe unwillingly) the (very naive) notion that the Knowledge Graph stands at a neat projection in data of "real-world" well-defined things-entities-objects and proven (true) facts about those. An impression reinforced by the use of expressions such as "Find the right thing". And actually, that's how most people are ready to buy it, "Don't be evil" implies "Don't lie, just facts". In a nutshell, if you want to know (true, proven, quality checked) facts about things, just ask Google. It's used to be just ask Wikipedia, but since the Knowledge Graph taps on Wikipedia, it inherits the trust in its source. But similarly naive presentations can be found here and there uttered by enthusiastic Linked Data supporters. Granted, TBL's discourse avoids reference to "facts", but does not close the door, and by this opening a pervasive neo-platonician view of the world has engulfed. There are things and facts outhere, just represent them on the Web using URIs and RDF, et voilà. The DBpedia Knowledge Base description contains such typical sentences blurring the ontological status of what is described.
All these [DBpedia] versions together describe 38.3 million things, out of which 23.8 million are localized descriptions of things that also exist in the English version of DBpedia.
It's let to everyone's guess to figure what "existence in the English version" can mean for a thing. What should such documents say instead of "things" and "facts" to avoid such a confusion? Simply what they are, data bases of statements using names (URIs) and sentences (RDF triples) which just copy, translate, adapt, in one word re-present on the Web statements already present in documents and data, in a variety of more or less natural, structured, formal, shared, idiomatic languages. As often stressed here (for five years at least), this representation is just another translation.
And, as for any kind of statements in any language, to figure whether you can trust them or not, you should be able to track their provenance, the context and time of their utterance. That's for example how Wikidata is intended to work. Look at the image below, nothing like a real-world thing or fact is mentioned, but a statement with its claim and context.
The question of the relationship of names and statements with any real-world referents is a deep question open by philosophers for ages, and which should certainly remain open. Or in any case the Web, Linked Data and the Knowledge Graph do not, will not, and should not insidiously, or even with no evil in mind, pretend to close it. Those technologies just provide incredibly efficient ways to exchange, link, access, share statements, based on Web architecture and a minimalist standard grammar. Which is indeed a great achievement, no less, but no more. At the end of the day, data are only data, statements are only statements.


Common names, proper usage

What follows might be, as previous posts, relevant to the raging debate in and around the W3C Shapes Working Group. If you don't care too much about Latin, Greek, French, German, etymology, translation and languages at large, you can go straight to the last paragraph. But I trust my faithful readers (whoever they are) to follow me through the long preliminary linguistic meanders.

I had a while ago pointed at the enclosure of common names as trademarks. Maybe I should have written common nouns. But in French (my native language), there is a single word nom to translate both noun and name, all being cognates to Latin nomen, Greek ὄνομα, and many more avatars of the same Indo-European root. In French grammar you will say "nom commun" for "common noun" and "nom propre" for "proper noun", and a French native speaker is likely to translate in English "common name" and "proper name", both ambiguous out of context. And my purpose today is indeed to look at what it can mean for names to be common or proper beyond what it means for grammatical nouns.
Let's look into Latin again, where communis and proprius, as well as their ancient Greek equivalents κοινός and ἴδιος have roughly the semantic scope they have kept in French and English. Together they split the world into what belongs to the commons and what is proprietary or private. Beyond and before use in grammar to denote universals and particulars, further meanings have built upon good or bad characteristics associated with each term. Typically, "common" will be used as a derogatory qualifier for whatever belongs to the vulgum pecus, those common people which do not behave, think or speak properly.  The French "propre" even goes further down this derogatory path to mean "clean", with disambiguation by position ("c'est ma propre maison" = "it's my own house" vs "sa maison est propre" = "her house is clean"). Such extensions seem indeed characteristic of a language controlled by some aristocracy. It's worth noticing that the English "own" and its German cognate "Eigen" do not seem to have suffered similar semantic drifts. 
Sticking to the original meaning and forgetting the interpretations of either grammar or aristocracy, common names would be simply names belonging to the commons. Which is true, if you think about it, for just any name. A name with no community (or communality) would be useless, and actually barely a name, just a string with no shared usage and agreed-upon denotation. Under such a definition, even proper nouns are common names. From a grammatical viewpoint, "Roma" is a proper noun, but it's common to all people using it to denote the capital of Italy. To make it short, all names belong to the commons, otherwise they don't name anything at all.
The above analysis does not apply only to natural languages names (aka nouns), but also to all those technical names handled in our information system internal languages, the names used by machines to call each other in the dark (see previous post) and take actions. URIs, addresses, objects and classes names ... if those were not common names, we would have no open Web, and no open source code with reusable libraries.
But those common names, when used and interpreted by software, behave internally at run time as proper names, by all means of "proper". They each call a well defined individual object, method or whatever piece of executable code. A URI sent through the HTTP protocol is eventually calling by their internal names specific pieces of data on one or more servers, all of them running by their own, proper, often proprietary code with its idiosyncratic functional semantics.
Otherwise said, if the declarative semantics of a technical name (description of what it denotes) belongs to the commons, its performative semantics (what it does when called) is proper to the system in which it is used, and conditions at run time.

How is that relevant to the W3C Shapes debate? What this group is (maybe) seeking (or should seek) is actually a (standard) way to describe proper performative semantics for systems using RDF data. On the DC-Architecture list, +Holger Knublauch is complaining a few days ago.
Yet, there used to be a notion of a Semantic Web, in which people were able to publish ontologies together with shared semantics. On this list and also the WG it seems that this has come out of fashion, and everyone seems "obsessed" with the ability to violate the published semantics.
Violate the published semantics? Well, no, it's just about describing how the common semantics behave properly in my system. But whether that can be achieved through yet another declarative language or some interpretation of existing ones without blurring the RDF landscape a bit more, is another story. 


You need names on the Web, it's dark in there.

The chinese character 名 (name) which we have seen in the previous post as the mother of all things, has an interesting origin. It's composed from the characters 夕 (night, symbolized by a crescent moon) and 口 (an open mouth). The clue of such a mysterious association is that you need a name either to call someone, or to identify yourself, in the dark of night. In daylight, you don't really need to know the name of your interlocutor to recognize each other and engage into conversation. You don't need names of things to find and handle them.

Interaction through information systems, and singularly on the Web, is a conversation in the darkest of nights. You can't see your interlocutors, you can't wave or bow at them, and you don't see either what your are looking for, and the system does not see you. So you need names everywhere. You need names to enter the system, to login, to send messages. You need to know names to connect to people on the social web. You need to know a name of what you search to ask a search engine. One can argue that all of this is rapidly changing, with identification using your finger or eyeprint, connecting to stuff or people using icons and various fancy non-textual interfaces. But under the hood, the system will still exchange ids, keys, adresses, all those avatars of names used by machines. If our online experience gets closer and closer to daylight conversation, poor machines will keep  for a long time shooting names to each other across the dark of Web.



My conversation with good old 老子 is a neverending story, and I had to revisit him with the untranslatables paradigm in mind. I discovered long ago the extreme difficulty of translating the chinese characters and singularly in ancient writings through the excellent introduction I already mentioned here some years ago, this "Idiot chinois" by Kyril Ryjik. This book had sold out long ago, my exemplar was lost in a former life, fortunately a few years ago on some obscure blog I stumbled on a PDF copy I was preciously keeping safe ... but I can now forget about all those. After thirty years of dark ages, L'Idiot Chinois is now republished, and this new edition should land on my bookshelves anytime soon ...
The infamous and cryptic first chapter of the 道德經 would certainly be easily short listed in any challenge of the best untranslatables ever. It is an example Ryjik is presenting, because it's both too well known and too much translated, and certainly deeply misunderstood by most western translators.
Here goes the first part, which even if you don't read Chinese will strike you by the rhythm and sheer graphical refinement of its 24 characters. Note that the character 名 (míng, "name") is repeated five times, a hint at this story being about names and naming, mainly. 


Ryjik holds that all but a few western translations and interpretations project a transcendental interpretation of  which does not make sense in the historical/political/cultural context where this text was produced. This is still the case of many available translations, for which the Dao has too much the look and feel of our western monotheist God. If nothing else, the initial caps everywhere are suspicious, there is no upper-case in Chinese.  should certainly be taken with a more mundane meaning : the way the world is going, and that human beings should try to follow, individually and collectively, in order to live in harmony with the general flow. Only physics, no metaphysics.
With this in mind, Ryjik posits that the negative  in the first sentence should be certainly read as a determinant of 常 (constant, unchanging, regular, in one word steady), rather of the whole group 常道. 
In other words, where most translators read 非(常道) not (steady way) one should rather read (非常)道 (not steady) way. Which makes the whole sentence read  something like (a) way really way is not a steady way. In other words : if you want to conform your way to the way (of the world at large) you have to adapt and change (as the world does). In the historical context, Ryjik holds that this is a moral and political recommandation not to stick to a rigid application of ancient rules despite the situation is everchanging. But this is a general consideration, just put there to introduce the main point of the story : the role of names.
Reading in the same spirit 名可名,非常名 yields name really name is not a steady name. Since things as the world flows are everchanging, the names you give to things are also bound to change to keep their accuracy. And in this spirit I just changed the title of this blog ...
As for the following two sentences which seem more mysterious, I've not been fully convinced by any translation so far, even the one by Ryjik. I'm pushed towards proposing my own translation by a beautiful edition entitled "La Danse de l'Encre", illustrated by Lassaâd Metoui, a tunisian calligraph. Thomas Golsenne writes in the introduction (in French, my translation)
"To read the Tao Te King against the grain, out of context is not only a right granted to the reader, it's a sort of duty  ... Understanding or translating [it] "faithfully" does not make any sense, because there is nothing to be faithful to, nothing but emptiness"
So be it, here goes my own unfaithful version of the two following sentences

無名天地之始  : there is no name at the origin of the universe
有名萬物之母  : having a name is the mother of all things

Which I read : the world as a whole 天地 (sky and earth) exists before and beyond any name, and does not need any name to exist, but with names come the separation in things, this and not-this, one, two and the ten thousand beings like said further on in chapter 42. 道生一,一生二,二生三,三生萬物. Dao is father of one, one is father of two, two is father of three, three is father of the multitude of beings.
I'm not sure we need another subject than 無名 and 有名 in those two sentences, a subject which would be implicitly 道, as most translations have it, like "Without name the Dao is the origin of the Universe" etc ... here comes the Holy Ghost, the Logos and the heavy monotheist capitalization. But the dao has nothing to do with the Holy Ghost. There is no metaphysics in the dao, only physics. 
This is actually somehow akin to the (too noisy) recent thesis of Markus Gabriel "Warum es die Welt nicht gibt". Things exist insofar as they are named, but the world cannot be named as a separate entity because there is nothing from which it could be separated from.

Amazingly enough, there is no entry for name in the Dictionary of Untranslatables. Not even a small entry in the index. This is certainly food for thought to expand in a future post.


Data Patterns, continued

Follow-up of the previous post, still trying to make sense of this pack of untranslatables : pattern vs schema vs structure vs model, and in particular how to draw the fine line between their descriptive and prescriptive aspects ... without spamming anymore the DC-Architecture list with this discussion with +Holger Knublauch which has somehow gone astray ...
Looking at pattern in the Wiktionary yields a lot of definitions, among others the following ones, broad enough to fit our purpose.
  • A naturally-occurring or random arrangement of shapes, colours etc. which have a regular or decorative effect. 
  • A particular sequence of events, facts etc. which can be understood, used to predict the future, or seen to have a mathematical, geometric, statistical etc. relationship. 
Further on in the same source, I discover that pattern can also be used as a verb (to pattern)
  • To make or design (anything) by, from, or after, something that serves as a pattern; to copy; to model; to imitate.

To discover, recognize, classify and name patterns in the world is a basic activity of our brain, and the very basis of our knowledge. Are those patterns emerging in our brains and projected on reality? Or does the world really signifies something to us (in the sense of the French faire signe) with those patterns, pointing to some internal logic and maybe meaning? I will keep agnostic here on this deep question, and rather look at an example which will bring us back to the questions of patterns in data.
What do we see in this image? Objects of various shapes, sizes and colors, connected by edges apparently not oriented. Some would call it a graph. Can you see any pattern? A casual look might miss it, and say those shapes, colours and sizes are rather random, their distribution is not really regular, although there are some vertical and horizontal alignments, groups of objects of the same color, and other groups of the same shape. A mix of order and random, like in the real world. Looking more closely, you will notice that connected objects share either a common color, or a common shape, or both (like the two red rectangles). This I will call a pattern.
We can now try to describe those objects in RDF data, using three predicates ex:shape, ex:color and ex:connected, and check if the pattern is general.

    ex:shape  "moon";
    ex:color "blue";
    ex:connected  :blueTriangle1 .

    ex:shape  "triangle";
    ex:color "blue";
    ex:connected  :blueMoon1, blueEllipse1, redTriangle1 .


The pattern can be checked over the above data using this query

  ?x  ex:shape ?xShape.
  ?x  ex:color ?xColor.
  ?y  ex:shape ?yShape.
  ?y  ex:color ?yColor.
  ?x  ex:connected  ?y.
  FILTER (?xShape = ?yShape || ?xColor = ?yColor)

This query should yield all objects in the graph. If there is a handful of exceptions out of thousands of objects, I will certainly consider this is a general pattern, with some exceptions I will look closely at for further investigation. If this pattern is observed for, say, 60% of nodes, I will certainly consider it a frequent pattern. If the result is less than 10%, I will tend to consider it as a random structure rather than a pattern. All this activity is descriptive, with possible predictive purposes. I might have queried a part only of this graph because it has billions of objects, and assume the pattern is extending to the rest.

Can I turn this pattern into a prescriptive rule? Sure enough. If I want to create a new object connected to the yellow triangle at the bottom right, it has to be either a triangle (free color), or a yellow whatever (free shape), or both. But ... may I introduce new colors and new shapes, such as a yellow star or a purple triangle? In an open world, this is not forbidden by my pattern. But my closed system can be more restrictive, and limit the shapes and colors to those already known. 

I'm pretty sure that people asked to extend this graph, even after discovering the underlying pattern, will wonder for a while whether they are allowed or not to introduce a yellow star or a purple triangle, because neither star or purple appear in the current picture. It's likely that the most conformist of us will interpret the open pattern into a closed world schema, where objects can have only the shapes and colors already present. Not to mention the size, which has not been discussed, and not represented in the data. Imaginative people, certainly many children will take the open world assumption to invent freely new shapes with new colors, maybe joyfully breaking the pattern in many places. Logicians will be stuck in wondering which logic to use, and are likely to do nothing but argue why at length with each other.

What lessons do we bring home from this example?
  • Patterns can be discovered in data, or checked over data. 
  • The same observed pattern can be turned into an open world rule or included in a closed world schema, and there is not generally a single way to do either of those.
  • We should have a way to represent and expose patterns in data, independently of their further use. The current RDF pile of standards has nothing explicitely designed for such representations, but  SPARQL would be a good basis.
  • Patterns are not necessariliy linked to types or classes of objects. In our example, no rdf:type is either declared in the data or used in the SPARQL query.
For those who read French see also this post on Mondeca's blog Leçons de Choses Le toro bravo et le top model dated april 2010, showing those ruminations are not really new. 


The case for Data Patterns

The W3C RDF Data Shapes Working Group has hard time trying to name the technology it is chartered to deliver. A proposal by +Holger Knublauch for Linked Data Object Model has triggered a lively discussion even outside the W3C group forum, on the Dublin Core list where +Thomas Baker has supported and pushed further my suggestion to use data pattern instead of shape, model or schema in various combinations with linked and object. Since this terminological proposal has over the week-end made its way to the official proposal list, maybe it's time to justify and explain a bit more such a terminological choice, and what I put technically under this notion of pattern
I must admit I've not gone thoroughly through the Shapes WG long threads wondering, among other tricky questions, about resources and resource shapes, or if shapes are classes, and maybe the view I expose below is naive, but the overall impression I get is that all those efforts to ground the work on RDFS or OWL are just bringing about more confusion on the meaning of already overloaded terms. A parallel discussion has started from a false naive question by +Juan Sequeda on the Semantic Web list a few days ago on how to explore a SPARQL Endpoint. In this exchange with +Pavel Klinov, I take the position that exploring RDF data is looking for patterns, not for schema.
The terminological distinction is important. The notion of schema, or for that matter the alternative proposal model, is heavily overloaded in the minds of people with a database background, and it is on the other hand totally abused in the RDF world. Its use in the RDFS name itself was a big cause of confusion. Not to mention the more recent http://schema.org, which defines anything but a schema, even in its RDF expression. RDFS vocabularies or OWL ontologies are neither schemas or models as understood in the closed world of databases or XML, namely global structures which precede and control the creation and/or validation of data. Using the term schema in RDF landscape is in fact preventing people to grok that RDF data by design has no need for schema. No schema in a RDF dataset is not a bug, it's a feature. And the current raging debates is only showing that people put so many different meanings on schema when trying to use it about RDF data, that you better forget using it all all.
Patterns, on the other hand, can be present in data whether they have or not been a priori defined in a global schema or model. They can be observed over a whole dataset or only in parts of the data. They can be used for query, validation, and even making inferences. But they are agnostic about the various interpretations implied by such usages, they don't abide a priori by any closed or open world assumption.
Technically speaking, how can a data pattern be expressed? To anyone a bit familiar with SPARQL, it is formally equivalent to the content of a WHERE clause in a SPARQL query. Such a content, by the way, is indeed called by the SPARQL specification itself a graph pattern. Let me take a simple example which will meet hopefully en passant an issue expressed by +Karen Coyle, the fact that people (in the Shapes WG) have hard time thinking about data without types (classes). 

Let P1 be the following pattern (prefixes defined as per Linked Open Vocabularies).
?x   dcterms:creator  ?y.
?y   person:placeOfBirth ?z.
?z   dbpedia-owl:country  dbpedia:France
This pattern does not contain any rdf:type declaration, hence it does seem like a shape under any of the current definitions proposed by the Shapes WG. It is not attached to, even less defined as, an explicit class. It does not rely on any RDFS or OWL construct.
What is the possible use of such a pattern? A basic level of use would be to declare that it is present or even frequent in the dataset (the description of the use of a pattern in a dataset could provide a COUNT to figure the number of its occurrences), which means if you use it as a WHERE clause in a SPARQL query over the dataset, the result will not be empty and will represent a significant part of the data.
Another level would be to associate P1 by some logical connector to another pattern, for example let P2 be the following one.
?x    dcterms:title  ?title.
 FILTER (lang(?title) = "fr")
One can now constrain the dataset by the rule P1 => P2 (supposing here the variable ?x is defined globally over P1 and P2). Said in natural language, if the creator of some thing is born in France, then this thing has a title in French (which might be a silly assumption in general, but can make sense in my dataset about French works). Note again that there is no assumption on the type or class of ?x and ?p. Of course one can fetch the predicates in their respective ontologies using their URIs and look out for their rdfs:domain to infer some types. But you don't need to do that to make sense of the above constraint. Practically, this constraint would be validated on all or part of the dataset by the following query yielding an empty result.
?x    dcterms:creator  ?p.
?p    person:placeOfBirth ?place.
?place dbpedia-owl:country  dbpedia:France.
{?x    dcterms:title  ?title.
FILTER (lang(?title) = "fr")}
Not sure how P1 => P2 would be interpreted as an open world subsumption. Supposing you can interpret each of the patterns as some constructed OWL class for the common variable ?x, and write a subsumption axiom between those, not sure such an interpretation would be unique. Deriving types from patterns is something natural language and knowledge does all the time, but not sure if OWL for example is handling that kind of induction. There is certainly work on this subject I don't know of, but it's clearly not "basic" OWL.
In conclusion, I am not claiming that patterns and SPARQL covers all the needs and requirements of the Data Shapes charter, but I hope it shows at least that searching and validating data based on patterns can be achieved independently of RDFS or OWL constructs, and even of any rdf:type declaration.
Follow-up of the conversation on DC-Architecture list.

[EDITED 2015-01-27] After feedback from Holger and further reading of LDOM, it seems that the above P1 => P2 can be expressed as a LDOM Global Contraint encapsulating the SPARQL query, thus :
a ldom:GlobalConstraint ;
ldom:message "Things created by someone born in France must have a title in French" ;
ldom:level ldom:Warning ;
ldom:sparql """
?x    dcterms:creator  ?p.
?p    person:placeOfBirth ?place.
?place dbpedia-owl:country  dbpedia:France.
{?x    dcterms:title  ?title.
FILTER (lang(?title) = "fr")}
""" .