Making Sense of Ambiguity

Paper presented by David Booth at Semantic Technology Conference in San Francisco, 25 June 2010. Just do read it and try to make sense of it. The best analysis of the issue I've read so far.


Coreference using substitution rules

Note : This is mostly copied/adapted from a message I posted last week in yet another conversation about the identity issue on W3C Library Linked Data Incubator Group internal mailing list.


What 'mean' means

I've been working for a couple of months now with Gerard de Melo at Lexvo.org. The first objective was to make an example of Linked Data both social and technical good practice. If you have published a set of URIs, and find out afterwards that another set for the same resources has better quality, and moreover you have not the bandwidth or resources to maintain your dataset, what should you do? The example at hand was to redirect the work I've been doing at lingvoj.org towards the data at Lexvo.org which are far more complete, and moreover integrated in a general approach which I found extremely interesting.
The neat result of this work so far is that URIs for languages at lingvoj.org are now redirecting seamlessly to matching lexvo.org URIs, see e.g., http://www.lingvoj.org/lang/fr.


Looking for the stranger next door

Back in 2002 I was involved in the building of a knowledge model for drug discovery, intended to be used by a knowledge portal of a major pharmaceutical group. Not sure it ever was implemented, but the work was great food for thought. Asking a leading scientist there what were his main functional requirements for a knowledge portal, I was stunned by the obvious simplicity of his answer. In short:
I want the system to stop pushing to me things I already know, such as my own publications, or those of my students and colleagues. What is of interest to me lies just behind this, one click away over the edge of my current knowledge. What I want to be pushed to me by the system should be different enough to question my current knowledge and make it move forward, but close enough to be easily connected to it.
I've met this requirement over and over since, made more or less explicit by all kinds of users. In a nutshell the interesting knowledge is both close to mine and different. It's the stranger living next door. But actually I've not seen yet any application meeting this requirement.
Indeed many applications push stuff based on user profile, social recommendations etc. But most of the time what they push to the user is something (or someone, in the case of social network recommendations) possibly unknown, but close and similar. The basic mechanism is Amazon's "if you like this, you should probably like that", or LinkedIn's "meet a friend of your friends". Very often the recommended stuff or person is not that unknown, and when it is, most of the time it's just adding a layer to your current knowledge or social cocoon. To find out something or someone both new and challenging, the best way is still to-date random browsing and serendipity. That's basically how I found out about PDF 2010 conference, through an excellent report by Marcia Stepanek and Ethan Zuckerman's post about Eli Pariser and Filter Bubbles, both providing excellent background reading for what I'm pushing here.

But how does one spot the stranger next door? Well, she's somehow different. Maybe the emergent social-semantic web tools can help to find out this. Imagine an interface where users would pick data and people making together a comfort zone representative of their current knowledge and network. First the system would check if this choice is globally consistent, and if yes search the edge of this comfort zone by any convenient follow-your-nose algorithms, and discover assertions related to, but not consistent with the user's current view of the world. So instead of like-minded folks and similar readings comforting my knowledge cocoon, I would see popping up on my dashboard "John Bar, which you might know, has a different view about topic Foo. Do you want to discuss this now?", along with a cool visualization based on the inconsistent triples.

Now that would be an exciting way to explore the social-semantic edges, avoiding the pitfalls of both cocooning and random serendipity. Did you say killer app?


Coreference as a Service

Yahoo! releases Concordance as part of GeoPlanet API. The aim of this service is to provide equivalence between identifiers for geo entities defined in different namespaces. Quoting Gary Gale on Yahoo! Geo Technologies Blog.

We’ve collected these identifiers and namespaces as a single object, a concordance, which empowers a user to reference each source. You can think of it as a mapping of an identifier in a namespace to its equivalent in another namespace. But it’s not a joining of information; we’re only enumerating the identifiers, not the back-end data or attributes that they describe.
The last sentence is important. The service is agnostic on the data model or ontologies used by the various identifiers publishers. Ontological emptiness makes the service useful.

Another striking example is provided by Ellerdale, reconciliating Wikipedia or Freebase topics with Twitter hashtags to build amazing dynamic pages.

Let's guess that many more of the same will emerge in the months to come.


societas hominum et societas rerum

Danny Ayers has recently posted a "call to arms" to try and speed up the process of adoption of semantic web technologies. And of course he has triggered the usual bunch of complaints about it. Tools are too technical, stuff is presented by geeks for geeks, data are boring, we need betteer user interfaces etc. Among many smart but technical proposals, basically adding to the general complexity issue they are supposed to solve, I will pick up this very simple one by Karl Dubost.
ACTION : Tell a story to people
I've thought about it, and here comes the best story I've come to imagine, although I'm neither a good story teller, nor good at building user interfaces. I'm just good at metaphors.
The Web is a social technology. What have been the killer Web applications so far? e-mail, blogs, Facebook, Twitter... all social stuff, whichever version of Web you call it. People understand what social entities and social links are about. So, let's tell them the story of the societas rerum (society of things) interconnected the same way as, and interconnected with, the societas hominum (society of people). Individuals connected by (meaningful) links. Yes, data are boring. Instead of the technical linked data cloud, let's show a living Web of people and things. What's in there, what it's all about : people and organisations (FOAF), places (Geonames), books (DBLP), products and services (GoodRelations), events etc. In a nutshell, the story of the Semantic Web is the story of the Social Web extended to things. And it's already there, in many ways, even if not (yet) implemented in the RDF technologies stack. Look at every web resource you get at, and ask : is this resource intended to represent and describe one definite thing? Has it a focus (see previous post)? Is it socially linked to other similar resources? If the answer is yes, then this resource participates in the societas rerum. If moreover it's linked to resources representing people, it's also participating in the societas hominum.

About the title, some will ask : why latin? Simple answer : I've been through seven years of latin classes in high school, so I have to use it somehow and show this off a little. More complex answer : Open a latin dictionary, and figure out the original scope of "societas" and "res", and if it's properly translated by "society of things". In fact I found out after forging this title that those concepts (in latin) seem to have been introduced by Antonio Gramsci. Orthodox marxists will forgive me to use them out of the original context, a bit of which is copied below. More to be found here.
One must conceive of man as a series of active relationships (a process) in which individuality, though perhaps the most important, is not, however, the only element to be taken into account. . . . The humanity which is reflected in each individuality is composed of various elements: 1. the individual; 2. other men; 3. the natural world. . . . Each one of us changes himself . . . to the extent that he changes . . . the complex relations of which he is the hub. . . . If one's own individuality is the ensemble of these relations, to create one's own personality means to acquire consciousness of them, and to modify one's own personality means to modify the ensemble of these relations. But these relations, as we have said, are not simple. Some are necessary, others are voluntary. . . . It will be said that what each individual can change is very little, considering his strength. This is true up to a point. But when the individual can associate himself with all the other individuals who want the same changes, and if the changes wanted are rational, the individual can be multiplied an impressive number of times, and can obtain a change which is far more radical than at first sight seemed possible. . . . Up to now the significance attributed to these supra-individual organisms [that the individual is related to] (both the societas hominum and the societas rerum) has been mechanistic and determinist; hence the reaction against it. It is necessary to elaborate a doctrine in which these relations are seen as active and in movement, establishing quite clearly that the source of this activity is the consciousness of the individual man who knows, wishes, admires, creates . . . and conceives of himself not as isolated but rich in the possibilities offered to him by other men and by the society of things of which he cannot help having a certain knowledge.


FOAF focus

I pretty much like the new property Dan Brickley introduced as the next addition to FOAF. For what it does, see Dan's explanations yesterday on LOD forum. The rationale is clearly set:
Because conceptualisations of things as SKOS concepts are distinct from the things themselves. If this weren't the case, we couldn't have diverse treatment of common people/places/artifacts in multiple SKOS thesauri.
I let you enjoy the rest of the post, and will simply add that indeed, it's addressing in the specific context of interoperability of SKOS and FOAF the very issue we've been speaking about here for years. Things are distinct from their conceptualisations, and we need a way for various representations to focus on the same thing they are about. There is still a step further, though. The thing itself is present in the information system only through representations. The URI for the thing itself is the best proxy we can get for it in the system, but let's assume with Dan and FOAF'ers that it's somehow closer to the thing itself than concepts in thesauri, and therefore allows the latter to focus on the former. And certainly I'm delighted with the metaphore of the focus, which is yet another avatar of convergence, like the spokes of the wheel I try to keep rolling here. In french, either spokes of a wheel or converging rays of light are called rayons.