2007-01-02

Every subject is a blank node

Two discussion threads made me move a step forward towards a general theory of blank nodes. One thread I already mentioned is about languages. The other one was started here by John Black on Semantic Web list, about representation of concepts having contextual semantics, such as "I', "You", "Here" etc. Using a blank node to represent the context is the solution I propose today here.

Where do I go from here? In RDF, URIs are good at defining unambiguous property values, in other words objects, including type. But very often, and maybe most of the time, the individual subject (in both meaning of subject of an RDF triple, and topic maps subject of conversation) is best represented as a blank node bearing all kinds of identified properties, but none of them conferring absolute identity. This way, it's left to applications to figure out identification rules, in other words which property or boolean combination of properties they want to consider as identifying or not. Based on some set of application rules, two subjects can be considered the same, whereas based on other rules, they are considered different. This is actually how it works in the real life and natural language, where many subjects have inherent ambiguity. In order to deal with this ambiguity, no absolute identifying property should be asserted for such subjects.
This may provide a way out of the debate on URI ambiguity. URIs should not be ambiguous, so they should be used for unambiguous subjects. But as long as we deal with real world subjects which are inherently ambiguous, like persons, places, contexts, languages ... they should not be attached identifying URIs.