2009-07-30

URI species

The debate about proliferation of URIs representing the same thing keeps on rolling on various Semantic Web lists, going back again and again to the same questions. How does one discover existing URIs for a thing, if any? Is it a good or bad practice to mint a new URI for a thing which already has one? How do one link URIs identifying the same thing? Many smart and conflicting answers have been given, largely depending on the viewpoint on Web architecture and the main use of URIs in the mind of their authors. Web pragmatists and linked data evangelists tend to consider that proliferation of URIs is not necessarily a good idea, but something we are bound to live with, whereas experts in knowledge representation tend to consider it should be avoided by all means. Trust, persistence, quality of resource descriptions, use and abuse of owl:sameAs have been discussed over and over, with no obvious technical answer.

Since life provides the oldest, proven, efficient ways to store, maintain, replicate and use information, I've tried to figure if we could not learn from biology. Interestingly enough, biologists are not more able to come to a consensus about what a species is than Semantic Web gurus to agree on what is behind a URI. Somehow, the two issues are very similar. They deal with persistence of information over time. With the disclaimer that I am not a biologist, let me assume here the definition of a species as the set of individual expressions of some common genetic pool. Protection and persistence of the species genetic pool is the main occupation of any form of life. Strategies to achieve this goal present an awesome diversity, but in this variety one can find some constants. Among those are the basic facts that individuals are bound to a short life span, so the protection of the genetic pool is best achieved by assuming mortality of individuals, and ensuring duplication and replication of the information in as many individuals as possible. Not by defending a single representation behind firewalls.

How does that apply to the Semantic Web? A URI, along with the resource description it provides, can be seen as an individual expression of a species concept. As any human artefact, or any living individual, or any physical manifestation in this world, this expression is bound to be a transient. The agent who created and maintain the URI is bound to disappear, among other things. It will be less costly, as life tells us, to have copies of the information in as many expressions as possible all over the place, than to protect this specific one. Consider a URI not as the unique representation of a thing, but as an individual expression of a species.