2012-04-12

schema.org Tree and the Vocabulary Forest

My feelings about schema.org from the very beginning have been so mixed that I read a lot about it and wrote nothing, until a while ago at the end of the first of LOV stories. Following Dan Brickley's introduction of schema.org to BBC about two weeks ago, Phil Archer is wondering if we should follow Danbri, which means move towards enthusiastic adoption of schema.org as the first general vocabulary of the Semantic Web, covering 50% of our needs, and calling more specific vocabularies long tail for more specialized use. I am more keen to follow Kerstin Forsberg's cautious approach, and started to wonder aloud about it on Google+, attracting a quick answer from Danbri. A sensible and pragmatic one (as usual) but which did not completely convince me to follow him (at least not in the sense proposed by Phil Archer).
That does not mean I will, or was ever willing to, ignore schema.org. On the contrary. Six months ago I went to Dublin Core Conference 2011 with in my bag a draft mapping proposal from Dublin Core terms to schema.org elements, initiative from which a formal Task Group eventually emerged. This group has really kicked off last month, and a first point of discussion has been to clarify on which semantics of schema.org URIs such a mapping should rely. There again, I found the answer made by Danbri quite puzzling indeed : semantics of schema.org types and properties are not those of RDFS classes and properties, as declared for example by Michael Hausenblas transformation at schema.rdfs.org. Hence Danbri recommends (and Michael apparently agrees) that any mapping shoud rely on the semantics declared by the source schema.org, not by the formal RDFS interpretation.
Great ... except that the said semantics is actually declared nowhere formally at schema.org, and Danbri's interpretation as expressed on the DC list does not seem amenable to any trustable species of formal logic that I know (I would be happy to be proven false on this). Moreover so far the URIs of schema.org properties are not de-referencable, only classes are. This is a major issue. If we believe Danbri (and we have good reasons to do so) schema.org semantics do not share the same underlying model as the rest of the linked data cloud vocabularies, expressed using RDF, RDFS and OWL. This potentially hinders the semantic integration of data declared using schema.org with linked data using other vocabularies.
Nevertheless, mappings from other vocabularies to schema.org are developing, and they are written using RDFS or OWL. If schema.org does not provide itself any formal definition for its URIs, other parties have started to provide formal descriptions (see the distinction in a recent httpRange-14 discussion), certainly trusting the first one published at schema.rdfs.org.
Hopefully the starting discussion around Dublin Core to schema.org mappings will help to clarify those issues, and eventually I'll make up my mind about following Danbri or not.