2012-02-10

Is your linked data vocabulary 5-star?

You have created and published on the Web linked data following the best practices and you think they are now 5-star data. But are the vocabularies, RDFS or OWL classes and properties your dataset is using, also 5-star? At the above link one can read some kind of wishful thinking expressed by Tim Berners-Lee:
The third rule, that one should serve information on the web against a URI, is, in 2006, well followed for most ontologies … One can, in general, look up the properties and classes one finds in data, and get information from the RDF, RDFS, and OWL ontologies including the relationships between the terms in the ontology. 
Unfortunately, even in 2012, this is yet far from being true in general.
Looking at vocabulary namespaces listed by LODStats, one can discover that about half of those URIs are either not dereferencable or not leading to any formal specification of the vocabulary. And when such a specification is accessible, it too often provides neither metadata nor element definitions to be fully reliable and re-usable. That’s too bad because a good vocabulary is critical to your data interoperability. Data using ill-defined, obscure, not documented, not published or ill-published vocabularies are de facto locked into semantics silos and miss the main point of semantic added value. Such data share the common RDF data model, but no one can reliably make sense of their semantics. That’s why you should make sure that your 5-star data are backed by findable, understandable and re-usable 5-star vocabularies .
So, how do you make a 5-star vocabulary? Here go some suggestions, inspired by the 5-star linked data scale. Comments welcome here or preferably on the matching Google+ post.

« Publish your vocabulary on the Web at a stable URI
«« Provide human-readable documentation and basic metadata such as creator, publisher, date of creation, last modification, version number
««« Provide labels and descriptions, if possible in several languages, to make your vocabulary usable in multiple linguistic scopes
«««« Make your vocabulary available via its namespace URI, both as a formal file and human-readable documentation, using content negotiation
««««« Link to other vocabularies by re-using elements rather than re-inventing

[2012-03-05] on the Google+ follow-up Adrian Pohl suggests adding "Publish your vocabulary using an open license" to those requirements. Should be added to the 1-star requirement, indeed. If the vocabulary is not free of use, the other criteria are pointless.