2013-02-22

A small step for Google ...

... a giant leap for the Semantic Web?

For french-reading people I've already published two days ago on Mondeca's blog Leçons de Choses why I think the Google Knowledge Graph, despite all its value, is so far neither part of the Semantic Web, nor even properly interfaced with it. 
It's really frustrating, because a very small step for Google could represent indeed a giant leap for both Linked Data visibility at large and general understanding of the role of URIs for identification of things on the Web. This small step would be as simple for Google as doing the following.

- Acknowledge that the Web of things has not been invented by Google, but is the basis of the Semantic Web pile of languages, vocabularies and linked data (which did not invent the concept either, but provided the technical infrastructure enabling it).

- Acknowledge that the natural identification of things on the Web is done by dereferenceable URIs, and that billions of such URIs already exist, provided by scores of trustable sources.

- Acknowledge that the things stored by Google and displayed in the Knowledge Graph have most of the time already been identified by at least one of those URIs (on DBpedia, Freebase, VIAF, Geonames ...) 

- Hence, logically include such URIs in the Knowledge Graph descriptions, as every other regular linked data base does. 

This should be very simple and really cheap since Google certainly already holds such information, if only through Freebase. It does not even need to either coin its own URIs for the Knowledge Graph things, nor provide an API for them (this is a complementary move, but with certainly more technical, legal and business issues). 

And as a search engine, Google could (should) indeed go a step further, by ranking the URIs of things as it has done for the URLs of pages, images, videos, places, books, recipes ... As long as we have to go to a specific Semantic Web engine like Swoogle or Sindice to search a URI for some-thing, the Semantic Web will not be a natural part of the Web. Getting URIs of things as part of a regular search from the major search engine would be a significant milestone.

Added (2013-02-25) Follow-up of discussion with +Luca Matteis on Google+
Google results are currently URLs of pages in which the name of the thing I'm searching for is present, plus a proxy for the thing in the form of the Knowledge Graph item. Among results at https://www.google.com/search?q=Victor+Hugo I would like to find a little box with URIs identifying the thing-Victor-Hugo-himself, such as the following.
And looking more closely at this, even adding DBpedia to the query https://www.google.com/search?q=Victor+Hugo+DBpedia does not really improve the results, but actually shows that the URI explicitly declared by DBpedia as representing the thing I'm looking for, http://dbpedia.org/resource/Victor_Hugo, is plainly ignored. 

2013-02-08

Ontology Loose Coupling Annotation

I've slowly changed my mind since last year about schema.org semantics. The RDFa version of the data model, even if it was still "experimental", has clarified that schema.org notions of "domain" and "range" differ from their RDFS homonyms. I was pleased to see later on the proposal to rename them "domainIncludes" and "rangeIncudes" respectively to avoid further ambiguity, even if this proposal did not address all my questions.
And actually I now look forward to see those properties explicitly published and usable for linked vocabularies, because they offer a new way to link class to properties, alternative to the hard semantic constraints of either rdfs:domain and rdfs:range (often abused because default such alternative), or local constraints using OWL restrictions, more difficult to understand and use. The semantics of domainIncludes and rangeIncludes are indeed more fit to the needs of  loose semantic coupling in the linked data universe. They allow to suggest, indicate without enforcing any logical constraint, the classes that are expected or intended to be found for subjects and objects of a given predicate. They offers good guidelines to linked data publishers and consumers using the predicate. They are a nice workaround avoiding the cumbersome construction of complex domains using owl:unionOf.
Since the publication of those properties in the schema.org namespace does not seem in the top priorities, I decided to make a step forward and define them in a namespace of mine, and to use them in the latest version of the lingvoj.org ontology. This small vocabulary is (so far) called Ontology Loose Coupling Annotation. Not sure this is the most appealing title, maybe Loose Ontology Coupling (LoCo) would fly better.
Anyway ... OLCA, as it stands, defines among other properties olca:domainIncludes and olca:rangeIncludes as owl:AnnotationProperty, so that they can be included in OWL ontologies without interfering with the core semantics. 
I hope it will provide a way for lightly constrained popular vocabularies such as Dublin Core, SKOS and FOAF to play nicely together, through loose coupling declarations such as the one given as example.

dcterms:subject a  rdf:Property
    olca:domainIncludes   foaf:Document;
    olca:rangeIncludes    skos:Concept.

A proposal which seems to address the open range issue for DC terms. One can expect the subject (yes, I know, vocabulary clashes here) of dcterms:subject to be a foaf:Document, and it's OK, and the object to be a skos:Concept, and it's OK, but neither of those are constrained. The same property can link other classes as well, and it's OK. 

2013-02-04

From 'Long Data' to 'Long Meaning'

I was attracted and actually misled this morning by the title of this article published last week in Wired. I put comments both on the original article and in a Google+ post. But the question deserves certainly more than those quick reactions. Long is definitely climbing the hype curve, and it's a good thing if the concept meaning does not get blurred along the buzz process, a common pitfall accurately pointed by several comments on the  said Wired article.
The Long Now Foundation has coined the concept for quite a while now, and two recent Long Now Blog entries are indeed about Long Data. But Long Data is not only about gathering data from the past in consistent series despite all difficulties, but also (to paraphrase the tagline of next Dublin Core Conference) linking to the future, which means having the data of today still available 10,000 years from now.