Meme tracker

Before coining the term "hubject", I just made sure it was brand new, at least on the Web. All I got ten days ago with this query was some background noise due mainly to mispellings of "subject". Today you get a few hits, but if I judge by the very positive feedback I got so far it should spread. Before the public launch, I had pushed the term to Jack, and he sent me this page by Jon Udell on the O'Reilly Network, pointing to the final comment.
One of the advantages of coining a word is that you can track the progress of its associated meme. Last fall, in collaboration with readers of my blog, I settled on the word screencast. A couple of months ago it drew 200 Google hits, today the number is 60,000. Screencasting may never have the mainstream appeal of podcasting, a word coined not long before that now draws 8 million Google hits. But the meme is spreading and I can't wait to see where it goes next.
I can't wait either to see where hubjects go next, so I set this post as a permanent meme tracker. But so far, I can easily track myself the meme expansion. The last one to-date is freshly posted by Danny Ayers as Stuff of the Day, just after a quick mail "intercourse" triggered by Jack. Before that, I had a fruitful exchange with Patrick Durusau and Steve Newcomb, from which it appears that hubjects could be considered as no more no less than possible technical implementations of the "subject proxies" defined by the TMRM. Steve is even considering the introduction of hubjects in his Versavant implementation. Got also very positive feedback from Phil Tetlow, coordinator of the W3C SWBPD Software Engineering Task Force, already mentioned in those pages. More to come ...


The Wheel and the Hub

Jack was asking for graphics. This is the best I could find to illustrate the metaphor in this latest version of my thoughts about hubjects. I like this image both for its sheer graphical quality, and for the fact that only the hub and spokes are visible. The wheel itself you can only guess.

Note : The Wheel and the Hub is now published under Mondeca namespace, including logo and copyright. The new URL is http://www.mondeca.com/content/download/455/3434/file/hubjects.pdf


Introducing Hubjects

Hub + Subject = Hubject.

This paper is a rough first cut of an introduction
inspired by Chapter 11 of the Tao-te-King
Thirty spokes share the wheel's hub;
It is the center hole that makes it useful.
More to come, including examples and graphics.


httpRange-14 issue "Resolved"

The W3C Technical Architecture Group announced on Saturday that the issue httpRange-14 was "resolved" (sic). Very weird resolution indeed, which links the kind of resource an http URI identifies ("information resource" or "any resource") to the type of answer to a GET request (2xx, 303 or 4xx). On the TAG list and on his blog Jan Algermissen wonders about the impact of such a decision, listing a few examples of application, and concluding with a good question indeed.
Question: Who is going to mint and maintain all the URIs to talk about dogs?


Ontology Definition Metamodel

Certainly a major step towards semantic interoperability of Topic Maps, RDF, OWL, UML, Common Logic ... Really worth downloading and reading at length.


Blank Nodes continued

Steve Newcomb took the time to make an excellent comment on my previous post. Actually, this comment would have deserved a full post, and I remind my old friend that he has a permanent invitation to appear here as a contributor, and I would be very honored if he could join. That said, Steve's viewpoint is much closer to mine that he seems to think.
SRN : I insist that subjects do have identity, but only within contexts -- within universes of discourse.
I did not write something very different when I wrote that only representations can have identity. Maybe I should have put it slightly differently, and I'm sure Steve will agree with this other way to put it : Whatever the subject, it has neither absolute identity, nor absolute definition, nor absolute property of any kind, that would be valid in any context. Identity, and all properties bound to this identity, is always conferred through a representation, itself defined inside the context of some representation scheme, and making sense only in the framework of this scheme.
SRN: But I don't see how it's meaningful to say that a proxy is not a proxy for something in particular. By its very nature, a proxy is always a proxy for something in particular.
Well here I think I disagree with Steve, if "something" is to be understood as "some thing". My view on this has always been that, for all practical reasons, it's the first act of "proxyfication" which brings the subject into existence, as a subject of conversation. But this debate is not really important, and we can proceed from here to the notion of subjects as blank nodes, with or without agreement on the separate existence of subjects. The original point of the debate set by Alistair and Dan was to know how to express in an efficient and meaningful way the fact that two or more representations in different schemes are somehow proxies for the same subject. My point was that this could be captured by something quite similar to an RDF blank node, lets' call it a Subject Blank Node, bearing no absolute identity, and of which only properties could be : represented this way here, and that way there.

The way I see it, a Subject Blank Node would have no logical property per se. It would not be part of any representation scheme, but would provide a hub between various schemes. Such a hub would allow applications able to make sense of several representations schemes, each with their specific structure and logical rules, to aggregate information from different schemes, such as a Topic in a TM application, a class in an OWL ontology, a concept in a SKOS scheme, a category in dmoz, a page in Wikipedia, a term in Wordnet, or a picture by Van Gogh, or a Nocturne of Chopin.


Ontology Mapping, Ineffable Subjects and Blank Nodes

In this thread on SWAD forum, Alistair Miles and Dan Brickley re-activate an old issue : How do I express that resource X in representation scheme A (e.g. a SKOS concept scheme) and resource Y in representation scheme B (e.g. an OWL ontology) are somehow representations of the same (----) . After suggesting a suboptimal Topic Map solution I suddenly yesterday came out with the idea that in RDF, blank nodes could be a killer solution. Actually one can use blank nodes to aggregate various representations of whatever, keeping agnostic on what this whatever is. Using blank nodes to represent "ineffable subjects" is cool, since nobody is able to say anything directly about them (asserting name, type or any other property), since they have no URI. Put it together with recent debate on ISO SC34 mailing list about subject locators, and consider this provocative conclusion : RDF blank nodes are better than TM topics at representing subjects, since, and this is my last thought, subjects have no identity, only representations have one. Subjects have no identity, read no type, no property at all. Resources have identity (URIs), so the best attempt to indicate a subject is to gather various resources in a blank node, as so many fingers pointing towards the moon.
Remember in the Topic Maps book, I wrote about an empty subject indicator ...


Maybe ontologies aren't overrated after all...

When Clay Shirky posted his now famous "Ontologies are overrated" paper, (see also here) a relatively new conversation started. Slipping in the sidelines, however, massive creativity continues. For instance, look at this page at del.icio.us where a small ontology of file types is used to refine the del.icio.us tags.

A comment to that post points to here, a page which enumerates things going on either for, around, or inspired by del.icio.us. One such link points to sid.vicio.us where a list of OWL ontologies exists, one of which is liberal.owl. rdf:about and rdf:resource attributes are links into del.icio.us content.


Bloom filters

From Wikipedia:
The Bloom filter, conceived by Burton H. Bloom, is a space-efficient probabilistic data structure that is used to test whether or not an element is a member of a set. False positives are possible, but false negatives are not. Elements can be added to the set, but not removed (though this can be addressed with a counting filter). The more elements that are added to the set, the larger the probability of false positives.
A good list of papers about applications at the end of the article, including P2P networks. Certainly could be applied to automatic aggregation of Topic Maps too. To be compared with methods of Subject Identity Measure already mentioned.


Stumble Upon

Stumbled upon this community tool yesterday. Quite amazing mix of FOAF and Bookmark sharing. I think Jack will love it, and become a stumbler too. The link on the side bar is to my personal stumble node.

Are "subject locators" bogus?

Patrick Durusau, in the title post on Topic Maps ISO/IEC SC34 list, questions the notion of "subject locators" as defined by TMDM. His point is that through the network you never retrieve a resource, only some representation of it, depending on many things, including the global state of the client-server system at retrieval time, the state of the resource itself etc. Patrick quotes excerpts from the Thomas Fielding dissertation supporting such a view:
The early Web architecture defined URI as document identifiers. Authors were instructed to define identifiers in terms of a document's location on the network. Web protocols could then be used to retrieve that document. However, this definition proved to be unsatisfactory for a number of reasons ...
I tend more and more to agree with Patrick that this distinction TM make between "subject identifiers" and "subject locators", IOW between "subject indicator references" and "resource references" is certainly something to revisit. More on the thread ...


Planet Identity

I did a quick google on this site to ensure someone didn't already mention this aggregation of blogs.
Planet Identity is an aggregation of public weblogs related to Identity Management. The opinions expressed in those weblogs and hence this aggregation are those of the original authors.

If you're not sure whether something is ambiguous, it is

Found that while browsing the ESW Wiki, in a page called GoodURIs. Food for thought is to be found around that one, about topics such as UniversalNames, or DefineYourTerms.


Semantics@ IVOA Forum

I've already mentioned IVOA here a while ago. I've jumped since a few days in a lively debate on IVOA semantics forum, including this thread, where ontological status of observation, event, object, etc. are discussed in-depth. As already mentioned, definition, identification and classification of an unbound quantity of "objects", among the tremendous flow of data carried by all wavelenghts of light over distances ranging from a few kilometers to billions of light-years, and captured by so many instruments and stored in so many data bases ... is one of the widest, oldest and most fascinating challenges in Knowledge Management. Not to mention the task of making sense of all that stuff put together ...


Semantic Elephant

"Part 1 : The Elephant is Real". The first of a series by Jeff Pollock of Network Inference about the still unknown, but certainly happening, "Semantic Convergence". To be continued ... comments when I've read the whole series.