2006-12-29
Classifying is hard, tagging is worse
Sounds familiar? I already hear the folksonomy people crying : "Hey, of course, that's why tagging is so cool". As far as I am concerned, tagging is worse, it means more arbitrary decisions, because not only do I have to choose a category, I can choose more that one, or none at all, and I have to figure them myself. Way too many decisions ... That's why my browser bookmarks and email folders are a mess, why I have no del.icio.us account, why my Technorati profile is so low, etc ...
Beyond my own decision difficulties, there is something to be added as this now long discussion obout ontologies vs tagging. What I've learnt in science is that a good theory is a falsifiable one. What you assert using an ontology, whatever language or framework with declared formal semantics, is falsifiable. No formal semantics, no notion of true and false, hence no falsifiability. In other words, and to make it simple, an RDF assertion can be declared or inferred true or false vs a given ontology, a OWL class can be proven unsatisfiable etc. Nothing of the like with tags. Assignation of a tag cannot be proven true or false, or inconsistent. Tags are not falsifiable.
By the way, the same distinction is to be made for RDF vs Topic Maps. Topic Maps are not falsifiable, because they have no formal semantics. Now the question is to know is falsifiability, which has been proven to be critical in science, is also critical in information technologies.
That said, since the new Blogger version enables easy tagging (maybe the older version did also, but was never aware of it), and since there is now quite a bunch of posts on univers immedia, I decided to be brave and start tagging them, as thoughtlessly as possible. Starting by the more recent ones, I then shifted to the most ancient, a good occasion to revisit them if nothing else. The result you see on the left under "What". First impression is of course there are too many of them, but I will try to keep up that way throughout the blog just to see how it flies, then maybe keep only the most frequent ones if I end up with a too long list.
2006-12-27
OWL ontology for identity on the web
The definitions of resource that can be found in literature show ambiguity, making the issue of handling the identification of a web resource very problematic.
Our approach restricts the nature of the web resource to that of a computational object. This choice is motivated by the fact that a resource is something that has to be addressable, and things like cars and people are not addressable for their nature. Hence, it is wrong in principle to use the same mechanism of addressing for entities that have such different sorts.
Migration to new Blogger version
The list of contributors does not show anymore, they have to do something about their Blogger account to be able to post again.
A couple of things I've been about lately
I've been silent here for over two months now, my blogging time devoted to the Mondeca blog in French Leçons de Choses. But there is a couple of things I've been working on, worth mentioning.
I've exchanged with Michel Biezunski on his Data Projection Model , and found out that its genericity and simplicity made it easy and straightforward to express the structure of Mondeca ITM, without the borderline hacking needed when using either OWL-RDF or XTM for the same task. Now open questions: What will happen with that model? Who will see the benefits over languages already in this space, and singularly over RDF? Who will build tools supporting it?
Been wondering if a semiotic approach could shed some light on our thoughts on referents, and came out with a RDF semiotic triangle. The URI is the signifier, the RDF description is the formalisation of the signified concept associated with the URI. The referent is out of the language and signs realm, and should stay there. In this approach, attempting to achieve a representation of the referent, even using tricks as blank resources or hubjects of any kind, is therefore a recursive trap and actually a non-sense. So any declaration of same-ness or identity of referents should be avoided. Only concepts bear identity, not their referents. From that point on, came to the idea that linking different concepts/signs (URI + RDF description) which humans consider to have more or less similar referent will take the form of processing rules, more than declarative semantics.
Thanks to Jakob Voss for this post in a long thread on public-esw-thes list, which really triggered a kind of illumination about this. As an example, trying to say that my SKOS concept a:Restaurant has the same referent as your OWL class b:Restaurant through any RDF declarative relation between those two resources shoud be avoided. But I can set in my system a functional rule expressing that any document of which subject is an instance of your b:Restaurant class will be indexed against my a:Restaurant concept. The referent is represented nowhere, but it is acting at the core of this rule.
Actually we have this very indexing rule mechanism working in some Mondeca applications, and I have submitted a paper to XTech 2007 about it. More to come if ever the paper is selected.
Lately, got interested again in triggering some process to have languages available not only as tags to use in XML, but as proper RDF resources. This is an old story tracking back to OASIS Published Subjects Technical Committees, and singularly PSI for languages. Track this topic on ESW Wiki, and see here for ongoing thread and more explanations. There again, my proposal is to forget absolute identification of a language by a URI. Concepts identified by URI are the properties and property values than can be declared for a language, and let applications decide on which properties are useful to them. No absolute rule saying that two descriptions refer to the same language.