New deliverables

The new version of Topic Maps Reference Model is definitely putting subject identification as the core common feature of Topic Maps Applications. It acknowledges in the informative Annex A:
The problem of "subject identity" has recently been recognized as more difficult than previously thought by proponents of the Semantic Web.
This Annex also mentions several interesting papers already mentioned here in various previous posts. Meanwhile, the SWBP RDFTM Task Force has delivered a rich Survey of Interoperability Proposals with a quite exhaustive presentation of the identity issue.

I've been playing lately with SWRL, wondering how it could be used to express subject identification rules. We have poked enough lately with the notions of context and protocol of identification to think about going from those qualitative general considerations to something more effective like a "Subject Identification Rule Language" able to capture complex rules of identification including declared or computed properties of subjects as well as context elements.


Categorization before identification

Over at sciencedaily.com today, I noticed an article on how the brain identifies objects which come into view (visual recognition begins with categorization). I am tossing this out for general consideration. Here's the requisite quote:
"There are two main processing stages in object recognition: categorization and identification, with identification following categorization," the authors wrote. "Overall, these findings provide important constraints for theories of object recognition."


Is Identity Contextualized?

In a recent post on the Ceryle blog I commented on an article by Katharine Mieszkowski of Salon.com called Steal this Bookmark!, which is basically about the emergence of grass roots ontologies online as used on websites like 43 Things. It might seem somewhat of a stretch for me to describe 43 Things thusly, but I think it's accurate, probably more accurate than much of the use of the word "ontology" within the Knowledge Representation field. 43 Things becomes a map of what people know about a series of subjects as expressed in common language. It's not perfect, the language is muddy, but there's no pretended formality either. As someone says in Mieszkowski's article, "It's more the simplest thing that could possibly work, that shouldn't work, but happens to."

My blog entry was mostly about grass roots or informal ontologies, which I think will succeed where the "Semantic Web" will fail, not so much in delivering the goods to its paying benefactors (such as DARPA and other large government and corporate entities), but in actually having any real impact on the Web as used by the Rest of Us.

The Web community has always developed its own technologies, almost in spite of the W3C, and this is only reinforced by the open source movement. There's no particular reason why RSS or Atom or other new Web technologies need to be based in RDF, it's just a convenient (cough) XML graph syntax. GXL or XTM would do just as well, maybe even better. I've long believed people's enthusiasm for RDF is simply a misplaced enthusiasm over graphs. To those long bound to hierarchies and tree structures, graphs seem very cool, like the Che Guevara of mathematical structures. They're more like the way of the world inside and outside our heads. Some people get very passionate about such things. Others like to watch golf on TV, so go figure?

Anyway, apart from my normal ranting I closed with mention of two issues near and dear to my own research: identity and context. I note that univers immedia has a reference to Chris Welty and Nicola Guarino, both of whom have done some excellent work on the former. Patrick Brézillon has for a number of years been leading conferences that focus on the latter, and maintains a web page about his work on context. While much of the Semantic Web stuff I find almost nonsensical in its almost complete absence of issues of epistemology, identity and context, these guys have been doing some very important work for many years. I don't think we could underestimate the important of Brézillon's conferences in pushing the issue of context into the mainstream.

One of the things that I'm pretty convinced of is that everything is contextualized, even identity (I won't quote the first two chapters of the Tao Te Ching). So where in the Topic Maps models we always talked about the notion of some kind of fixed identity point around which we hung Topic characteristics, if that identity is itself fluid (i.e., contextualized by any of a myriad of factors, human and not), it doesn't exactly break the model, but it makes it a lot more complex, perhaps more capable of modeling real life. For those of you who speak XTM natively, we'd just need to add an optional <scope> element to the content model of <subjectIdentity>. But there's probably a way to do this without mucking with the XTM syntax.

I've been digging around in the philosophical/epistemological literature (e.g., [1], [2/3], [4/5], [6]), trying to find that Copernicus-in-the-bathtub experience (no, not that one, the other one) on how identity and context mesh. It seems sometimes the more I dig the more complicated the issue becomes, and unfortunately my research domain isn't theoretically in philosophy (at least that's what my advisors keep advising me — they hope I'll actually finish my dissertation one day). The pile of books keeps getting higher.
The Penumbra said to the Umbra, "At one moment you move: at another you are at rest. At one moment you sit down: at another you get up. Why this instability of purpose?"

"Perhaps I depend," replied the Umbra, "upon something which causes me to do as I do; and perhaps that something depends in turn upon something else which causes it to do as it does. Or perhaps my dependence is like (the unconscious movements) of a snake's scales or of a cicada's wings. How can I tell why I do one thing, or why I do not do another?"
-- Chuang Tse, (trans. Lin Yutang)
which kinda sums up my own experience lately...


The Concept of Subject in a Semiotic Light

The linked paper is by Jens-Erik Mai, whose publications can be found here. Personally, I recommend studying his dissertation. At various times in the past, I have connected a C.S. Peirce scholar, Mary Keeler, to Steven Newcomb, one of the founders of the topic maps paradigm (among other important contributions). What we get from that coupling is the realization that there is, indeed, a semiotic aspect to the nature of subjects. From the linked paper:
One of the key functions of library and information services is to provide access to information based on users' requests for knowledge. Knowledge can be stored in a wide range of information bearing objects such as text, image, sound, multimedia, and as technology develops more people gain access to the objects, through different media. We will here analyze the processes and problems associated with determining the subject matter of an information bearing object.


A Formal Ontology of Properties

Christopher Welty just sent me the pointer to an excellent paper he presented with Nicola Guarino at the 12th International Conference on Knowledge Engineering and Knowledge Management EKAW-2000.

Clean and clear introduction to difficult issues, very formal but at the same time providing a very practical framework for sound ontology engineering.


Identity, Context, and everything...

The link is to a lively thread, jumping into the middle, where Peter P. Jones opens the floodgates on a discussion about context. No point in doing all the obligitory quotes here. Just go read the thread. But, hey, two quotes do stand out, following Peter's thesis on context as geometry.

Dennis E. Hamilton:
There's a book called "Metaphor and Reality" by Phillip Wheelwright that has a tangential bearing on this topic. The phrase that sticks in my mind is this: "machines have contexts, people have perspectives."
and Murray Altheim:
I'm not sure if you're aware, but there's a whole subdomain within the Knowledge Representation/Structures community devoted to issuesof context, which seems to be headed up by Patrick Brezillon.

Cybernetics and Conversation

Previous post from Jack makes me re-visit old tracks about subject identity dynamically emerging and changing through conversation. In the last section in my Published Subject Indicators chapter in the Topic Maps Book, I wrote a few years ago:
The best PSI is the one that is most likely to change its content because it is maintained at the core of the community questioning the subject, and most subjects are moving targets.
Seems that Wikipedia pages are exactly that : places where living subjects are continually emerging through conversation. Googling conversation + identity, I stumbled on the title page written back in '96 by Paul Pangaro
The piece attempts to capture, in every-day language, the breadth of Conversation Theory as purveyed by Gordon Pask. Although it was not explicit in the publication, a sub-title could be, "Conversation Theory in Two Pages."
The opening line reads : "Without conversation, there is nothing (no thing)"


Wikipedia URLs a Subject Codes

The link is to David Megginson's blog. This struck me as terribly interesting.
Over in my aviation weblog, I find myself more and more linking to Wikipedia whenever I’m discussing a concept, person, place, or anything else that doesn’t have its own, canonical home page. If, as I suspect, lots of other bloggers are doing the same, then links to Wikipedia articles may soon be the blogsphere’s answer to subject codes.
The idea follows something similar from James Tauber, who points to a tagging scheme from Technorati. In the end, it seems that subject identity lies in the realm of concensus or agreement; Wikipedia appears popular enough that its URLs might serve at least one important aspect of the subject identity issue.