2005-12-27
The Wheel and the Hub (six months later)
Following this logic, SPEK vocabulary has also be simplified to the extreme ... I don't need "views" and "aspects" any more. Any RDF description is a view and provides a specific aspect. Note also that 'hubject' is back in the SPEK vocabulary, but no more as a class, but as the property linking a description to the binding blank node.
I guess this is now as simple as possible ...
2005-12-09
Forging URI schemes : best or bad practice?
The year of the unique ID
Work (e.g., Hamlet)
Expression (e.g., the Folger's Hamlet with annotations and introduction)
Manifestation (a particular print run of Folger's Hamlet)
Item (a copy of Folger's Hamlet sitting on a shelf)
2005-11-25
Tom Gruber on Tag Identity
In section 5, just before the conclusion, a bunch of questions about Tag Identity, showing that both Tags and URIs meet similar identification issues, in fact common to any naming mechanism, and neither is providing killing answers.
2005-11-21
Identity, Reference and the Web
URIs are the primary mechanism for reference and identity on the Web. To be useful, a URI must provide access to information which is sufficient to enable someone or something to uniquely identify a particular thing and the thing identified might vary between contexts. There is no doubt that as mechanisms for identifying web pages the URI has been wildly successful. Currently, URIs can also be used to identify namespaces, ontologies, and almost anything. However, important questions are the interpretation and use and meaning of URIs have been left unquestioned ...Exactly indeed, what we are about here ... Interesting to see also Pat Hayes in the co-chairs list. I remember that quite a while ago in a private communication, Pat had stressed the fact that identity issues had been "sadly overlooked" so far by current Semantic Web technologies.
Compact URIs : The CURIE syntax
New SPEK release
Hubject is no more an explicit class in this version, because I've figured out that about any resource could be used as a hub. I've kept "spoke" as the property linking a view to the resource it describes, but changed the direction : the spoke is directed inward to the (hubject resource), not outward from the hub.
As an example, I picked the "Air Pollution" hubject as defined by Wikipedia, and four different views : a term in a glossary, a descriptor in a thesaurus, a category in a taxonomy, and a class in an ontology.
2005-10-25
SKOS in Topic Maps
Now the crucial question is maybe not how you can do it (something Lars Marius shows quite neatly as usual), but why one would want to do that. Adding a real world use case would be cool ...
As simple as possible ...
Please, W3C, create a standard RDF serialization that elevates RDF as a first class citizen of XML. Everyone else has a schema, why can't we?Having passed (too much) time those days struggling with the yet-another-serialization syndrom in the latest versions of SWOOP and Protégé, I could not agree more. But waiting for such a (most unlikely) W3C delivery, alternatives solutions pop up and are worth looking at.
Phil Jones pushes the notion of SynWeb, which he defines as a web which doesn't need "key identifiers".
The difference is that the knowledge needed to give semantics to the data resides in the programs which do the combining, rather than in a schema which has been prepared earlier.No absolute meaning of data, no absolute identifiers, semantics in the application context? Certainly close to our current ramblings on perspectives and aspects.
The simplest and most radical alternative to-date is certainly Phil Dawes' tagtriples, a simple text format for triple statements. Forget URIs, namespaces, XML and the like. Identification is local to a graph (an ordered collection of statements), as indicated in the Tagtriples Model and Semantics (don't run away, that is really as simple as can be).
All occurances [sic] of a particular symbol in a graph must denote the same meaning. [...] The same symbol used in different graphs may or may not denote the same meaning - it is up to the consumer of the information to interpret how the symbol/meanings correspond.
2005-10-21
Introducing SPEK
More to come ...
2005-10-20
Perspectives and SKOS
Subject classification with DITA and SKOS
A topic is a unit of information that describes a single task, concept, or reference item.The new publication, really worth reading, comes with a challenging academic subtitle : "Managing formal subjects", hiding in fact a very pragmatic approach:
In a topic-oriented architecture such as DITA, content is authored in small, independent units that are assembled to provide help systems, books, courses, and other deliverables. Each unit of information answers a single question for a specific purpose. That is, each topic has specific, independent subject matter -- the very reason that these units of information are called topics.The paper then expands very neatly on how SKOS can be used to declare what the subject of a topic is, claiming that "subject" here is to be understood in the same sense than in "Published Subject Indicator".
2005-10-17
Placeopedia = Google Earth + Wikipedia
2005-10-10
The Search for the Perfect Language
I take that such a language is possible, and that the science on which it depends can be found, by mean of which farmers could best grasp the truth of things than philosophers do today. But don't hope to ever see it in use; that would suppose great changes in the order of things, and would need the world to be a heaven on earth, something worth to propose only in the world of novels.
2005-10-06
Topic Maps for Libraries Wiki
Elaine Svenonius in her book The Intellectual Foundations of Information Organization states that the purpose of information organization is "to bring essentially like information together and to differentiate what is not exactly alike".Suellen has also established a Topic Maps Interest Group within LITA (Library & Information Technology Association). I hope she will take the time to comment a little more about it here.
Lars Marius is alive and blogging at TMRA'05
What about "Beer, Topic Maps and Everything."?
2005-10-03
Anti-SPAM measures for comments
Sorry for the extra inconvenient in posting comments.
2005-09-28
Revisiting Content Negotiation
2005-09-21
Axioms of Identity
In my research into digital identity, I created a set of 'axioms' that have molded my perspective of the subject. I developed these axioms as the foundation for how I would create a digital identity solution ... a software solution to accumulate identity, and provide controlled dissemination of that information.
The First Axiom of Identity
I posit that we humans do not have any inherent identity.
The Second Axiom of Identity
I posit that identity does not exist outside the context of a community.
The Third Axiom of Identity
I posit that identity is exchanged in transactions that occur within a context of trust and authentication.
nota bene: given the last update on these (4-3-2005), I'm guessing that Bernard didn't already mention them here earlier :)
2005-09-16
Thinking about RDF and Topic Maps
I like to think about it this way: the core of the topic maps inquiry is to
satisfy a couple of important use cases: finding and reminding. In
those two use cases lie two primitive notions: subject identity and
names for things. Those are the two primitives that topic maps place
front and center, whereas, it seems to me, OWL emphasizes inferencing in
subsumption hierarchies, relegating subject identity to "proper use of URIs". I like to think about subject identity in the same terms a lawyer might do so in a court case. There, properties of the subject, more so than some URI, become all important. A trial might turn on something as trivial as shoes worn on some particular day. As topic maps are evolving, particularly in the case of the TMRM (topic maps reference model), we are seeing more emphasis placed on comparable subject properties than on precise URIs, which, in many cases, do not (yet) exist. We are seeing the evolution of the ability to "confer" identity on a subject according to circumstances. I think this line of inquiry can map directly into rdf work.
Topic maps (indeed, "subject maps") add one important consideration
outside subject identity and names for things: a guarantee that any
proxy for any subject (aka Topic), is the one place you need to go (in
*this* map) to find all that is knowable about that subject.
The knife cutting in the other direction suggests that, at the implementation level, topic maps could evolve along lines suggested from rdf work. Indeed, some of my own work involves the use of Jena coupled with JDBM for a backside.
2005-09-14
GeoRSS
GeoRSS is simple proposal for RSS feeds to also be described by location or Geotagged. It standardizes the way in which "where" is encoded with enough simplicity and descriptive power to satisfy most needs to describe the location of Web content.This article further suggests to combine this geo-tagging with folkso-tagging, to provide pragmatic but efficient "what-where" identification.
2005-09-06
Technorati Blog Finder
Semapedia
2005-09-02
Simile Tools
What people say about what they do is also interesting stuff. I had mentioned Stefano's Lynotype before. Posts are not frequent, but always thoughtful. See e.g. Data First vs. Structure first.
2005-08-31
Vocabulary, taxonomy, thesaurus, ontology ...
Seems that "Taxonomy" is the most trendy word those days, but if you take the time to make a bit of shopping at the Taxonomy Warehouse you will find all kinds of resources belonging to any of those types, and many more : subject headings, classification schemes, indexing schemes, reference models, dictionaries, glossaries ...
Google Sets
Try {thing, subject, resource} or {Mondeca}.
No hierarchy revisited
2005-08-30
Blogos, the essence of your blog
Grafting , crossbreeding and other taxonomy breaches
Most fears linked to bio-technologies are indeed to be considered at the same level. People are both fascinated and scared about hybrids and GM organisms, as they have always been about monsters and chimaeras of any kind, more for the breaches they make in their world representation than any objective danger they bring about.
2005-08-18
Strange fossil defies grouping
The trouble is the animal, named Vetustodermis planus, did not possess a set of features, or characters, which placed it clearly within any known group.
I am interpreting the word "characters" to mean characteristics. This creature identity issue is telling in the sense that it suggests open issues for topic maps subject identification processing. How does ISO 13250 address subject identification? Section 5.2.1 "Topic Link Architectual Form" of ISO 13250 suggests this:
The optional subject identity attribute refers to one or more indications ("subject descriptors") of the identity of the subject (the organizing principle) of the topic link.
There exist numerous interpretations of 5.2.1, which are manifest in XTM, TMDM, and TMRM. Is it appropriate to revisit the assumptions inherent in those interpretations?
I am indebted to Patrick Durusau for long and productive discussions centered around the subject identity issues related to topic maps implementations. I'd like to see such discussions in greater depth, in public.
2005-08-16
Subject Identity: Now more than ever...
2005-08-10
More on Quantum Semantics
- "Identity and Individuality in Quantum Theory"
- "Relativization of the Principle of Identity"
- "Quantum Objects are Vague Objects"
We have suggested here that quantum objects are vague objects and, further, that how that vagueness is understood depends on the metaphysical package adopted with regard to their individuality. If quantum objects are taken to be individuals, as Lowe considers them, then the vagueness arises because of the existence of relations which do not supervene on monadic properties of the relata; it is because of such relations that we cannot tell which particle is which in an entangled state [...] The alternative package characterises quanta as non-individuals, where this is understood in terms of a lack of identity. [...] There are still some interesting questions to be addressed here, such as how it is that one can refer to objects for which one cannot even say that identity holds.
2005-08-08
Deep Web Research
DeepWebResearch.info is a Subject Tracer™ Information Blog developed and created by the Virtual Private Library™. It is designed to bring together the latest resources and sources on an ongoing basis from the Internet for deep web research which are listed below.
On the surface, it sounds like they are doing topic mapping of one sort of another. What is more interesting (to me) is how I landed on that site: mostly by way of a search for everything that is knowable about UIMA, IBM's Unstructured Information Management Architecture, which is being announced this week at LinuxWorld to go open source. It is already an Eclipse plugin. One of the search hits suggested that DeepWebResearch might be using UIMA in its technology.
Whether quantum mechanics, or category theory, or plain old propositional logic is at work, it is necessary that some form of information resource harvesting will be necessary. It seems a bit of great news that we can start pulling together a large array of available open source products to assemble ever more powerful harvesting tools.
Schrödinger's Web
It strikes me that if inconsistency is fundamental then it should be treated as such, not something to be avoided.
2005-08-03
Perfect or sloppy - RDF, Shirky and Wittgenstein
It essential[ly] hinges on this, do you believe two people have ever in the history of humanity shared the same (i.e identical) concept. Do you believe that concepts exist as perfect entities that we share or infact do we say a concept is shared when we see a number of people using words in a similar enough way. i.e is the world fuzzz, sloppy and uncertain or is it perfect? Are concepts A Priori or derived?
Quoting further:
This is the essential error that Wittgenstein points out in his later work. There is no single shared meaning that we all can describe in our different ways. To believe so is to believe that a meaning exists A Priori and that language is just our means of describing it. Instead Wittgenstein turns it on its head and says, meaning is nothing more than the way a word is actually used by people.
The post then goes on to describe ways in which his comments are reflected in applications of RDF. Danny Ayers adds a comment to the post which says:
...the vast majority of software in use today is based on similar conceptual approximations, yet somehow manages to be useful.
2005-08-02
What is a planet ?
The claim Friday that a 10th planet has been discovered in our solar system has set off a fresh round of debate and international talks aimed at defining the most vexing term in astronomy: the word planet.
2005-07-29
Fun with Hubjects
A hubject is the result a phonetic accident when two memes, subject and hub, have a translocation error performed on them. This accident is part of what we now call directed evolution. By contrast, the philadelphia chromosome, known to be behind several cancers, the most prominent being chronic myelogenous leukemia, is a translocation error, not thought to be directed, between chromosomes 9 and 22. That error splices part of 9 with part of 22 into the famous BCR-ABL splice, characteristically referred to as "ph+" (because it was the first cancer gene discovered -- in Philadelphia -- following Watson&Crick). But, there remains the other parts which did not become famous, but which also get together. A dissertation at UCLA showed that object to be benign.
So, what's that got to do with hubjects? That's what hits when you've got soap in your eyes. In America, we have this whole thing about suburbs. Stay with me here; don't try to guess where this is going. We speak of living in the 'burbs. Could we say that a SubjectProxy (aka: Topic) living in a topic map is, um..., living in the 'bubs? Only if we had a different phonetic accident on the same memes and came up with, brace yourself, sububs.
You know, we can do that. Language is the longest running open source project in the entire universe. We can make up names for things till the cows come home, and beyond. At the core, however, the identity of the subject remains the same. Go figure.
We Are the Web
What will most surprise us is how dependent we will be on what the Machine knows - about us and about what we want to know. We already find it easier to Google something a second or third time rather than remember it ourselves. The more we teach this megacomputer, the more it will assume responsibility for our knowing. It will become our memory. Then it will become our identity. In 2015 many people, when divorced from the Machine, won't feel like themselves - as if they'd had a lobotomy.
2005-07-26
Seeking sustainable IT (not yet desperately, but still ...)
Let's face it. We're building these things for ourselves, and they're proliferating because we have fun doing it.And any sofware vendor or consultant around could have added : " ... and because we hope to sell more technology build on top of it."
It made me wonder about the relevancy of some of the implicit assumptions which pushed me into Knowledge Engineering quite a while ago, and which I explicited at some point in a nutshell as : "Knowledge is sustainable information". Information is consumable and volatile, will be tomorrow at best redundant, at worst obsolete. By opposition, knowledge is supposed to be sustainable and building up with time. The more knowledge you have already gathered, the more you are likely to transform new information into more knowledge. So I envisioned "Knowledge Technologies" (KT) as another name for "Sustainable IT", along the lines of the European IST program spirit : "From Information Society to Knowledge Society". All of it was supported by my background, which made me consider Maths as the most impressive accumulation of (sustainable) knowledge ever, after natural languages of course.
Right opposite to this cumulative and patient build-up of knowledge we see in Maths and Science, proliferation of technology for the sake of it is clearly anything but sustainable. And actually, the current trend in which languages, software and hardware are all tied up in technological packages leads to this annoying conclusion that languages and specifications are bound to follow the same kind of "product life cycle" logic than their supporting software and harware. If Knowledge Technologies keep up following such a track, they clearly more belong to the technology-for-the-sake-of-it market logic than to sustainable knowledge building.
A very pernicious trend indeed, for many obvious reasons. Beyond the sheer issues of managing semantic interoperability between current, past and future languages and formats, lost of critical knowledge and data embedded in obsolete formats (see e.g. the Pionneer Anomaly), there are the human aspects of it : playing with a language is always more or less formatting your way of thinking. Dealing with too many different languages is difficult and confusing. Concepts are embedded in their representations, so considering language product life cycle means also considering concept life cycle. Not a very pleasant perspective.
I'm not sure what the requirements for sustainable IT would be. But surely one on them would be considering concepts the same way as life forms, with their needed diversity, fragility, and need for care.
2005-07-22
Mission 2007
From the mission page :
What is now needed is the launching of a self-propelling, self-replicating and self-sustaining model of ICT for rural regeneration and prosperity [...] The term “knowledge center” was chosen because at the village level there is need for value addition to generic information by converting it into local-specific knowledge [...] The Mission will be top-down in its approach to technological connectivity, but bottom-up in relation to content and knowledge management.Watch this space ...
2005-07-08
News Metadata Framework Technical Specification
What is a document?
"What is a document" is a quite long and thoughtful entry (this blog is very verbose, many entries look like full papers), certainly relevant to recent debates about "information resource" vs "other resource", and more subtle than the recent resolution of httpRange-14 issue.
2005-07-01
Ambiguity and imprecision
I think there is two ways to consider ambiguity :
Way 1. Subjects are ill-defined, everything is fuzzy, nothing can be asserted for sure.
Way 2. Subjects are well defined, but in many ways, as so many views in/from different frameworks/perspectives.Way 1 is good for unformal and cheerful conversation, like the one you used to have in forums, and now blogs, RSS, tagging and the like. But it is IMO pernicious : people either think they agree, though they speak passed each other, or the other way round think they disagree because they have no way to figure if they actually have different viewpoints, or if they speak about different things. Billions of examples available everyday.
Way 2 is what TMRM and hubjects are about : subjects are ambiguous, contradictory, fuzzy, moving targets, OK. But each view on a subject has better be well defined, and the rules for this definition explicit (perspective disclosed). You know your view is not exhausting the subject, you can explore different views, see if their logics are compatible, if they can play nicely with each other or are too orthogonal for that etc ... So you can agree that you agree or disagree on clear grounds, and go to war if needed, but with crystal clear reasons.
2005-06-30
Meme tracker
One of the advantages of coining a word is that you can track the progress of its associated meme. Last fall, in collaboration with readers of my blog, I settled on the word screencast. A couple of months ago it drew 200 Google hits, today the number is 60,000. Screencasting may never have the mainstream appeal of podcasting, a word coined not long before that now draws 8 million Google hits. But the meme is spreading and I can't wait to see where it goes next.I can't wait either to see where hubjects go next, so I set this post as a permanent meme tracker. But so far, I can easily track myself the meme expansion. The last one to-date is freshly posted by Danny Ayers as Stuff of the Day, just after a quick mail "intercourse" triggered by Jack. Before that, I had a fruitful exchange with Patrick Durusau and Steve Newcomb, from which it appears that hubjects could be considered as no more no less than possible technical implementations of the "subject proxies" defined by the TMRM. Steve is even considering the introduction of hubjects in his Versavant implementation. Got also very positive feedback from Phil Tetlow, coordinator of the W3C SWBPD Software Engineering Task Force, already mentioned in those pages. More to come ...
2005-06-28
The Wheel and the Hub
Note : The Wheel and the Hub is now published under Mondeca namespace, including logo and copyright. The new URL is http://www.mondeca.com/content/download/455/3434/file/hubjects.pdf
2005-06-21
Introducing Hubjects
This paper is a rough first cut of an introduction
inspired by Chapter 11 of the Tao-te-King
Thirty spokes share the wheel's hub;More to come, including examples and graphics.
It is the center hole that makes it useful.
2005-06-20
httpRange-14 issue "Resolved"
Question: Who is going to mint and maintain all the URIs to talk about dogs?
2005-06-17
Ontology Definition Metamodel
2005-06-16
Blank Nodes continued
SRN : I insist that subjects do have identity, but only within contexts -- within universes of discourse.I did not write something very different when I wrote that only representations can have identity. Maybe I should have put it slightly differently, and I'm sure Steve will agree with this other way to put it : Whatever the subject, it has neither absolute identity, nor absolute definition, nor absolute property of any kind, that would be valid in any context. Identity, and all properties bound to this identity, is always conferred through a representation, itself defined inside the context of some representation scheme, and making sense only in the framework of this scheme.
SRN: But I don't see how it's meaningful to say that a proxy is not a proxy for something in particular. By its very nature, a proxy is always a proxy for something in particular.Well here I think I disagree with Steve, if "something" is to be understood as "some thing". My view on this has always been that, for all practical reasons, it's the first act of "proxyfication" which brings the subject into existence, as a subject of conversation. But this debate is not really important, and we can proceed from here to the notion of subjects as blank nodes, with or without agreement on the separate existence of subjects. The original point of the debate set by Alistair and Dan was to know how to express in an efficient and meaningful way the fact that two or more representations in different schemes are somehow proxies for the same subject. My point was that this could be captured by something quite similar to an RDF blank node, lets' call it a Subject Blank Node, bearing no absolute identity, and of which only properties could be : represented this way here, and that way there.
The way I see it, a Subject Blank Node would have no logical property per se. It would not be part of any representation scheme, but would provide a hub between various schemes. Such a hub would allow applications able to make sense of several representations schemes, each with their specific structure and logical rules, to aggregate information from different schemes, such as a Topic in a TM application, a class in an OWL ontology, a concept in a SKOS scheme, a category in dmoz, a page in Wikipedia, a term in Wordnet, or a picture by Van Gogh, or a Nocturne of Chopin.
2005-06-15
Ontology Mapping, Ineffable Subjects and Blank Nodes
2005-06-14
Maybe ontologies aren't overrated after all...
A comment to that post points to here, a page which enumerates things going on either for, around, or inspired by del.icio.us. One such link points to sid.vicio.us where a list of OWL ontologies exists, one of which is liberal.owl. rdf:about and rdf:resource attributes are links into del.icio.us content.
2005-06-13
Bloom filters
The Bloom filter, conceived by Burton H. Bloom, is a space-efficient probabilistic data structure that is used to test whether or not an element is a member of a set. False positives are possible, but false negatives are not. Elements can be added to the set, but not removed (though this can be addressed with a counting filter). The more elements that are added to the set, the larger the probability of false positives.A good list of papers about applications at the end of the article, including P2P networks. Certainly could be applied to automatic aggregation of Topic Maps too. To be compared with methods of Subject Identity Measure already mentioned.
2005-06-10
Stumble Upon
Are "subject locators" bogus?
The early Web architecture defined URI as document identifiers. Authors were instructed to define identifiers in terms of a document's location on the network. Web protocols could then be used to retrieve that document. However, this definition proved to be unsatisfactory for a number of reasons ...I tend more and more to agree with Patrick that this distinction TM make between "subject identifiers" and "subject locators", IOW between "subject indicator references" and "resource references" is certainly something to revisit. More on the thread ...
2005-06-07
Planet Identity
Planet Identity is an aggregation of public weblogs related to Identity Management. The opinions expressed in those weblogs and hence this aggregation are those of the original authors.
If you're not sure whether something is ambiguous, it is
2005-06-06
Semantics@ IVOA Forum
2005-06-02
Semantic Elephant
2005-05-25
Open ID
An OpenID identity is just a URL. You can have multiple identities in the same way you can have multiple URLs. All OpenID does is provide a way to prove that you own a URL (identity). And it does this without passing around your password, your email address, or anything you don't want it to. There's no profile exchange component at all: your profiile is your identity URL, but recipients of your identity can then learn more about you from any public, semantically interesting documents linked thereunder (FOAF, RSS, Atom, vCARD, etc.).
2005-05-24
To Tag or Not to Tag, That Is the Question
... and so on.Enter yet another more baffling attempt at tagging. This one is fascinating since it's been gussied up with a new name, and for some unknown reason been given the blessing of a bunch of brain-dead bloggers. This is because a few of the favorite sites that the bloggers love have tacitly approved of the so-called—get this—"folksonomy tags." Oh, a new term! This one is a laugh riot, since there is nothing new here except the new name: Folksonomy. I mean even in HTML there was the "metatag."
No, no. This is different because, uh well, uh, lemme think. It just is!
The current fave sites amongst the cognoscenti have adopted the idea of public tags, and a number of influential bloggers have jumped on board pumping up the concept and re-promoting that old rusty saw, "the semantic Web." The semantic Web is a dead duck, let me assure you.
Piggy Bank
Piggy Bank is an extension to the Firefox web browser that turns it into a “Semantic Web browser”, letting you make use of existing information on the Web in more useful and flexible ways.Not tried it yet, but sound interesting. Based on so-called folksologies, as explained in accompanying blog, Stefano's Lynotype.
2005-05-23
Identification in Big Science
An IVOA Identifier is a globally unique name for a resource. This name can be used to retrieve a unique description of the resource from an IVOA-compliant registry. This document describes the syntax for IVOA identifiers as well as how they are created. An IVOA identifier has two separable components that can appear in two equivalent formats: an XML-tagged form and a URI-compliant form. The syntax has been defined to encourage global-uniqueness naturally and to maximize the freedom of resource providers to control the character content of an identifier.
This specification addresses the need for a standardized naming schema for biological entities in the Life Sciences domains, the need for a service assigning unique identifiers complying with such naming schema, and the need for a resolving service that specifies how to retrieve the entities identified by such naming schema from repositories.
Social Bookmarking Tools
Because, to paraphrase a pop music lyric from a certain rock and roll band of yesterday, "the Web is old, the Web is new, the Web is all, the Web is you", it seems like we might have to face up to some of these stark realities [n1]. With the introduction of new social software applications such as blogs, wikis, newsfeeds, social networks, and bookmarking tools (the subject of this paper), the claim that Shelley Powers makes in a Burningbird blog entry [1] seems apposite: "This is the user's web now, which means it's my web and I can make the rules." Reinvention is revolution – it brings us always back to beginnings.
John Udell; delicious; language evolution
Consider this: language is the longest-running open source project on this planet.
2005-05-18
Ontology is Overrated: Categories, Links, and Tags
What I get from this very clear and intelligent paper is the notion that, in the open Web, efficient semantics are likely to emerge from free tagging, more efficient indeed than those built in pre-defined well-thought ontologies. It goes with the experience of my few past years of development of ontologies and constrained topic maps : very efficient for intranet and corporate environments, they will give poor results on the Web at large.
2005-04-29
Source Codes for Subjects
The Library of Congress' Network Development and MARC Standards Office, with interested experts, has developed the Metadata Authority Description Schema (MADS), an XML schema for an authority element set that may be used to provide metadata about agents (people, organizations), events, and terms (topics, geographics, genres, etc.).
MADS points to the Source Codes for Subjects (link below the title here). Some of the codes are:
aass
"Asian American Studies Library subject headings" in A Guide for establishing Asian American core collections. (Berkeley, CA: Asian American Studies Library, University of California, Berkeley)
aat
AAT: Art & architecture thesaurus (New York, NY: Oxford University Press)
abne
Autoridades de la Biblioteca Nacional de España (Madrid: Biblioteca Nacional de España)
agrifors
AGRIFOREST-sanasto (Helsinki: Helsingin Yliopisto)
agrovoc
AGROVOC multilingual agricultural thesaurus. (Rome: APIMONDIA)
2005-04-13
Identity as a "Pattern of Information"
"According to Greek legend, Poseidon's son Theseus sailed to Crete to slay the monster Minotaur. After his triumphant return to Athens, his ship was preserved as a memorial. As the vessel aged, decaying planks were replaced with new ones; eventually, all the original timber was replaced. Philosophers know the story of Theseus's ship as a classic example of the problem of identity. What was the true identity of the ship, the shape or the wood? A more contemporary example may be found in the form of my first car, a 1966 Ford Mustang with a 289-cubic-inch engine and a speedometer that pegged at 140 m.p.h. As a young man high in testosterone but low in self-control, by the time I sold the car 15 years later there was hardly an original part on it. Nevertheless, my '1966' Mustang was now considered a classic, and I netted a tidy profit. Like Theseus's ship, its essence — its Mustangness — was intact. The analogy holds for human identity. The atoms in my brain and body today are not the same ones I had when I was born. Nevertheless, the patterns of information coded in my DNA and in my neural memories are still those of Michael Shermer. The human essence, the soul, is more than a pile of parts — it is a pattern of information." – Michael ShermerThe idea of a pattern language has been revisited once more by Christopher Alexander in his fourth book, The Luminous Ground: The Nature of Order, which is described as presenting "a new cosmology that arises from the careful study of architecture and art, and above all from the practice of the arts. It is a cosmology which places the I, our experience of self, as the linking stem that unites each individual with the whole, connecting consciousness and matter," and suggests that it is human interpretation (a necessarily contextualized process) that provides us with a sense of identity, not anything inherent within the ever-changing cosmos.
2005-03-05
Situation and Identity
This paper examines the notions of situation and identity on the semantic web. The authors define how identity and situation apply to the semantic web, and present methods for using Inverse Functional Properties to utilise these definitions. We present the notion of a Composite Inverse Functional Property in order to exploit the structure of data for identification, and show how these can be used to apply context specific identification.
That, to me, sounds like conferred identity, using terms from the RM. It also reminds me of the seeing as post I did here much earlier. I'm not making any value judgements on the content of the paper; rather, I think it to be grist for a lot of group think.
2005-02-28
New deliverables
The problem of "subject identity" has recently been recognized as more difficult than previously thought by proponents of the Semantic Web.This Annex also mentions several interesting papers already mentioned here in various previous posts. Meanwhile, the SWBP RDFTM Task Force has delivered a rich Survey of Interoperability Proposals with a quite exhaustive presentation of the identity issue.
I've been playing lately with SWRL, wondering how it could be used to express subject identification rules. We have poked enough lately with the notions of context and protocol of identification to think about going from those qualitative general considerations to something more effective like a "Subject Identification Rule Language" able to capture complex rules of identification including declared or computed properties of subjects as well as context elements.
2005-02-16
Categorization before identification
"There are two main processing stages in object recognition: categorization and identification, with identification following categorization," the authors wrote. "Overall, these findings provide important constraints for theories of object recognition."
2005-02-12
Is Identity Contextualized?
My blog entry was mostly about grass roots or informal ontologies, which I think will succeed where the "Semantic Web" will fail, not so much in delivering the goods to its paying benefactors (such as DARPA and other large government and corporate entities), but in actually having any real impact on the Web as used by the Rest of Us.
The Web community has always developed its own technologies, almost in spite of the W3C, and this is only reinforced by the open source movement. There's no particular reason why RSS or Atom or other new Web technologies need to be based in RDF, it's just a convenient (cough) XML graph syntax. GXL or XTM would do just as well, maybe even better. I've long believed people's enthusiasm for RDF is simply a misplaced enthusiasm over graphs. To those long bound to hierarchies and tree structures, graphs seem very cool, like the Che Guevara of mathematical structures. They're more like the way of the world inside and outside our heads. Some people get very passionate about such things. Others like to watch golf on TV, so go figure?
Anyway, apart from my normal ranting I closed with mention of two issues near and dear to my own research: identity and context. I note that univers immedia has a reference to Chris Welty and Nicola Guarino, both of whom have done some excellent work on the former. Patrick Brézillon has for a number of years been leading conferences that focus on the latter, and maintains a web page about his work on context. While much of the Semantic Web stuff I find almost nonsensical in its almost complete absence of issues of epistemology, identity and context, these guys have been doing some very important work for many years. I don't think we could underestimate the important of Brézillon's conferences in pushing the issue of context into the mainstream.
One of the things that I'm pretty convinced of is that everything is contextualized, even identity (I won't quote the first two chapters of the Tao Te Ching). So where in the Topic Maps models we always talked about the notion of some kind of fixed identity point around which we hung Topic characteristics, if that identity is itself fluid (i.e., contextualized by any of a myriad of factors, human and not), it doesn't exactly break the model, but it makes it a lot more complex, perhaps more capable of modeling real life. For those of you who speak XTM natively, we'd just need to add an optional <scope> element to the content model of <subjectIdentity>. But there's probably a way to do this without mucking with the XTM syntax.
I've been digging around in the philosophical/epistemological literature (e.g., [1], [2/3], [4/5], [6]), trying to find that Copernicus-in-the-bathtub experience (no, not that one, the other one) on how identity and context mesh. It seems sometimes the more I dig the more complicated the issue becomes, and unfortunately my research domain isn't theoretically in philosophy (at least that's what my advisors keep advising me — they hope I'll actually finish my dissertation one day). The pile of books keeps getting higher.
The Penumbra said to the Umbra, "At one moment you move: at another you are at rest. At one moment you sit down: at another you get up. Why this instability of purpose?"which kinda sums up my own experience lately...
"Perhaps I depend," replied the Umbra, "upon something which causes me to do as I do; and perhaps that something depends in turn upon something else which causes it to do as it does. Or perhaps my dependence is like (the unconscious movements) of a snake's scales or of a cicada's wings. How can I tell why I do one thing, or why I do not do another?" -- Chuang Tse, (trans. Lin Yutang)
2005-02-10
The Concept of Subject in a Semiotic Light
One of the key functions of library and information services is to provide access to information based on users' requests for knowledge. Knowledge can be stored in a wide range of information bearing objects such as text, image, sound, multimedia, and as technology develops more people gain access to the objects, through different media. We will here analyze the processes and problems associated with determining the subject matter of an information bearing object.
2005-02-09
A Formal Ontology of Properties
Clean and clear introduction to difficult issues, very formal but at the same time providing a very practical framework for sound ontology engineering.
2005-02-08
Identity, Context, and everything...
Dennis E. Hamilton:
There's a book called "Metaphor and Reality" by Phillip Wheelwright that has a tangential bearing on this topic. The phrase that sticks in my mind is this: "machines have contexts, people have perspectives."and Murray Altheim:
I'm not sure if you're aware, but there's a whole subdomain within the Knowledge Representation/Structures community devoted to issuesof context, which seems to be headed up by Patrick Brezillon.
Cybernetics and Conversation
The best PSI is the one that is most likely to change its content because it is maintained at the core of the community questioning the subject, and most subjects are moving targets.
The piece attempts to capture, in every-day language, the breadth of Conversation Theory as purveyed by Gordon Pask. Although it was not explicit in the publication, a sub-title could be, "Conversation Theory in Two Pages."
2005-02-03
Wikipedia URLs a Subject Codes
Over in my aviation weblog, I find myself more and more linking to Wikipedia whenever I’m discussing a concept, person, place, or anything else that doesn’t have its own, canonical home page. If, as I suspect, lots of other bloggers are doing the same, then links to Wikipedia articles may soon be the blogsphere’s answer to subject codes.
2005-01-18
Situation and Identity
"A Generalisation of Inverse Functional Properties" which does not speak much to people not aware of arcane details of OWL terminology. "Inverse Functional Properties" should be better named "Identifying Properties", since, according to OWL semantics, two individuals sharing the same value for such a property must be considered as the same one. The abstract says:
This paper examines the notions of situation and identity on the semantic web. The authors define how identity and situation apply to the semantic web, and present methods for using Inverse Functional Properties to utilise these definitions. We present the notion of a Composite Inverse Functional Property in order to exploit the structure of data for identification, and show how these can be used to apply context specific identification.
Context specific identification is really the concept to highlight here.
Object Co-identification on the Semantic Web
How many "person" concepts in the Semantic Web?
But there are about 400 other SW resources, classes and properties using the same name. Food for thought ... Why so many already? What about re-usability? What about aggregation of data and federation of knowledge? (see previous post) How many of those resources declare equivalence with other ones? And how many are actually used?
2005-01-13
Semantic Web Ontologies: What Works and What Doesn't
Glossary of terms relating to thesauri and other forms of structured vocabulary for information retrieval
Confusion often arises when different people use terms to mean different things. This list is based on definitions drawn up by a group of four consultants who specialise in the development and use of thesauri and other forms of structured vocabulary for information retrieval, Stella Dextre Clarke, Alan Gilchrist, Ron Davies and Leonard Will. The authors do not claim that these definitions are "correct" and that other meanings are "wrong", but recommend these definitions as being a consistent and well-defined set which will aid communication by encouraging everyone to use the same words with the same meaning.
2005-01-07
a) It is distributed and gives personal control over accessThis should be compared to the Identity Commons I-Names that I mentioned here.
b) It is available now
2005-01-06
Identity Management Webcast
In this webcast you will learn how identity management is helping organizations to meet their key business initiatives by driving new online revenue opportunities, enabling a company to securely extend business beyond its four walls, and helping corporations to mitigate risk while complying with new regulations such as Sarbanes Oxley.
Topics to be Covered:
> Identity Management defined and the stages of a successful deployment
> The business drivers and benefits of Identity Management
> Provisioning and Access Management technologies
> Selecting the right Identity Management vendor
> Do's and Don'ts - Deploying an Identity Management solution