That is, every thing is a sign. The first and main function of any language is to allow division of the world into "this" and "not this", based on some interpretation of data received from the world. Such an interpretation of data as signs is the basic form of semiosis, a process performed for quite a while by humans, and for many more ages by animals before them. It can now be performed by machines or information systems (roughly, computers connected to data acquisition devices). The aspects of this process can be defined as following.
- SALIENCE : Capacity to separate as meaningful (significant or salient) a certain data set from the continuous data flow we get from the world through our perceptive experience, be it direct through our biological senses, or indirect through one or more several levels of mediation : reading data gathered by instrumental devices, compilation of such data over time, texts interpreting those data.
- SIGNIFICATION : Capacity to consider the salient data set as a signifier conveying a particular meaning (signified), based on some characteristics such as spatial connectivity, permanence in time, regularity of patterns, similarity with other data sets previously interpreted and stored as signs, or anything the interpreter sees fit by its own rules and general view of the world. The core and essential meaning assigned is generally permanence, existence of a "thing" underlying the "sign". The thing is the signified associated to the signifier which is the data set.
- REPRESENTATION : Translate this sign/thing (both signifier and signified) into some proxy in a representation language allowing storage and retrieval for further use. Typical forms of representation include assignation of identifiers (symbols, icons, names, code numbers), description of the signified, and its connection to pre-existing ones through classification, typing, or any other kind of association or linking.
The above analysis can be set as the basis for a general semiosis framework applicable to natural languages (human or otherwise), formal languages used in our information systems, and scientific languages (theories in physics, biology). This framework, while keeping agnostic at the metaphysical level on the ontological status of things, will hopefully help to provide a solid theoretical foundation to the emerging semiosphere, the network of human knowledge and languages and information systems.
For use of this approch in the Semantic Web area, see a first cut ontology here.
[Note 2013-02-05] : This post has been for years and is still the top viewed in this blog, and I really don't understand why. Passer-by if you care to tell me how you came here, please comment below. Thanks!