2005-06-13

Bloom filters

From Wikipedia:
The Bloom filter, conceived by Burton H. Bloom, is a space-efficient probabilistic data structure that is used to test whether or not an element is a member of a set. False positives are possible, but false negatives are not. Elements can be added to the set, but not removed (though this can be addressed with a counting filter). The more elements that are added to the set, the larger the probability of false positives.
A good list of papers about applications at the end of the article, including P2P networks. Certainly could be applied to automatic aggregation of Topic Maps too. To be compared with methods of Subject Identity Measure already mentioned.