2007
Doctoral thesis  Open Access

A Content-Addressable Network for Similarity Search in Metric Spaces

Falchi F.

metric spaces  MCAN  Grid  distributed index  dht  content-addressable network  CAN  similarity search  SDDS  Peer-to-Peer  p2p 

Because of the ongoing digital data explosion, more advanced search paradigms than the traditional exact match are needed for contentbased retrieval in huge and ever growing collections of data produced in application areas such as multimedia, molecular biology, marketing, computer-aided design and purchasing assistance. As the variety of data types is fast going towards creating a database utilized by people, the computer systems must be able to model human fundamental reasoning paradigms, which are naturally based on similarity. The ability to perceive similarities is crucial for recognition, classification, and learning, and it plays an important role in scientific discovery and creativity. Recently, the mathematical notion of metric space has become a useful abstraction of similarity and many similarity search indexes have been developed. In this thesis, we accept the metric space similarity paradigm and concentrate on the scalability issues. By exploiting computer networks and applying the Peer-to-Peer communication paradigms, we build a structured network of computers able to process similarity queries in parallel. Since no centralized entities are used, such architectures are fully scalable. Specifically, we propose a Peer-to-Peer system for similarity search in metric spaces called Metric Content-Addressable Network (MCAN) which is an extension of the well known Content-Addressable Network (CAN) used for hash lookup. A prototype implementation of MCAN was tested on real-life datasets of image features, protein symbols, and text -- observed results are reported. We also compared the performance of MCAN with three other, recently proposed, distributed data structures for similarity search in metric spaces.



Back to previous page
BibTeX entry
@phdthesis{oai:it.cnr:prodotti:425172,
	title = {A Content-Addressable Network for Similarity Search in Metric Spaces},
	author = {Falchi F.},
	year = {2007}
}
CNR ExploRA

Bibliographic record

ISTI Repository

Deposited version Open Access

Also available from

etd.adm.unipi.itOpen Access

SAPIR
Search on Audio-visual content using peer-to-peer Information Retrieval