2010
Report  Unknown

Use of permutation prefixes for efficient and scalable approximate similarity search

Esuli A.

Content Analysis and Indexing  Information Search and Retrieval  Approximate similarity search  Metric spaces  Scalability 

We present the Permutation Prefix Index (PP-Index), an index data structure that allows to perform efficient approximate similarity search. The PP-Index belongs to the family of the permutation-based indexes, which are based on representing any indexed object with ``its view of the surrounding world'', i.e., a list of the elements of a set of reference objects sorted by their distance order with respect to the indexed object. In its basic formulation, the PP-Index is strongly biased toward efficiency. We show how the effectiveness can easily reach optimal levels just by adopting two ``boosting'' strategies: multiple index search and multiple query search, which both have nice parallelization properties. We study both the efficiency and the effectiveness properties of the PP-Index, experimenting with collections of sizes up to one hundred million objects, represented in a very high-dimensional similarity space.

Source: ISTI Technical reports, 2010



Back to previous page
BibTeX entry
@techreport{oai:it.cnr:prodotti:161210,
	title = {Use of permutation prefixes for efficient and scalable approximate similarity search},
	author = {Esuli A.},
	institution = {ISTI Technical reports, 2010},
	year = {2010}
}