2017
Conference article  Open Access

High-dimensional simplexes for supermetric search

Connor R., Vadicamo L., Rabitti F.

Dimensionality reduction  Information Retrieval (cs.IR)  Computer Science - Information Retrieval  FOS: Computer and information sciences  Metric search  Metric embedding  Supermetric space  H.3.3 

In a metric space, triangle inequality implies that, for any three objects, a triangle with edge lengths corresponding to their pairwise distances can be formed. The n-point property is a generalisation of this where, for any (n+1) objects in the space, there exists an n-dimensional simplex whose edge lengths correspond to the distances among the objects. In general, metric spaces do not have this property; however in 1953, Blumenthal showed that any semi-metric space which is isometrically embeddable in a Hilbert space also has the n-point property. We have previously called such spaces supermetric spaces, and have shown that many metric spaces are also supermetric, including Euclidean, Cosine, Jensen-Shannon and Triangular spaces of any dimension. Here we show how such simplexes can be constructed from only their edge lengths, and we show how the geometry of the simplexes can be used to determine lower and upper bounds on unknown distances within the original space. By increasing the number of dimensions, these bounds converge to the true distance. Finally we show that for any Hilbert-embeddable space, it is possible to construct Euclidean spaces of arbitrary dimensions, from which these lower and upper bounds of the original space can be determined. These spaces may be much cheaper to query than the original. For similarity search, the engineering tradeoffs are good: we show significant reductions in data size and metric cost with little loss of accuracy, leading to a significant overall improvement in exact search performance.

Source: SISAP 2017 - Similarity Search and Applications. 10th International Conference, pp. 96–109, Munich, Germany, 4-6 October 2017


1. L. M. Blumenthal. A note on the four-point property. Bull. Amer. Math. Soc., 39(6):423{426, 1933.
2. L. M. Blumenthal. Theory and applications of distance geometry. Clarendon Press, 1953.
3. E. Chavez and G. Navarro. Metric databases. In Laura C. Rivero, Jorge Horacio Doorn, and Viviana E. Ferraggine, editors, Encyclopedia of Database Technologies and Applications, pages 366{371. Idea Group, 2005.
4. R. Connor, F. A. Cardillo, L. Vadicamo, and F. Rabitti. Hilbert Exclusion: Improved metric search through nite isometric embeddings. ACM Trans. Inform. Syst., 35(3):17:1{17:27, December 2016.
5. R. Connor, L. Vadicamo, F. A. Cardillo, and F. Rabitti. Supermetric Search with the Four-Point Property, pages 51{64. Springer International Publishing, 2016.
6. R. Connor, L. Vadicamo, F. A. Cardillo, and Fausto Rabitti. Supermetric search.
7. Richard Connor and Franco Alberto Cardillo. Quantifying the speci city of nearduplicate image classi cation functions. In VISAPP 2016, 2016.
8. M. A. A. Cox and Trevor F. Cox. Multidimensional Scaling, pages 315{347. Springer Berlin Heidelberg, Berlin, Heidelberg, 2008.
9. V. De Silva and J. B. Tenenbaum. Sparse multidimensional scaling using landmark points. Technical report, 2004.
10. K. Figueroa, G. Navarro, and E. Chavez. Metric spaces library. Online http://www. sisap. org, 2007.
11. I. K. Fodor. A survey of dimension reduction techniques. Technical report, Center for Applied Scienti c Computing, Lawrence Livermore National Laboratory, 2002.
12. I. Jolli e. Principal Component Analysis. John Wiley & Sons, Ltd, 2014.
13. Rui Mao, Willard L. Miranker, and Daniel P. Miranker. Dimension reduction for distance-based indexing. In SISAP 2010, pages 25{32. ACM, 2010.
14. J. Matousek. Lectures on Discrete Geometry. Graduate Texts in Mathematics. Springer New York, 2013.
15. K. Menger. Untersuchungen ber allgemeine metrik. Math. Ann., 100:75{163, 1928.
16. M. L. Mico, J. Oncina, and E. Vidal. A new version of the nearest-neighbour approximating and eliminating search algorithm (aesa) with linear preprocessing time and memory requirements. Pattern Recogn. Lett., 15(1):9{17, January 1994.
17. W. A Wilson. A relation between metric and euclidean spaces. American J. Math., 54(3):505{517, 1932.
18. L. Yang. Distance metric learning: A comprehensive survey. 2006.
19. P. Zezula, G. Amato, V. Dohnal, and M. Batko. Similarity search: the metric space approach, volume 32 of Advances in Database Systems. Springer, 2006.

Metrics



Back to previous page
BibTeX entry
@inproceedings{oai:it.cnr:prodotti:378356,
	title = {High-dimensional simplexes for supermetric search},
	author = {Connor R. and Vadicamo L. and Rabitti F.},
	doi = {10.1007/978-3-319-68474-1_7 and 10.48550/arxiv.1707.08370},
	booktitle = {SISAP 2017 - Similarity Search and Applications. 10th International Conference, pp. 96–109, Munich, Germany, 4-6 October 2017},
	year = {2017}
}