Connor R., Dearle A., Vadicamo L.
Exclusion power Metric indexing Metric search Similarity search
It is generally understood that, as dimensionality increases, the minimum cost of metric query tends from O(log n) to O (n) in both space and time, where n is the size of the data set. With low dimensionality, the former is easy to achieve; with very high dimensionality, the latter is inevitable. We previously described BitPart as a novel mechanism suitable for performing exact metric search in "high(er)" dimensions. The essential tradeoff of BitPart is that its space cost is linear with respect to the size of the data, but the actual space required for each object may be small as log2 n bits, which allows even very large data sets to be queried using only main memory. Potentially the time cost still scales with O (log n). Together these attributes give exact search which outperforms indexing structures if dimensionality is within a certain range. In this article, we reiterate the design of BitPart in this context. The novel contribution is an indepth examination of what the notion of "high(er)" means in practical terms. To do this we introduce the notion of exclusion power, and show its application to some generated data sets across different dimensions.
Source: SEBD 2022 - 30th Italian Symposium on Advanced Database Systems, pp. 415–426, Tirrenia (PI), Italia, 19-22/06/2022
@inproceedings{oai:it.cnr:prodotti:472059, title = {Investigating binary partition power in metric query}, author = {Connor R. and Dearle A. and Vadicamo L.}, booktitle = {SEBD 2022 - 30th Italian Symposium on Advanced Database Systems, pp. 415–426, Tirrenia (PI), Italia, 19-22/06/2022}, year = {2022} }