39 result(s)
Page Size: 10, 20, 50
Export: bibtex, xml, json, csv
Order by:

CNR Author operator: and / or
more
Typology operator: and / or
Language operator: and / or
Date operator: and / or
Rights operator: and / or
2021 Journal article Open Access OPEN

The VISIONE video search system: exploiting off-the-shelf text search engines for large-scale video retrieval
Amato G., Bolettieri P., Carrara F., Debole F., Falchi F., Gennaro C., Vadicamo L., Vairo C.
This paper describes in detail VISIONE, a video search system that allows users to search for videos using textual keywords, the occurrence of objects and their spatial relationships, the occurrence of colors and their spatial relationships, and image similarity. These modalities can be combined together to express complex queries and meet users' needs. The peculiarity of our approach is that we encode all information extracted from the keyframes, such as visual deep features, tags, color and object locations, using a convenient textual encoding that is indexed in a single text retrieval engine. This offers great flexibility when results corresponding to various parts of the query (visual, text and locations) need to be merged. In addition, we report an extensive analysis of the retrieval performance of the system, using the query logs generated during the Video Browser Showdown (VBS) 2019 competition. This allowed us to fine-tune the system by choosing the optimal parameters and strategies from those we tested.Source: JOURNAL OF IMAGING 7 (2021). doi:10.3390/jimaging7050076
DOI: 10.3390/jimaging7050076

See at: ISTI Repository Open Access | ISTI Repository Open Access | CNR ExploRA Open Access | www.mdpi.com Open Access


2020 Conference article Open Access OPEN

Scalar Quantization-Based Text Encoding for Large Scale Image Retrieval
Amato G., Carrara F., Falchi F., Gennaro C., Rabitti F., Vadicamo L.
The great success of visual features learned from deep neu-ral networks has led to a significant effort to develop efficient and scal- A ble technologies for image retrieval. This paper presents an approach to transform neural network features into text codes suitable for being indexed by a standard full-text retrieval engine such as Elasticsearch. The basic idea is providing a transformation of neural network features with the twofold aim of promoting the sparsity without the need of un-supervised pre-training. We validate our approach on a recent convolu-tional neural network feature, namely Regional Maximum Activations of Convolutions (R-MAC), which is a state-of-art descriptor for image retrieval. An extensive experimental evaluation conducted on standard benchmarks shows the effectiveness and efficiency of the proposed ap-proach and how it compares to state-of-the-art main-memory indexes.Source: 28th Italian Symposium on Advanced Database Systems, pp. 258–265, Virtual (online) due COVID-19, 21-24/06/2020

See at: ceur-ws.org Open Access | ISTI Repository Open Access | CNR ExploRA Open Access


2020 Journal article Open Access OPEN

Re-ranking via local embeddings: A use case with permutation-based indexing and the nSimplex projection
Vadicamo L., Gennaro C., Falchi F., Chavez E., Connor R., Amato G.
Approximate Nearest Neighbor (ANN) search is a prevalent paradigm for searching intrinsically high dimensional objects in large-scale data sets. Recently, the permutation-based approach for ANN has attracted a lot of interest due to its versatility in being used in the more general class of metric spaces. In this approach, the entire database is ranked by a permutation distance to the query. Typically, permutations allow the efficient selection of a candidate set of results, but typically to achieve high recall or precision this set has to be reviewed using the original metric and data. This can lead to a sizeable percentage of the database being recalled, along with many expensive distance calculations. To reduce the number of metric computations and the number of database elements accessed, we propose here a re-ranking based on a local embedding using the nSimplex projection. The nSimplex projection produces Euclidean vectors from objects in metric spaces which possess the n-point property. The mapping is obtained from the distances to a set of reference objects, and the original metric can be lower bounded and upper bounded by the Euclidean distance of objects sharing the same set of references. Our approach is particularly advantageous for extensive databases or expensive metric function. We reuse the distances computed in the permutations in the first stage, and hence the memory footprint of the index is not increased. An extensive experimental evaluation of our approach is presented, demonstrating excellent results even on a set of hundreds of millions of objects.Source: Information systems (Oxf.) (2020). doi:10.1016/j.is.2020.101506
DOI: 10.1016/j.is.2020.101506
Project(s): AI4EU via OpenAIRE

See at: ISTI Repository Open Access | Information Systems Restricted | Information Systems Restricted | Information Systems Restricted | Information Systems Restricted | Information Systems Restricted | CNR ExploRA Restricted | www.sciencedirect.com Restricted | Information Systems Restricted


2020 Contribution to book Open Access OPEN

Preface - SISAP 2020
Satoh S., Vadicamo L., Zimek A., Carrara F., Bartolini I., Aumüller M., Jónsson B. Þór, Pagh R.
Preface of Volume 12440 LNCS,2020, Pages v-vi, 13th International Conference on Similarity Search and Applications, SISAP 2020.Source: Similarity Search and Applications, pp. v–vi. New York: Springer Science and Business Media, 2020
DOI: 10.1007/978-3-030-60936-8

See at: link.springer.com Open Access | CNR ExploRA Open Access | link.springer.com Restricted | link.springer.com Restricted


2020 Report Open Access OPEN

AIMH research activities 2020
Aloia N., Amato G., Bartalesi V., Benedetti F., Bolettieri P., Carrara F., Casarosa V., Ciampi L., Concordia C., Corbara S., Esuli A., Falchi F., Gennaro C., Lagani G., Massoli F. V., Meghini C., Messina N., Metilli D., Molinari A., Moreo A., Nardi A., Pedrotti A., Pratelli N., Rabitti F., Savino P., Sebastiani F., Thanos C., Trupiano L., Vadicamo L., Vairo C.
Annual Report of the Artificial Intelligence for Media and Humanities laboratory (AIMH) research activities in 2020.

See at: ISTI Repository Open Access | CNR ExploRA Open Access


2019 Conference article Open Access OPEN

Modelling string structure in vector spaces
Connor R., Dearle A., Vadicamo L.
Searching for similar strings is an important and frequent database task both in terms of human interactions and in absolute worldwide CPU utilisation. A wealth of metric functions for string comparison exist. However, with respect to the wide range of classification and other techniques known within vector spaces, such metrics allow only a very restricted range of techniques. To counter this restriction, various strategies have been used for mapping string spaces into vector spaces, approximating the string distances within the mapped space and therefore allowing vector space techniques to be used. In previous work we have developed a novel technique for mapping metric spaces into vector spaces, which can therefore be applied for this purpose. In this paper we evaluate this technique in the context of string spaces, and compare it to other published techniques for mapping strings to vectors. We use a publicly available English lexicon as our experimental data set, and test two different string metrics over it for each vector mapping. We find that our novel technique considerably outperforms previously used technique in preserving the actual distance.Source: 27th Italian Symposium on Advanced Database Systems, Castiglione della Pescaia, Italy, 16-19/06/2019

See at: ceur-ws.org Open Access | ISTI Repository Open Access | CNR ExploRA Open Access


2019 Conference article Open Access OPEN

Query Filtering with Low-Dimensional Local Embeddings
Chavez E., Connor R., Vadicamo L.
The concept of local pivoting is to partition a metric space so that each element in the space is associated with precisely one of a fixed set of reference objects or pivots. The idea is that each object of the data set is associated with the reference object that is best suited to filter that particular object if it is not relevant to a query, maximising the probability of excluding it from a search. The notion does not in itself lead to a scalable search mechanism, but instead gives a good chance of exclusion based on a tiny memory footprint and a fast calculation. It is therefore most useful in contexts where main memory is at a premium, or in conjunction with another, scalable, mechanism. In this paper we apply similar reasoning to metric spaces which possess the four-point property, which notably include Euclidean, Cosine, Triangular, Jensen-Shannon, and Quadratic Form. In this case, each element of the space can be associated with two reference objects, and a four-point lower-bound property is used instead of the simple triangle inequality. The probability of exclusion is strictly greater than with simple local pivoting; the space required per object and the calculation are again tiny in relative terms. We show that the resulting mechanism can be very effective. A consequence of using the four-point property is that, for m reference points, there are (Formula Presented) pivot pairs to choose from, giving a very good chance of a good selection being available from a small number of distance calculations. Finding the best pair has a quadratic cost with the number of references; however, we provide experimental evidence that good heuristics exist. Finally, we show how the resulting mechanism can be integrated with a more scalable technique to provide a very significant performance improvement, for a very small overhead in build-time and memory cost.Source: International Conference on Similarity Search and Applications, pp. 233–246, Newark, NJ, USA, 2-4/10/2019
DOI: 10.1007/978-3-030-32047-8_21

See at: dspace.stir.ac.uk Open Access | ISTI Repository Open Access | academic.microsoft.com Restricted | dblp.uni-trier.de Restricted | dspace.stir.ac.uk Restricted | dspace.stir.ac.uk Restricted | link.springer.com Restricted | link.springer.com Restricted | CNR ExploRA Restricted | rd.springer.com Restricted | www.storre.stir.ac.uk Restricted


2019 Conference article Open Access OPEN

SPLX-Perm: A Novel Permutation-Based Representation for Approximate Metric Search
Vadicamo L., Connor R., Falchi F., Gennaro C., Rabitti F.
Many approaches for approximate metric search rely on a permutation-based representation of the original data objects. The main advantage of transforming metric objects into permutations is that the latter can be efficiently indexed and searched using data structures such as inverted-files and prefix trees. Typically, the permutation is obtained by ordering the identifiers of a set of pivots according to their distances to the object to be represented. In this paper, we present a novel approach to transform metric objects into permutations. It uses the object-pivot distances in combination with a metric transformation, called n-Simplex projection. The resulting permutation-based representation, named SPLX-Perm, is suitable only for the large class of metric space satisfying the n-point property. We tested the proposed approach on two benchmarks for similarity search. Our preliminary results are encouraging and open new perspectives for further investigations on the use of the n-Simplex projection for supporting permutation-based indexing.Source: International Conference on Similarity Search and Applications, pp. 40–48, Newark, NJ, USA, 2-4/10/2019
DOI: 10.1007/978-3-030-32047-8_4

See at: dspace.stir.ac.uk Open Access | ISTI Repository Open Access | academic.microsoft.com Restricted | dblp.uni-trier.de Restricted | dspace.stir.ac.uk Restricted | dspace.stir.ac.uk Restricted | link.springer.com Restricted | link.springer.com Restricted | CNR ExploRA Restricted | rd.springer.com Restricted | www.storre.stir.ac.uk Restricted


2019 Conference article Open Access OPEN

Metric Embedding into the Hamming Space with the n-Simplex Projection
Vadicamo L., Mic V., Falchi F., Zezula P.
Transformations of data objects into the Hamming space are often exploited to speed-up the similarity search in metric spaces. Techniques applicable in generic metric spaces require expensive learning, e.g., selection of pivoting objects. However, when searching in common Euclidean space, the best performance is usually achieved by transformations specifically designed for this space. We propose a novel transformation technique that provides a good trade-off between the applicability and the quality of the space approximation. It uses the n-Simplex projection to transform metric objects into a low-dimensional Euclidean space, and then transform this space to the Hamming space. We compare our approach theoretically and experimentally with several techniques of the metric embedding into the Hamming space. We focus on the applicability, learning cost, and the quality of search space approximation.Source: International Conference on Similarity Search and Applications, pp. 265–272, Newark, NJ, USA, 2-4/10/2019
DOI: 10.1007/978-3-030-32047-8_23

See at: Univerzitní repozitář Masarykovy univerzity Open Access | is.muni.cz Open Access | ISTI Repository Open Access | academic.microsoft.com Restricted | dblp.uni-trier.de Restricted | is.muni.cz Restricted | link.springer.com Restricted | link.springer.com Restricted | CNR ExploRA Restricted | rd.springer.com Restricted | repozitar.cz Restricted


2019 Conference article Open Access OPEN

An Image Retrieval System for Video
Bolettieri P., Carrara F., Debole F., Falchi F., Gennaro C., Vadicamo L., Vairo C.
Since the 1970's the Content-Based Image Indexing and Retrieval (CBIR) has been an active area. Nowadays, the rapid increase of video data has paved the way to the advancement of the technologies in many different communities for the creation of Content-Based Video Indexing and Retrieval (CBVIR). However, greater attention needs to be devoted to the development of effective tools for video search and browse. In this paper, we present Visione, a system for large-scale video retrieval. The system integrates several content-based analysis and retrieval modules, including a keywords search, a spatial object-based search, and a visual similarity search. From the tests carried out by users when they needed to find as many correct examples as possible, the similarity search proved to be the most promising option. Our implementation is based on state-of-the-art deep learning approaches for content analysis and leverages highly efficient indexing techniques to ensure scalability. Specifically, we encode all the visual and textual descriptors extracted from the videos into (surrogate) textual representations that are then efficiently indexed and searched using an off-the-shelf text search engine using similarity functions.Source: International Conference on Similarity Search and Applications (SISAP), pp. 332–339, Newark, NJ, USA, 2-4/10/2019
DOI: 10.1007/978-3-030-32047-8_29

See at: ISTI Repository Open Access | academic.microsoft.com Restricted | dblp.uni-trier.de Restricted | link.springer.com Restricted | link.springer.com Restricted | CNR ExploRA Restricted | rd.springer.com Restricted


2019 Software Unknown

VISIONE Content-Based Video Retrieval System, VBS 2019
Amato G., Bolettieri P., Carrara F., Debole F., Falchi F., Gennaro C., Vadicamo L., Vairo C.
VISIONE is a content-based video retrieval system that participated to VBS for the very first time in 2019. It is mainly based on state-of-the-art deep learning approaches for visual content analysis and exploits highly efficient indexing techniques to ensure scalability. The system supports query by scene tag, query by object location, query by color sketch, and visual similarity search.

See at: bilioso.isti.cnr.it | CNR ExploRA


2019 Journal article Open Access OPEN

Supermetric search
Connor R., Vadicamo L., Cardillo F. A., Rabitti F.
Metric search is concerned with the efficient evaluation of queries in metric spaces. In general, a large space of objects is arranged in such a way that, when a further object is presented as a query, those objects most similar to the query can be efficiently found. Most mechanisms rely upon the triangle inequality property of the metric governing the space. The triangle inequality property is equivalent to a finite embedding property, which states that any three points of the space can be isometrically embedded in two-dimensional Euclidean space. In this paper, we examine a class of semimetric space which is finitely four-embeddable in three-dimensional Euclidean space. In mathematics this property has been extensively studied and is generally known as the four-point property. All spaces with the four-point property are metric spaces, but they also have some stronger geometric guarantees. We coin the term supermetric(1) space as, in terms of metric search, they are significantly more tractable. Supermetric spaces include all those governed by Euclidean, Cosine,(2) Jensen-Shannon and Triangular distances, and are thus commonly used within many domains. In previous work we have given a generic mathematical basis for the supermetric property and shown how it can improve indexing performance for a given exact search structure. Here we present a full investigation into its use within a variety of different hyperplane partition indexing structures, and go on to show some more of its flexibility by examining a search structure whose partition and exclusion conditions are tailored, at each node, to suit the individual reference points and data set present there. Among the results given, we show a new best performance for exact search using a well-known benchmark. (C) 2018 Elsevier Ltd. All rights reserved.Source: Information systems (Oxf.) 80 (2019): 108–123. doi:10.1016/j.is.2018.01.002
DOI: 10.1016/j.is.2018.01.002

See at: arXiv.org e-Print Archive Open Access | Information Systems Open Access | ISTI Repository Open Access | Strathprints Open Access | Information Systems Restricted | Information Systems Restricted | CNR ExploRA Restricted | www.sciencedirect.com Restricted


2019 Conference article Open Access OPEN

VISIONE at VBS2019
Amato G., Bolettieri P., Carrara F., Debole F., Falchi F., Gennaro C., Vadicamo L., Vairo C.
This paper presents VISIONE, a tool for large-scale video search. The tool can be used for both known-item and ad-hoc video search tasks since it integrates several content-based analysis and re- trieval modules, including a keyword search, a spatial object-based search, and a visual similarity search. Our implementation is based on state-of- the-art deep learning approaches for the content analysis and leverages highly efficient indexing techniques to ensure scalability. Specifically, we encode all the visual and textual descriptors extracted from the videos into (surrogate) textual representations that are then efficiently indexed and searched using an off-the-shelf text search engine.Source: MMM 2019 - 25th International Conference on Multimedia Modeling, pp. 591–596, Thessaloniki, Greece, 08-11/01/2019
DOI: 10.1007/978-3-030-05716-9_51

See at: ISTI Repository Open Access | link.springer.com Restricted | CNR ExploRA Restricted


2019 Conference article Open Access OPEN

Intelligenza Artificiale, Retrieval e Beni Culturali
Vadicamo L., Amato G., Bolettieri P., Falchi F., Gennaro C., Rabitti F.
La visita a musei o a luoghi di interesse di città d'arte può essere completamente reinventata attraverso modalità di fruizione moderne e dinamiche, basate su tecnologie di riconoscimento e localizzazione visuale, ricerca per immagini e visualizzazioni in realtà aumentata. Da anni il gruppo di ricerca AIMIR porta avanti attività di ricerca su queste tematiche ricoprendo anche ruoli di responsabilità in progetti nazionali ed internazionali. Questo contributo riassume alcune delle attività di ricerca svolte e delle tecnologie utilizzate, nonché la partecipazione a progetti che hanno utilizzato tecnologie di intelligenza artificiale per la valorizzazione e la fruizione del patrimonio culturale.Source: Ital-IA, Roma, 18/3/2019, 19/3/2019

See at: ISTI Repository Open Access | CNR ExploRA Open Access | www.ital-ia.it Open Access


2019 Journal article Open Access OPEN

Large-scale instance-level image retrieval
Amato G., Carrara F., Falchi F., Gennaro C., Vadicamo L.
The great success of visual features learned from deep neural networks has led to a significant effort to develop efficient and scalable technologies for image retrieval. Nevertheless, its usage in large-scale Web applications of content-based retrieval is still challenged by their high dimensionality. To overcome this issue, some image retrieval systems employ the product quantization method to learn a large-scale visual dictionary from a training set of global neural network features. These approaches are implemented in main memory, preventing their usage in big-data applications. The contribution of the work is mainly devoted to investigating some approaches to transform neural network features into text forms suitable for being indexed by a standard full-text retrieval engine such as Elasticsearch. The basic idea of our approaches relies on a transformation of neural network features with the twofold aim of promoting the sparsity without the need of unsupervised pre-training. We validate our approach on a recent convolutional neural network feature, namely Regional Maximum Activations of Convolutions (R-MAC), which is a state-of-art descriptor for image retrieval. Its effectiveness has been proved through several instance-level retrieval benchmarks. An extensive experimental evaluation conducted on the standard benchmarks shows the effectiveness and efficiency of the proposed approach and how it compares to state-of-the-art main-memory indexes.Source: Information processing & management 57 (2019). doi:10.1016/j.ipm.2019.102100
DOI: 10.1016/j.ipm.2019.102100
Project(s): AI4EU via OpenAIRE

See at: ISTI Repository Open Access | Information Processing & Management Restricted | Information Processing & Management Restricted | Information Processing & Management Restricted | Information Processing & Management Restricted | CNR ExploRA Restricted | Information Processing & Management Restricted | www.sciencedirect.com Restricted


2019 Report Open Access OPEN

AIMIR 2019 Research Activities
Amato G., Bolettieri P., Carrara F., Ciampi L., Di Benedetto M., Debole F., Falchi F., Gennaro C., Lagani G., Massoli F. V., Messina N., Rabitti F., Savino P., Vadicamo L., Vairo C.
Multimedia Information Retrieval (AIMIR) research group is part of the NeMIS laboratory of the Information Science and Technologies Institute "A. Faedo" (ISTI) of the Italian National Research Council (CNR). The AIMIR group has a long experience in topics related to: Artificial Intelligence, Multimedia Information Retrieval, Computer Vision and Similarity search on a large scale. We aim at investigating the use of Artificial Intelligence and Deep Learning, for Multimedia Information Retrieval, addressing both effectiveness and efficiency. Multimedia information retrieval techniques should be able to provide users with pertinent results, fast, on huge amount of multimedia data. Application areas of our research results range from cultural heritage to smart tourism, from security to smart cities, from mobile visual search to augmented reality. This report summarize the 2019 activities of the research group.Source: AIMIR Annual Report, 2019

See at: ISTI Repository Open Access | CNR ExploRA Open Access


2018 Journal article Open Access OPEN

Aggregating binary local descriptors for image retrieval
Amato G., Falchi F., Vadicamo L.
Content-Based Image Retrieval based on local features is computationally expensive because of the complexity of both extraction and matching of local feature. On one hand, the cost for extracting, representing, and comparing local visual descriptors has been dramatically reduced by recently proposed binary local features. On the other hand, aggregation techniques provide a meaningful summarization of all the extracted feature of an image into a single descriptor, allowing us to speed up and scale up the image search. Only a few works have recently mixed together these two research directions, defining aggregation methods for binary local features, in order to leverage on the advantage of both approaches.In this paper, we report an extensive comparison among state-of-the-art aggregation methods applied to binary features. Then, we mathematically formalize the application of Fisher Kernels to Bernoulli Mixture Models. Finally, we investigate the combination of the aggregated binary features with the emerging Convolutional Neural Network (CNN) features. Our results show that aggregation methods on binary features are effective and represent a worthwhile alternative to the direct matching. Moreover, the combination of the CNN with the Fisher Vector (FV) built upon binary features allowed us to obtain a relative improvement over the CNN results that is in line with that recently obtained using the combination of the CNN with the FV built upon SIFTs. The advantage of using the FV built upon binary features is that the extraction process of binary features is about two order of magnitude faster than SIFTs.Source: Multimedia tools and applications 77 (2018): 5385–5415. doi:10.1007/s11042-017-4450-2
DOI: 10.1007/s11042-017-4450-2
Project(s): EAGLE

See at: arXiv.org e-Print Archive Open Access | Multimedia Tools and Applications Open Access | ISTI Repository Open Access | Multimedia Tools and Applications Restricted | Multimedia Tools and Applications Restricted | Multimedia Tools and Applications Restricted | Multimedia Tools and Applications Restricted | Multimedia Tools and Applications Restricted | Multimedia Tools and Applications Restricted | Multimedia Tools and Applications Restricted | CNR ExploRA Restricted


2018 Report Open Access OPEN

SMART NEWS - Visual Content Mining
Amato G., Carrara F., Falchi F., Gennaro C., Vadicamo L.
Il deliverable D3.3 "Visual Content Mining" ha lo scopo di descrivere e documentare le attività di visual content mining portate avanti come parte dell'obiettivo operativo 3 "Social Media Analysis/Mining" del progetto "Smart News: Social Sensing for Breaking News" . In particolare, questo documento descrive lo stato dell'arte e le tecniche adottate o sviluppate in SmartNews per l'analisi automatica delle immagini al fine di estrarre informazioni che ne permettano la loro descrizione automatica, classificazione e ricerca. Tali analisi verranno integrate nel News Management tool per l'analisi delle immagini raccolte dal sistema (attività 3.1 "Data Collection") fornendo agli utenti della piattaforma degli strumenti innovativi per l'analisi dei dati e l'arricchimento delle informazioni raccolte su una notizia monitorata.Source: Project report, SMART NEWS, Deliverable D3.3, 2018

See at: ISTI Repository Open Access | CNR ExploRA Open Access


2018 Conference article Open Access OPEN

Re-ranking Permutation-Based Candidate Sets with the n-Simplex Projection
Amato G., Chavez E., Connor R., Falchi F., Gennaro C., Vadicamo L.
In the realm of metric search, the permutation-based approaches have shown very good performance in indexing and supporting approximate search on large databases. These methods embed the metric objects into a permutation space where candidate results to a given query can be efficiently identified. Typically, to achieve high effectiveness, the permutation-based result set is refined by directly comparing each candidate object to the query one. Therefore, one drawback of these approaches is that the original dataset needs to be stored and then accessed during the refining step. We propose a refining approach based on a metric embedding, called n-Simplex projection, that can be used on metric spaces meeting the n-point property. The n-Simplex projection provides upper- and lower-bounds of the actual distance, derived using the distances between the data objects and a finite set of pivots. We propose to reuse the distances computed for building the data permutations to derive these bounds and we show how to use them to improve the permutation-based results. Our approach is particularly advantageous for all the cases in which the traditional refining step is too costly, e.g. very large dataset or very expensive metric function.Source: Similarity Search and Applications. SISAP 2018, pp. 3–17, Lima, Perù, 7-9 Ottobre 2018
DOI: 10.1007/978-3-030-02224-2_1

See at: dspace.stir.ac.uk Open Access | ISTI Repository Open Access | academic.microsoft.com Restricted | dblp.uni-trier.de Restricted | dspace.stir.ac.uk Restricted | dspace.stir.ac.uk Restricted | link.springer.com Restricted | link.springer.com Restricted | link.springer.com Restricted | CNR ExploRA Restricted | rd.springer.com Restricted


2018 Conference article Open Access OPEN

Selecting sketches for similarity search
Mic V., Novak D., Vadicamo L., Zezula P.
Techniques of the Hamming embedding, producing bit string sketches, have been recently successfully applied to speed up similarity search. Sketches are usually compared by the Hamming distance, and applied to filter out non-relevant objects during the query evaluation. As several sketching techniques exist and each can produce sketches with different lengths, it is hard to select a proper configuration for a particular dataset. We assume that the (dis)similarity of objects is expressed by an arbitrary metric function, and we propose a way to efficiently estimate the quality of sketches using just a small sample set of data. Our approach is based on a probabilistic analysis of sketches which describes how separated are objects after projection to the Hamming space.Source: ADBIS 2018 - 22nd European Conference on Advances in Databases and Information Systems, pp. 127–141, Budapest, Ungheria, 2-5 September 2018
DOI: 10.1007/978-3-319-98398-1_9

See at: ISTI Repository Open Access | academic.microsoft.com Restricted | dblp.uni-trier.de Restricted | link.springer.com Restricted | link.springer.com Restricted | link.springer.com Restricted | CNR ExploRA Restricted | rd.springer.com Restricted | www.muni.cz Restricted