5 result(s)
Page Size: 10, 20, 50
Export: bibtex, xml, json, csv
Order by:

CNR Author operator: and / or
more
Typology operator: and / or
Language operator: and / or
Date operator: and / or
Rights operator: and / or
2020 Software Unknown
Visione II
Amato G., Bolettieri P., Carrara F., Falchi F., Gennaro C., Messina N., Vadicamo L., Vairo C.
VISIONE II is a content-based video retrieval system for effective video search on large-scale collections. It allows users to search for videos using textual descriptions, keywords, occurrence of objects and their spatial relationships, occurrence of colors and their spatial relationships, and image similarity.

See at: bilioso.isti.cnr.it | CNR ExploRA


2020 Report Open Access OPEN
AIMH research activities 2020
Aloia N., Amato G., Bartalesi V., Benedetti F., Bolettieri P., Carrara F., Casarosa V., Ciampi L., Concordia C., Corbara S., Esuli A., Falchi F., Gennaro C., Lagani G., Massoli F. V., Meghini C., Messina N., Metilli D., Molinari A., Moreo A., Nardi A., Pedrotti A., Pratelli N., Rabitti F., Savino P., Sebastiani F., Thanos C., Trupiano L., Vadicamo L., Vairo C.
Annual Report of the Artificial Intelligence for Media and Humanities laboratory (AIMH) research activities in 2020.Source: ISTI Annual Report, ISTI-2020-AR/001, 2020
DOI: 10.32079/isti-ar-2020/001
Metrics:


See at: ISTI Repository Open Access | CNR ExploRA


2020 Conference article Closed Access
Re-implementing and Extending Relation Network for R-CBIR
Messina N., Amato G., Falchi F.
Relational reasoning is an emerging theme in Machine Learning in general and in Computer Vision in particular. Deep Mind has recently proposed a module called Relation Network (RN) that has shown impressive results on visual question answering tasks. Unfortunately, the implementation of the proposed approach was not public. To reproduce their experiments and extend their approach in the context of Information Retrieval, we had to re-implement everything, testing many parameters and conducting many experiments. Our implementation is now public on GitHub and it is already used by a large community of researchers. Furthermore, we recently presented a variant of the relation network module that we called Aggregated Visual Features RN (AVF-RN). This network can produce and aggregate at inference time compact visual relationship-aware features for the Relational-CBIR (R-CBIR) task. R-CBIR consists in retrieving images with given relationships among objects. In this paper, we discuss the details of our Relation Network implementation and more experimental results than the original paper. Relational reasoning is a very promising topic for better understanding and retrieving inter-object relationships, especially in digital libraries.Source: 16th Italian Research Conference on Digital Libraries, IRCDL 2020, pp. 82–92, Bari, Italy, 30-31/01/2020
DOI: 10.1007/978-3-030-39905-4_9
Metrics:


See at: doi.org Restricted | link.springer.com Restricted | CNR ExploRA


2020 Conference article Open Access OPEN
Relational visual-textual information retrieval
Messina N.
With the advent of deep learning, multimedia information processing gained a huge boost, and astonishing results have been observed on a multitude of interesting visual-textual tasks. Relation networks paved the way towards an attentive processing methodology that considers images and texts as sets of basic interconnected elements (regions and words). These winning ideas recently helped to reach the state-of-the-art on the image-text matching task. Cross-media information retrieval has been proposed as a benchmark to test the capabilities of the proposed networks to match complex multi-modal concepts in the same common space. Modern deep-learning powered networks are complex and almost all of them cannot provide concise multi-modal descriptions that can be used in fast multi-modal search engines. In fact, the latest image-sentence matching networks use cross-attention and early-fusion approaches, which force all the elements of the database to be considered at query time. In this work, I will try to lay down some ideas to bridge the gap between the effectiveness of modern deep-learning multi-modal matching architectures and their efficiency, as far as fast and scalable visual-textual information retrieval is concerned.Source: SISAP 2020 - 13th International Conference on Similarity Search and Applications, pp. 405–411, Copenhagen, Denmark, September 30 - October 2, 2020
DOI: 10.1007/978-3-030-60936-8_33
Metrics:


See at: ISTI Repository Open Access | doi.org Restricted | link.springer.com Restricted | CNR ExploRA


2020 Journal article Open Access OPEN
Virtual to real adaptation of pedestrian detectors
Ciampi L., Messina N., Falchi F., Gennaro C., Amato G.
Pedestrian detection through Computer Vision is a building block for a multitude of applications. Recently, there has been an increasing interest in convolutional neural network-based architectures to execute such a task. One of these supervised networks' critical goals is to generalize the knowledge learned during the training phase to new scenarios with different characteristics. A suitably labeled dataset is essential to achieve this purpose. The main problem is that manually annotating a dataset usually requires a lot of human effort, and it is costly. To this end, we introduce ViPeD (Virtual Pedestrian Dataset), a new synthetically generated set of images collected with the highly photo-realistic graphical engine of the video game GTA V (Grand Theft Auto V), where annotations are automatically acquired. However, when training solely on the synthetic dataset, the model experiences a Synthetic2Real domain shift leading to a performance drop when applied to real-world images. To mitigate this gap, we propose two different domain adaptation techniques suitable for the pedestrian detection task, but possibly applicable to general object detection. Experiments show that the network trained with ViPeD can generalize over unseen real-world scenarios better than the detector trained over real-world data, exploiting the variety of our synthetic dataset. Furthermore, we demonstrate that with our domain adaptation techniques, we can reduce the Synthetic2Real domain shift, making the two domains closer and obtaining a performance improvement when testing the network over the real-world images.Source: Sensors (Basel) 20 (2020). doi:10.3390/s20185250
DOI: 10.3390/s20185250
DOI: 10.48550/arxiv.2001.03032
Project(s): AI4EU via OpenAIRE
Metrics:


See at: arXiv.org e-Print Archive Open Access | Sensors Open Access | Sensors Open Access | ISTI Repository Open Access | Sensors Open Access | Sensors Open Access | doi.org Restricted | CNR ExploRA