290 result(s)
Page Size: 10, 20, 50
Export: bibtex, xml, json, csv
Order by:

CNR Author operator: and / or
more
Typology operator: and / or
more
Language operator: and / or
Date operator: and / or
more
Rights operator: and / or
2021 Journal article Open Access OPEN

Solving the same-different task with convolutional neural networks
Messina N., Amato G. Carrara F., Gennaro C., Falchi F.
Deep learning demonstrated major abilities in solving many kinds of different real-world problems in computer vision literature. However, they are still strained by simple reasoning tasks that humans consider easy to solve. In this work, we probe current state-of-the-art convolutional neural networks on a difficult set of tasks known as the same-different problems. All the problems require the same prerequisite to be solved correctly: understanding if two random shapes inside the same image are the same or not. With the experiments carried out in this work, we demonstrate that residual connections, and more generally the skip connections, seem to have only a marginal impact on the learning of the proposed problems. In particular, we experiment with DenseNets, and we examine the contribution of residual and recurrent connections in already tested architectures, ResNet-18, and CorNet-S respectively. Our experiments show that older feed-forward networks, AlexNet and VGG, are almost unable to learn the proposed problems, except in some specific scenarios. We show that recently introduced architectures can converge even in the cases where the important parts of their architecture are removed. We finally carry out some zero-shot generalization tests, and we discover that in these scenarios residual and recurrent connections can have a stronger impact on the overall test accuracy. On four difficult problems from the SVRT dataset, we can reach state-of-the-art results with respect to the previous approaches, obtaining super-human performances on three of the four problems.Source: Pattern recognition letters 143 (2021): 75–80. doi:10.1016/j.patrec.2020.12.019
DOI: 10.1016/j.patrec.2020.12.019
Project(s): AI4EU via OpenAIRE

See at: arXiv.org e-Print Archive Open Access | Pattern Recognition Letters Open Access | ISTI Repository Open Access | Pattern Recognition Letters Restricted | Pattern Recognition Letters Restricted | Pattern Recognition Letters Restricted | Pattern Recognition Letters Restricted | Pattern Recognition Letters Restricted | Pattern Recognition Letters Restricted | Pattern Recognition Letters Restricted | CNR ExploRA Restricted | Pattern Recognition Letters Restricted | www.sciencedirect.com Restricted


2021 Conference article Open Access OPEN

Domain adaptation for traffic density estimation
Ciampi L., Santiago C., Costeira J. P., Gennaro C., Amato G.
Convolutional Neural Networks have produced state-of-the-art results for a multitude of computer vision tasks under supervised learning. However, the crux of these methods is the need for a massive amount of labeled data to guarantee that they generalize well to diverse testing scenarios. In many real-world applications, there is indeed a large domain shift between the distributions of the train (source) and test (target) domains, leading to a significant drop in performance at inference time. Unsupervised Domain Adaptation (UDA) is a class of techniques that aims to mitigate this drawback without the need for labeled data in the target domain. This makes it particularly useful for the tasks in which acquiring new labeled data is very expensive, such as for semantic and instance segmentation. In this work, we propose an end-to-end CNN-based UDA algorithm for traffic density estimation and counting, based on adversarial learning in the output space. The density estimation is one of those tasks requiring per-pixel annotated labels and, therefore, needs a lot of human effort. We conduct experiments considering different types of domain shifts, and we make publicly available two new datasets for the vehicle counting task that were also used for our tests. One of them, the Grand Traffic Auto dataset, is a synthetic collection of images, obtained using the graphical engine of the Grand Theft Auto video game, automatically annotated with precise per-pixel labels. Experiments show a significant improvement using our UDA algorithm compared to the model's performance without domain adaptation.Source: VISIGRAPP 2021 - 16th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, pp. 185–195, Online Conference, 08-10 February, 2021
DOI: 10.5220/0010303401850195
Project(s): AI4EU via OpenAIRE, AI4Media via OpenAIRE

See at: ISTI Repository Open Access | CNR ExploRA Open Access | www.scitepress.org Open Access


2021 Conference article Open Access OPEN

Defending Neural ODE Image Classifiers from Adversarial Attacks with Tolerance Randomization
Carrara F., Caldelli R., Falchi F., Amato G.
Deep learned models are now largely adopted in different fields, and they generally provide superior performances with respect to classical signal-based approaches. Notwithstanding this, their actual reliability when working in an unprotected environment is far enough to be proven. In this work, we consider a novel deep neural network architecture, named Neural Ordinary Differential Equations (N-ODE), that is getting particular attention due to an attractive property--a test-time tunable trade-off between accuracy and efficiency. This paper analyzes the robustness of N-ODE image classifiers when faced against a strong adversarial attack and how its effectiveness changes when varying such a tunable trade-off. We show that adversarial robustness is increased when the networks operate in different tolerance regimes during test time and training time. On this basis, we propose a novel adversarial detection strategy for N-ODE nets based on the randomization of the adaptive ODE solver tolerance. Our evaluation performed on standard image classification benchmarks shows that our detection technique provides high rejection of adversarial examples while maintaining most of the original samples under white-box attacks and zero-knowledge adversaries.Source: International Conference on Pattern Recognition ICPR 2021, pp. 425–438, Milano (Virtuale), 10-15/01/2021
DOI: 10.1007/978-3-030-68780-9_35
Project(s): AI4EU via OpenAIRE, AI4Media via OpenAIRE

See at: ISTI Repository Open Access | link.springer.com Restricted | link.springer.com Restricted | CNR ExploRA Restricted


2021 Journal article Open Access OPEN

The VISIONE video search system: exploiting off-the-shelf text search engines for large-scale video retrieval
Amato G., Bolettieri P., Carrara F., Debole F., Falchi F., Gennaro C., Vadicamo L., Vairo C.
This paper describes in detail VISIONE, a video search system that allows users to search for videos using textual keywords, the occurrence of objects and their spatial relationships, the occurrence of colors and their spatial relationships, and image similarity. These modalities can be combined together to express complex queries and meet users' needs. The peculiarity of our approach is that we encode all information extracted from the keyframes, such as visual deep features, tags, color and object locations, using a convenient textual encoding that is indexed in a single text retrieval engine. This offers great flexibility when results corresponding to various parts of the query (visual, text and locations) need to be merged. In addition, we report an extensive analysis of the retrieval performance of the system, using the query logs generated during the Video Browser Showdown (VBS) 2019 competition. This allowed us to fine-tune the system by choosing the optimal parameters and strategies from those we tested.Source: JOURNAL OF IMAGING 7 (2021). doi:10.3390/jimaging7050076
DOI: 10.3390/jimaging7050076

See at: ISTI Repository Open Access | ISTI Repository Open Access | CNR ExploRA Open Access | www.mdpi.com Open Access


2020 Conference article Open Access OPEN

Edge-Based Video Surveillance with Embedded Devices
Kavalionak H., Gennaro C., Amato G., Vairo C., Perciante C., Meghini C., Falchi F., Rabitti F.
Video surveillance systems have become indispensable tools for the security and organization of public and private areas. In this work, we propose a novel distributed protocol for an edge-based face recogni-tion system that takes advantage of the computational capabilities of the surveillance devices (i.e., cameras) to perform person recognition. The cameras fall back to a centralized server if their hardware capabili-ties are not enough to perform the recognition. We evaluate the proposed algorithm via extensive experiments on a freely available dataset. As a prototype of surveillance embedded devices, we have considered a Rasp-berry PI with the camera module. Using simulations, we show that our algorithm can reduce up to 50% of the load of the server with no negative impact on the quality of the surveillance service.Source: 28th Symposium on Advanced Database Systems (SEBD), pp. 278–285, Villasimius, Sardinia, Italy, 21-24/06/2020

See at: ceur-ws.org Open Access | ISTI Repository Open Access | CNR ExploRA Open Access


2020 Conference article Open Access OPEN

Multi-Resolution Face Recognition with Drones
Amato G., Falchi F., Gennaro C., Massoli F. V., Vairo C.
Smart cameras have recently seen a large diffusion and represent a low-cost solution for improving public security in many scenarios. Moreover, they are light enough to be lifted by a drone. Face recognition enabled by drones equipped with smart cameras has already been reported in the literature. However, the use of the drone generally imposes tighter constraints than other facial recognition scenarios. First, weather conditions, such as the presence of wind, pose a severe limit on image stability. Moreover, the distance the drones fly is typically much high than fixed ground cameras, which inevitably translates into a degraded resolution of the face images. Furthermore, the drones' operational altitudes usually require the use of optical zoom, thus amplifying the harmful effects of their movements. For all these reasons, in drone scenarios, image degradation strongly affects the behavior of face detection and recognition systems. In this work, we studied the performance of deep neural networks for face re-identification specifically designed for low-quality images and applied them to a drone scenario using a publicly available dataset known as DroneSURF.Source: 3rd International Conference on Sensors, Signal and Image Processing, pp. 13–18, Praga, Czech Republic (Virtual), 23-25/10/2020
DOI: 10.1145/3441233.3441237

See at: ISTI Repository Open Access | dl.acm.org Restricted | dl.acm.org Restricted | CNR ExploRA Restricted


2020 Conference article Open Access OPEN

Scalar Quantization-Based Text Encoding for Large Scale Image Retrieval
Amato G., Carrara F., Falchi F., Gennaro C., Rabitti F., Vadicamo L.
The great success of visual features learned from deep neu-ral networks has led to a significant effort to develop efficient and scal- A ble technologies for image retrieval. This paper presents an approach to transform neural network features into text codes suitable for being indexed by a standard full-text retrieval engine such as Elasticsearch. The basic idea is providing a transformation of neural network features with the twofold aim of promoting the sparsity without the need of un-supervised pre-training. We validate our approach on a recent convolu-tional neural network feature, namely Regional Maximum Activations of Convolutions (R-MAC), which is a state-of-art descriptor for image retrieval. An extensive experimental evaluation conducted on standard benchmarks shows the effectiveness and efficiency of the proposed ap-proach and how it compares to state-of-the-art main-memory indexes.Source: 28th Italian Symposium on Advanced Database Systems, pp. 258–265, Virtual (online) due COVID-19, 21-24/06/2020

See at: ceur-ws.org Open Access | ISTI Repository Open Access | CNR ExploRA Open Access


2020 Journal article Open Access OPEN

Cross-resolution learning for face recognition
Massoli F. V., Amato G., Falchi F.
Convolutional Neural Network models have reached extremely high performance on the Face Recognition task. Mostly used datasets, such as VGGFace2, focus on gender, pose, and age variations, in the attempt of balancing them to empower models to better generalize to unseen data. Nevertheless, image resolution variability is not usually discussed, which may lead to a resizing of 256 pixels. While specific datasets for very low-resolution faces have been proposed, less attention has been paid on the task of cross-resolution matching. Hence, the discrimination power of a neural network might seriously degrade in such a scenario. Surveillance systems and forensic applications are particularly susceptible to this problem since, in these cases, it is common that a low-resolution query has to be matched against higher-resolution galleries. Although it is always possible to either increase the resolution of the query image or to reduce the size of the gallery (less frequently), to the best of our knowledge, extensive experimentation of cross-resolution matching was missing in the recent deep learning-based literature. In the context of low- and cross-resolution Face Recognition, the contribution of our work is fourfold: i) we proposed a training procedure to fine-tune a state-of-the-art model to empower it to extract resolution-robust deep features; ii) we conducted an extensive test campaign by using high-resolution datasets (IJB-B and IJB-C) and surveillance-camera-quality datasets (QMUL-SurvFace, TinyFace, and SCface) showing the effectiveness of our algorithm to train a resolution-robust model; iii) even though our main focus was the cross-resolution Face Recognition, by using our training algorithm we also improved upon state-of-the-art model performances considering low-resolution matches; iv) we showed that our approach could be more effective concerning preprocessing faces with super-resolution techniques. The python code of the proposed method will be available at https://github.com/fvmassoli/cross-resolution-face-recognition.Source: Image and vision computing 99 (2020). doi:10.1016/j.imavis.2020.103927
DOI: 10.1016/j.imavis.2020.103927
Project(s): AI4EU via OpenAIRE

See at: arXiv.org e-Print Archive Open Access | Image and Vision Computing Open Access | ISTI Repository Open Access | Image and Vision Computing Restricted | Image and Vision Computing Restricted | Image and Vision Computing Restricted | Image and Vision Computing Restricted | Image and Vision Computing Restricted | Image and Vision Computing Restricted | Image and Vision Computing Restricted | Image and Vision Computing Restricted | CNR ExploRA Restricted | Image and Vision Computing Restricted | www.sciencedirect.com Restricted | Image and Vision Computing Restricted


2020 Conference article Restricted

Re-implementing and Extending Relation Network for R-CBIR
Messina N., Amato G., Falchi F.
Relational reasoning is an emerging theme in Machine Learning in general and in Computer Vision in particular. Deep Mind has recently proposed a module called Relation Network (RN) that has shown impressive results on visual question answering tasks. Unfortunately, the implementation of the proposed approach was not public. To reproduce their experiments and extend their approach in the context of Information Retrieval, we had to re-implement everything, testing many parameters and conducting many experiments. Our implementation is now public on GitHub and it is already used by a large community of researchers. Furthermore, we recently presented a variant of the relation network module that we called Aggregated Visual Features RN (AVF-RN). This network can produce and aggregate at inference time compact visual relationship-aware features for the Relational-CBIR (R-CBIR) task. R-CBIR consists in retrieving images with given relationships among objects. In this paper, we discuss the details of our Relation Network implementation and more experimental results than the original paper. Relational reasoning is a very promising topic for better understanding and retrieving inter-object relationships, especially in digital libraries.Source: 16th Italian Research Conference on Digital Libraries, IRCDL 2020, pp. 82–92, Bari, Italy, 30-31/01/2020
DOI: 10.1007/978-3-030-39905-4_9

See at: academic.microsoft.com Restricted | dblp.uni-trier.de Restricted | link.springer.com Restricted | link.springer.com Restricted | CNR ExploRA Restricted


2020 Journal article Open Access OPEN

Re-ranking via local embeddings: A use case with permutation-based indexing and the nSimplex projection
Vadicamo L., Gennaro C., Falchi F., Chavez E., Connor R., Amato G.
Approximate Nearest Neighbor (ANN) search is a prevalent paradigm for searching intrinsically high dimensional objects in large-scale data sets. Recently, the permutation-based approach for ANN has attracted a lot of interest due to its versatility in being used in the more general class of metric spaces. In this approach, the entire database is ranked by a permutation distance to the query. Typically, permutations allow the efficient selection of a candidate set of results, but typically to achieve high recall or precision this set has to be reviewed using the original metric and data. This can lead to a sizeable percentage of the database being recalled, along with many expensive distance calculations. To reduce the number of metric computations and the number of database elements accessed, we propose here a re-ranking based on a local embedding using the nSimplex projection. The nSimplex projection produces Euclidean vectors from objects in metric spaces which possess the n-point property. The mapping is obtained from the distances to a set of reference objects, and the original metric can be lower bounded and upper bounded by the Euclidean distance of objects sharing the same set of references. Our approach is particularly advantageous for extensive databases or expensive metric function. We reuse the distances computed in the permutations in the first stage, and hence the memory footprint of the index is not increased. An extensive experimental evaluation of our approach is presented, demonstrating excellent results even on a set of hundreds of millions of objects.Source: Information systems (Oxf.) (2020). doi:10.1016/j.is.2020.101506
DOI: 10.1016/j.is.2020.101506
Project(s): AI4EU via OpenAIRE

See at: ISTI Repository Open Access | Information Systems Restricted | Information Systems Restricted | Information Systems Restricted | Information Systems Restricted | Information Systems Restricted | CNR ExploRA Restricted | www.sciencedirect.com Restricted | Information Systems Restricted


2020 Conference article Open Access OPEN

Continuous ODE-defined image features for adaptive retrieval
Carrara F., Amato G., Falchi F., Gennaro C.
In the last years, content-based image retrieval largely benefited from representation extracted from deeper and more complex convolutional neural networks, which became more effective but also more computationally demanding. Despite existing hardware acceleration, query processing times may be easily saturated by deep feature extraction in high-throughput or real-time embedded scenarios, and usually, a trade-off between efficiency and effectiveness has to be accepted. In this work, we experiment with the recently proposed continuous neural networks defined by parametric ordinary differential equations, dubbed ODE-Nets, for adaptive extraction of image representations. Given the continuous evolution of the network hidden state, we propose to approximate the exact feature extraction by taking a previous "near-in-time" hidden state as features with a reduced computational cost. To understand the potential and the limits of this approach, we also evaluate an ODE-only architecture in which we minimize the number of classical layers in order to delegate most of the representation learning process - - and thus the feature extraction process - - to the continuous part of the model. Preliminary experiments on standard benchmarks show that we are able to dynamically control the trade-off between efficiency and effectiveness of feature extraction at inference-time by controlling the evolution of the continuous hidden state. Although ODE-only networks provide the best fine-grained control on the effectiveness-efficiency trade-off, we observed that mixed architectures perform better or comparably to standard residual nets in both the image classification and retrieval setups while using fewer parameters and retaining the controllability of the trade-off.Source: ICMR '20 - International Conference on Multimedia Retrieval, pp. 198–206, Dublin, Ireland, 8-11 June, 2020
DOI: 10.1145/3372278.3390690
Project(s): AI4EU via OpenAIRE

See at: ISTI Repository Open Access | academic.microsoft.com Restricted | dblp.uni-trier.de Restricted | dl.acm.org Restricted | dl.acm.org Restricted | CNR ExploRA Restricted


2020 Conference article Open Access OPEN

Unsupervised vehicle counting via multiple camera domain adaptation
Ciampi L., Santiago C., Costeira J. P., Gennaro C., Amato G.
Monitoring vehicle flow in cities is a crucial issue to improve the urban environment and quality of life of citizens. Images are the best sensing modality to perceive and asses the flow of vehicles in large areas. Current technologies for vehicle counting in images hinge on large quantities of annotated data, preventing their scalability to city-scale as new cameras are added to the system. This is a recurrent problem when dealing with physical systems and a key research area in Machine Learning and AI. We propose and discuss a new methodology to design image-based vehicle density estimators with few labeled data via multiple camera domain adaptations.Source: ECAI-2020 - 1st International Workshop on New Foundations for Human-Centered AI (NeHuAI), pp. 1–4, Online Conference, 04 September, 2020
Project(s): AI4EU via OpenAIRE

See at: ceur-ws.org Open Access | ISTI Repository Open Access | CNR ExploRA Open Access


2020 Journal article Open Access OPEN

Virtual to real adaptation of pedestrian detectors
Ciampi L., Messina N., Falchi F., Gennaro C., Amato G.
Pedestrian detection through Computer Vision is a building block for a multitude of applications. Recently, there has been an increasing interest in convolutional neural network-based architectures to execute such a task. One of these supervised networks' critical goals is to generalize the knowledge learned during the training phase to new scenarios with different characteristics. A suitably labeled dataset is essential to achieve this purpose. The main problem is that manually annotating a dataset usually requires a lot of human effort, and it is costly. To this end, we introduce ViPeD (Virtual Pedestrian Dataset), a new synthetically generated set of images collected with the highly photo-realistic graphical engine of the video game GTA V (Grand Theft Auto V), where annotations are automatically acquired. However, when training solely on the synthetic dataset, the model experiences a Synthetic2Real domain shift leading to a performance drop when applied to real-world images. To mitigate this gap, we propose two different domain adaptation techniques suitable for the pedestrian detection task, but possibly applicable to general object detection. Experiments show that the network trained with ViPeD can generalize over unseen real-world scenarios better than the detector trained over real-world data, exploiting the variety of our synthetic dataset. Furthermore, we demonstrate that with our domain adaptation techniques, we can reduce the Synthetic2Real domain shift, making the two domains closer and obtaining a performance improvement when testing the network over the real-world images.Source: Sensors (Basel) 20 (2020). doi:10.3390/s20185250
DOI: 10.3390/s20185250

See at: Sensors Open Access | arXiv.org e-Print Archive Open Access | Sensors Open Access | Europe PubMed Central Open Access | ISTI Repository Open Access | CNR ExploRA Open Access | Sensors Open Access | Sensors Open Access | Sensors Open Access | Sensors Open Access


2020 Journal article Embargo

Cross-resolution face recognition adversarial attacks
Massoli F. V., Falchi F., Amato G.
Face Recognition is among the best examples of computer vision problems where the supremacy of deep learning techniques compared to standard ones is undeniable. Unfortunately, it has been shown that they are vulnerable to adversarial examples - input images to which a human imperceptible perturbation is added to lead a learning model to output a wrong prediction. Moreover, in applications such as biometric systems and forensics, cross-resolution scenarios are easily met with a non-negligible impact on the recognition performance and adversary's success. Despite the existence of such vulnerabilities set a harsh limit to the spread of deep learning-based face recognition systems to real-world applications, a comprehensive analysis of their behavior when threatened in a cross-resolution setting is missing in the literature. In this context, we posit our study, where we harness several of the strongest adversarial attacks against deep learning-based face recognition systems considering the cross-resolution domain. To craft adversarial instances, we exploit attacks based on three different metrics, i.e., L, L, and L, and we study the resilience of the models across resolutions. We then evaluate the performance of the systems against the face identification protocol, open- and close-set. In our study, we find that the deep representation attacks represents a much dangerous menace to a face recognition system than the ones based on the classification output independently from the used metric. Furthermore, we notice that the input image's resolution has a non-negligible impact on an adversary's success in deceiving a learning model. Finally, by comparing the performance of the threatened networks under analysis, we show how they can benefit from a cross-resolution training approach in terms of resilience to adversarial attacks.Source: Pattern recognition letters 140 (2020): 222–229. doi:10.1016/j.patrec.2020.10.008
DOI: 10.1016/j.patrec.2020.10.008
Project(s): AI4EU via OpenAIRE

See at: Pattern Recognition Letters Restricted | Pattern Recognition Letters Restricted | Pattern Recognition Letters Restricted | Pattern Recognition Letters Restricted | CNR ExploRA Restricted | www.sciencedirect.com Restricted | Pattern Recognition Letters Restricted


2020 Journal article Open Access OPEN

Detection of Face Recognition Adversarial Attacks
Massoli F. V., Carrara F., Amato G., Falchi F.
Deep Learning methods have become state-of-the-art for solving tasks such as Face Recognition (FR). Unfortunately, despite their success, it has been pointed out that these learning models are exposed to adversarial inputs - images to which an imperceptible amount of noise for humans is added to maliciously fool a neural network - thus limiting their adoption in sensitive real-world applications. While it is true that an enormous effort has been spent to train robust models against this type of threat, adversarial detection techniques have recently started to draw attention within the scientific community. The advantage of using a detection approach is that it does not require to re-train any model; thus, it can be added to any system. In this context, we present our work on adversarial detection in forensics mainly focused on detecting attacks against FR systems in which the learning model is typically used only as features extractor. Thus, training a more robust classifier might not be enough to counteract the adversarial threats. In this frame, the contribution of our work is four-fold: (i) we test our proposed adversarial detection approach against classification attacks, i.e., adversarial samples crafted to fool an FR neural network acting as a classifier; (ii) using a k-Nearest Neighbor (k-NN) algorithm as a guide, we generate deep features attacks against an FR system based on a neural network acting as features extractor, followed by a similarity-based procedure which returns the query identity; (iii) we use the deep features attacks to fool an FR system on the 1:1 face verification task, and we show their superior effectiveness with respect to classification attacks in evading such type of system; (iv) we use the detectors trained on the classification attacks to detect the deep features attacks, thus showing that such approach is generalizable to different classes of offensives.Source: Computer vision and image understanding (Print) 202 (2020). doi:10.1016/j.cviu.2020.103103
DOI: 10.1016/j.cviu.2020.103103
Project(s): AI4EU via OpenAIRE

See at: arXiv.org e-Print Archive Open Access | Computer Vision and Image Understanding Open Access | ISTI Repository Open Access | Computer Vision and Image Understanding Restricted | Computer Vision and Image Understanding Restricted | Computer Vision and Image Understanding Restricted | Computer Vision and Image Understanding Restricted | CNR ExploRA Restricted | Computer Vision and Image Understanding Restricted | Computer Vision and Image Understanding Restricted | www.sciencedirect.com Restricted


2020 Journal article Open Access OPEN

Learning accurate personal protective equipment detection from virtual worlds
Di Benedetto M., Carrara F., Meloni E., Amato G., Falchi F., Gennaro C.
Deep learning has achieved impressive results in many machine learning tasks such as image recognition and computer vision. Its applicability to supervised problems is however constrained by the availability of high-quality training data consisting of large numbers of humans annotated examples (e.g. millions). To overcome this problem, recently, the AI world is increasingly exploiting artificially generated images or video sequences using realistic photo rendering engines such as those used in entertainment applications. In this way, large sets of training images can be easily created to train deep learning algorithms. In this paper, we generated photo-realistic synthetic image sets to train deep learning models to recognize the correct use of personal safety equipment (e.g., worker safety helmets, high visibility vests, ear protection devices) during at-risk work activities. Then, we performed the adaptation of the domain to real-world images using a very small set of real-world images. We demonstrated that training with the synthetic training set generated and the use of the domain adaptation phase is an effective solution for applications where no training set is available.Source: Multimedia tools and applications (2020). doi:10.1007/s11042-020-09597-9
DOI: 10.1007/s11042-020-09597-9
Project(s): AI4EU via OpenAIRE

See at: ISTI Repository Open Access | Multimedia Tools and Applications Restricted | Multimedia Tools and Applications Restricted | Multimedia Tools and Applications Restricted | Multimedia Tools and Applications Restricted | CNR ExploRA Restricted


2020 Conference article Open Access OPEN

Learning distance estimators from pivoted embeddings of metric objects
Carrara F., Gennaro C., Falchi F., Amato G.
Efficient indexing and retrieval in generic metric spaces often translate into the search for approximate methods that can retrieve relevant samples to a query performing the least amount of distance computations. To this end, when indexing and fulfilling queries, distances are computed and stored only against a small set of reference points (also referred to as pivots) and then adopted in geometrical rules to estimate real distances and include or exclude elements from the result set. In this paper, we propose to learn a regression model that estimates the distance between a pair of metric objects starting from their distances to a set of reference objects. We explore architectural hyper-parameters and compare with the state-of-the-art geometrical method based on the n-simplex projection. Preliminary results show that our model provides a comparable or slightly degraded performance while being more efficient and applicable to generic metric spaces.Source: SISAP 2020: the 13th International Conference on Similarity Search and Applications, pp. 361–368, Copenhagen, Denmark (Virtual), 30/09/2020 - 02/10/2020
DOI: 10.1007/978-3-030-60936-8_28
Project(s): AI4EU via OpenAIRE

See at: ISTI Repository Open Access | academic.microsoft.com Restricted | link.springer.com Restricted | link.springer.com Restricted | link.springer.com Restricted | CNR ExploRA Restricted


2020 Report Open Access OPEN

AIMH research activities 2020
Aloia N., Amato G., Bartalesi V., Benedetti F., Bolettieri P., Carrara F., Casarosa V., Ciampi L., Concordia C., Corbara S., Esuli A., Falchi F., Gennaro C., Lagani G., Massoli F. V., Meghini C., Messina N., Metilli D., Molinari A., Moreo A., Nardi A., Pedrotti A., Pratelli N., Rabitti F., Savino P., Sebastiani F., Thanos C., Trupiano L., Vadicamo L., Vairo C.
Annual Report of the Artificial Intelligence for Media and Humanities laboratory (AIMH) research activities in 2020.

See at: ISTI Repository Open Access | CNR ExploRA Open Access


2020 Conference article Open Access OPEN

Cross-resolution deep features based image search
Massoli F. V., Falchi F., Gennaro C., Amato G.
Deep Learning models proved to be able to generate highly discriminative image descriptors, named deep features, suitable for similarity search tasks such as Person Re-Identification and Image Retrieval. Typically, these models are trained by employing high-resolution datasets, therefore reducing the reliability of the produced representations when low-resolution images are involved. The similarity search task becomes even more challenging in the cross-resolution scenarios, i.e., when a low-resolution query image has to be matched against a database containing descriptors generated from images at different, and usually high, resolutions. To solve this issue, we proposed a deep learning-based approach by which we empowered a ResNet-like architecture to generate resolution-robust deep features. Once trained, our models were able to generate image descriptors less brittle to resolution variations, thus being useful to fulfill a similarity search task in cross-resolution scenarios. To asses their performance, we used synthetic as well as natural low-resolution images. An immediate advantage of our approach is that there is no need for Super-Resolution techniques, thus avoiding the need to synthesize queries at higher resolutions.Source: Similarity Search and Applications, pp. 352–360, Copenhagen, Denmark, 20/09/2020, 2/10/2020
DOI: 10.1007/978-3-030-60936-8_27

See at: link.springer.com Open Access | ISTI Repository Open Access | CNR ExploRA Open Access | academic.microsoft.com Restricted | link.springer.com Restricted | link.springer.com Restricted


2020 Conference article Open Access OPEN

KNN-guided Adversarial Attacks
Massoli F. V., Falchi F., Amato G.
In the last decade, we have witnessed a renaissance of Deep Learning models. Nowadays, they are widely used in industrial as well as scientific fields, and noticeably, these models reached super-human per-formances on specific tasks such as image classification. Unfortunately, despite their great success, it has been shown that they are vulnerable to adversarial attacks-images to which a specific amount of noise imper-ceptible to human eyes have been added to lead the model to a wrong decision. Typically, these malicious images are forged, pursuing a misclas-sification goal. However, when considering the task of Face Recognition (FR), this principle might not be enough to fool the system. Indeed, in the context FR, the deep models are generally used merely as features ex-tractors while the final task of recognition is accomplished, for example, by similarity measurements. Thus, by crafting adversarials to fool the classifier, it might not be sufficient to fool the overall FR pipeline. Start-ing from this observation, we proposed to use a k-Nearest Neighbour algorithm as guidance to craft adversarial attacks against an FR system. In our study, we showed how this kind of attack could be more threaten-ing for an FR system than misclassification-based ones considering both the targeted and untargeted attack strategies.Source: SEBD 2020. Italian Symposium on Advanced Database Systems, pp. 302–309, Villasimius, Sud Sardegna, Italia, 21-24/6/2020

See at: ceur-ws.org Open Access | ISTI Repository Open Access | CNR ExploRA Open Access