Page 1 of 14

2023 Conference article Open Access

Social and hUman ceNtered XR
Vairo C., Callieri M., Carrara F., Cignoni P., Di Benedetto M., Gennaro C., Giorgi D., Palma G., Vadicamo L., Amato G.
The Social and hUman ceNtered XR (SUN) project is focused on developing eXtended Reality (XR) solutions that integrate the physical and virtual world in a way that is convincing from a human and social perspective. In this paper, we outline the limitations that the SUN project aims to overcome, including the lack of scalable and cost-effective solutions for developing XR applications, limited solutions for mixing the virtual and physical environment, and barriers related to resource limitations of end-user devices. We also propose solutions to these limitations, including using artificial intelligence, computer vision, and sensor analysis to incrementally learn the visual and physical properties of real objects and generate convincing digital twins in the virtual environment. Additionally, the SUN project aims to provide wearable sensors and haptic interfaces to enhance natural interaction with the virtual environment and advanced solutions for user interaction. Finally, we describe three real-life scenarios in which we aim to demonstrate the proposed solutions.Source: Ital-IA 2023 - Workshop su AI per l'industria, Pisa, Italy, 29-31/05/2023

See at: ceur-ws.org Open Access | ISTI Repository | ISTI Repository | CNR ExploRA

2023 Report Unknown

SUN D1.1 - Management Website
Amato G., Bolettieri P., Gennaro C., Vadicamo L., Vairo C.
Report describing the online web accessible repository for all project-related documentation, which serves as the primary means for project partners to manage and share documents of the project. https://wiki.sun-xr-project.euSource: ISTI Project Report, SUN, D1.1, 2023

See at: CNR ExploRA

2023 Conference article Open Access

Unsupervised domain adaptation for video violence detection in the wild
Ciampi L., Santiago C., Costeira J. P., Falchi F. Gennaro C., Amato G.
Video violence detection is a subset of human action recognition aiming to detect violent behaviors in trimmed video clips. Current Computer Vision solutions based on Deep Learning approaches provide astonishing results. However, their success relies on large collections of labeled datasets for supervised learning to guarantee that they generalize well to diverse testing scenarios. Although plentiful annotated data may be available for some pre-specified domains, manual annotation is unfeasible for every ad-hoc target domain or task. As a result, in many real-world applications, there is a domain shift between the distributions of the train (source) and test (target) domains, causing a significant drop in performance at inference time. To tackle this problem, we propose an Unsupervised Domain Adaptation scheme for video violence detection based on single image classification that mitigates the domain gap between the two domains. We conduct experiments considering as the source labeled domain some datasets containing violent/non-violent clips in general contexts and, as the target domain, a collection of videos specific for detecting violent actions in public transport, showing that our proposed solution can improve the performance of the considered models.Source: IMPROVE 2023 - 3rd International Conference on Image Processing and Vision Engineering, pp. 37–46, Prague, Czech Republic, 21-23/04/2023
DOI: 10.5220/0011965300003497
Project(s): AI4Media via OpenAIRE

Metrics:

See at: ISTI Repository Open Access | www.scitepress.org Restricted | CNR ExploRA

2023 Report Unknown

THE D.3.2.1 - AA@THE User needs, technical requirements and specifications
Pratali L., Campana M. G., Delmastro F., Di Martino F., Pescosolido L., Barsocchi P., Broccia G., Ciancia V., Gennaro C., Girolami M., Lagani G., La Rosa D., Latella D., Magrini M., Manca M., Massink M., Mattioli A., Moroni D., Palumbo F., Paradisi P., Paternò F., Santoro C., Sebastiani L., Vairo C.
Deliverable D3.2.1 del progetto PNRR Ecosistemi ed innovazione - THESource: ISTI Project Report, THE, D3.2, 2023

See at: CNR ExploRA

2023 Journal article Open Access

A comprehensive atlas of perineuronal net distribution and colocalization with parvalbumin in the adult mouse brain
Lupori L., Totaro V., Cornuti S., Ciampi L., Carrara F., Grilli E., Viglione A., Tozzi F., Putignano E., Mazziotti R., Amato G., Gennaro G., Tognini P., Pizzorusso T.
Perineuronal nets (PNNs) surround specific neurons in the brain and are involved in various forms of plasticity and clinical conditions. However, our understanding of the PNN role in these phenomena is limited by the lack of highly quantitative maps of PNN distribution and association with specific cell types. Here, we present a comprehensive atlas of Wisteria floribunda agglutinin (WFA)-positive PNNs and colocalization with parvalbumin (PV) cells for over 600 regions of the adult mouse brain. Data analysis shows that PV expression is a good predictor of PNN aggregation. In the cortex, PNNs are dramatically enriched in layer 4 of all primary sensory areas in correlation with thalamocortical input density, and their distribution mirrors intracortical connectivity patterns. Gene expression analysis identifies many PNN-correlated genes. Strikingly, PNN-anticorrelated transcripts are enriched in synaptic plasticity genes, generalizing PNNs' role as circuit stability factors.Source: Cell reports 42 (2023). doi:10.1016/j.celrep.2023.112788
DOI: 10.1016/j.celrep.2023.112788
Project(s): AI4Media via OpenAIRE

Metrics:

See at: www.cell.com Open Access | CNR ExploRA

2023 Journal article Open Access

On the generalization of Deep Learning models in video deepfake detection
Coccomini D. A., Caldelli R., Falchi F., Gennaro C.
The increasing use of deep learning techniques to manipulate images and videos, commonly referred to as "deepfakes", is making it more challenging to differentiate between real and fake content, while various deepfake detection systems have been developed, they often struggle to detect deepfakes in real-world situations. In particular, these methods are often unable to effectively distinguish images or videos when these are modified using novel techniques which have not been used in the training set. In this study, we carry out an analysis of different deep learning architectures in an attempt to understand which is more capable of better generalizing the concept of deepfake. According to our results, it appears that Convolutional Neural Networks (CNNs) seem to be more capable of storing specific anomalies and thus excel in cases of datasets with a limited number of elements and manipulation methodologies. The Vision Transformer, conversely, is more effective when trained with more varied datasets, achieving more outstanding generalization capabilities than the other methods analysed. Finally, the Swin Transformer appears to be a good alternative for using an attention-based method in a more limited data regime and performs very well in cross-dataset scenarios. All the analysed architectures seem to have a different way to look at deepfakes, but since in a real-world environment the generalization capability is essential, based on the experiments carried out, the attention-based architectures seem to provide superior performances.Source: JOURNAL OF IMAGING 9 (2023). doi:10.3390/jimaging9050089
DOI: 10.3390/jimaging9050089
DOI: 10.20944/preprints202303.0161.v1
Project(s): AI4Media via OpenAIRE

Metrics:

See at: doi.org Open Access | Journal of Imaging | www.mdpi.com | CNR ExploRA

2023 Conference article Open Access

VISIONE: a large-scale video retrieval system with advanced search functionalities
Amato G., Bolettieri P., Carrara F., Falchi F., Gennaro C., Messina N., Vadicamo L., Vairo C.
VISIONE is a large-scale video retrieval system that integrates multiple search functionalities, including free text search, spatial color and object search, visual and semantic similarity search, and temporal search. The system leverages cutting-edge AI technology for visual analysis and advanced indexing techniques to ensure scalability. As demonstrated by its runner-up position in the 2023 Video Browser Showdown competition, VISIONE effectively integrates these capabilities to provide a comprehensive video retrieval solution. A system demo is available online, showcasing its capabilities on over 2300 hours of diverse video content (V3C1+V3C2 dataset) and 12 hours of highly redundant content (Marine dataset). The demo can be accessed at https://visione.isti.cnr.itSource: ICMR '23: International Conference on Multimedia Retrieval, pp. 649–653, Thessaloniki, Greece, 12-15/06/2023
DOI: 10.1145/3591106.3592226
Project(s): AI4Media via OpenAIRE

Metrics:

See at: ISTI Repository Open Access | CNR ExploRA

2023 Conference article Open Access

VISIONE at Video Browser Showdown 2023
Amato G., Bolettieri P., Carrara F., Falchi F., Gennaro C., Messina N., Vadicamo L., Vairo C.
In this paper, we present the fourth release of VISIONE, a tool for fast and effective video search on a large-scale dataset. It includes several search functionalities like text search, object and color-based search, semantic and visual similarity search, and temporal search. VISIONE uses ad-hoc textual encoding for indexing and searching video content, and it exploits a full-text search engine as search backend. In this new version of the system, we introduced some changes both to the current search techniques and to the user interface.Source: MMM 2023 - 29th International Conference on Multi Media Modeling, pp. 615–621, Bergen, Norway, 9-12/01/2023
DOI: 10.1007/978-3-031-27077-2_48
Project(s): AI4Media via OpenAIRE

Metrics:

See at: ISTI Repository Open Access | ZENODO | CNR ExploRA

2023 Report Open Access

CNR activity in the ESA Extension project
Vairo C., Bolettieri P., Gennaro C., Amato G.
The CNR activity within the ESA "EXTENSION" project aims to develop an advanced visual recognition system for cultural heritage objects in L'Aquila, using AI techniques such as classifiers. However, this task requires substantial computational resources due to the large amount of data and deep learning-based AI techniques involved. To overcome these challenges, a centralized approach has been adopted, with a central server providing the necessary computational power and storage capacity.Source: ISTI Technical Report, ISTI-TR-2023/010, pp.1–10, 2023
DOI: 10.32079/isti-tr-2023/010
Metrics:

See at: ISTI Repository Open Access | CNR ExploRA

2023 Journal article Open Access

Learning-based traffic scheduling in non-stationary multipath 5G non-terrestrial networks
Machumilane A., Gotta A., Cassarà P., Amato G., Gennaro C.
In non-terrestrial networks, where low Earth orbit satellites and user equipment move relative to each other, line-of-sight tracking and adapting to channel state variations due to endpoint movements are a major challenge. Therefore, continuous line-of-sight estimation and channel impairment compensation are crucial for user equipment to access a satellite and maintain connectivity. In this paper, we propose a framework based on actor-critic reinforcement learning for traffic scheduling in non-terrestrial networks scenario where the channel state is non-stationary due to the variability of the line of sight, which depends on the current satellite elevation. We deploy the framework as an agent in a multipath routing scheme where the user equipment can access more than one satellite simultaneously to improve link reliability and throughput. We investigate how the agent schedules traffic in multiple satellite links by adopting policies that are evaluated by an actor-critic reinforcement learning approach. The agent continuously trains its model based on variations in satellite elevation angles, handovers, and relative line-of-sight probabilities. We compare the agent's retraining time with the satellite visibility intervals to investigate the effectiveness of the agent's learning rate. We carry out performance analysis while considering the dense urban area of Paris, where high-rise buildings significantly affect the line of sight. The simulation results show how the learning agent selects the scheduling policy when it is connected to a pair of satellites. The results also show that the retraining time of the learning agent is up to 0.1times the satellite visibility time at given elevations, which guarantees efficient use of satellite visibility.Source: Remote sensing (Basel) 15 (2023). doi:10.3390/rs15071842
DOI: 10.3390/rs15071842
Metrics:

See at: Remote Sensing Open Access | ISTI Repository | www.mdpi.com | ZENODO | CNR ExploRA

2023 Conference article Open Access

AIMH Lab 2022 activities for Healthcare
Carrara F., Ciampi L., Di Benedetto M., Falchi F., Gennaro C., Amato G.
The application of Artificial Intelligence technologies in healthcare can enhance and optimize medical diagnosis, treatment, and patient care. Medical imaging, which involves Computer Vision to interpret and understand visual data, is one area of healthcare that shows great promise for AI, and it can lead to faster and more accurate diagnoses, such as detecting early signs of cancer or identifying abnormalities in the brain. This short paper provides an introduction to some of the activities of the Artificial Intelligence for Media and Humanities Laboratory of the ISTI-CNR that integrate AI and medical image analysis in healthcare. Specifically, the paper presents approaches that utilize 3D medical images to detect the behavior-variant of frontotemporal dementia, a neurodegenerative syndrome that can be diagnosed by analyzing brain scans. Furthermore, it illustrates some Deep Learning-based techniques for localizing and counting biological structures in microscopy images, such as cells and perineuronal nets. Lastly, the paper presents a practical and cost-effective AI-based tool for multi-species pupillometry (mice and humans), which has been validated in various scenarios.Source: Ital-IA 2023, pp. 128–133, Pisa, Italy, 29-31/05/2023

See at: ceur-ws.org Open Access | ISTI Repository | CNR ExploRA

2023 Conference article Open Access

MC-GTA: a synthetic benchmark for multi-camera vehicle tracking
Ciampi L., Messina N., Valenti G. E., Amato G., Falchi F., Gennaro C.
Multi-camera vehicle tracking (MCVT) aims to trace multiple vehicles among videos gathered from overlapping and non-overlapping city cameras. It is beneficial for city-scale traffic analysis and management as well as for security. However, developing MCVT systems is tricky, and their real-world applicability is dampened by the lack of data for training and testing computer vision deep learning-based solutions. Indeed, creating new annotated datasets is cumbersome as it requires great human effort and often has to face privacy concerns. To alleviate this problem, we introduce MC-GTA - Multi Camera Grand Tracking Auto, a synthetic collection of images gathered from the virtual world provided by the highly-realistic Grand Theft Auto 5 (GTA) video game. Our dataset has been recorded from several cameras recording urban scenes at various crossroads. The annotations, consisting of bounding boxes localizing the vehicles with associated unique IDs consistent across the video sources, have been automatically generated by interacting with the game engine. To assess this simulated scenario, we conduct a performance evaluation using an MCVT SOTA approach, showing that it can be a valuable benchmark that mitigates the need for real-world data. The MC-GTA dataset and the code for creating new ad-hoc custom scenarios are available at https://github.com/GaetanoV10/GT5-Vehicle-BB.Source: ICIAP 2023 - 22nd International Conference on Image Analysis and Processing, pp. 316–327, Udine, Italy, 11-15/09/2023
DOI: 10.1007/978-3-031-43148-7_27
Project(s): AI4Media via OpenAIRE

Metrics:

See at: ISTI Repository Open Access | link.springer.com Restricted | CNR ExploRA

2023 Conference article Open Access

AIMH Lab 2022 activities for Vision
Ciampi L., Amato G., Bolettieri P., Carrara F., Di Benedetto M., Falchi F., Gennaro C., Messina N., Vadicamo L., Vairo C.
The explosion of smartphones and cameras has led to a vast production of multimedia data. Consequently, Artificial Intelligence-based tools for automatically understanding and exploring these data have recently gained much attention. In this short paper, we report some activities of the Artificial Intelligence for Media and Humanities (AIMH) laboratory of the ISTI-CNR, tackling some challenges in the field of Computer Vision for the automatic understanding of visual data and for novel interactive tools aimed at multimedia data exploration. Specifically, we provide innovative solutions based on Deep Learning techniques carrying out typical vision tasks such as object detection and visual counting, with particular emphasis on scenarios characterized by scarcity of labeled data needed for the supervised training and on environments with limited power resources imposing miniaturization of the models. Furthermore, we describe VISIONE, our large-scale video search system designed to search extensive multimedia databases in an interactive and user-friendly manner.Source: Ital-IA 2023, pp. 538–543, Pisa, Italy, 29-31/05/2023
Project(s): AI4Media via OpenAIRE

See at: ceur-ws.org Open Access | ISTI Repository | CNR ExploRA

2023 Conference article Open Access

Vec2Doc: transforming dense vectors into sparse representations for efficient information retrieval
Carrara F., Gennaro C., Vadicamo L., Amato G.
Vec2Doc: Transforming Dense Vectors into Sparse Representations for Efficient Information RetrievalSource: SISAP 2023 - 16th International Conference on Similarity Search and Applications, pp. 215–222, A Coruña, Spain, 9-11/10/2023
DOI: 10.1007/978-3-031-46994-7_18
Project(s): AI4Media via OpenAIRE

Metrics:

See at: ISTI Repository Open Access | CNR ExploRA

2023 Conference article Open Access

AIMH Lab approaches for deepfake detection
Coccomini D. A., Caldelli R., Esuli A., Falchi F., Gennaro C., Messina N., Amato G.
The creation of highly realistic media known as deepfakes has been facilitated by the rapid development of artificial intelligence technologies, including deep learning algorithms, in recent years. Concerns about the increasing ease of creation and credibility of deepfakes have then been growing more and more, prompting researchers around the world to concentrate their efforts on the field of deepfake detection. In this same context, researchers at ISTI-CNR's AIMH Lab have conducted numerous researches, investigations and proposals to make their own contribution to combating this worrying phenomenon. In this paper, we present the main work carried out in the field of deepfake detection and synthetic content detection, conducted by our researchers and in collaboration with external organizations.Source: Ital-IA 2023, pp. 432–436, Pisa, Italy, 29-31/05/2023
Project(s): AI4Media via OpenAIRE

See at: ceur-ws.org Open Access | ISTI Repository | CNR ExploRA

2023 Conference article Open Access

AIMH Lab for a susteinable bio-inspired AI
Lagani G., Falchi F., Gennaro C., Amato G.
In this short paper, we report the activities of the Artificial Intelligence for Media and Humanities (AIMH) laboratory of the ISTI-CNR related to Sustainable AI. In particular, we discuss the problem of the environmental impact of AI research, and we discuss a research direction aimed at creating effective intelligent systems with a reduced ecological footprint. The proposal is based on bio-inspired learning, which takes inspiration from the biological processes underlying human intelligence in order to produce more energy-efficient AI systems. In fact, biological brains are able to perform complex computations, with a power consumption which is orders of magnitude smaller than that of traditional AI. The ability to control and replicate these biological processes reveals promising results towards the realization of sustainable AISource: ITAL-IA 2023, pp. 575–584, Pisa, Italy, 29-30/05/2023

See at: ceur-ws.org Open Access | ISTI Repository | CNR ExploRA

2023 Journal article Open Access

Induced permutations for approximate metric search
Vadicamo L., Amato G., Gennaro C.
Permutation-based Indexing (PBI) approaches have been proven to be particularly effective for conducting large-scale approximate metric searching. These methods rely on the idea of transforming the original metric objects into permutation representations, which can be efficiently indexed using data structures such as inverted files. The standard conceptualization of permutation associated with a metric object involves only the use of object distances and their relative orders from a set of anchors called pivots. In this paper, we generalized this definition in order to enlarge the class of permutation representations that can be used by PBI approaches. In particular, we introduced the concept of permutation induced by a space transformation and a sorting function, and we investigated which properties these transformations should possess to produce permutations that are effective for metric search. Furthermore, as a practical outcome, we defined a new type of permutation representation that is calculated using distances from pairs of pivots. This proposed technique allowed us to produce longer permutations than traditional ones for the same number of object-pivot distance calculations. The advantage lies in the fact that when longer permutations are employed, the use of inverted files built on permutation prefixes leads to greater efficiency in the search phase.Source: Information systems (Oxf.) 119 (2023). doi:10.1016/j.is.2023.102286
DOI: 10.1016/j.is.2023.102286
Project(s): AI4EU via OpenAIRE

, AI4Media via OpenAIRE

Metrics:

See at: ISTI Repository Open Access | Information Systems Restricted | www.sciencedirect.com | CNR ExploRA

2023 Conference article Open Access

An optimized pipeline for image-based localization in museums from egocentric images
Messina N., Falchi F., Furnari A., Gennaro C., Farinella G. M.
With the increasing interest in augmented and virtual reality, visual localization is acquiring a key role in many downstream applications requiring a real-time estimate of the user location only from visual streams. In this paper, we propose an optimized hierarchical localization pipeline by specifically tackling cultural heritage sites with specific applications in museums. Specifically, we propose to enhance the Structure from Motion (SfM) pipeline for constructing the sparse 3D point cloud by a-priori filtering blurred and near-duplicated images. We also study an improved inference pipeline that merges similarity-based localization with geometric pose estimation to effectively mitigate the effect of strong outliers. We show that the proposed optimized pipeline obtains the lowest localization error on the challenging Bellomo dataset. Our proposed approach keeps both build and inference times bounded, in turn enabling the deployment of this pipeline in real-world scenarios.Source: ICIAP 2023 - 22nd International Conference on Image Analysis and Processing, pp. 512–524, Udine, Italy, 11-15/09/2023
DOI: 10.1007/978-3-031-43148-7_43
Project(s): AI4Media via OpenAIRE

Metrics:

See at: IRIS - Università degli Studi di Catania Open Access | ISTI Repository | doi.org Restricted | CNR ExploRA

2023 Conference article Open Access

Traffic scheduling in non-stationary multipath non-terrestrial networks: a reinforcement learning approach
Machumilane A., Gotta A., Cassarà P., Gennaro C., Amato G.
In Non-Terrestrial Networks (NTNs), where LEO satellites and User Equipment (UE) move relative to each other, Line-of-Sight (LOS) tracking, and adapting to channel state variations due to endpoint movements are a major challenge. Therefore, continuous LOS estimation and channel impairment compensation are crucial for a UE to access a satellite and maintain connectivity. In this paper, we propose a Actor-Critic (AC)-Reinforcement Learning (RL) framework for traffic scheduling in NTN scenarios where the channel state is non-stationary due to the variability of LOS, which depends on the current satellite elevation. We deploy the framework as an agent in a Multi-Path Routing (MPR) scheme where the UE can access more than one satellite simultaneously to improve link reliability and throughput. We study how the agent schedules traffic on multiple satellite links by adopting the AC version of RL. The agent continuously trains based on variations in satellite elevation angles, handoffs, and relative LOS probabilities. We compare the agent retraining time with the satellite visibility intervals to investigate the effectiveness of the agent's learning rate. We carry out performance analysis considering the dense urban area of Chicago, where high-rise buildings significantly affect the LOS. The simulation results show how the learning agent selects the scheduling policy when it is connected to a pair of satellites. The results also show that the retraining time of the learning agent is up to 0.1 times the satellite visibility time at certain elevations, which guarantees efficient use of satellite visibility.Source: ICC 2023 - IEEE International Conference on Communications, pp. 4094–4099, Rome, Italy, 28/05-01/06/2023
DOI: 10.1109/icc45041.2023.10279788
DOI: 10.5281/zenodo.8430896
DOI: 10.5281/zenodo.8430897
Metrics:

2023 Report Open Access

AIMH Research Activities 2023
Aloia N., Amato G., Bartalesi V., Bianchi L., Bolettieri P., Bosio C., Carraglia M., Carrara F., Casarosa V., Ciampi L., Coccomini D. A., Concordia C., Corbara S., De Martino C., Di Benedetto M., Esuli A., Falchi F., Fazzari E., Gennaro C., Lagani G., Lenzi E., Meghini C., Messina N., Molinari A., Moreo A., Nardi A., Pedrotti A., Pratelli N., Puccetti G., Rabitti F., Savino P., Sebastiani F., Sperduti G., Thanos C., Trupiano L., Vadicamo L., Vairo C., Versienti L.
The AIMH (Artificial Intelligence for Media and Humanities) laboratory is dedicated to exploring and pushing the boundaries in the field of Artificial Intelligence, with a particular focus on its application in digital media and humanities. This lab's objective is to enhance the current state of AI technology particularly on deep learning, text analysis, computer vision, multimedia information retrieval, multimedia content analysis, recognition, and retrieval. This report encapsulates the laboratory's progress and activities throughout the year 2023.Source: ISTI Annual Reports, 2023
DOI: 10.32079/isti-ar-2023/001
Metrics:

See at: ISTI Repository Open Access | CNR ExploRA