2024
Conference article
Open Access
Beyond human imagination: the art of creating prompt-driven 3D scenes with Generative AI
Federico G., Carrara F., Amato G., Di Benedetto M.Reconstructing large-scale outdoor environments is essential for advancing XR applications but is hindered by the high cost and limitations of traditional methods like LiDAR, depth sensors, and photogrammetry. We propose generative neural architectures to address these issues. Our initial Spatio-Temporal Diffusion model combines temporal image sequences and coarse spatial data with a novel SDF_MIP representation for efficient training. Building on this, we introduce Neural-Clipmap, a scalable framework using an enhanced octree structure and Triplane representations to refine 3D reconstructions iteratively. Additionally, we leverage monocular RGB image sequences with 2D diffusion priors via Score Distillation Sampling (SDS) to reconstruct missing data, addressing challenges like initialization coherence and color accuracy through a multi-phase inpainting process. These approaches reduce resource requirements while enabling efficient, high-quality reconstructions.Source: VTT TECHNOLOGY, vol. 432, pp. 201-206. Athens, Greece, 27-29 Novembre 2024
Project(s): Italian Strengthening of ESFRI RI RESILIENCE,
SUN 
See at:
cris.vtt.fi
| CNR IRIS
| CNR IRIS
| CNR IRIS
2024
Conference article
Open Access
Spatio-temporal 3D reconstruction from frame sequences and feature points
Federico G., Carrara F., Amato G., Di Benedetto M.Reconstructing a large real environment is a fundamental task to promote eXtended Reality adoption in industrial and entertainment fields. However, the short range of depth cameras, the sparsity of LiDAR sensors, and the huge computational cost of Structure-from-Motion pipelines prevent scene replication in near real time. To overcome these limitations, we introduce a spatio-temporal diffusion neural architecture, a generative AI technique that fuses temporal information (i.e., a short temporally-ordered list of color photographs, like sparse frames of a video stream) with an approximate spatial resemblance of the explored environment. Our aim is to modify an existing 3D diffusion neural model to produce a Signed Distance Field volume from which a 3D mesh representation can be extracted. Our results show that the hallucination approach of diffusion models is an effective methodology where a fast reconstruction is a crucial target.DOI: 10.1145/3672406.3672415Project(s): Italian Strengthening of ESFRI RI RESILIENCE,
SUN
Metrics:
See at:
dl.acm.org
| CNR IRIS
| CNR IRIS
| CNR IRIS
2023
Conference article
Open Access
Social and hUman ceNtered XR
Vairo C, Callieri M, Carrara F, Cignoni P, Di Benedetto M, Gennaro C, Giorgi D, Palma G, Vadicamo L, Amato GThe Social and hUman ceNtered XR (SUN) project is focused on developing eXtended Reality (XR) solutions that integrate the physical and virtual world in a way that is convincing from a human and social perspective. In this paper, we outline the limitations that the SUN project aims to overcome, including the lack of scalable and cost-effective solutions for developing XR applications, limited solutions for mixing the virtual and physical environment, and barriers related to resource limitations of end-user devices. We also propose solutions to these limitations, including using artificial intelligence, computer vision, and sensor analysis to incrementally learn the visual and physical properties of real objects and generate convincing digital twins in the virtual environment. Additionally, the SUN project aims to provide wearable sensors and haptic interfaces to enhance natural interaction with the virtual environment and advanced solutions for user interaction. Finally, we describe three real-life scenarios in which we aim to demonstrate the proposed solutions.Source: CEUR WORKSHOP PROCEEDINGS. Pisa, Italy, 29-31/05/2023
See at:
ceur-ws.org
| CNR IRIS
| ISTI Repository
| ISTI Repository
| CNR IRIS
| CNR IRIS
2023
Conference article
Open Access
AIMH Lab 2022 activities for Healthcare
Carrara F, Ciampi L, Di Benedetto M, Falchi F, Gennaro C, Amato GThe application of Artificial Intelligence technologies in healthcare can enhance and optimize medical diagnosis, treatment, and patient care. Medical imaging, which involves Computer Vision to interpret and understand visual data, is one area of healthcare that shows great promise for AI, and it can lead to faster and more accurate diagnoses, such as detecting early signs of cancer or identifying abnormalities in the brain. This short paper provides an introduction to some of the activities of the Artificial Intelligence for Media and Humanities Laboratory of the ISTI-CNR that integrate AI and medical image analysis in healthcare. Specifically, the paper presents approaches that utilize 3D medical images to detect the behavior-variant of frontotemporal dementia, a neurodegenerative syndrome that can be diagnosed by analyzing brain scans. Furthermore, it illustrates some Deep Learning-based techniques for localizing and counting biological structures in microscopy images, such as cells and perineuronal nets. Lastly, the paper presents a practical and cost-effective AI-based tool for multi-species pupillometry (mice and humans), which has been validated in various scenarios.Source: CEUR WORKSHOP PROCEEDINGS, pp. 128-133. Pisa, Italy, 29-31/05/2023
See at:
ceur-ws.org
| CNR IRIS
| ISTI Repository
| CNR IRIS
2023
Conference article
Open Access
AIMH Lab 2022 activities for Vision
Ciampi L, Amato G, Bolettieri P, Carrara F, Di Benedetto M, Falchi F, Gennaro C, Messina N, Vadicamo L, Vairo CThe explosion of smartphones and cameras has led to a vast production of multimedia data. Consequently, Artificial Intelligence-based tools for automatically understanding and exploring these data have recently gained much attention. In this short paper, we report some activities of the Artificial Intelligence for Media and Humanities (AIMH) laboratory of the ISTI-CNR, tackling some challenges in the field of Computer Vision for the automatic understanding of visual data and for novel interactive tools aimed at multimedia data exploration. Specifically, we provide innovative solutions based on Deep Learning techniques carrying out typical vision tasks such as object detection and visual counting, with particular emphasis on scenarios characterized by scarcity of labeled data needed for the supervised training and on environments with limited power resources imposing miniaturization of the models. Furthermore, we describe VISIONE, our large-scale video search system designed to search extensive multimedia databases in an interactive and user-friendly manner.Source: CEUR WORKSHOP PROCEEDINGS, pp. 538-543. Pisa, Italy, 29-31/05/2023
Project(s): AI4Media 
, Future Artificial Intelligence Research
See at:
ceur-ws.org
| CNR IRIS
| ISTI Repository
| CNR IRIS
2023
Other
Open Access
AIMH Research Activities 2023
Aloia N., Amato G., Bartalesi Lenzi V., Bianchi L., Bolettieri P., Bosio C., Carraglia M., Carrara F., Casarosa V., Ciampi L., Coccomini D. A., Concordia C., Corbara S., De Martino C., Di Benedetto M., Esuli A., Falchi F., Fazzari E., Gennaro C., Lagani G., Lenzi E., Meghini C., Messina N., Molinari A., Moreo Fernandez A., Nardi A., Pedrotti A., Pratelli N., Puccetti G., Rabitti F., Savino P., Sebastiani F., Sperduti G., Thanos C., Trupiano L., Vadicamo L., Vairo C., Versienti L.The AIMH (Artificial Intelligence for Media and Humanities) laboratory is dedicated to exploring and pushing the boundaries in the field of Artificial Intelligence, with a particular focus on its application in digital media and humanities. This lab's objective is to enhance the current state of AI technology particularly on deep learning, text analysis, computer vision, multimedia information retrieval, multimedia content analysis, recognition, and retrieval. This report encapsulates the laboratory's progress and activities throughout the year 2023.DOI: 10.32079/isti-ar-2023/001Metrics:
See at:
CNR IRIS
| ISTI Repository
| CNR IRIS
2022
Conference article
Open Access
AIMH Lab for the Industry
Carrara F, Ciampi L, Di Benedetto M, Falchi F, Gennaro C, Massoli Fv, Amato GIn this short paper, we report the activities of the Artificial Intelligence for Media and Humanities (AIMH) laboratory of the ISTI-CNR related to Industry. The massive digitalization affecting all the stages of product design, production, and control calls for data-driven algorithms helping in the coordination of humans, machines, and digital resources in Industry 4.0. In this context, we developed AI-based Computer-Vision technologies of general interest in the emergent digital paradigm of the fourth industrial revolution, fo-cusing on anomaly detection and object counting for computer-assisted testing and quality control. Moreover, in the automotive sector, we explore the use of virtual worlds to develop AI systems in otherwise practically unfeasible scenarios, showing an application for accident avoidance in self-driving car AI agents.
See at:
CNR IRIS
| ISTI Repository
| www.ital-ia2022.it
| CNR IRIS
2022
Conference article
Open Access
AIMH Lab: Smart Cameras for Public Administration
Ciampi L, Cafarelli D, Carrara F, Di Benedetto M, Falchi F, Gennaro C, Massoli Fv, Messina N, Amato GIn this short paper, we report the activities of the Artificial Intelligence for Media and Humanities (AIMH) laboratory of the ISTI-CNR related to Public Administration. In particular, we present some AI-based public services serving the citizens that help achieve common goals beneficial to the society, putting humans at the epicenter. Through the automatic analysis of images gathered from city cameras, we provide AI applications ranging from smart parking and smart mobility to human activity monitoring.
See at:
CNR IRIS
| ISTI Repository
| www.ital-ia2022.it
| CNR IRIS
2022
Journal article
Open Access
An embedded toolset for human activity monitoring in critical environments
Di Benedetto M, Carrara F, Ciampi L, Falchi F, Gennaro C, Amato GIn many working and recreational activities, there are scenarios where both individual and collective safety have to be constantly checked and properly signaled, as occurring in dangerous workplaces or during pandemic events like the recent COVID-19 disease. From wearing personal protective equipment to filling physical spaces with an adequate number of people, it is clear that a possibly automatic solution would help to check compliance with the established rules. Based on an off-the-shelf compact and low-cost hardware, we present a deployed real use-case embedded system capable of perceiving people's behavior and aggregations and supervising the appliance of a set of rules relying on a configurable plug-in framework. Working on indoor and outdoor environments, we show that our implementation of counting people aggregations, measuring their reciprocal physical distances, and checking the proper usage of protective equipment is an effective yet open framework for monitoring human activities in critical conditions.Source: EXPERT SYSTEMS WITH APPLICATIONS, vol. 199
DOI: 10.1016/j.eswa.2022.117125Project(s): AI4EU 
,
AI4Media
Metrics:
See at:
CNR IRIS
| ISTI Repository
| CNR IRIS
| CNR IRIS
| CNR IRIS
2022
Other
Open Access
AI and computer vision for smart cities
Amato G, Carrara F, Ciampi L, Di Benedetto M, Gennaro C, Falchi F, Messina N, Vairo CArtificial Intelligence (AI) is increasingly employed to develop public services that make life easier for citizens. In this abstract, we present some research topics and applications carried out by the Artificial Intelligence for Media and Humanities (AIMH) laboratory of the ISTI-CNR of Pisa about the study and development of AI-based services for Smart Cities dedicated to the interaction with the physical world through the analysis of images gathered from city cameras. Like no other sensing mechanism, networks of city cameras can 'observe' the world and simultaneously provide visual data to AI systems to extract relevant information and make/suggest decisions helping to solve many real-world problems. Specifically, we discuss some solutions in the context of smart mobility, parking monitoring, infrastructure management, and surveillance systems.Project(s): AI4Media 
See at:
CNR IRIS
| icities2022.unicam.it
| ISTI Repository
| ISTI Repository
| CNR IRIS
| CNR IRIS
2022
Other
Open Access
CrowdVisor: an embedded toolset for human activity monitoring in critical environments
Di Benedetto M, Carrara F, Ciampi L, Falchi F, Gennaro C, Amato GAs evidenced during the recent COVID-19 pandemic, there are scenarios in which ensuring compliance to a set of guidelines (such as wearing medical masks and keeping a certain physical distance among people) becomes crucial to secure a safe living environment. However, human supervision could not always guarantee this task, especially in crowded scenes. This abstract presents CrowdVisor, an embedded modular Computer Vision-based and AI-assisted system that can carry out several tasks to help monitor individual and collective human safety rules. We strive for a real-time but low-cost system, thus complying with the compute- and storage-limited resources availability typical of off-the-shelves embedded devices, where images are captured and processed directly onboard. Our solution consists of multiple modules relying on well-researched neural network components, each responsible for specific functionalities that the user can easily enable and configure. In particular, by exploiting one of these modules or combining some of them, our framework makes available many capabilities. They range from the ability to estimate the so-called social distance to the estimation of the number of people present in the monitored scene, as well as the possibility to localize and classify Personal Protective Equipment (PPE) worn by people (such as helmets and face masks). To validate our solution, we test all the functionalities that our framework makes available over two novel datasets that we collected and annotated on purpose. Experiments show that our system provides a valuable asset to monitor compliance with safety rules automatically.Project(s): AI4EU 
,
AI4Media 
See at:
CNR IRIS
| icities2022.unicam.it
| ISTI Repository
| ISTI Repository
| CNR IRIS
| CNR IRIS
2022
Journal article
Open Access
Deep networks for behavioral variant frontotemporal dementia identification from multiple acquisition sources
Di Benedetto M, Carrara F, Tafuri B, Nigro S, De Blasi R, Falchi F, Gennaro C, Gigli G, Logroscino G, Amato GBehavioral variant frontotemporal dementia (bvFTD) is a neurodegenerative syndrome whose clinical diagnosis remains a challenging task especially in the early stage of the disease. Currently, the presence of frontal and anterior temporal lobe atrophies on magnetic resonance imaging (MRI) is part of the diagnostic criteria for bvFTD. However, MRI data processing is usually dependent on the acquisition device and mostly require human-assisted crafting of feature extraction. Following the impressive improvements of deep architectures, in this study we report on bvFTD identification using various classes of artificial neural networks, and present the results we achieved on classification accuracy and obliviousness on acquisition devices using extensive hyperparameter search. In particular, we will demonstrate the stability and generalization of different deep networks based on the attention mechanism, where data intra-mixing confers models the ability to identify the disorder even on MRI data in inter-device settings, i.e., on data produced by different acquisition devices and without model fine tuning, as shown from the very encouraging performance evaluations that dramatically reach and overcome the 90% value on the AuROC and balanced accuracy metrics.Source: COMPUTERS IN BIOLOGY AND MEDICINE, vol. 148
DOI: 10.1016/j.compbiomed.2022.105937Project(s): AI4Media
Metrics:
See at:
CNR IRIS
| ISTI Repository
| www.sciencedirect.com
| CNR IRIS
| CNR IRIS
| CNR IRIS
2022
Other
Open Access
AIMH research activities 2022
Aloia N., Amato G., Bartalesi Lenzi V., Benedetti F., Bolettieri P., Cafarelli D., Carrara F., Casarosa V., Ciampi L., Coccomini D. A., Concordia C., Corbara S., Di Benedetto M., Esuli A., Falchi F., Gennaro C., Lagani G., Lenzi E., Meghini C., Messina N., Metilli D., Molinari A., Moreo Fernandez A. D., Nardi A., Pedrotti A., Pratelli N., Rabitti F., Savino P., Sebastiani F., Sperduti G., Thanos C., Trupiano L., Vadicamo L., Vairo C.The Artificial Intelligence for Media and Humanities laboratory (AIMH) has the mission to investigate and advance the state of the art in the Artificial Intelligence field, specifically addressing applications to digital media and digital humanities, and taking also into account issues related to scalability.This report summarize the 2022 activities of the research group.DOI: 10.32079/isti-ar-2022/002Metrics:
See at:
CNR IRIS
| ISTI Repository
| CNR IRIS
2021
Other
Open Access
AIMH research activities 2021
Aloia N., Amato G., Bartalesi Lenzi V., Benedetti F., Bolettieri P., Cafarelli D., Carrara F., Casarosa V., Coccomini D., Ciampi L., Concordia C., Corbara S., Di Benedetto M., Esuli A., Falchi F., Gennaro C., Lagani G., Massoli F. V., Meghini C., Messina N., Metilli D., Molinari A., Moreo Fernandez A., Nardi A., Pedrotti A., Pratelli N., Rabitti F., Savino P., Sebastiani F., Sperduti G., Thanos C., Trupiano L., Vadicamo L., Vairo C.The Artificial Intelligence for Media and Humanities laboratory (AIMH) has the mission to investigate and advance the state of the art in the Artificial Intelligence field, specifically addressing applications to digital media and digital humanities, and taking also into account issues related to scalability.
This report summarize the 2021 activities of the research group.DOI: 10.32079/isti-ar-2021/003Metrics:
See at:
CNR IRIS
| ISTI Repository
| CNR IRIS
2020
Journal article
Open Access
Learning accurate personal protective equipment detection from virtual worlds
Di Benedetto M., Carrara F., Meloni E., Amato G., Falchi F., Gennaro C.Deep learning has achieved impressive results in many machine learning tasks such as image recognition and computer vision. Its applicability to supervised problems is however constrained by the availability of high-quality training data consisting of large numbers of humans annotated examples (e.g. millions). To overcome this problem, recently, the AI world is increasingly exploiting artificially generated images or video sequences using realistic photo rendering engines such as those used in entertainment applications. In this way, large sets of training images can be easily created to train deep learning algorithms. In this paper, we generated photo-realistic synthetic image sets to train deep learning models to recognize the correct use of personal safety equipment (e.g., worker safety helmets, high visibility vests, ear protection devices) during at-risk work activities. Then, we performed the adaptation of the domain to real-world images using a very small set of real-world images. We demonstrated that training with the synthetic training set generated and the use of the domain adaptation phase is an effective solution for applications where no training set is available.Source: MULTIMEDIA TOOLS AND APPLICATIONS
DOI: 10.1007/s11042-020-09597-9Project(s): AI4EU
Metrics:
See at:
CNR IRIS
| link.springer.com
| ISTI Repository
| Multimedia Tools and Applications
| CNR IRIS
| CNR IRIS
2019
Conference article
Open Access
Learning Safety Equipment Detection using Virtual Worlds
Di Benedetto M, Meloni E, Amato G, Falchi F, Gennaro CNowadays, the possibilities offered by state-of-The-Art deep neural networks allow the creation of systems capable of recognizing and indexing visual content with very high accuracy. Performance of these systems relies on the availability of high quality training sets, containing a large number of examples (e.g. million), in addition to the the machine learning tools themselves. For several applications, very good training sets can be obtained, for example, crawling (noisily) annotated images from the internet, or by analyzing user interaction (e.g.: on social networks). However, there are several applications for which high quality training sets are not easy to be obtained/created. Consider, as an example, a security scenario where one wants to automatically detect rarely occurring threatening events. In this respect, recently, researchers investigated the possibility of using a visual virtual environment, capable of artificially generating controllable and photo-realistic contents, to create training sets for applications with little available training images. We explored this idea to generate synthetic photo-realistic training sets to train classifiers to recognize the proper use of individual safety equipment (e.g.: worker protection helmets, high-visibility vests, ear protection devices) during risky human activities. Then, we performed domain adaptation to real images by using a very small image data set of real-world photographs. We show that training with the generated synthetic training set and using the domain adaptation step is an effective solution to address applications for which no training sets exist.Source: PROCEEDINGS INTERNATIONAL WORKSHOP ON CONTENT-BASED MULTIMEDIA, vol. 2019-September. Dublin, Ireland, 4/9/2019, 6/9/2019
DOI: 10.1109/cbmi.2019.8877466Project(s): AI4EU
Metrics:
See at:
CNR IRIS
| ieeexplore.ieee.org
| ISTI Repository
| doi.org
| CNR IRIS
| CNR IRIS
2019
Other
Open Access
AIMIR 2019 Research Activities
Amato G, Bolettieri P, Carrara F, Ciampi L, Di Benedetto M, Debole F, Falchi F, Gennaro C, Lagani G, Massoli Fv, Messina N, Rabitti F, Savino P, Vadicamo L, Vairo CMultimedia Information Retrieval (AIMIR) research group is part of the NeMIS laboratory of the Information Science and Technologies Institute "A. Faedo" (ISTI) of the Italian National Research Council (CNR). The AIMIR group has a long experience in topics related to: Artificial Intelligence, Multimedia Information Retrieval, Computer Vision and Similarity search on a large scale.
We aim at investigating the use of Artificial Intelligence and Deep Learning, for Multimedia Information Retrieval, addressing both effectiveness and efficiency. Multimedia information retrieval techniques should be able to provide users with pertinent results, fast, on huge amount of multimedia data.
Application areas of our research results range from cultural heritage to smart tourism, from security to smart cities, from mobile visual search to augmented reality.
This report summarize the 2019 activities of the research group.
See at:
CNR IRIS
| ISTI Repository
| CNR IRIS