326 result(s)
Page Size: 10, 20, 50
Export: bibtex, xml, json, csv
Order by:

CNR Author operator: and / or
more
Typology operator: and / or
Language operator: and / or
Date operator: and / or
Rights operator: and / or
2025 Conference article Open Access OPEN
Wordnet and word ladders: climbing the abstraction taxonomy with LLMs
Puccetti G., Bolognesi M., Esuli A.
WordNet has long served as a benchmark for approximating the mechanisms of semantic categorization in the human mind, particularly through its hierarchical structure of word synsets, most notably the IS-A relation. However, these semantic relations have traditionally been curated manually by expert lexicographers, relying on external resources like dictionaries and corpora. In this paper, we explore whether large language models (LLMs) can be leveraged to approximate these hierarchical semantic relations, potentially offering a scalable and more dynamic alternative for maintaining and updating the WordNet taxonomy. This investigation addresses the feasibility and implications of automating this process with LLMs by testing a set of prompts encoding different sociodemographic traits and finds that adding age and job information to the prompt affects the model ability to generate text in agreement with hierarchical semantic relations while gender does not have a statistically significant impact.

See at: CNR IRIS Open Access | unipv-larl.github.io Open Access | CNR IRIS Restricted


2025 Journal article Restricted
Criticality in neural cultures: insights into memory and connectivity in entorhinal-hippocampal networks
Iannello L., Tonelli F., Cremisi F., Calcagnile L. Maria, Mannella R., Amato G., Di Garbo A.
The brain is a complex system of interconnected regions that underlie memory, cognition, and perception. Today, our understanding of the brain's dynamic processes remains incomplete, particularly regarding differences in electrophysiological activity and inter-regional connectivity among specific areas. To explore this, we investigated the electrical activity, functional connectivity, and interactions of neural cultures differentiated into hippocampal, isocortical, and entorhinal networks using multi-electrode arrays (MEAs) to record extracellular local field potentials. Our results showed that collective synchronization events, or network bursts, were present in all cultures except for the hippocampal networks. Interestingly, introducing entorhinal neuron spheroids onto hippocampal cultures induced synchronized activity. Furthermore, Self-organized criticality analysis confirmed that all networks, except hippocampal cultures, were in a critical regime. Moreover, we found that entorhinal-hippocampal coupling facilitated criticality, promoting recurrent synchronized activity patterns. The consistent scaling exponents across configurations underscore the universality of criticality in biological networks. Finally, power spectrum analysis revealed a theta band peak in connected entorhinal-hippocampal cultures, consistent with in vivo studies, highlighting the role of theta oscillations in memory consolidation. Our findings provide more insights into brain functioning and offer an in vitro model for studying learning and memory.Source: CHAOS, SOLITONS AND FRACTALS, vol. 194
DOI: 10.1016/j.chaos.2025.116184
Project(s): AICult: Artificial Intelligence with Cultured Neuronal Networks, Tuscany Health Ecosystem
Metrics:


See at: Chaos Solitons & Fractals Restricted | Archivio istituzionale della Ricerca - Scuola Normale Superiore Restricted | CNR IRIS Restricted | CNR IRIS Restricted | CNR IRIS Restricted | www.sciencedirect.com Restricted


2025 Conference article Open Access OPEN
Stress-testing machine generated text detection: shifting language models writing style to fool detectors
Pedrotti A., Papucci M., Ciaccio C., Miaschi A., Puccetti G., Dell'Orletta F., Esuli A.
Recent advancements in Generative AI and Large Language Models (LLMs) have enabled the creation of highly realistic synthetic content, raising concerns about the potential for malicious use, such as misinformation and manipulation. Moreover, detecting Machine-Generated Text (MGT) remains challenging due to the lack of robust benchmarks that assess generalization to real-world scenarios. In this work, we evaluate the resilience of state-of-the-art MGT detectors (e.g., Mage, Radar, LLM-DetectAIve) to linguistically informed adversarial attacks. We develop a pipeline that fine-tunes language models using Direct Preference Optimization (DPO) to shift the MGT style toward human-written text (HWT), obtaining generations more challenging to detect by current models. Additionally, we analyze the linguistic shifts induced by the alignment and how detectors rely on “linguistic shortcuts” to detect texts. Our results show that detectors can be easily fooled with relatively few examples, resulting in a significant drop in detecting performances. This highlights the importance of improving detection methods and making them robust to unseen in-domain texts. We release code, models, and data to support future research on more robust MGT detection benchmarks.DOI: 10.18653/v1/2025.findings-acl.156
Project(s): SoBigData via OpenAIRE
Metrics:


See at: aclanthology.org Open Access | CNR IRIS Open Access | CNR IRIS Restricted


2025 Conference article Open Access OPEN
Automatic annotation of legal references (Allegationes) in the Liber Extra's Ordinary Gloss
Esuli A., Imperia V. R., Puccetti G.
The study of normative corpora of the past is a key activity in the fields of Religious Studies and Legal History. The development of intelligent software tools that support this activity is of paramount importance to support the digital transformation of the community. We present an interdisciplinary activity that lead to an accurate automatic annotation of legal references in the Liber Extra’s Ordinary Gloss. An index of legal references as been derived from the annotations enabling the creation of novel navigation and data analysis tools. The contribution of this work is twofold: the actual index is already by itself valuable resource for the discipline, and we detail the process that lead to its production, showing that an effective result can be delivered by a small team with limited resources. Both the index and the code are made publicly available.

See at: ceur-ws.org Open Access | CNR IRIS Open Access | CNR IRIS Restricted


2025 Conference article Open Access OPEN
DIACU: a dataset for the DIAchronic analysis of Church Slavonic
Cassese M., Puccetti G., Napolitano M., Esuli A.
The Church Slavonic language has evolved over time without being formalized into a precise grammar. Therefore, there is currently no clearly outlined history of this language tracing its evolution. However, in recent years, there has been a greater effort to digitize these resources, partly motivated by increased sensitivity with respect to the need to preserve multilingual knowledge. To exploit them, we propose DIACU (DIAchronic Analysis of Church Slavonic), a comprehensive collection of several existing corpora in Church Slavonic. In this work, we thoroughly describe the collection of this novel dataset and test its effectiveness as a training set for attributing Slavonic texts to specific periods. The dataset and the code of the experiments is available at https://github.com/MariaCassese/DIACU.DOI: 10.18653/v1/2025.bsnlp-1.12
Metrics:


See at: aclanthology.org Open Access | CNR IRIS Open Access | CNR IRIS Restricted


2025 Journal article Open Access OPEN
ARTEMIS: animal recognition through enhanced multimodal integration system
Fazzari E., Romano D., Falchi F., Stefanini C.
This paper introduces Animal Recognition Through Enhanced Multimodal Integration System (ARTEMIS), a transformer-based framework designed for multilabel animal action recognition by fusing video, image, and textual modalities. ARTEMIS utilizes state-of-the-art captioning and language models, such as BLIP2 and Llama 3, to generate textual descriptions from video frames, which are input to the model, significantly enhancing its performance unlikely previous results that do not consider this modality. Through comprehensive ablation studies, we explore the contribution of various model components and propose optimization strategies, including genetic algorithms and reinforcement learning, to dynamically adjust ensemble weights. Our feature alignment techniques-using contrastive and cosine similarity losses-further improve multimodal integration. Evaluations on the Animal Kingdom dataset, which includes 30,100 clips across 140 action classes, demonstrate that ARTEMIS achieves a new state-of-the-art mAP of 79.82, outperforming existing methods. The combination of multimodal fusion and ensemble strategies makes ARTEMIS a robust solution for complex animal action recognition tasks. The code of our fusion method is available at https://github.com/edofazza/ARTEMIS.Source: INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS
DOI: 10.1007/s13042-025-02602-3
Project(s): Robocoenosis via OpenAIRE
Metrics:


See at: Archivio della ricerca della Scuola Superiore Sant'Anna Open Access | Archivio della ricerca della Scuola Superiore Sant'Anna Open Access | CNR IRIS Open Access | link.springer.com Open Access | GitHub Restricted | CNR IRIS Restricted


2025 Journal article Open Access OPEN
Animal behavior analysis methods using deep learning: a survey
Fazzari E., Romano D., Falchi F., Stefanini C.
Animal behavior serves as a reliable indicator of the adaptation of organisms to their environment and their overall well-being. Through rigorous observation of animal actions and interactions, researchers and observers can glean valuable insights into diverse facets of their lives, encompassing health, social dynamics, ecological relationships, and neuroethological dimensions. Although state-of-the-art deep learning models have demonstrated remarkable accuracy in classifying various forms of animal data, their adoption in animal behavior studies remains limited. This survey article endeavors to comprehensively explore deep learning architectures and strategies applied to the identification of animal behavior, spanning auditory, visual, and audiovisual methodologies. The survey categorizes techniques into pose estimation-based and non-pose estimation-based methods, analyzing their applications, effectiveness, and limitations. Furthermore, the manuscript scrutinizes extant animal behavior datasets, offering a detailed examination of the principal challenges confronting this research domain. The article culminates in a comprehensive discussion of key research directions within deep learning that hold potential for advancing the field of animal behavior studies.Source: EXPERT SYSTEMS WITH APPLICATIONS, vol. 289 (issue 128330)
DOI: 10.1016/j.eswa.2025.128330
DOI: 10.48550/arxiv.2405.14002
Metrics:


See at: arXiv.org e-Print Archive Open Access | Expert Systems with Applications Open Access | Archivio della ricerca della Scuola Superiore Sant'Anna Open Access | CNR IRIS Open Access | www.sciencedirect.com Open Access | doi.org Restricted | CNR IRIS Restricted


2025 Book Restricted
SUN: Social and hUman ceNtered XR - A Horizon Europe Project Paving the Way for the Widespread Adoption of Extended and Virtual Worlds
Vairo C., Caracciolo G., Giorgi D., Leonardis D., Vadicamo L.
Extended Reality (XR) is a rapidly growing technology that bridges physical and virtual worlds, opening up new possibilities in healthcare, communications, and security. The European project SUN – Social and hUman ceNtered XR, funded by the Horizon Europe program, addresses the ongoing challenges of making XR more accessible, usable, and realistic. SUN develops technologies and models that enhance social interaction and immersive perception, while keeping an ethical and human-centered design, by introducing new wearable sensors, haptic interfaces, and high-performance streaming solutions. Through new 3D acquisition techniques and the use of artificial intelligence, SUN explores innovative ways to connect physical objects and digital counterparts, creating coherent and immersive environments. The project’s innovations were validated in three real-world piloting scenarios: rehabilitation therapy, workplace safety and social interaction, and assistive technologies for individuals with severe mobility or communication impairments. This volume presents the results of three years of research and development, offering a solid vision of how XR can evolve in a sustainable, ethical, and human-centered way.DOI: 10.32079/isti-book-2025/001
Metrics:


See at: CNR IRIS Restricted | CNR IRIS Restricted


2025 Other Open Access OPEN
Linking Dante UI: manuale d’uso
Trupiano L., Concordia C., Aloia N., Tomazzoli G., Meghini C.
Questo documento descrive l’interfaccia grafica per l’accesso ai dati del grafo di conoscenza Linking Dante (LiDa). L’interfaccia grafica (GUI), accessibile all’indirizzo https://lida.dantenetwork.it, è stata progettata e sviluppata per fornire a utenti con vari gradi di esperienza un accesso semplificato ai dati del grafo di conoscenza LiDa.

See at: CNR IRIS Open Access | lida.dantenetwork.it Open Access | CNR IRIS Restricted


2025 Conference article Restricted
Towards identity-aware cross-modal retrieval: a dataset and a baseline
Messina N., Vadicamo L., Maltese L., Gennaro C.
Recent advancements in deep learning have significantly enhanced content-based retrieval methods, notably through models like CLIP that map images and texts into a shared embedding space. However, these methods often struggle with domain-specific entities and long-tail concepts absent from their training data, particularly in identifying specific individuals. In this paper, we explore the task of identity-aware cross-modal retrieval, which aims to retrieve images of persons in specific contexts based on natural language queries. This task is critical in various scenarios, such as for searching and browsing personalized video collections or large audio-visual archives maintained by national broadcasters. We introduce a novel dataset, COCO Person FaceSwap (COCO-PFS), derived from the widely used COCO dataset and enriched with deepfake-generated faces from VGGFace2. This dataset addresses the lack of large-scale datasets needed for training and evaluating models for this task. Our experiments assess the performance of different CLIP variations repurposed for this task, including our architecture, Identity-aware CLIP (Id-CLIP), which achieves competitive retrieval performance through targeted fine-tuning. Our contributions lay the groundwork for more robust cross-modal retrieval systems capable of recognizing long-tail identities and contextual nuances. Data and code are available at .Source: LECTURE NOTES IN COMPUTER SCIENCE, vol. 15572, pp. 437-452. Lucca, Italy, April 6–10, 2025
DOI: 10.1007/978-3-031-88708-6_28
Project(s): Future Artificial Intelligence Research, a MUltimedia platform for Content Enrichment and Search in audiovisual archives
Metrics:


See at: CNR IRIS Restricted | CNR IRIS Restricted | CNR IRIS Restricted


2025 Journal article Open Access OPEN
QuAcc: using quantification to predict classifier accuracy under prior probability shift
Volpi L., Moreo Fernandez A., Sebastiani F.
Using cross-validation to predict the accuracy of a classifier on unseen data can be done reliably only in the absence of dataset shift, i.e., when the training data and the unseen data are IID. In this work we deal instead with the problem of predicting classifier accuracy on unseen data affected by prior probability shift (PPS), an important type of dataset shift. We propose QuAcc, a method built on top of ?quantification? algorithms robust to PPS, i.e., algorithms devised for estimating the prevalence values of the classes in unseen data affected by PPS. QuAcc is based on the idea of viewing the cells of the contingency table (on which classifier accuracy is computed) as classes, and of estimating, via a quantification algorithm, their prevalence values on the unseen data labelled by the classifier. We perform systematic experiments in which we compare the prediction error incurred by QuAcc with that of state-of-the-art classifier accuracy prediction (CAP) methods.Source: INTELLIGENZA ARTIFICIALE, vol. 19 (issue 2), pp. 141-157
DOI: 10.1177/17248035251338347
Project(s): Future Artificial Intelligence Research, Italian Strengthening of ESFRI RI RESILIENCE, Quantification in the Context of Dataset Shift, Strengthening the Italian RI for Social Mining and Big Data Analytics
Metrics:


See at: CNR IRIS Open Access | Intelligenza Artificiale Restricted | CNR IRIS Restricted


2025 Conference article Open Access OPEN
Optimizing LLMs for Italian: reducing token fertility and enhancing efficiency through vocabulary adaptation
Moroni L., Puccetti G., Huguet Cabot P. -L., Bejgu A. S., Barba E., Miaschi A., Dell'Orletta F., Esuli A., Navigli R.
The number of pretrained Large Language Models (LLMs) is increasing steadily, though the majority are designed predominantly for the English language. While state-of-the-art LLMs can handle other languages, due to language contamination or some degree of multilingual pretraining data, they are not optimized for non-English languages, leading to inefficient encoding (high token ``fertility'') and slower inference speed.In this work, we thoroughly compare a variety of vocabulary adaptation techniques for optimizing English LLMs for the Italian language, and put forward Semantic Alignment Vocabulary Adaptation (SAVA), a novel method that leverages neural mapping for vocabulary substitution. SAVA achieves competitive performance across multiple downstream tasks, enhancing grounded alignment strategies. We adapt two LLMs: Mistral-7B-v0.1, reducing token fertility by 25{\%}, and Llama-3.1-8B, optimizing the vocabulary and reducing the number of parameters by 1 billion. We show that, following the adaptation of the vocabulary, these models can recover their performance with a relatively limited stage of continual training on the target language. Finally, we test the capabilities of the adapted models on various multi-choice and generative tasks.DOI: 10.18653/v1/2025.findings-naacl.371
DOI: 10.48550/arxiv.2504.17025
Metrics:


See at: aclanthology.org Open Access | arXiv.org e-Print Archive Open Access | CNR IRIS Open Access | doi.org Restricted | doi.org Restricted | Archivio della ricerca- Università di Roma La Sapienza Restricted | CNR IRIS Restricted


2025 Book Open Access OPEN
Proceedings of the 5th International Workshop on Learning to Quantify (LQ 2025)
Bunse M., González P., Moreo Fernandez A., Sebastiani F.
The 5th International Workshop on Learning to Quantify (LQ 2025 – https: //lq-2025.github.io/) has been held in Porto, PT, on September 15, 2025, as a satellite workshop of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML/PKDD 2025). While the 1st edition of the workshop (LQ 2021 – https://cikmlq2021. github.io/) had to be an entirely online event due to the COVID-19 pan- demic, the 2nd edition (LQ 2022 – https://lq-2022.github.io/), 3rd edition (LQ 2023 – https://lq-2023.github.io/), 4th edition (LQ 2024 – https://lq-2024.github.io/), and this 5th edition, have been hybrid events, with presentations given in-presence, and both in-presence attendees and remote attendees. The LQ 2025 workshop consisted of the presentations of seven contributed papers, that had each gone through a rigorous peer-reviewing process by three reviewers each, and a final collective discussion on the open problems of learning to quantify and on future initiatives. The present volume con- tains the text of five of the seven presentations given at the workshop (for the other two presentations the authors asked for their papers not to be in the proceedings). We hope that the availability of the present volume will increase the interest in the subject of quantification on the part of researchers and practitioners alike, and will contribute to making quantification better known to potential users of this technology and to researchers interested in advancing the field.Project(s): SoBigData via OpenAIRE

See at: CNR IRIS Open Access | lq-2025.github.io Open Access | CNR IRIS Restricted


2025 Conference article Open Access OPEN
An efficient method for deriving confidence intervals in aggregative quantification
Moreo Fernandez A., Salvati N.
This paper explores efficient methods for deriving confidence intervals in quantification, the area of machine learning concerned with estimating class prevalence values. By focusing on computationally efficient strategies, we propose a robust framework for quantifying uncertainty. The key idea is to disentangle the two main phases of current aggregative quantifiers (classification followed by aggregation) and apply bootstrap only to the second phase. We investigate different methods for constructing confidence regions, including confidence intervals, confidence ellipses in the simplex, and confidence regions in the transformed Centered Log-Ratio space. Additionally, we examine various bootstrap strategies, including model-based, population-based, and a combined approach. Our results demonstrate the effectiveness of combining modelbased and population-based bootstrap approaches, particularly when used with traditional confidence intervals, while also achieving significant efficiency gains compared to a naive application of bootstrap.Project(s): Quantification in the Context of Dataset Shift

See at: CNR IRIS Open Access | lq-2025.github.io Open Access | CNR IRIS Restricted


2025 Journal article Open Access OPEN
Automatic extraction of regesta for medieval latin text summarization
Puccetti G., Righi L., Sabbatini I., Esuli A.
We produced a novel dataset of 4,533 medieval Latin regesta (summaries) paired with full texts, extracted through a meticulous pipeline involving manual annotation, custom model training, text extraction, and post-processing to ensure high-quality, structured data for AI-driven summarization tasks.Source: ERCIM NEWS, vol. 141, pp. 31-32

See at: ercim-news.ercim.eu Open Access | CNR IRIS Open Access | CNR IRIS Restricted


2025 Conference article Restricted
A simple method for classifier accuracy prediction under prior probability shift
Volpi L., Moreo Fernandez A., Sebastiani F.
The standard technique for predicting the accuracy that a classifier will have on unseen data (classifier accuracy prediction – CAP) is cross-validation (CV). However, CV relies on the assumption that the training data and the test data are sampled from the same distribution, an assumption that is often violated in many real-world scenarios. When such violations occur (i.e., in the presence of dataset shift), the estimates returned by CV are unreliable. In this paper we propose a CAP method specifically designed to address prior probability shift (PPS), an instance of dataset shift in which the training and test distributions are characterized by different class priors. By solving a system of independent linear equations, with n the number of classes, our method estimates the entries of the contingency table of the test data, and thus allows estimating any specific evaluation measure. Since a key step in this method involves predicting the class priors of the test data, we further observe a connection between our method and the field of “learning to quantify”. Our experiments show that, when combined with state-of-the-art quantification techniques, under PPS our method tends to outperform existing CAP methods.Source: LECTURE NOTES IN COMPUTER SCIENCE, vol. 15244, pp. 267-283. Pisa, Italy, 14-16/10/2024
DOI: 10.1007/978-3-031-78980-9_17
Project(s): Quantification in the Context of Dataset Shift
Metrics:


See at: CNR IRIS Restricted | CNR IRIS Restricted | CNR IRIS Restricted | link.springer.com Restricted


2025 Conference article Open Access OPEN
GenAI content detection Task 1: English and multilingual machine-generated text detection: AI vs. Human
Wang Y., Shelmanov A., Mansurov J., Tsvigun A., Mikhailov V., Xing R., Xie Z., Geng J., Puccetti G., Artemova E., Su J., Ta M. N., Abassy M., Elozeiri K., El Dine Ahmed S., Goloburda M., Mahmoud T., Tomar R. V., Aziz A., Laiyk N., Afzal O. M., Koike R., Kaneko M., Aji A. F., Habash N., Gurevych I., Nakov P.
We present the GenAI Content Detection Task 1 - a shared task on binary machine generated text detection, conducted as a part of the GenAI workshop at COLING 2025. The task consists of two subtasks: Monolingual (English) and Multilingual. The shared task attracted many participants: 36 teams made official submissions to the Monolingual subtask during the test phase and 27 teams - to the Multilingual. We provide a comprehensive overview of the data, a summary of the results - including system rankings and performance scores - detailed descriptions of the participating systems, and an in-depth analysis of submissions.1

See at: aclanthology.org Open Access | CNR IRIS Open Access | CNR IRIS Restricted


2025 Journal article Open Access OPEN
Structural monitoring of heritage buildings via deep learning algorithms
Girardi M., Gurioli G., Messina N.
Monitoring systems constitute a significant, non-invasive tool for verifying the structural health of buildings and infrastructure over time. Deep learning neural networks can be used to analyse data from long-term monitoring systems, such as time series of velocity/acceleration measured at specific points and environmental parameters, and to predict the main features of the buildings’ structural behaviour with respect to ambient stresses. Potential anomalies of the structure’s vibrational features related to damage or unexpected events, such as earthquakes or exceptional loads, can also be detected. The paper focuses on the application of a Temporal Fusion Transformer (TFT) network to data from the dynamic monitoring of a medieval tower in the historic centre of Lucca (Tuscany, Italy).Source: ERCIM NEWS, vol. 141, pp. 13-14

See at: ercim-news.ercim.eu Open Access | CNR IRIS Open Access | CNR IRIS Restricted


2025 Conference article Open Access OPEN
Is CLIP the main roadblock for fine-grained open-world perception?
Bianchi L., Carrara F., Messina N., Falchi F.
Modern applications increasingly demand flexible computer vision models that adapt to novel concepts not encountered during training. This necessity is pivotal in emerging domains like extended reality, robotics, and autonomous driving, which require the ability to respond to open-world stimuli. A key ingredient is the ability to identify objects based on free-form textual queries defined at inference time – a task known as open-vocabulary object detection. Multimodal backbones like CLIP are the main enabling technology for current open-world perception solutions. Despite performing well on generic queries, recent studies highlighted limitations on the fine-grained recognition capabilities in open-vocabulary settings – i.e., for distinguishing subtle object features like color, shape, and material. In this paper, we perform a detailed examination of these open-vocabulary object recognition limitations to find the root cause. We evaluate the performance of CLIP, the most commonly used vision-language backbone, against a fine-grained object-matching benchmark, revealing interesting analogies between the limitations of open-vocabulary object detectors and their backbones. Experiments suggest that the lack of fine-grained understanding is caused by the poor separability of object characteristics in the CLIP latent space. Therefore, we try to understand whether fine-grained knowledge is present in CLIP embeddings but not exploited at inference time due, for example, to the unsuitability of the cosine similarity matching function, which may discard important object characteristics. Our preliminary experiments show that simple CLIP latent-space re-projections help separate fine-grained concepts, paving the way towards the development of backbones inherently able to process fine-grained details. The code for reproducing these experiments is available at https://github.com/lorebianchi98/FG-CLIP.DOI: 10.1109/cbmi62980.2024.10859215
Project(s): Future Artificial Intelligence Research, Italian Strengthening of ESFRI RI RESILIENCE, SUN via OpenAIRE, a MUltimedia platform for Content Enrichment and Search in audiovisual archives
Metrics:


See at: CNR IRIS Open Access | ieeexplore.ieee.org Open Access | CNR IRIS Restricted | CNR IRIS Restricted


2025 Contribution to book Open Access OPEN
Adversarial magnification to deceive deepfake detection through super resolution
Coccomini D. A., Caldelli R., Amato G., Falchi F., Gennaro C.
Deepfake technology is rapidly advancing, posing significant challenges to the detection of manipulated media content. Parallel to that, some adversarial attack techniques have been developed to fool the deepfake detectors and make deepfakes even more difficult to be detected. This paper explores the application of super resolution techniques as a possible adversarial attack in deepfake detection. Through our experiments, we demonstrate that minimal changes made by these methods in the visual appearance of images can have a profound impact on the performance of deepfake detection systems. We propose a novel attack using super resolution as a quick, black-box and effective method to camouflage fake images and/or generate false alarms on pristine images. Our results indicate that the usage of super resolution can significantly impair the accuracy of deepfake detectors, thereby highlighting the vulnerability of such systems to adversarial attacks. The code to reproduce our experiments is available at: https://github.com/davide-coccomini/Adversarial-Magnification-to-Deceive-Deepfake-Detection-through-Super-Resolution.Source: COMMUNICATIONS IN COMPUTER AND INFORMATION SCIENCE, vol. 2134, pp. 491-501
DOI: 10.1007/978-3-031-74627-7_41
Project(s): AI4Media via OpenAIRE
Metrics:


See at: CNR IRIS Open Access | link.springer.com Open Access | doi.org Restricted | CNR IRIS Restricted