2024
Conference article
Open Access
Operationalizing the fundamental rights impact assessment for AI systems: the FRIA project
Savella R., Pratesi F., Trasarti R., Gatt L., Gaeta M. C., Caggiano I. A., Aulino L., Troisi E., Izzo L.This paper presents the FRIA Project, a multidisciplinary research study which connects the legal and ethical aspects related to the impact on fundamental rights of Artificial Intelligence systems and the technical issues that arise in the creation of an automated tool for the Fundamental Rights Impact Assessment, which is the ultimate objective of this work.Project(s): SoBigData-PlusPlus 
See at:
CNR IRIS
| ital-ia2024.it
| CNR IRIS
2023
Conference article
Open Access
Trustworthy AI at KDD Lab
Giannotti F, Guidotti R, Monreale A, Pappalardo L, Pedreschi D, Pellungrini R, Pratesi F, Rinzivillo S, Ruggieri S, Setzu M, Deluca RThis document summarizes the activities regarding the development of Responsible AI (Responsible Artificial Intelligence) conducted by the Knowledge Discovery and Data mining group (KDD-Lab), a joint research group of the Institute of Information Science and Technologies "Alessandro Faedo" (ISTI) of the National Research Council of Italy (CNR), the Department of Computer Science of the University of Pisa, and the Scuola Normale Superiore of Pisa.Source: CEUR WORKSHOP PROCEEDINGS, pp. 388-393. Pisa, Italy, 29-30/05/2023
Project(s): SoBigData-PlusPlus 
See at:
ceur-ws.org
| CNR IRIS
| ISTI Repository
| CNR IRIS
2022
Other
Metadata Only Access
The TAILOR Handbook of Trustworthy AI
Albertoni R, Allard T, Alves G, Bringas Colmenarejo A, Buijsman S, Casares P A M, Colantonio S, Couceiro M, Escobar S, Gonzalezcastañé G, Guidotti R, Heintz F, Hernandez Orallo J, Kuilman S, Makhlouf K, Martinez Plumed F, Monreale A, Pellungrini R, Pratesi F, Ramachandran Pillai R, Rossi A, Rousset Mc, Ruggieri S, Siebert Lc, Skrzypczyski P, Stefanowski J, Straccia U, Òsullivan B, Visentin A, Zgonnikov A, Zhioua SThe main goal of the Handbook of Trustworthy AI is to provide to non experts, especially researchers and students, an overview of the problem related to the developing of ethical and trustworty AI systems. In particular, we want to provide an overview of the main dimensions of trustworthiness, starting with a understandable explaination of the dimension itsleves, and then presenting the characterization of the problems (staring with a brief summary and the explaination of the importance of the dimension, presenting a taxonomy and some guidelines, if they are available and consolidated), summarizing what are the major challenges and solutions in the field, as well as what are the lastest research developments.Project(s): TAILOR 
See at:
CNR IRIS
| tailor.isti.cnr.it
2022
Journal article
Open Access
Where do migrants and natives belong in a community: a Twitter case study and privacy risk analysis
Kim J, Pratesi F, Rossetti G, Sîrbu A, Giannotti FToday, many users are actively using Twitter to express their opinions and to share information. Thanks to the availability of the data, researchers have studied behaviours and social networks of these users. International migration studies have also benefited from this social media platform to improve migration statistics. Although diverse types of social networks have been studied so far on Twitter, social networks of migrants and natives have not been studied before. This paper aims to fill this gap by studying characteristics and behaviours of migrants and natives on Twitter. To do so, we perform a general assessment of features including profiles and tweets, and an extensive network analysis on the network. We find that migrants have more followers than friends. They have also tweeted more despite that both of the groups have similar account ages.
More interestingly, the assortativity scores showed that users tend to connect based on nationality more than country of residence, and this is more the case for migrants than natives. Furthermore, both natives and migrants tend to connect mostly with natives. The homophilic behaviours of users are also well reflected in the communities that we detected. Our additional privacy risk analysis showed that Twitter data can be safely used without exposing sensitive information of the users, and minimise risk of re-identification, while respecting GDPR.Source: SOCIAL NETWORK ANALYSIS AND MINING, vol. 13 (issue 15)
DOI: 10.1007/s13278-022-01017-0Project(s): SoBigData-PlusPlus
Metrics:
See at:
CNR IRIS
| link.springer.com
| ISTI Repository
| CNR IRIS
2022
Book
Open Access
IAIL 2022 - Imagining the AI Landscape after the AI Act
Dushi D, Naretto F, Panigutti C, Pratesi FWe summarize the first Workshop on Imagining the AI Landscape after the AI Act (IAIL 2022), co-located with 1st International Conference on Hybrid Human-Artificial Intelligence (HHAI 2022), held on June 13, 2022 in Amsterdam, Netherlands.Source: CEUR WORKSHOP PROCEEDINGS, vol. 3221
Project(s): CoHuBiCoL 
,
TAILOR 
,
HumanE-AI-Net 
,
SoBigData-PlusPlus 
See at:
ceur-ws.org
| CNR IRIS
| ISTI Repository
| CNR IRIS
2022
Contribution to book
Open Access
Ethics in smart information systems
Pratesi F, Trasarti R, Giannotti FThis chapter analyses some of the ethical implications of recent developments in artificial intelligence (AI), data mining, machine learning and robotics. In particular, we start summarising the more consolidated issues and solutions related to privacy in data management systems, moving towards the novel concept of explainability. The chapter reviews the development of the right to privacy and the right to explanation, culminated in the General Data Protection Regulation. However, the new kinds of big data (such as internet logs or GPS tracking) require a different approach to managing privacy requirements. Several solutions have been developed and will be reviewed here. Our view is that generally data protection must be considered from the beginning as novel AI solutions are developing using the Privacy-by-Design paradigm. This involves a shift in perspective away from remedying problems to trying to prevent them, instead. We conclude by covering the main requirements necessary to achieve a trustworthy scenario, as advised also by the European Commission. A step in the direction towards Trustworthy AI was achieved in the Ethics Guidelines for Trustworthy Artificial Intelligence produced by an expert group for the European Commission. The key elements in these guidelines will reviewed in this chapter. To ensure European independence and leadership, we must invest wisely by bundling, connecting and opening our AI resources while also having in mind ethical priorities, such as transparency and fairness.DOI: 10.51952/9781447363972.ch009DOI: 10.56687/9781447363972-012DOI: 10.2307/j.ctv2tbwqd5.14Project(s): TAILOR 
,
PRO-RES 
,
SoBigData-PlusPlus
Metrics:
See at:
bristoluniversitypressdigital.com
| doi.org
| doi.org
| CNR IRIS
| ISTI Repository
| doi.org
| CNR IRIS
2021
Journal article
Open Access
Give more data, awareness and control to individual citizens, and they will help COVID-19 containment
Nanni M, Andrienko G, Barabasi Al, Boldrini C, Bonchi F, Cattuto C, Chiaromonte F, Comande G, Conti M, Cote M, Dignum F, Dignum V, Domingoferrer J, Ferragina P, Giannotti F, Guidotti R, Helbing D, Kaski K, Kertesz J, Lehmann S, Lepri B, Lukowicz P, Matwin S, Jimenez Dm, Monreale A, Morik K, Oliver N, Passarella A, Passerini A, Pedreschi D, Pentland A, Pianesi F, Pratesi F, Rinzivillo S, Ruggieri S, Siebes A, Torra V, Trasarti R, Hoven J, Vespignani AThe rapid dynamics of COVID-19 calls for quick and effective tracking of virus transmission chains and early detection of outbreaks, especially in the "phase 2" of the pandemic, when lockdown and other restriction measures are progressively withdrawn, in order to avoid or minimize contagion resurgence. For this purpose, contact-tracing apps are being proposed for large scale adoption by many countries. A centralized approach, where data sensed by the app are all sent to a nation-wide server, raises concerns about citizens' privacy and needlessly strong digital surveillance, thus alerting us to the need to minimize personal data collection and avoiding location tracking. We advocate the conceptual advantage of a decentralized approach, where both contact and location data are collected exclusively in individual citizens' "personal data stores", to be shared separately and selectively (e.g., with a backend system, but possibly also with other citizens), voluntarily, only when the citizen has tested positive for COVID-19, and with a privacy preserving level of granularity. This approach better protects the personal sphere of citizens and affords multiple benefits: it allows for detailed information gathering for infected people in a privacy-preserving fashion; and, in turn this enables both contact tracing, and, the early detection of outbreak hotspots on more finely-granulated geographic scale. The decentralized approach is also scalable to large populations, in that only the data of positive patients need be handled at a central level. Our recommendation is two-fold. First to extend existing decentralized architectures with a light touch, in order to manage the collection of location data locally on the device, and allow the user to share spatio-temporal aggregates--if and when they want and for specific aims--with health authorities, for instance. Second, we favour a longer-term pursuit of realizing a Personal Data Store vision, giving users the opportunity to contribute to collective good in the measure they want, enhancing self-awareness, and cultivating collective efforts for rebuilding society.Source: ETHICS AND INFORMATION TECHNOLOGY, vol. 23 (issue 3)
DOI: 10.1007/s10676-020-09572-wProject(s): SoBigData-PlusPlus
Metrics:
See at:
Aaltodoc Publication Archive
| Aaltodoc Publication Archive
| Ethics and Information Technology
| Ethics and Information Technology
| Recolector de Ciencia Abierta, RECOLECTA
| CNR IRIS
| Archivio Istituzionale
| link.springer.com
| Ethics and Information Technology
| City Research Online
| ISTI Repository
| Online Research Database In Technology
| NARCIS
| NARCIS
| Digitala Vetenskapliga Arkivet - Academic Archive On-line
| Publikationer från Umeå universitet
| NARCIS
| CNR IRIS
| kclpure.kcl.ac.uk
| Fraunhofer-ePrints
| Fraunhofer-ePrints
| publons.com
| www.scopus.com
2021
Journal article
Open Access
An ethico-legal framework for social data science (International Journal of Data Science and Analytics, (2021), 11, 4, (377-390), 10.1007/s41060-020-00211-7)
Forgo N., Hanold S., Van Den Hoven J., Krugel T., Lishchuk I., Mahieu R., Monreale A., Pedreschi D., Pratesi F., Van Putten D.This paper presents a framework for research infrastructures enabling ethically sensitive and legally compliant data science in Europe. Our goal is to describe how to design and implement an open platform for big data social science, including, in particular, personal data. To this end, we discuss a number of infrastructural, organizational and methodological principles to be developed for a concrete implementation. These include not only systematically tools and methodologies that effectively enable both the empirical evaluation of the privacy risk and data transformations by using privacy-preserving approaches, but also the development of training materials (a massive open online course) and organizational instruments based on legal and ethical principles. This paper provides, by way of example, the implementation that was adopted within the context of the SoBigData Research Infrastructure.Source: INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS, vol. 12 (issue 1), p. 79
DOI: 10.1007/s41060-021-00261-5Project(s): SoBigData
Metrics:
See at:
Journal of Data Science
| International Journal of Data Science and Analytics
| International Journal of Data Science and Analytics
| CNR IRIS
| link.springer.com
2021
Journal article
Open Access
Correction to: Human migration: the big data perspective
Sîrbu A., Andrienko G., Andrienko N., Boldrini C., Conti M., Giannotti F., Guidotti R., Bertoli S., Kim J., Muntean C. I., Pappalardo L., Passarella A., Pedreschi D., Pollacci L., Pratesi F., Sharma R.How can big data help to understand the migration phenomenon? In this paper, we try to answer this question through an analysis of various phases of migration, comparing traditional and novel data sources and models at each phase. We concentrate on three phases of migration, at each phase describing the state of the art and recent developments and ideas. The first phase includes the journey, and we study migration flows and stocks, providing examples where big data can have an impact. The second phase discusses the stay, i.e. migrant integration in the destination country. We explore various data sets and models that can be used to quantify and understand migrant integration, with the final aim of providing the basis for the construction of a novel multi-level integration index. The last phase is related to the effects of migration on the source countries and the return of migrants.Source: INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS, vol. 11, pp. 341-360
DOI: 10.1007/s41060-021-00260-6Project(s): SoBigData
Metrics:
See at:
CNR IRIS
| link.springer.com
| Journal of Data Science
| International Journal of Data Science and Analytics
| Open Access Repository
| International Journal of Data Science and Analytics
| CNR IRIS
2020
Journal article
Open Access
Give more data, awareness and control to individual citizens, and they will help COVID-19 containment
Nanni M, Andrienko G, Barabasi Al, Boldrini C, Bonchi F, Cattuto C, Chiaromonte F, Comandé G, Conti M, Coté M, Dignum F, Dignum V, Domingoferrer J, Ferragina P, Giannotti F, Guidotti R, Helbing D, Kaski K, Kertesz J, Lehmann S, Lepri B, Lukowicz P, Matwin S, Jimenez D, Monreale A, Morik K, Oliver N, Passarella A, Passerini A, Pedreschi D, Pentland A, Pianesi F, Pratesi F, Rinzivillo S, Ruggieri S, Siebes A, Torra V, Trasarti R, Van Den Hoven J, Vespignani AThe rapid dynamics of COVID-19 calls for quick and effective tracking of virus transmission chains and early detection of outbreaks, especially in the "phase 2" of the pandemic, when lockdown and other restriction measures are progressively withdrawn, in order to avoid or minimize contagion resurgence. For this purpose, contact-tracing apps are being proposed for large scale adoption by many countries. A centralized approach, where data sensed by the app are all sent to a nation-wide server, raises concerns about citizens' privacy and needlessly strong digital surveillance, thus alerting us to the need to minimize personal data collection and avoiding location tracking. We advocate the conceptual advantage of a decentralized approach, where both contact and location data are collected exclusively in individual citizens' "personal data stores", to be shared separately and selectively (e.g., with a backend system, but possibly also with other citizens), voluntarily, only when the citizen has tested positive for COVID-19, and with a privacy preserving level of granularity. This approach better protects the personal sphere of citizens and affords multiple benefits: It allows for detailed information gathering for infected people in a privacy-preserving fashion; and, in turn this enables both contact tracing, and, the early detection of outbreak hotspots on more finely-granulated geographic scale. The decentralized approach is also scalable to large populations, in that only the data of positive patients need be handled at a central level. Our recommendation is two-fold. First to extend existing decentralized architectures with a light touch, in order to manage the collection of location data locally on the device, and allowthe user to share spatio-temporal aggregates-if and when they want and for specific aims-with health authorities, for instance. Second, we favour a longerterm pursuit of realizing a Personal Data Store vision, giving users the opportunity to contribute to collective good in the measure they want, enhancing self-awareness, and cultivating collective efforts for rebuilding society.Source: TRANSACTIONS ON DATA PRIVACY, vol. 13 (issue 1), pp. 61-66
See at:
CNR IRIS
| ISTI Repository
| www.tdp.cat
| CNR IRIS
2020
Journal article
Open Access
(So) Big Data and the transformation of the city
Andrienko G, Andrienko N, Boldrini C, Caldarelli G, Cintia P, Cresci S, Facchini A, Giannotti F, Gionis A, Guidotti R, Mathioudakis M, Muntean Ci, Pappalardo L, Pedreschi D, Pournaras E, Pratesi F, Tesconi M, Trasarti RThe exponential increase in the availability of large-scale mobility data has fueled the vision of smart cities that will transform our lives. The truth is that we have just scratched the surface of the research challenges that should be tackled in order to make this vision a reality. Consequently, there is an increasing interest among different research communities (ranging from civil engineering to computer science) and industrial stakeholders in building knowledge discovery pipelines over such data sources. At the same time, this widespread data availability also raises privacy issues that must be considered by both industrial and academic stakeholders. In this paper, we provide a wide perspective on the role that big data have in reshaping cities. The paper covers the main aspects of urban data analytics, focusing on privacy issues, algorithms, applications and services, and georeferenced data from social media. In discussing these aspects, we leverage, as concrete examples and case studies of urban data science tools, the results obtained in the "City of Citizens" thematic area of the Horizon 2020 SoBigData initiative, which includes a virtual research environment with mobility datasets and urban analytics methods developed by several institutions around Europe. We conclude the paper outlining the main research challenges that urban data science has yet to address in order to help make the smart city vision a reality.Source: INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS, vol. 1
DOI: 10.1007/s41060-020-00207-3Project(s): SoBigData
Metrics:
See at:
Aaltodoc Publication Archive
| International Journal of Data Science and Analytics
| White Rose Research Online
| HELDA - Digital Repository of the University of Helsinki
| Archivio istituzionale della ricerca - Università degli Studi di Venezia Ca' Foscari
| CNR IRIS
| link.springer.com
| International Journal of Data Science and Analytics
| City Research Online
| ISTI Repository
| CNR IRIS
| Fraunhofer-ePrints
2020
Journal article
Open Access
Human migration: the big data perspective
Sîrbu A, Andrienko G, Andrienko N, Boldrini C, Conti M, Giannotti F, Guidotti R, Bertoli S, Kim J, Muntean Ci, Pappalardo L, Passarella A, Pedreschi D, Pollacci L, Pratesi F, Sharma RHow can big data help to understand the migration phenomenon? In this paper, we try to answer this question through an analysis of various phases of migration, comparing traditional and novel data sources and models at each phase. We concentrate on three phases of migration, at each phase describing the state of the art and recent developments and ideas. The first phase includes the journey, and we study migration flows and stocks, providing examples where big data can have an impact. The second phase discusses the stay, i.e. migrant integration in the destination country. We explore various data sets and models that can be used to quantify and understand migrant integration, with the final aim of providing the basis for the construction of a novel multi-level integration index. The last phase is related to the effects of migration on the source countries and the return of migrants.Source: INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS, vol. 11, pp. 341-360
DOI: 10.1007/s41060-020-00213-5Project(s): SoBigData
Metrics:
See at:
International Journal of Data Science and Analytics
| CNR IRIS
| link.springer.com
| ISTI Repository
| HAL Clermont Université
| CNR IRIS
| Fraunhofer-ePrints
2020
Journal article
Open Access
An ethico-legal framework for social data science
Forgó N, Hänold S, Van Den Hoven J, Krügel T, Lishchuk I, Mahieu R, Monreale A, Pedreschi D, Pratesi F, Van Putten DThis paper presents a framework for research infrastructures enabling ethically sensitive and legally compliant data science
in Europe. Our goal is to describe how to design and implement an open platform for big data social science, including, in
particular, personal data. To this end, we discuss a number of infrastructural, organizational and methodological principles to
be developed for a concrete implementation. These include not only systematically tools and methodologies that effectively
enable both the empirical evaluation of the privacy risk and data transformations by using privacy-preserving approaches, but
also the development of training materials (a massive open online course) and organizational instruments based on legal and
ethical principles. This paper provides, by way of example, the implementation that was adopted within the context of the
SoBigData Research Infrastructure.Source: INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS, vol. 11, pp. 377-390
DOI: 10.1007/s41060-020-00211-7Project(s): SoBigData 
,
SoBigData-PlusPlus
Metrics:
See at:
Vrije Universiteit Brussel Research Portal
| CNR IRIS
| ISTI Repository
| NARCIS
| CNR IRIS
2019
Journal article
Open Access
PRIMULE: Privacy risk mitigation for user profiles
Pratesi F, Gabrielli L, Cintia P, Monreale A, Giannotti FThe availability of mobile phone data has encouraged the development of different data-driven tools, supporting social science studies and providing new data sources to the standard official statistics. However, this particular kind of data are subject to privacy concerns because they can enable the inference of personal and private information. In this paper, we address the privacy issues related to the sharing of user profiles, derived from mobile phone data, by proposing PRIMULE, a privacy risk mitigation strategy. Such a method relies on PRUDEnce (Pratesi et al., 2018), a privacy risk assessment framework that provides a methodology for systematically identifying risky-users in a set of data. An extensive experimentation on real-world data shows the effectiveness of PRIMULE strategy in terms of both quality of mobile user profiles and utility of these profiles for analytical services such as the Sociometer (Furletti et al., 2013), a data mining tool for city users classification.Source: DATA & KNOWLEDGE ENGINEERING, vol. 125 (issue 101786)
DOI: 10.1016/j.datak.2019.101786Project(s): SoBigData
Metrics:
See at:
CNR IRIS
| ISTI Repository
| Archivio istituzionale della Ricerca - Scuola Normale Superiore
| www.sciencedirect.com
| Data & Knowledge Engineering
| CNR IRIS
| CNR IRIS
2018
Contribution to book
Open Access
How data mining and machine learning evolved from relational data base to data science
Amato G, Candela L, Castelli D, Esuli A, Falchi F, Gennaro C, Giannotti F, Monreale A, Nanni M, Pagano P, Pappalardo L, Pedreschi D, Pratesi F, Rabitti F, Rinzivillo S, Rossetti G, Ruggieri S, Sebastiani F, Tesconi MDuring the last 35 years, data management principles such as physical and logical independence, declarative querying and cost-based optimization have led to profound pervasiveness of relational databases in any kind of organization. More importantly, these technical advances have enabled the first round of business intelligence applications and laid the foundation for managing and analyzing Big Data today.Source: STUDIES IN BIG DATA, pp. 287-306
DOI: 10.1007/978-3-319-61893-7_17Metrics:
See at:
arpi.unipi.it
| CNR IRIS
| link.springer.com
| ISTI Repository
| doi.org
| CNR IRIS
| CNR IRIS
2018
Conference article
Open Access
Privacy Preserving Multidimensional Profiling
Pratesi F, Monreale A, Giannotti F, Pedreschi DRecently, big data had become central in the analysis of human behavior and the development of innovative services. In particular, a new class of services is emerging, taking advantage of different sources of data, in order to consider the multiple aspects of human beings. Unfortunately, these data can lead to re-identification problems and other privacy leaks, as diffusely reported in both scientific literature and media. The risk is even more pressing if multiple sources of data are linked together since a potential adversary could know information related to each dataset. For this reason, it is necessary to evaluate accurately and mitigate the individual privacy risk before releasing personal data. In this paper, we propose a methodology for the first task, i.e., assessing privacy risk, in a multidimensional scenario, defining some possible privacy attacks and simulating them using real-world datasets.Source: LECTURE NOTES OF THE INSTITUTE FOR COMPUTER SCIENCES, SOCIAL INFORMATICS AND TELECOMMUNICATIONS ENGINEERING, pp. 142-152. Pisa, Italy, 29-30/11/2017
DOI: 10.1007/978-3-319-76111-4_15Project(s): SoBigData
Metrics:
See at:
CNR IRIS
| link.springer.com
| ISTI Repository
| Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering
| CNR IRIS
| Archivio istituzionale della Ricerca - Scuola Normale Superiore