318 result(s)
Page Size: 10, 20, 50
Export: bibtex, xml, json, csv
Order by:

CNR Author operator: and / or
more
Typology operator: and / or
Language operator: and / or
Date operator: and / or
more
Rights operator: and / or
2023 Journal article Open Access OPEN
Dense Hebbian neural networks: a replica symmetric picture of unsupervised learning
Agliari E, Albanese L, Alemanno F, Alessandrelli A, Barra A, Giannotti F, Lotito D, Pedreschi D
We consider dense, associative neural-networks trained with no supervision and we investigate their computational capabilities analytically, via statistical-mechanics tools, and numerically, via Monte Carlo simulations. In particular, we obtain a phase diagram summarizing their performance as a function of the control parameters (e.g. quality and quantity of the training dataset, network storage, noise) that is valid in the limit of large network size and structureless datasets. Moreover, we establish a bridge between macroscopic observables standardly used in statistical mechanics and loss functions typically used in the machine learning. As technical remarks, from the analytical side, we extend Guerra's interpolation to tackle the non-Gaussian distributions involved in the post-synaptic potentials while, from the computational counterpart, we insert Plefka's approximation in the Monte Carlo scheme, to speed up the evaluation of the synaptic tensor, overall obtaining a novel and broad approach to investigate unsupervised learning in neural networks, beyond the shallow limit.Source: PHYSICA. A, vol. 627
DOI: 10.1016/j.physa.2023.129143
Metrics:


See at: CNR IRIS Open Access | ISTI Repository Open Access | www.sciencedirect.com Open Access | CNR IRIS Restricted | CNR IRIS Restricted


2023 Conference article Restricted
Explaining socio-demographic and behavioral patterns of vaccination against the swine flu (H1N1) pandemic
Punzi C, Maslennikova A, Gezici G, Pellungrini R, Giannotti F
Pandemic vaccination campaigns must account for vaccine skepticism as an obstacle to overcome. Using machine learning to identify behavioral and psychological patterns in public survey datasets can provide valuable insights and inform vaccination campaigns based on empirical evidence. However, we argue that the adoption of local and global explanation methodologies can provide additional support to health practitioners by suggesting personalized communication strategies and revealing potential demographic, social, or structural barriers to vaccination requiring systemic changes. In this paper, we first implement a chain classification model for the adoption of the vaccine during the H1N1 influenza outbreak taking seasonal vaccination information into account, and then compare it with a binary classifier for vaccination to better understand the overall patterns in the data. Following that, we derive and compare global explanations using post-hoc methodologies and interpretable-by-design models. Our findings indicate that socio-demographic factors play a distinct role in the H1N1 vaccination as compared to the general vaccination. Nevertheless, medical recommendation and health insurance remain significant factors for both vaccinations. Then, we concentrated on the subpopulation of individuals who did not receive an H1N1 vaccination despite being at risk of developing severe symptoms. In an effort to assist practitioners in providing effective recommendations to patients, we present rules and counterfactuals for the selected instances based on local explanations. Finally, we raise concerns regarding gender and racial disparities in healthcare access by analysing the interaction effects of sensitive attributes on the model's output.Source: COMMUNICATIONS IN COMPUTER AND INFORMATION SCIENCE (PRINT), pp. 621-635. Lisbon, Portugal, 26-28/07/2023
DOI: 10.1007/978-3-031-44067-0_31
Project(s): HumanE-AI-Net via OpenAIRE, SoBigData-PlusPlus via OpenAIRE
Metrics:


See at: doi.org Restricted | CNR IRIS Restricted | CNR IRIS Restricted | link.springer.com Restricted


2023 Conference article Restricted
Handling missing values in local post-hoc explainability
Cinquini M., Giannotti F., Guidotti R., Mattei A.
Missing data are quite common in real scenarios when using Artificial Intelligence (AI) systems for decision-making with tabular data and effectively handling them poses a significant challenge for such systems. While some machine learning models used by AI systems can tackle this problem, the existing literature lacks post-hoc explainability approaches able to deal with predictors that encounter missing data. In this paper, we extend a widely used local model-agnostic post-hoc explanation approach that enables explainability in the presence of missing values by incorporating state-of-the-art imputation methods within the explanation process. Since our proposal returns explanations in the form of feature importance, the user will be aware also of the importance of a missing value in a given record for a particular prediction. Extensive experiments show the effectiveness of the proposed method with respect to some baseline solutions relying on traditional data imputation.Source: xAI 2023 - World Conference on Explainable Artificial Intelligence, pp. 256–278, Lisbon, Portugal, 26-28/07/2023
DOI: 10.1007/978-3-031-44067-0_14
Project(s): TAILOR via OpenAIRE, XAI via OpenAIRE, SoBigData-PlusPlus via OpenAIRE, Humane AI via OpenAIRE
Metrics:


See at: doi.org Restricted | link.springer.com Restricted | CNR ExploRA


2023 Journal article Open Access OPEN
Co-design of human-centered, explainable AI for clinical decision support
Panigutti C., Beretta A., Fadda D., Giannotti F., Pedreschi D., Perotti A., Rinzivillo S.
eXplainable AI (XAI) involves two intertwined but separate challenges: the development of techniques to extract explanations from black-box AI models and the way such explanations are presented to users, i.e., the explanation user interface. Despite its importance, the second aspect has received limited attention so far in the literature. Effective AI explanation interfaces are fundamental for allowing human decision-makers to take advantage and oversee high-risk AI systems effectively. Following an iterative design approach, we present the first cycle of prototyping-testing-redesigning of an explainable AI technique and its explanation user interface for clinical Decision Support Systems (DSS). We first present an XAI technique that meets the technical requirements of the healthcare domain: sequential, ontology-linked patient data, and multi-label classification tasks. We demonstrate its applicability to explain a clinical DSS, and we design a first prototype of an explanation user interface. Next, we test such a prototype with healthcare providers and collect their feedback with a two-fold outcome: First, we obtain evidence that explanations increase users' trust in the XAI system, and second, we obtain useful insights on the perceived deficiencies of their interaction with the system, so we can re-design a better, more human-centered explanation interface.Source: ACM transactions on interactive intelligent systems (Online) 13 (2023). doi:10.1145/3587271
DOI: 10.1145/3587271
Project(s): HumanE-AI-Net via OpenAIRE, XAI via OpenAIRE
Metrics:


See at: dl.acm.org Open Access | Archivio istituzionale della Ricerca - Scuola Normale Superiore Open Access | ISTI Repository Open Access | ACM Transactions on Interactive Intelligent Systems Restricted | CNR ExploRA


2022 Other Open Access OPEN
The complexity of Artificial Learning - CNR Foresight report
Bacco F. M., Colantonio S., Lepri S., Bartolucci C., Cinti C., Falchi F., Giannotti F.
Artificial learning will significantly affect human life and science in the next decades. However, its basic principles of functioning are still theoretically not well understood. The aim of the workshop was to examine the relations and perspectives of the interaction between Data and Computer Sciences and Complexity theory on this matter.

See at: CNR IRIS Open Access | ISTI Repository Open Access | www.foresight.cnr.it Open Access | CNR IRIS Restricted


2022 Journal article Open Access OPEN
Understanding peace through the world news
Voukelatou V, Miliou I, Giannotti F, Pappalardo L
Peace is a principal dimension of well-being and is the way out of inequity and violence. Thus, its measurement has drawn the attention of researchers, policymakers, and peacekeepers. During the last years, novel digital data streams have drastically changed the research in this field. The current study exploits information extracted from a new digital database called Global Data on Events, Location, and Tone (GDELT) to capture peace through the Global Peace Index (GPI). Applying predictive machine learning models, we demonstrate that news media attention from GDELT can be used as a proxy for measuring GPI at a monthly level. Additionally, we use explainable AI techniques to obtain the most important variables that drive the predictions. This analysis highlights each country's profile and provides explanations for the predictions, and particularly for the errors and the events that drive these errors. We believe that digital data exploited by researchers, policymakers, and peacekeepers, with data science tools as powerful as machine learning, could contribute to maximizing the societal benefits and minimizing the risks to peace.Source: EPJ DATA SCIENCE, vol. 11 (issue 1)
DOI: 10.1140/epjds/s13688-022-00315-z
Project(s): XAI via OpenAIRE, SoBigData-PlusPlus via OpenAIRE
Metrics:


See at: epjdatascience.springeropen.com Open Access | CNR IRIS Open Access | ISTI Repository Open Access | CNR IRIS Restricted


2022 Journal article Open Access OPEN
Origin and destination attachment: study of cultural integration on Twitter
Kim J, Sirbu A, Giannotti F, Rossetti G, Rapoport H
The cultural integration of immigrants conditions their overall socio-economic integration as well as natives' attitudes towards globalisation in general and immigration in particular. At the same time, excessive integration--or assimilation--can be detrimental in that it implies forfeiting one's ties to the origin country and eventually translates into a loss of diversity (from the viewpoint of host countries) and of global connections (from the viewpoint of both host and home countries). Cultural integration can be described using two dimensions: the preservation of links to the origin country and culture, which we call origin attachment, and the creation of new links together with the adoption of cultural traits from the new residence country, which we call destination attachment. In this paper we introduce a means to quantify these two aspects based on Twitter data. We build origin and destination attachment indices and analyse their possible determinants (e.g., language proximity, distance between countries), also in relation to Hofstede's cultural dimension scores. The results stress the importance of language: a common language between origin and destination countries favours origin attachment, as does low proficiency in the host language. Common geographical borders seem to favour both origin and destination attachment. Regarding cultural dimensions, larger differences among origin and destination countries in terms of Individualism, Masculinity and Uncertainty appear to favour destination attachment and lower origin attachment.Source: EPJ DATA SCIENCE, vol. 11 (issue 1)
DOI: 10.1140/epjds/s13688-022-00363-5
Project(s): HumMingBird via OpenAIRE, SoBigData-PlusPlus via OpenAIRE
Metrics:


See at: EPJ Data Science Open Access | epjdatascience.springeropen.com Open Access | CNR IRIS Open Access | ISTI Repository Open Access | CNR IRIS Restricted


2022 Journal article Open Access OPEN
Stable and actionable explanations of black-box models through factual and counterfactual rules
Guidotti R., Monreale A., Ruggieri S., Naretto F., Turini F., Pedreschi D., Giannotti F.
Recent years have witnessed the rise of accurate but obscure classification models that hide the logic of their internal decision processes. Explaining the decision taken by a black-box classifier on a specific input instance is therefore of striking interest. We propose a local rule-based model-agnostic explanation method providing stable and actionable explanations. An explanation consists of a factual logic rule, stating the reasons for the black-box decision, and a set of actionable counterfactual logic rules, proactively suggesting the changes in the instance that lead to a different outcome. Explanations are computed from a decision tree that mimics the behavior of the black-box locally to the instance to explain. The decision tree is obtained through a bagging-like approach that favors stability and fidelity: first, an ensemble of decision trees is learned from neighborhoods of the instance under investigation; then, the ensemble is merged into a single decision tree. Neighbor instances are synthetically generated through a genetic algorithm whose fitness function is driven by the black-box behavior. Experiments show that the proposed method advances the state-of-the-art towards a comprehensive approach that successfully covers stability and actionability of factual and counterfactual explanations.Source: DATA MINING AND KNOWLEDGE DISCOVERY, vol. 38, pp. 2825-2862
DOI: 10.1007/s10618-022-00878-5
Project(s): NoBIAS via OpenAIRE, TAILOR via OpenAIRE, HumanE-AI-Net via OpenAIRE, SAI: Social Explainable Artificial Intelligence via OpenAIRE, XAI via OpenAIRE, SoBigData-PlusPlus via OpenAIRE, SAI via OpenAIRE, Social Explainable Artificial Intelligence (SAI) via OpenAIRE
Metrics:


See at: Archivio istituzionale della Ricerca - Scuola Normale Superiore Open Access | Archivio istituzionale della Ricerca - Scuola Normale Superiore Open Access | Archivio della Ricerca - Università di Pisa Open Access | Archivio della Ricerca - Università di Pisa Open Access | Software Heritage Restricted | Software Heritage Restricted | IRIS Cnr Restricted | GitHub Restricted | GitHub Restricted | GitHub Restricted | IRIS Cnr Restricted | CNR IRIS Restricted | IRIS Cnr Restricted


2022 Conference article Restricted
Transparent latent space counterfactual explanations for tabular data
Bodria F., Guidotti R., Giannotti F., Pedreschi D.
Artificial Intelligence decision-making systems have dramatically increased their predictive performance in recent years, beating humans in many different specific tasks. However, with increased performance has come an increase in the complexity of the black-box models adopted by the AI systems, making them entirely obscure for the decision process adopted. Explainable AI is a field that seeks to make AI decisions more transparent by producing explanations. In this paper, we propose T-LACE, an approach able to retrieve post-hoc counterfactual explanations for a given pre-trained black-box model. T-LACE exploits the similarity and linearity proprieties of a custom-created transparent latent space to build reliable counterfactual explanations. We tested T-LACE on several tabular datasets and provided qualitative evaluations of the generated explanations in terms of similarity, robustness, and diversity. Comparative analysis against various state-of-the-art counterfactual explanation methods shows the higher effectiveness of our approach.DOI: 10.1109/dsaa54385.2022.10032407
Project(s): XAI via OpenAIRE
Metrics:


See at: doi.org Restricted | Archivio istituzionale della Ricerca - Scuola Normale Superiore Restricted | Archivio della Ricerca - Università di Pisa Restricted | IRIS Cnr Restricted | Archivio della Ricerca - Università di Pisa Restricted | CNR IRIS Restricted


2022 Conference article Open Access OPEN
Interpretable latent space to enable counterfactual explanations
Bodria F., Guidotti R., Giannotti F., Pedreschi D.
Many dimensionality reduction methods have been introduced to map a data space into one with fewer features and enhance machine learning models' capabilities. This reduced space, called latent space, holds properties that allow researchers to understand the data better and produce better models. This work proposes an interpretable latent space that preserves the similarity of data points and supports a new way of learning a classification model that allows prediction and explanation through counterfactual examples. We demonstrate with extensive experiments the effectiveness of the latent space with respect to different metrics in comparison with several competitors, as well as the quality of the achieved counterfactual explanations.Source: LECTURE NOTES IN ARTIFICIAL INTELLIGENCE, vol. 13601, pp. 525-540. Tolouse, France, 10-12/10/2022
DOI: 10.1007/978-3-031-18840-4_37
Project(s): XAI via OpenAIRE
Metrics:


See at: Archivio istituzionale della Ricerca - Scuola Normale Superiore Open Access | Archivio istituzionale della Ricerca - Scuola Normale Superiore Open Access | doi.org Restricted | Archivio della Ricerca - Università di Pisa Restricted | IRIS Cnr Restricted | CNR IRIS Restricted | IRIS Cnr Restricted


2021 Conference article Open Access OPEN
Measuring immigrants adoption of natives shopping consumption with machine learning
Guidotti R, Nanni M, Giannotti F, Pedreschi D, Bertoli S, Speciale B, Rapoport H
Tell me what you eat and I will tell you what you are". Jean Anthelme Brillat-Savarin was among the firsts to recognize the relationship between identity and food consumption. Food adoption choices are much less exposed to external judgment and social pressure than other individual behaviours, and can be observed over a long period. That makes them an interesting basis for, among other applications, studying the integration of immigrants from a food consumption viewpoint. Indeed, in this work we analyze immigrants' food consumption from shopping retail data for understanding if and how it converges towards those of natives. As core contribution of our proposal, we define a score of adoption of natives' consumption habits by an individual as the probability of being recognized as a native from a machine learning classifier, thus adopting a completely data-driven approach. We measure the immigrant's adoption of natives' consumption behavior over a long time, and we identify different trends. A case study on real data of a large nation-wide supermarket chain reveals that we can distinguish five main different groups of immigrants depending on their trends of native consumption adoption.DOI: 10.1007/978-3-030-67670-4_23
Metrics:


See at: CNR IRIS Open Access | link.springer.com Open Access | ISTI Repository Open Access | CNR IRIS Restricted | CNR IRIS Restricted | link.springer.com Restricted


2021 Other Open Access OPEN
Understanding peacefulness through the world news
Voukelatou V, Miliou I, Giannotti F, Pappalardo L
Peacefulness is a principal dimension of well-being for all humankind and is the way out of inequity and every single form of violence. Thus, its measurement has lately drawn the attention of researchers and policy-makers. During the last years, novel digital data streams have drastically changed the research in this field. In the current study, we exploit information extracted from Global Data on Events, Location, and Tone (GDELT) digital news database, to capture peacefulness through the Global Peace Index (GPI). Applying predictive machine learning models, we demonstrate that news media attention from GDELT can be used as a proxy for measuring GPI at a monthly level. Additionally, we use the SHAP methodology to obtain the most important variables that drive the predictions. This analysis highlights each country's profile and provides explanations for the predictions overall, and particularly for the errors and the events that drive these errors. We believe that digital data exploited by Social Good researchers, policy-makers, and peace-builders, with data science tools as powerful as machine learning, could contribute to maximize the societal benefits and minimize the risks to peacefulness.Project(s): SoBigData-PlusPlus via OpenAIRE

See at: arxiv.org Open Access | CNR IRIS Open Access | ISTI Repository Open Access | CNR IRIS Restricted


2021 Journal article Open Access OPEN
Data Science: a game changer for science and innovation
Grossi V, Giannotti F, Pedreschi D, Manghi P, Pagano P, Assante M
This paper shows data science's potential for disruptive innovation in science, industry, policy, and people's lives. We present how data science impacts science and society at large in the coming years, including ethical problems in managing human behavior data and considering the quantitative expectations of data science economic impact. We introduce concepts such as open science and e-infrastructure as useful tools for supporting ethical data science and training new generations of data scientists. Finally, this work outlines SoBigData Research Infrastructure as an easy-to-access platform for executing complex data science processes. The services proposed by SoBigData are aimed at using data science to understand the complexity of our contemporary, globally interconnected society.Source: INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS, vol. 11, pp. 263-278
DOI: 10.1007/s41060-020-00240-2
Metrics:


See at: CNR IRIS Open Access | link.springer.com Open Access | ISTI Repository Open Access | CNR IRIS Restricted | International Journal of Data Science and Analytics Restricted | International Journal of Data Science and Analytics Restricted


2021 Journal article Open Access OPEN
Predicting seasonal influenza using supermarket retail records
Miliou I, Xiong X, Rinzivillo S, Zhang Q, Rossetti G, Giannotti F, Pedreschi D, Vespignani A
Increased availability of epidemiological data, novel digital data streams, and the rise of powerful machine learning approaches have generated a surge of research activity on realtime epidemic forecast systems. In this paper, we propose the use of a novel data source, namely retail market data to improve seasonal influenza forecasting. Specifically, we consider supermarket retail data as a proxy signal for influenza, through the identification of sentinel baskets, i.e., products bought together by a population of selected customers. We develop a nowcasting and forecasting framework that provides estimates for influenza incidence in Italy up to 4 weeks ahead. We make use of the Support Vector Regression (SVR) model to produce the predictions of seasonal flu incidence. Our predictions outperform both a baseline autoregressive model and a second baseline based on product purchases. The results show quantitatively the value of incorporating retail market data in forecasting models, acting as a proxy that can be used for the real-time analysis of epidemics.Source: PLOS COMPUTATIONAL BIOLOGY, vol. 17 (issue 7)
DOI: 10.1371/journal.pcbi.1009087
Metrics:


See at: CNR IRIS Open Access | journals.plos.org Open Access | ISTI Repository Open Access | CNR IRIS Restricted


2021 Journal article Open Access OPEN
Ethics of smart cities: towards value-sensitive design and co-evolving city life
Helbing D, Fanitabasi F, Giannotti F, Hanggli R, Hausladen Ci, Van Den Hoven J, Mahajan S, Pedreschi D, Pournaras E
The digital revolution has brought about many societal changes such as the creation of "smart cities". The smart city concept has changed the urban ecosystem by embedding digital technologies in the city fabric to enhance the quality of life of its inhabitants. However, it has also led to some pressing issues and challenges related to data, privacy, ethics inclusion, and fairness. While the initial concept of smart cities was largely technology-and data-driven, focused on the automation of traffic, logistics and processes, this concept is currently being replaced by technology-enabled, human-centred solutions. However, this is not the end of the development, as there is now a big trend towards "design for values". In this paper, we point out how a value-sensitive design approach could promote a more sustainable pathway of cities that better serves people and nature. Such "valuesensitive design" will have to take ethics, law and culture on board. We discuss how organising the digital world in a participatory way, as well as leveraging the concepts of self-organisation, selfregulation, and self-control, would foster synergy effects and thereby help to leverage a sustainable technological revolution on a global scale. Furthermore, a "democracy by design" approach could also promote resilience.Source: SUSTAINABILITY (BASEL), vol. 13 (issue 20)
DOI: 10.3390/su132011162
Project(s): SoBigData-PlusPlus via OpenAIRE
Metrics:


See at: CNR IRIS Open Access | ISTI Repository Open Access | www.mdpi.com Open Access | CNR IRIS Restricted


2021 Conference article Restricted
Artificial intelligence for humankind: a panel on how to create truly interactive and human-centered AI for the benefit of individuals and society
Schmidt A, Giannotti F, Mackay W, Shneiderman B, Vaananen K
This panel discusses the role of human-computer interaction (HCI) in the conception, design, and implementation of human-centered artificial intelligence (AI). For us, it is important that AI and machine learning (ML) are ethical and create value for humans - as individuals as well as for society. Our discussion emphasizes the opportunities of using HCI and User Experience Design methods to create advanced AI/ML-based systems that will be widely adopted, reliable, safe, trustworthy, and responsible. The resulting systems will integrate AI and ML algorithms while providing user interfaces and control panels that ensure meaningful human control.DOI: 10.1007/978-3-030-85607-6_32
Metrics:


See at: CNR IRIS Restricted | CNR IRIS Restricted | link.springer.com Restricted | link.springer.com Restricted


2021 Journal article Open Access OPEN
Give more data, awareness and control to individual citizens, and they will help COVID-19 containment
Nanni M, Andrienko G, Barabasi Al, Boldrini C, Bonchi F, Cattuto C, Chiaromonte F, Comande G, Conti M, Cote M, Dignum F, Dignum V, Domingoferrer J, Ferragina P, Giannotti F, Guidotti R, Helbing D, Kaski K, Kertesz J, Lehmann S, Lepri B, Lukowicz P, Matwin S, Jimenez Dm, Monreale A, Morik K, Oliver N, Passarella A, Passerini A, Pedreschi D, Pentland A, Pianesi F, Pratesi F, Rinzivillo S, Ruggieri S, Siebes A, Torra V, Trasarti R, Hoven J, Vespignani A
The rapid dynamics of COVID-19 calls for quick and effective tracking of virus transmission chains and early detection of outbreaks, especially in the "phase 2" of the pandemic, when lockdown and other restriction measures are progressively withdrawn, in order to avoid or minimize contagion resurgence. For this purpose, contact-tracing apps are being proposed for large scale adoption by many countries. A centralized approach, where data sensed by the app are all sent to a nation-wide server, raises concerns about citizens' privacy and needlessly strong digital surveillance, thus alerting us to the need to minimize personal data collection and avoiding location tracking. We advocate the conceptual advantage of a decentralized approach, where both contact and location data are collected exclusively in individual citizens' "personal data stores", to be shared separately and selectively (e.g., with a backend system, but possibly also with other citizens), voluntarily, only when the citizen has tested positive for COVID-19, and with a privacy preserving level of granularity. This approach better protects the personal sphere of citizens and affords multiple benefits: it allows for detailed information gathering for infected people in a privacy-preserving fashion; and, in turn this enables both contact tracing, and, the early detection of outbreak hotspots on more finely-granulated geographic scale. The decentralized approach is also scalable to large populations, in that only the data of positive patients need be handled at a central level. Our recommendation is two-fold. First to extend existing decentralized architectures with a light touch, in order to manage the collection of location data locally on the device, and allow the user to share spatio-temporal aggregates--if and when they want and for specific aims--with health authorities, for instance. Second, we favour a longer-term pursuit of realizing a Personal Data Store vision, giving users the opportunity to contribute to collective good in the measure they want, enhancing self-awareness, and cultivating collective efforts for rebuilding society.Source: ETHICS AND INFORMATION TECHNOLOGY, vol. 23 (issue 3)
DOI: 10.1007/s10676-020-09572-w
Project(s): SoBigData-PlusPlus via OpenAIRE
Metrics:


See at: Aaltodoc Publication Archive Open Access | Aaltodoc Publication Archive Open Access | Ethics and Information Technology Open Access | Ethics and Information Technology Open Access | Recolector de Ciencia Abierta, RECOLECTA Open Access | CNR IRIS Open Access | Archivio Istituzionale Open Access | link.springer.com Open Access | Ethics and Information Technology Open Access | City Research Online Open Access | ISTI Repository Open Access | Online Research Database In Technology Open Access | NARCIS Open Access | NARCIS Open Access | Digitala Vetenskapliga Arkivet - Academic Archive On-line Open Access | Publikationer från Umeå universitet Open Access | NARCIS Restricted | CNR IRIS Restricted | kclpure.kcl.ac.uk Restricted | Fraunhofer-ePrints Restricted | Fraunhofer-ePrints Restricted | publons.com Restricted | www.scopus.com Restricted


2021 Conference article Open Access OPEN
Boosting synthetic data generation with effective nonlinear causal discovery
Cinquini M, Giannotti F, Guidotti R
Synthetic data generation has been widely adopted in software testing, data privacy, imbalanced learning, artificial intelligence explanation, etc. In all such contexts, it is important to generate plausible data samples. A common assumption of approaches widely used for data generation is the independence of the features. However, typically, the variables of a dataset de-pend on one another, and these dependencies are not considered in data generation leading to the creation of implausible records. The main problem is that dependencies among variables are typically unknown. In this paper, we design a synthetic dataset generator for tabular data that is able to discover nonlinear causalities among the variables and use them at generation time. State-of-the-art methods for nonlinear causal discovery are typically inefficient. We boost them by restricting the causal discovery among the features appearing in the frequent patterns efficiently retrieved by a pattern mining algorithm. To validate our proposal, we design a framework for generating synthetic datasets with known causalities. Wide experimentation on many synthetic datasets and real datasets with known causalities shows the effectiveness of the proposed method.DOI: 10.1109/cogmi52975.2021.00016
Project(s): TAILOR via OpenAIRE, HumanE-AI-Net via OpenAIRE, SoBigData-PlusPlus via OpenAIRE
Metrics:


See at: CNR IRIS Open Access | ieeexplore.ieee.org Open Access | ISTI Repository Open Access | doi.org Restricted | CNR IRIS Restricted | CNR IRIS Restricted


2021 Journal article Open Access OPEN
GLocalX - From local to global explanations of black box AI models
Setzu M., Guidotti R., Monreale A., Turini F., Pedreschi D., Giannotti F.
Artificial Intelligence (AI) has come to prominence as one of the major components of our society, with applications in most aspects of our lives. In this field, complex and highly nonlinear machine learning models such as ensemble models, deep neural networks, and Support Vector Machines have consistently shown remarkable accuracy in solving complex tasks. Although accurate, AI models often are "black boxes" which we are not able to understand. Relying on these models has a multifaceted impact and raises significant concerns about their transparency. Applications in sensitive and critical domains are a strong motivational factor in trying to understand the behavior of black boxes. We propose to address this issue by providing an interpretable layer on top of black box models by aggregating "local" explanations. We present GLOCALX, a "local-first" model agnostic explanation method. Starting from local explanations expressed in form of local decision rules, GLOCALX iteratively generalizes them into global explanations by hierarchically aggregating them. Our goal is to learn accurate yet simple interpretable models to emulate the given black box, and, if possible, replace it entirely. We validate GLOCALX in a set of experiments in standard and constrained settings with limited or no access to either data or local explanations. Experiments show that GLOCALX is able to accurately emulate several models with simple and small models, reaching state-of-the-art performance against natively global solutions. Our findings show how it is often possible to achieve a high level of both accuracy and comprehensibility of classification models, even in complex domains with high-dimensional data, without necessarily trading one property for the other. This is a key requirement for a trustworthy AI, necessary for adoption in high-stakes decision making applications.Source: ARTIFICIAL INTELLIGENCE, vol. 294
DOI: 10.1016/j.artint.2021.103457
DOI: 10.48550/arxiv.2101.07685
Project(s): TAILOR via OpenAIRE, XAI via OpenAIRE, SoBigData-PlusPlus via OpenAIRE
Metrics:


See at: arXiv.org e-Print Archive Open Access | Artificial Intelligence Open Access | Archivio istituzionale della Ricerca - Scuola Normale Superiore Open Access | IRIS Cnr Open Access | IRIS Cnr Open Access | Artificial Intelligence Restricted | Archivio della Ricerca - Università di Pisa Restricted | doi.org Restricted | Archivio della Ricerca - Università di Pisa Restricted | CNR IRIS Restricted | CNR IRIS Restricted


2021 Journal article Open Access OPEN
Correction to: Human migration: the big data perspective
Sîrbu A., Andrienko G., Andrienko N., Boldrini C., Conti M., Giannotti F., Guidotti R., Bertoli S., Kim J., Muntean C. I., Pappalardo L., Passarella A., Pedreschi D., Pollacci L., Pratesi F., Sharma R.
How can big data help to understand the migration phenomenon? In this paper, we try to answer this question through an analysis of various phases of migration, comparing traditional and novel data sources and models at each phase. We concentrate on three phases of migration, at each phase describing the state of the art and recent developments and ideas. The first phase includes the journey, and we study migration flows and stocks, providing examples where big data can have an impact. The second phase discusses the stay, i.e. migrant integration in the destination country. We explore various data sets and models that can be used to quantify and understand migrant integration, with the final aim of providing the basis for the construction of a novel multi-level integration index. The last phase is related to the effects of migration on the source countries and the return of migrants.Source: INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS, vol. 11, pp. 341-360
DOI: 10.1007/s41060-021-00260-6
Project(s): SoBigData via OpenAIRE
Metrics:


See at: CNR IRIS Open Access | link.springer.com Open Access | Journal of Data Science Open Access | International Journal of Data Science and Analytics Open Access | Open Access Repository Open Access | International Journal of Data Science and Analytics Restricted | CNR IRIS Restricted