39 result(s)
Page Size: 10, 20, 50
Export: bibtex, xml, json, csv
Order by:

CNR Author operator: and / or
more
Typology operator: and / or
Language operator: and / or
Date operator: and / or
more
Rights operator: and / or
2022 Journal article Open Access OPEN
Explaining short text classification with diverse synthetic exemplars and counter-exemplars
Lampridis O, State L, Guidotti R, Ruggieri S
We present xspells, a model-agnostic local approach for explaining the decisions of black box models in classification of short texts. The explanations provided consist of a set of exemplar sentences and a set of counter-exemplar sentences. The former are examples classified by the black box with the same label as the text to explain. The latter are examples classified with a different label (a form of counter-factuals). Both are close in meaning to the text to explain, and both are meaningful sentences - albeit they are synthetically generated. xspells generates neighbors of the text to explain in a latent space using Variational Autoencoders for encoding text and decoding latent instances. A decision tree is learned from randomly generated neighbors, and used to drive the selection of the exemplars and counter-exemplars. Moreover, diversity of counter-exemplars is modeled as an optimization problem, solved by a greedy algorithm with theoretical guarantee. We report experiments on three datasets showing that xspells outperforms the well-known lime method in terms of quality of explanations, fidelity, diversity, and usefulness, and that is comparable to it in terms of stability.Source: MACHINE LEARNING
DOI: 10.1007/s10994-022-06150-7
Project(s): NoBIAS via OpenAIRE, SoBigData-PlusPlus via OpenAIRE
Metrics:


See at: CNR IRIS Open Access | link.springer.com Open Access | ISTI Repository Open Access | CNR IRIS Restricted


2022 Conference article Open Access OPEN
Bias discovery within human raters: a case study of the Jigsaw dataset
Manerba Marchiori M., Guidotti R., Passaro L., Ruggieri S.
Understanding and quantifying the bias introduced by human annotation of data is a crucial problem for trustworthy supervised learning. Recently, a perspectivist trend has emerged in the NLP community, focusing on the inadequacy of previous aggregation schemes, which suppose the existence of a single ground truth. This assumption is particularly problematic for sensitive tasks involving subjective human judgments, such as toxicity detection. To address these issues, we propose a preliminary approach for bias discovery within human raters by exploring individual ratings for specific sensitive topics annotated in the texts. Our analysis's object focuses on the Jigsaw dataset, a collection of comments aiming at challenging online toxicity identification.

See at: aclanthology.org Open Access | CNR IRIS Open Access | CNR IRIS Restricted


2022 Journal article Open Access OPEN
Stable and actionable explanations of black-box models through factual and counterfactual rules
Guidotti R., Monreale A., Ruggieri S., Naretto F., Turini F., Pedreschi D., Giannotti F.
Recent years have witnessed the rise of accurate but obscure classification models that hide the logic of their internal decision processes. Explaining the decision taken by a black-box classifier on a specific input instance is therefore of striking interest. We propose a local rule-based model-agnostic explanation method providing stable and actionable explanations. An explanation consists of a factual logic rule, stating the reasons for the black-box decision, and a set of actionable counterfactual logic rules, proactively suggesting the changes in the instance that lead to a different outcome. Explanations are computed from a decision tree that mimics the behavior of the black-box locally to the instance to explain. The decision tree is obtained through a bagging-like approach that favors stability and fidelity: first, an ensemble of decision trees is learned from neighborhoods of the instance under investigation; then, the ensemble is merged into a single decision tree. Neighbor instances are synthetically generated through a genetic algorithm whose fitness function is driven by the black-box behavior. Experiments show that the proposed method advances the state-of-the-art towards a comprehensive approach that successfully covers stability and actionability of factual and counterfactual explanations.Source: DATA MINING AND KNOWLEDGE DISCOVERY, vol. 38, pp. 2825-2862
DOI: 10.1007/s10618-022-00878-5
Project(s): NoBIAS via OpenAIRE, TAILOR via OpenAIRE, HumanE-AI-Net via OpenAIRE, SAI: Social Explainable Artificial Intelligence via OpenAIRE, XAI via OpenAIRE, SoBigData-PlusPlus via OpenAIRE, SAI via OpenAIRE, Social Explainable Artificial Intelligence (SAI) via OpenAIRE
Metrics:


See at: Archivio istituzionale della Ricerca - Scuola Normale Superiore Open Access | Archivio istituzionale della Ricerca - Scuola Normale Superiore Open Access | Archivio della Ricerca - Università di Pisa Open Access | Archivio della Ricerca - Università di Pisa Open Access | Software Heritage Restricted | Software Heritage Restricted | IRIS Cnr Restricted | GitHub Restricted | GitHub Restricted | GitHub Restricted | IRIS Cnr Restricted | CNR IRIS Restricted | IRIS Cnr Restricted


2021 Conference article Open Access OPEN
Ensemble of counterfactual explainers
Guidotti R., Ruggieri S.
In eXplainable Artificial Intelligence (XAI), several counterfactual explainers have been proposed, each focusing on some desirable properties of counterfactual instances: minimality, actionability, stability, diversity, plausibility, discriminative power. We propose an ensemble of counterfactual explainers that boosts weak explainers, which provide only a subset of such properties, to a powerful method covering all of them. The ensemble runs weak explainers on a sample of instances and of features, and it combines their results by exploiting a diversity-driven selection function. The method is model-agnostic and, through a wrapping approach based on autoencoders, it is also data-agnostic.Source: LECTURE NOTES IN COMPUTER SCIENCE, vol. 12986, pp. 358-368. Halifax, Canada, 11-13/10/2021
DOI: 10.1007/978-3-030-88942-5_28
DOI: 10.48550/arxiv.2308.15194
Project(s): TAILOR via OpenAIRE, XAI via OpenAIRE, SoBigData-PlusPlus via OpenAIRE
Metrics:


See at: arXiv.org e-Print Archive Open Access | doi.org Open Access | IRIS Cnr Open Access | IRIS Cnr Open Access | dblp.uni-trier.de Restricted | Archivio della Ricerca - Università di Pisa Restricted | doi.org Restricted | GitHub Restricted | Archivio della Ricerca - Università di Pisa Restricted | CNR IRIS Restricted | CNR IRIS Restricted | Archivio della Ricerca - Università di Pisa Restricted


2020 Journal article Open Access OPEN
Causal inference for social discrimination reasoning
Qureshi B, Kamiran F, Karim A, Ruggieri S, Pedreschi D
The discovery of discriminatory bias in human or automated decision making is a task of increasing importance and difficulty, exacerbated by the pervasive use of machine learning and data mining. Currently, discrimination discovery largely relies upon correlation analysis of decisions records, disregarding the impact of confounding biases. We present a method for causal discrimination discovery based on propensity score analysis, a statistical tool for filtering out the effect of confounding variables. We introduce causal measures of discrimination which quantify the effect of group membership on the decisions, and highlight causal discrimination/favoritism patterns by learning regression trees over the novel measures. We validate our approach on two real world datasets. Our proposed framework for causal discrimination has the potential to enhance the transparency of machine learning with tools for detecting discriminatory bias both in the training data and in the learning algorithms.Source: JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, vol. 54 (issue 2), pp. 425-437
DOI: 10.1007/s10844-019-00580-x
DOI: 10.48550/arxiv.1608.03735
Metrics:


See at: arXiv.org e-Print Archive Open Access | Journal of Intelligent Information Systems Open Access | CNR IRIS Open Access | link.springer.com Open Access | ISTI Repository Open Access | Journal of Intelligent Information Systems Restricted | doi.org Restricted | CNR IRIS Restricted | CNR IRIS Restricted


2019 Journal article Open Access OPEN
A survey of methods for explaining black box models
Guidotti R, Monreale A, Ruggieri S, Turini F, Giannotti F, Pedreschi D
In recent years, many accurate decision support systems have been constructed as black boxes, that is as systems that hide their internal logic to the user. This lack of explanation constitutes both a practical and an ethical issue. The literature reports many approaches aimed at overcoming this crucial weakness, sometimes at the cost of sacrificing accuracy for interpretability. The applications in which black box decision systems can be used are various, and each approach is typically developed to provide a solution for a specific problem and, as a consequence, it explicitly or implicitly delineates its own definition of interpretability and explanation. The aim of this article is to provide a classification of the main problems addressed in the literature with respect to the notion of explanation and the type of black box system. Given a problem definition, a black box type, and a desired explanation, this survey should help the researcher to find the proposals more useful for his own work. The proposed classification of approaches to open black box models should also be useful for putting the many research open questions in perspective.Source: ACM COMPUTING SURVEYS, vol. 51 (issue 5)
DOI: 10.1145/3236009
DOI: 10.48550/arxiv.1802.01933
Project(s): SoBigData via OpenAIRE
Metrics:


See at: arXiv.org e-Print Archive Open Access | Archivio istituzionale della Ricerca - Scuola Normale Superiore Open Access | dl.acm.org Open Access | ACM Computing Surveys Open Access | Archivio della Ricerca - Università di Pisa Open Access | CNR IRIS Open Access | ISTI Repository Open Access | ACM Computing Surveys Restricted | doi.org Restricted | CNR IRIS Restricted


2019 Conference article Open Access OPEN
On the stability of interpretable models
Guidotti R, Ruggieri S
Interpretable classification models are built with the purpose of providing a comprehensible description of the decision logic to an external oversight agent. When considered in isolation, a decision tree, a set of classification rules, or a linear model, are widely recognized as human-interpretable. However, such models are generated as part of a larger analytical process. Bias in data collection and preparation, or in model's construction may severely affect the accountability of the design process. We conduct an experimental study of the stability of interpretable models with respect to feature selection, instance selection, and model selection. Our conclusions should raise awareness and attention of the scientific community on the need of a stability impact assessment of interpretable models.DOI: 10.1109/ijcnn.2019.8852158
DOI: 10.48550/arxiv.1810.09352
Project(s): SoBigData via OpenAIRE
Metrics:


See at: arXiv.org e-Print Archive Open Access | arxiv.org Open Access | CNR IRIS Open Access | ieeexplore.ieee.org Open Access | ISTI Repository Open Access | doi.org Restricted | doi.org Restricted | CNR IRIS Restricted | CNR IRIS Restricted


2018 Contribution to book Open Access OPEN
How data mining and machine learning evolved from relational data base to data science
Amato G, Candela L, Castelli D, Esuli A, Falchi F, Gennaro C, Giannotti F, Monreale A, Nanni M, Pagano P, Pappalardo L, Pedreschi D, Pratesi F, Rabitti F, Rinzivillo S, Rossetti G, Ruggieri S, Sebastiani F, Tesconi M
During the last 35 years, data management principles such as physical and logical independence, declarative querying and cost-based optimization have led to profound pervasiveness of relational databases in any kind of organization. More importantly, these technical advances have enabled the first round of business intelligence applications and laid the foundation for managing and analyzing Big Data today.Source: STUDIES IN BIG DATA, pp. 287-306
DOI: 10.1007/978-3-319-61893-7_17
Metrics:


See at: arpi.unipi.it Open Access | CNR IRIS Open Access | link.springer.com Open Access | ISTI Repository Open Access | doi.org Restricted | CNR IRIS Restricted | CNR IRIS Restricted


2018 Other Open Access OPEN
Assessing the stability of interpretable models
Guidotti R, Ruggieri S
Interpretable classification models are built with the purpose of providing a comprehensible description of the decision logic to an external oversight agent. When considered in isolation, a decision tree, a set of classification rules, or a linear model, are widely recognized as human-interpretable. However, such models are generated as part of a larger analytical process, which, in particular, comprises data collection and filtering. Selection bias in data collection or in data pre-processing may affect the model learned. Although model induction algorithms are designed to learn to generalize, they pursue optimization of predictive accuracy. It remains unclear how interpretability is instead impacted. We conduct an experimental analysis to investigate whether interpretable models are able to cope with data selection bias as far as interpretability is concerned.Project(s): SoBigData via OpenAIRE

See at: arxiv.org Open Access | CNR IRIS Open Access | ISTI Repository Open Access | CNR IRIS Restricted


2018 Other Open Access OPEN
Local rule-based explanations of black box decision systems
Guidotti R, Monreale A, Ruggieri S, Pedreschi D, Turini F, Giannotti F
The recent years have witnessed the rise of accurate but obscure decision systems which hide the logic of their internal decision processes to the users. The lack of explanations for the decisions of black box systems is a key ethical issue, and a limitation to the adoption of machine learning components in socially sensitive and safety-critical contexts.% Therefore, we need explanations that reveals the reasons why a predictor takes a certain decision. In this paper we focus on the problem of black box outcome explanation, ie, explaining the reasons of the decision taken on a specific instance. We propose LORE, an agnostic method able to provide interpretable and faithful explanations. LORE first leans a local interpretable predictor on a synthetic neighborhood generated by a genetic algorithm. Then it derives from the logic of the local interpretable predictor a meaningful explanation consisting of: a decision rule, which explains the reasons of the decision; and a set of counterfactual rules, suggesting the changes in the instance's features that lead to a different outcome. Wide experiments show that LORE outperforms existing methods and baselines both in the quality of explanations and in the accuracy in mimicking the black box.Project(s): SoBigData via OpenAIRE

See at: arxiv.org Open Access | CNR IRIS Open Access | ISTI Repository Open Access | CNR IRIS Restricted


2018 Other Open Access OPEN
Open the black box data-driven explanation of black box decision systems
Pedreschi D, Giannotti F, Guidotti R, Monreale A, Pappalardo L, Ruggieri S, Turini F
Black box systems for automated decision making, often based on machine learning over (big) data, map a user's features into a class or a score without exposing the reasons why. This is problematic not only for lack of transparency, but also for possible biases hidden in the algorithms, due to human prejudices and collection artifacts hidden in the training data, which may lead to unfair or wrong decisions. We introduce the local-to-global framework for black box explanation, a novel approach with promising early results, which paves the road for a wide spectrum of future developments along three dimensions:(i) the language for expressing explanations in terms of highly expressive logic-based rules, with a statistical and causal interpretation;(ii) the inference of local explanations aimed at revealing the logic of the decision adopted for a specific instance by querying and auditing the black box in the vicinity of the target instance;(iii), the bottom-up generalization of the many local explanations into simple global ones, with algorithms that optimize the quality and comprehensibility of explanations.Project(s): SoBigData via OpenAIRE

See at: arxiv.org Open Access | CNR IRIS Open Access | ISTI Repository Open Access | CNR IRIS Restricted


2017 Conference article Open Access OPEN
Efficiently clustering very large attributed graphs
Baroni A, Conte A, Patrignani M, Ruggieri S
Attributed graphs model real networks by enriching their nodes with attributes accounting for properties. Several techniques have been proposed for partitioning these graphs into clusters that are homogeneous with respect to both semantic attributes and to the structure of the graph. However, time and space complexities of state of the art algorithms limit their scalability to medium-sized graphs. We propose SToC (for Semantic-Topological Clustering), a fast and scalable algorithm for partitioning large attributed graphs. The approach is robust, being compatible both with categorical and with quantitative attributes, and it is tailorable, allowing the user to weight the semantic and topological components. Further, the approach does not require the user to guess in advance the number of clusters. SToC relies on well known approximation techniques such as bottom-k sketches, traditional graph-theoretic concepts, and a new perspective on the composition of heterogeneous distance measures. Experimental results demonstrate its ability to efficiently compute high-quality partitions of large scale attributed graphs.DOI: 10.1145/3110025.3110030
DOI: 10.48550/arxiv.1703.08590
Metrics:


See at: arXiv.org e-Print Archive Open Access | arxiv.org Open Access | dl.acm.org Open Access | CNR IRIS Open Access | ISTI Repository Open Access | doi.org Restricted | doi.org Restricted | CNR IRIS Restricted | CNR IRIS Restricted


2016 Journal article Open Access OPEN
Big data research in Italy: a perspective
Bergamaschi S, Carlini E, Ceci M, Furletti B, Giannotti F, Malerba D, Mezzanzanica M, Monreale A, Pasi G, Pedreschi D, Perego R, Ruggieri S
The aim of this article is to synthetically describe the research projects that a selection of Italian universities is undertaking in the context of big data. Far from being exhaustive, this article has the objective of offering a sample of distinct applications that address the issue of managing huge amounts of data in Italy, collected in relation to diverse domains.Source: ENGINEERING, vol. 2 (issue 2), pp. 163-170
DOI: 10.1016/j.eng.2016.02.011
Metrics:


See at: doi.org Open Access | CNR IRIS Open Access | ISTI Repository Open Access | Engineering Open Access | CNR IRIS Restricted


2014 Journal article Open Access OPEN
A multidisciplinary survey on discrimination analysis
Romei A, Ruggieri S
The collection and analysis of observational and experimental data represent the main tools for assessing the presence, the extent, the nature, and the trend of discrimination phenomena. Data analysis techniques have been proposed in the last 50 years in the economic, legal, statistical, and, recently, in the data mining literature. This is not surprising, since discrimination analysis is a multidisciplinary problem, involving sociological causes, legal argumentations, economic models, statistical techniques, and computational issues. The objective of this survey is to provide a guidance and a glue for researchers and anti-discrimination data analysts on concepts, problems, application areas, datasets, methods, and approaches from a multidisciplinary perspective. We organize the approaches according to their method of data collection as observational, quasi-experimental, and experimental studies. A fourth line of recently blooming research on knowledge discovery based methods is also covered. Observational methods are further categorized on the basis of their application context: labor economics, social profiling, consumer markets, and others.Source: KNOWLEDGE ENGINEERING REVIEW, vol. 29 (issue 5), pp. 582-638
DOI: 10.1017/s0269888913000039
Metrics:


See at: The Knowledge Engineering Review Open Access | The Knowledge Engineering Review Restricted | CNR IRIS Restricted | CNR IRIS Restricted | journals.cambridge.org Restricted


2014 Journal article Open Access OPEN
Decision tree building on multi-core using FastFlow
Aldinucci M, Ruggieri S, Torquati M
The whole computer hardware industry embraced the multi-core. The extreme optimisation of sequential algorithms is then no longer sufficient to squeeze the real machine power, which can be only exploited via thread-level parallelism. Decision tree algorithms exhibit natural concurrency that makes them suitable to be parallelised. This paper presents an in-depth study of the parallelisation of an implementation of the C4.5 algorithm for multi-core architectures. We characterise elapsed time lower bounds for the forms of parallelisations adopted and achieve close to optimal performance. Our implementation is based on the FastFlow parallel programming environment, and it requires minimal changes to the original sequential code. Copyright © 2013 John Wiley & Sons, Ltd. Copyright © 2013 John Wiley & Sons, Ltd.Source: CONCURRENCY AND COMPUTATION, vol. 26 (issue 3), pp. 800-820
DOI: 10.1002/cpe.3063
Metrics:


See at: Concurrency and Computation Practice and Experience Open Access | Concurrency and Computation Practice and Experience Restricted | CNR IRIS Restricted | CNR IRIS Restricted | onlinelibrary.wiley.com Restricted


2013 Contribution to book Restricted
Discrimination Data Analysis: A Multi-disciplinary Bibliography
Romei A, Ruggieri S
Discrimination data analysis has been investigated for the last fifty years in a large body of social, legal, and economic studies. Recently, discrimination discovery and prevention has become a blooming research topic in the knowledge discovery community. This chapter provides a multi-disciplinary annotated bibliography of the literature on discrimination data analysis, with the intended objective to provide a common basis to researchers from a multi-disciplinary perspective. We cover legal, sociological, economic and computer science referencesDOI: 10.1007/978-3-642-30487-3_6
Metrics:


See at: doi.org Restricted | CNR IRIS Restricted | CNR IRIS Restricted


2013 Journal article Open Access OPEN
Discrimination discovery in scientific project evaluation: a case study
Romei A, Ruggieri S, Turini F
Discovering contexts of unfair decisions in a dataset of historical decision records is a non-trivial problem. It requires the design of ad hoc methods and techniques of analysis, which have to comply with existing laws and with legal argumentations. While some data mining techniques have been adapted to the purpose, the state-of-the-art of research still needs both methodological refinements, the consolidation of a Knowledge Discovery in Databases (KDD) process, and, most of all, experimentation with real data. This paper contributes by presenting a case study on gender discrimination in a dataset of scientific research proposals, and by distilling from the case study a general discrimination discovery process. Gender bias in scientific research is a challenging problem, that has been tackled in the social sciences literature by means of statistical regression. However, this approach is limited to test an hypothesis of discrimination over the whole dataset under analysis. Our methodology couples data mining, for unveiling previously unknown contexts of possible discrimination, with statistical regression, for testing the significance of such contexts, thus obtaining the best of the two worlds. (C) 2013 Elsevier Ltd. All rights reserved.Source: EXPERT SYSTEMS WITH APPLICATIONS, vol. 40 (issue 15), pp. 6064-6079
DOI: 10.1016/j.eswa.2013.05.016
Metrics:


See at: Expert Systems with Applications Open Access | Expert Systems with Applications Restricted | CNR IRIS Restricted | CNR IRIS Restricted | www.sciencedirect.com Restricted


2013 Conference article Open Access OPEN
Data anonimity meets non-discrimination
Ruggieri S
We investigate the relation between t-closeness, a well-known model of data anonymization, and alpha-protection, a model of data discrimination. We show that t-closeness implies bd(t)-protection, for a bound function bd() depending on the discrimination measure at hand. This allows us to adapt an inference control method, the Mondrian multidimensional generalization technique, to the purpose of non-discrimination data protection. The parallel between the two analytical models raises intriguing issues on the interplay between data anonymization and nondiscrimination research in data mining.DOI: 10.1109/icdmw.2013.56
Metrics:


See at: www.di.unipi.it Open Access | doi.org Restricted | CNR IRIS Restricted | ieeexplore.ieee.org Restricted | CNR IRIS Restricted


2013 Conference article Open Access OPEN
Learning from polyhedral sets
Ruggieri S
Parameterized linear systems allow for modelling and reasoning over classes of polyhedra. Collections of squares, rectangles, polytopes, and so on, can readily be defined by means of linear systems with parameters. In this paper, we investigate the problem of learning a parameterized linear system whose class of polyhedra includes a given set of example polyhedral sets and it is minimal.

See at: CNR IRIS Open Access | ijcai.org Open Access | CNR IRIS Restricted


2013 Contribution to book Restricted
The discovery of discrimination
Pedreschi D, Ruggieri S, Turini F
Discrimination discovery from data consists in the extraction of discriminatory situations and practices hidden in a large amount of historical decision records.We discuss the challenging problems in discrimination discovery, and present, in a unified form, a framework based on classification rules extraction and filtering on the basis of legally-grounded interestingness measures. The framework is implemented in the publicly available DCUBE tool. As a running example, we use a public dataset on credit scoring.Source: STUDIES IN APPLIED PHILOSOPHY, EPISTEMOLOGY AND RATIONAL ETHICS, pp. 91-108
DOI: 10.1007/978-3-642-30487-3_5
Metrics:


See at: doi.org Restricted | CNR IRIS Restricted | CNR IRIS Restricted | link.springer.com Restricted