2015
Conference article
Restricted
Social or green? A datadriven approach for more enjoyable carpooling
Guidotti R, Sassi A, Berlingerio M, Pascale ACarpooling, i.e. the sharing of vehicles to reach common destinations, is often performed to reduce costs and pollution. Recent works on carpooling and journey planning take into account, besides mobility match, also social aspects and, more generally, non-monetary rewards. In line with this, we presenta data-driven methodology for a more enjoyable carpooling. We introduce a measure of enjoyability based on people's interests,social links, and tendency to connect to people with similar or dissimilar interests. We devise a methodology to compute enjoyability from crowd-sourced data, and we show how this can be used on real world datasets to optimize for both mobility and enjoyability. Our methodology was tested on real data from Rome and San Francisco. We compare the results of an optimization model minimizing the number of cars, and a greedy approach maximizing the enjoyability. We evaluate them in terms of cars saved, and average enjoyability of the system. We present also the results of a user study, with more than 200 users reporting an interest of 39% in the enjoyable solution. Moreover, 24%of people declared that sharing the car with interesting people would be the primary motivation for carpooling.DOI: 10.1109/itsc.2015.142Project(s): PETRA
Metrics:
See at:
doi.org
| CNR IRIS
| ieeexplore.ieee.org
| CNR IRIS
| CNR IRIS
2016
Conference article
Open Access
Where is my next friend? Recommending enjoyable profiles in location based services
Guidotti R, Berlingerio MHow many of your friends, with whom you enjoy spending some time, live close by? How many people are at your reach, with whom you could have a nice conversation? We introduce a measure of enjoyability that may be the basis for a new class of location-based services aimed at maximizing the likelihood that two persons, or a group of people, would enjoy spending time together. Our enjoyability takes into account both topic similarity between two users and the users' tendency to connect to people with similar or dissimilar interest. We computed the enjoyability on two datasets of geo-located tweets, and we reasoned on the applicability of the obtained results for producing friend recommendations. We aim at suggesting couples of users which are not friends yet, but which are frequently co-located and maximize our enjoyability measure. By taking into account the spatial dimension, we show how 50% of users may find at least one enjoyable person within 10km of their two most visited locations. Our results are encouraging, and open the way for a new class of recommender systems based on enjoyability.Source: STUDIES IN COMPUTATIONAL INTELLIGENCE (PRINT), vol. 644, pp. 65-78. Dijion, France, 23-25 March, 2016
DOI: 10.1007/978-3-319-30569-1_5Project(s): PETRA
Metrics:
See at:
www.springer.com
| doi.org
| CNR IRIS
| CNR IRIS
| link.springer.com
2017
Journal article
Open Access
The GRAAL of carpooling: GReen And sociAL optimization from crowd-sourced data
Berlingerio M, Ghaddar B, Guidotti R, Pascale A, Sassi ACarpooling, i.e. the sharing of vehicles to reach common destinations, is often performed to reduce costs and pollution. Recent work on carpooling takes into account, besides mobility matches, also social aspects and, more generally, non-monetary incentives. In line with this, we present GRAAL, a data-driven methodology for GReen And sociAL carpooling. GRAAL optimizes a carpooling system not only by minimizing the number of cars needed at the city level, but also by maximizing the enjoyability of people sharing a trip. We introduce a measure of enjoyability based on people's interests, social links, and tendency to connect to people with similar or dissimilar interests. GRAAL computes the enjoyability within a set of users from crowd-sourced data, and then uses it on real world datasets to optimize a weighted linear combination of number of cars and enjoyability. To tune this weight, and to investigate the users' interest on the social aspects of carpooling, we conducted an online survey on potential carpooling users. We present the results of applying GRAAL on real world crowd-sourced data from the cities of Rome and San Francisco. Computational results are presented from both the city and the user perspective. Using the crowd-sourced weight, GRAAL is able to significantly reduce the number of cars needed, while keeping a high level of enjoyability on the tested data-set. From the user perspective, we show how the entire per-car distribution of enjoyability is increased with respect to the baselines.Source: TRANSPORTATION RESEARCH. PART C, EMERGING TECHNOLOGIES, vol. 80, pp. 20-36
DOI: 10.1016/j.trc.2017.02.025Project(s): PETRA
Metrics:
See at:
CNR IRIS
| ISTI Repository
| www.sciencedirect.com
| Transportation Research Part C Emerging Technologies
| CNR IRIS
| CNR IRIS
2017
Other
Restricted
Personal Data Analytics: Capturing Human Behavior to Improve Self-Awareness and Personal Services through Individual and Collective Knowledge
Guidotti RIn the era of Big Data, every single user of our hyper-connected world leaves behind a myriad of digital breadcrumbs while performing her daily activities. It is sufficient to think of a simple smartphone that enables each one of us to browse the Web, listen to music on online musical services, post messages on social networks, perform online shopping sessions, acquire images and videos and record our geographical locations. This enormous amount of personal data could be exploited to improve the lifestyle of each individual by extracting, analyzing and exploiting user's behavioral patterns like the items frequently purchased, the routinary movements, the favorite sequence of songs listened, etc. However, even though some user-centric models for data management named Personal Data Store are emerging, currently there is still a significant lack in terms of algorithms and models specifically designed to extract and capture knowledge from personal data. This thesis proposes an extension to the idea of Personal Data Store through Personal Data Analytics. In practice, we describe parameter-free algorithms that do not need to be tuned by experts and are able to automatically extract the patterns from the user's data. We define personal data models to characterize the user profile which are able to capture and collect the users' behavioral patterns. In addition, we propose individual and collective services exploiting the knowledge extracted with Personal Data Analytics algorithm and models. The services are provided for the users which are organized in a Personal Data Ecosystem in form of a peer distributed network, and are available to share part of their own patterns as a return of the service providing. We show how the sharing with the collectivity enables or improves, the services analyzed. The sharing enhances the level of the service for individuals, for example by providing to the user an invaluable opportunity for having a better perception of her self-awareness. Moreover, at the same time, knowledge sharing can lead to forms of collective gain, like the reduction of the number of circulating cars. To prove the feasibility of Personal Data Analytics in terms of algorithms, models and services proposed we report an extensive experimentation on real world data.Project(s): CIMPLEX 
,
PETRA 
,
SoBigData 
See at:
CNR IRIS
| CNR IRIS
2018
Conference article
Open Access
On the Equivalence Between Community Discovery and Clustering
Guidotti R, Coscia MClustering is the subset of data mining techniques used to agnostically classify entities by looking at their attributes. Clustering algorithms specialized to deal with complex networks are called community discovery. Notwithstanding their common objectives, there are crucial assumptions in community discovery edge sparsity and only one node type, among others which makes its mapping to clustering non trivial. In this paper, we propose a community discovery to clustering mapping, by focusing on transactional data clustering. We represent a network as a transactional dataset, and we find communities by grouping nodes with common items (neighbors) in their baskets (neighbor lists). By comparing our results with ground truth communities and state of the art community discovery methods, we show that transactional clustering algorithms are a feasible alternative to community discovery, and that a complete mapping of the two problems is possible.Source: LECTURE NOTES OF THE INSTITUTE FOR COMPUTER SCIENCES, SOCIAL INFORMATICS AND TELECOMMUNICATIONS ENGINEERING, pp. 342-352. Pisa, Italy, 29-30/11/2017
DOI: 10.1007/978-3-319-76111-4_34Project(s): SoBigData
Metrics:
See at:
CNR IRIS
| link.springer.com
| ISTI Repository
| Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering
| CNR IRIS
2019
Conference article
Restricted
Helping your docker images to spread based on explainable models
Guidotti R, Soldani J, Neri D, Brogi A, Pedreschi DDocker is on the rise in today's enterprise IT. It permits shipping applications inside portable containers, which run from so-called Docker images. Docker images are distributed in public registries, which also monitor their popularity. The popularity of an image impacts on its actual usage, and hence on the potential revenues for its developers. In this paper, we present a solution based on interpretable decision tree and regression trees for estimating the popularity of a given Docker image, and for understanding how to improve an image to increase its popularity. The results presented in this work can provide valuable insights to Docker developers, helping them in spreading their images. Code related to this paper is available at: https://github.com/di-unipi-socc/DockerImageMiner.DOI: 10.1007/978-3-030-10997-4_13Project(s): SoBigData
Metrics:
See at:
doi.org
| CNR IRIS
| CNR IRIS
| link.springer.com
2020
Conference article
Open Access
Black box explanation by learning image exemplars in the latent feature space
Guidotti R, Monreale A, Matwin S, Pedreschi DWe present an approach to explain the decisions of black box models for image classification. While using the black box to label images, our explanation method exploits the latent feature space learned through an adversarial autoencoder. The proposed method first generates exemplar images in the latent feature space and learns a decision tree classifier. Then, it selects and decodes exemplars respecting local decision rules. Finally, it visualizes them in a manner that shows to the user how the exemplars can be modified to either stay within their class, or to become counter-factuals by "morphing" into another class. Since we focus on black box decision systems for image classification, the explanation obtained from the exemplars also provides a saliency map highlighting the areas of the image that contribute to its classification, and areas of the image that push it into another class. We present the results of an experimental evaluation on three datasets and two black box models. Besides providing the most useful and interpretable explanations, we show that the proposed method outperforms existing explainers in terms of fidelity, relevance, coherence, and stability.DOI: 10.1007/978-3-030-46150-8_12DOI: 10.48550/arxiv.2002.03746Project(s): AI4EU 
,
Track and Know 
,
Track and Know 
,
PRO-RES 
,
SoBigData
Metrics:
See at:
arXiv.org e-Print Archive
| arxiv.org
| CNR IRIS
| ISTI Repository
| www.springerprofessional.de
| doi.org
| doi.org
| CNR IRIS
| CNR IRIS
2020
Conference article
Restricted
Global explanations with local scoring
Setzu M, Guidotti R, Monreale A, Turini FArtificial Intelligence systems often adopt machine learning models encoding complex algorithms with potentially unknown behavior. As the application of these "black box" models grows, it is our responsibility to understand their inner working and formulate them in human-understandable explanations. To this end, we propose a rule-based model-agnostic explanation method that follows a local-to-global schema: it generalizes a global explanation summarizing the decision logic of a black box starting from the local explanations of single predicted instances. We define a scoring system based on a rule relevance score to extract global explanations from a set of local explanations in the form of decision rules. Experiments on several datasets and black boxes show the stability, and low complexity of the global explanations provided by the proposed solution in comparison with baselines and state-of-the-art global explainers.Source: COMMUNICATIONS IN COMPUTER AND INFORMATION SCIENCE (PRINT), pp. 159-171. Würzburg, Germany, 16-20 September, 2019
DOI: 10.1007/978-3-030-43823-4_14Project(s): AI4EU 
,
Track and Know 
,
Track and Know 
,
PRO-RES 
,
XAI 
,
SoBigData
Metrics:
See at:
Communications in Computer and Information Science
| CNR IRIS
| CNR IRIS
| link.springer.com
2023
Journal article
Open Access
Solving imbalanced learning with outlier detection and features reduction
Lusito S, Pugnana A, Guidotti RA critical problem for several real world applications is class imbalance. Indeed, in contexts
like fraud detection or medical diagnostics, standard machine learning models fail
because they are designed to handle balanced class distributions. Existing solutions typically
increase the rare class instances by generating synthetic records to achieve a balanced
class distribution. However, these procedures generate not plausible data and tend to create
unnecessary noise. We propose a change of perspective where instead of relying on resampling
techniques, we depend on unsupervised features engineering approaches to represent
records with a combination of features that will help the classifier capturing the differences
among classes, even in presence of imbalanced data. Thus, we combine a large array
of outlier detection, features projection, and features selection approaches to augment the
expressiveness of the dataset population. We show the effectiveness of our proposal in a
deep and wide set of benchmarking experiments as well as in real case studies.Source: MACHINE LEARNING
DOI: 10.1007/s10994-023-06448-0Project(s): SoBigData-PlusPlus
Metrics:
See at:
CNR IRIS
| link.springer.com
| ISTI Repository
| CNR IRIS
2023
Conference article
Restricted
Explaining black-boxes in federated learning
Corbucci L, Guidotti R, Monreale AFederated Learning has witnessed increasing popularity in the past few years for its ability to train Machine Learning models in critical contexts, using private data without moving them. Most of the work in the literature proposes algorithms and architectures for training neural networks, which although they present high performance in different predicting tasks and are easy to be learned with a cooperative mechanism, their predictive reasoning is obscure. Therefore, in this paper, we propose a variant of SHAP, one of the most widely used explanation methods, tailored to Horizontal server-based Federated Learning. The basic idea is having the possibility to explain an instance's prediction performed by the trained Machine Leaning model as an aggregation of the explanations provided by the clients participating in the cooperation. We empirically test our proposal on two different tabular datasets, and we observe interesting and encouraging preliminary results.Source: COMMUNICATIONS IN COMPUTER AND INFORMATION SCIENCE (PRINT), pp. 151-163. Lisbon, Portugal, 26-28/07/2023
DOI: 10.1007/978-3-031-44067-0_8Project(s): TAILOR 
,
XAI 
,
SoBigData-PlusPlus 
,
Humane AI
Metrics:
See at:
doi.org
| CNR IRIS
| CNR IRIS
| link.springer.com
2022
Conference article
Restricted
Effect of different encodings and distance functions on quantum instance-based classifiers
Berti A., Bernasconi A., Del Corso G. M., Guidotti R.In the last years, we have witnessed the increasing usage of machine learning technologies. In parallel, we have observed the raise of quantum computing, a paradigm for computing making use of quantum theory. Quantum computing can empower machine learning with theoretical properties allowing to overcome the limitations of classical computing. The translation of classical algorithms into their quantum counter-part is not trivial and hides many difficulties. We illustrate and implement alternatives for the quantum nearest neighbor classifier focusing on the challenges related to data preparation and their effect on the performance. We show that, with certain data preparation strategies, quantum algorithms are comparable with the classic version, yet allowing for a theoretical reduction of the complexity for distances calculation.Source: LECTURE NOTES IN ARTIFICIAL INTELLIGENCE, vol. 13281, pp. 96-108. CHENGDU, CHINA, 16-19/05/2022
DOI: 10.1007/978-3-031-05936-0_8Metrics:
See at:
doi.org
| Archivio della Ricerca - Università di Pisa
| Archivio della Ricerca - Università di Pisa
| IRIS Cnr
| CNR IRIS
| link.springer.com
2021
Conference article
Restricted
Explaining any time series classifier
Guidotti R., Monreale A., Spinnato F., Pedreschi D., Giannotti F.We present a method to explain the decisions of black box models for time series classification. The explanation consists of factual and counterfactual shapelet-based rules revealing the reasons for the classification, and of a set of exemplars and counter-exemplars highlighting similarities and differences with the time series under analysis. The proposed method first generates exemplar and counter-exemplar time series in the latent feature space and learns a local latent decision tree classifier. Then, it selects and decodes those respecting the decision rules explaining the decision. Finally, it learns on them a shapelet-tree that reveals the parts of the time series that must, and must not, be contained for getting the returned outcome from the black box. A wide experimentation shows that the proposed method provides faithful, meaningful and interpretable explanations.DOI: 10.1109/cogmi50398.2020.00029Project(s): AI4EU 
,
TAILOR 
,
XAI 
,
SoBigData-PlusPlus
Metrics:
See at:
dblp.uni-trier.de
| doi.org
| Archivio istituzionale della Ricerca - Scuola Normale Superiore
| Archivio della Ricerca - Università di Pisa
| IRIS Cnr
| IRIS Cnr
| CNR IRIS
2020
Journal article
Open Access
Evaluating local explanation methods on ground truth
Guidotti R.Evaluating local explanation methods is a difficult task due to the lack of a shared and universally accepted definition of explanation. In the literature, one of the most common ways to assess the performance of an explanation method is to measure the fidelity of the explanation with respect to the classification of a black box model adopted by an Artificial Intelligent system for making a decision. However, this kind of evaluation only measures the degree of adherence of the local explainer in reproducing the behavior of the black box classifier with respect to the final decision. Therefore, the explanation provided by the local explainer could be different in the content even though it leads to the same decision of the AI system. In this paper, we propose an approach that allows to measure to which extent the explanations returned by local explanation methods are correct with respect to a synthetic ground truth explanation. Indeed, the proposed methodology enables the generation of synthetic transparent classifiers for which the reason for the decision taken, i.e., a synthetic ground truth explanation, is available by design. Experimental results show how the proposed approach allows to easily evaluate local explanations on the ground truth and to characterize the quality of local explanation methods . (c) 2020 The Author. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).Source: ARTIFICIAL INTELLIGENCE, vol. 291
DOI: 10.1016/j.artint.2020.103428Project(s): AI4EU 
,
TAILOR 
,
HumanE-AI-Net 
,
XAI 
,
SoBigData-PlusPlus
Metrics:
See at:
Artificial Intelligence
| IRIS Cnr
| IRIS Cnr
| Archivio della Ricerca - Università di Pisa
| Archivio della Ricerca - Università di Pisa
| CNR IRIS
2020
Conference article
Open Access
Interpretable next basket prediction boosted with representative recipes
Guidotti R., Viotto S.Food is an essential element of our lives, cultures, and a crucial part of human experience. The study of food purchases can drive the design of practical services such as next basket predictor and shopping list reminder. Current approaches aimed at realizing these services do not exploit a contextual dimension involving food, i.e., recipes. To this aim, we design a next basket predictor based on representative recipes able to exploit the interest of customers towards certain ingredients when making the recommendation. The proposed method first identifies the representative recipes of a customer by analyzing her purchases and then estimates the rating of the items for the prediction. The ratings are based on both the purchases and the ingredients of the representative recipes. In addition, through our method, it is easy to justify why a specific set of items is predicted while such explanations are often not easily available in many other effective but opaque recommenders. Experimentation on a real-world dataset shows that the usage of recipes leverages the performance of existing next basket predictors.DOI: 10.1109/cogmi50398.2020.00018Project(s): AI4EU 
,
TAILOR 
,
SoBigData-PlusPlus
Metrics:
See at:
IRIS Cnr
| IRIS Cnr
| IRIS Cnr
| arpi.unipi.it
| doi.org
| Archivio della Ricerca - Università di Pisa
| CNR IRIS
| CNR IRIS
2020
Conference article
Open Access
Explaining image classifiers generating exemplars and counter-exemplars from latent representations
Guidotti R., Monreale A., Matwin S., Pedreschi D.We present an approach to explain the decisions of black box image classifiers through synthetic exemplar and counterexemplar learnt in the latent feature space. Our explanation method exploits the latent representations learned through an adversarial autoencoder for generating a synthetic neighborhood of the image for which an explanation is required. A decision tree is trained on a set of images represented in the latent space, and its decision rules are used to generate exemplar images showing how the original image can be modified to stay within its class. Counterfactual rules are used to generate counter-exemplars showing how the original image can "morph"into another class. The explanation also comprehends a saliency map highlighting the areas that contribute to its classification, and areas that push it into another class. A wide and deep experimental evaluation proves that the proposed method outperforms existing explainers in terms of fidelity, relevance, coherence, and stability, besides providing the most useful and interpretable explanations.DOI: 10.1609/aaai.v34i09.7116Project(s): AI4EU 
,
PRO-RES 
,
SoBigData 
,
Humane AI
Metrics:
See at:
CNR IRIS
| ojs.aaai.org
| CNR IRIS
2024
Conference article
Open Access
Generative model for decision trees
Guidotti R., Monreale A., Setzu M., Volpi G.Decision trees are among the most popular supervised models due to their interpretability and knowledge representation resembling human reasoning. Commonly-used decision tree induction algorithms are based on greedy top-down strategies. Although these approaches are known to be an efficient heuristic, the resulting trees are only locally optimal and tend to have overly complex structures. On the other hand, optimal decision tree algorithms attempt to create an entire decision tree at once to achieve global optimality. We place our proposal between these approaches by designing a generative model for decision trees. Our method first learns a latent decision tree space through a variational architecture using pre-trained decision tree models. Then, it adopts a genetic procedure to explore such latent space to find a compact decision tree with good predictive performance. We compare our proposal against classical tree induction methods, optimal approaches, and ensemble models. The results show that our proposal can generate accurate and shallow, i.e., interpretable, decision trees.Source: PROCEEDINGS OF THE ... AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, vol. 38 (issue 19), pp. 21116-21124. Vancouver, Canada, 20-27/02/2024
DOI: 10.1609/aaai.v38i19.30104Project(s): TAILOR 
, Future Artificial Intelligence Research,
HumanE-AI-Net 
,
TANGO 
, MIMOSA,
XAI 
,
SoBigData-PlusPlus 
, Strengthening the Italian RI for Social Mining and Big Data Analytics
Metrics:
See at:
Proceedings of the AAAI Conference on Artificial Intelligence
| IRIS Cnr
| IRIS Cnr
| Proceedings of the AAAI Conference on Artificial Intelligence
| Archivio della Ricerca - Università di Pisa
| CNR IRIS