61 result(s)
Page Size: 10, 20, 50
Export: bibtex, xml, json, csv
Order by:

CNR Author operator: and / or
more
Typology operator: and / or
Language operator: and / or
Date operator: and / or
Rights operator: and / or
2015 Conference article Restricted
Social or green? A data­driven approach for more enjoyable carpooling
Guidotti R., Sassi A., Berlingerio M., Pascale A.
Carpooling, i.e. the sharing of vehicles to reach common destinations, is often performed to reduce costs and pollution. Recent works on carpooling and journey planning take into account, besides mobility match, also social aspects and, more generally, non-monetary rewards. In line with this, we presenta data-driven methodology for a more enjoyable carpooling. We introduce a measure of enjoyability based on people's interests,social links, and tendency to connect to people with similar or dissimilar interests. We devise a methodology to compute enjoyability from crowd-sourced data, and we show how this can be used on real world datasets to optimize for both mobility and enjoyability. Our methodology was tested on real data from Rome and San Francisco. We compare the results of an optimization model minimizing the number of cars, and a greedy approach maximizing the enjoyability. We evaluate them in terms of cars saved, and average enjoyability of the system. We present also the results of a user study, with more than 200 users reporting an interest of 39% in the enjoyable solution. Moreover, 24%of people declared that sharing the car with interesting people would be the primary motivation for carpooling.Source: 18th IEEE Intelligent Transportation Systems Conference, pp. 842–847, Las Palmas de Gran Canaria, Spain, 15-18/09/2015
DOI: 10.1109/itsc.2015.142
Project(s): PETRA via OpenAIRE
Metrics:


See at: doi.org Restricted | ieeexplore.ieee.org Restricted | CNR ExploRA


2016 Conference article Open Access OPEN
Where is my next friend? Recommending enjoyable profiles in location based services
Guidotti R., Berlingerio M.
How many of your friends, with whom you enjoy spending some time, live close by? How many people are at your reach, with whom you could have a nice conversation? We introduce a measure of enjoyability that may be the basis for a new class of location-based services aimed at maximizing the likelihood that two persons, or a group of people, would enjoy spending time together. Our enjoyability takes into account both topic similarity between two users and the users' tendency to connect to people with similar or dissimilar interest. We computed the enjoyability on two datasets of geo-located tweets, and we reasoned on the applicability of the obtained results for producing friend recommendations. We aim at suggesting couples of users which are not friends yet, but which are frequently co-located and maximize our enjoyability measure. By taking into account the spatial dimension, we show how 50% of users may find at least one enjoyable person within 10km of their two most visited locations. Our results are encouraging, and open the way for a new class of recommender systems based on enjoyability.Source: CompleNet 2016 - Complex Networks VII. 7th Workshop on Complex Networks, pp. 65–78, Dijion, France, 23-25 March, 2016
DOI: 10.1007/978-3-319-30569-1_5
Project(s): PETRA via OpenAIRE
Metrics:


See at: www.springer.com Open Access | doi.org Restricted | link.springer.com Restricted | CNR ExploRA


2017 Journal article Open Access OPEN
The GRAAL of carpooling: GReen And sociAL optimization from crowd-sourced data
Berlingerio M., Ghaddar B., Guidotti R., Pascale A., Sassi A.
Carpooling, i.e. the sharing of vehicles to reach common destinations, is often performed to reduce costs and pollution. Recent work on carpooling takes into account, besides mobility matches, also social aspects and, more generally, non-monetary incentives. In line with this, we present GRAAL, a data-driven methodology for GReen And sociAL carpooling. GRAAL optimizes a carpooling system not only by minimizing the number of cars needed at the city level, but also by maximizing the enjoyability of people sharing a trip. We introduce a measure of enjoyability based on people's interests, social links, and tendency to connect to people with similar or dissimilar interests. GRAAL computes the enjoyability within a set of users from crowd-sourced data, and then uses it on real world datasets to optimize a weighted linear combination of number of cars and enjoyability. To tune this weight, and to investigate the users' interest on the social aspects of carpooling, we conducted an online survey on potential carpooling users. We present the results of applying GRAAL on real world crowd-sourced data from the cities of Rome and San Francisco. Computational results are presented from both the city and the user perspective. Using the crowd-sourced weight, GRAAL is able to significantly reduce the number of cars needed, while keeping a high level of enjoyability on the tested data-set. From the user perspective, we show how the entire per-car distribution of enjoyability is increased with respect to the baselines.Source: Transportation research. Part C, Emerging technologies 80 (2017): 20–36. doi:10.1016/j.trc.2017.02.025
DOI: 10.1016/j.trc.2017.02.025
Project(s): PETRA via OpenAIRE
Metrics:


See at: ISTI Repository Open Access | Transportation Research Part C Emerging Technologies Restricted | www.sciencedirect.com Restricted | CNR ExploRA


2017 Doctoral thesis Unknown
Personal Data Analytics: Capturing Human Behavior to Improve Self-Awareness and Personal Services through Individual and Collective Knowledge
Guidotti R.
In the era of Big Data, every single user of our hyper-connected world leaves behind a myriad of digital breadcrumbs while performing her daily activities. It is sufficient to think of a simple smartphone that enables each one of us to browse the Web, listen to music on online musical services, post messages on social networks, perform online shopping sessions, acquire images and videos and record our geographical locations. This enormous amount of personal data could be exploited to improve the lifestyle of each individual by extracting, analyzing and exploiting user's behavioral patterns like the items frequently purchased, the routinary movements, the favorite sequence of songs listened, etc. However, even though some user-centric models for data management named Personal Data Store are emerging, currently there is still a significant lack in terms of algorithms and models specifically designed to extract and capture knowledge from personal data. This thesis proposes an extension to the idea of Personal Data Store through Personal Data Analytics. In practice, we describe parameter-free algorithms that do not need to be tuned by experts and are able to automatically extract the patterns from the user's data. We define personal data models to characterize the user profile which are able to capture and collect the users' behavioral patterns. In addition, we propose individual and collective services exploiting the knowledge extracted with Personal Data Analytics algorithm and models. The services are provided for the users which are organized in a Personal Data Ecosystem in form of a peer distributed network, and are available to share part of their own patterns as a return of the service providing. We show how the sharing with the collectivity enables or improves, the services analyzed. The sharing enhances the level of the service for individuals, for example by providing to the user an invaluable opportunity for having a better perception of her self-awareness. Moreover, at the same time, knowledge sharing can lead to forms of collective gain, like the reduction of the number of circulating cars. To prove the feasibility of Personal Data Analytics in terms of algorithms, models and services proposed we report an extensive experimentation on real world data.Project(s): CIMPLEX via OpenAIRE, PETRA via OpenAIRE, SoBigData via OpenAIRE

See at: CNR ExploRA


2018 Conference article Open Access OPEN
On the Equivalence Between Community Discovery and Clustering
Guidotti R., Coscia M.
Clustering is the subset of data mining techniques used to agnostically classify entities by looking at their attributes. Clustering algorithms specialized to deal with complex networks are called community discovery. Notwithstanding their common objectives, there are crucial assumptions in community discovery edge sparsity and only one node type, among others which makes its mapping to clustering non trivial. In this paper, we propose a community discovery to clustering mapping, by focusing on transactional data clustering. We represent a network as a transactional dataset, and we find communities by grouping nodes with common items (neighbors) in their baskets (neighbor lists). By comparing our results with ground truth communities and state of the art community discovery methods, we show that transactional clustering algorithms are a feasible alternative to community discovery, and that a complete mapping of the two problems is possible.Source: 3rd EAI International Conference on Smart Objects and Technologies for Social Good, pp. 342–352, Pisa, Italy, 29-30/11/2017
DOI: 10.1007/978-3-319-76111-4_34
Project(s): SoBigData via OpenAIRE
Metrics:


See at: link.springer.com Open Access | ISTI Repository Open Access | Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering Restricted | CNR ExploRA


2018 Conference article Open Access OPEN
Explaining successful docker images using pattern mining analysis
Guidotti R., Soldani J., Neri D., Brogi A.
Docker is on the rise in today's enterprise IT. It permits shipping applications inside portable containers, which run from so-called Docker images. Docker images are distributed in public registries, which also monitor their popularity. The popularity of an image directly impacts on its usage, and hence on the potential revenues of its developers. In this paper, we present a frequent pattern mining-based approach for understanding how to improve an image to increase its popularity. The results in this work can provide valuable insights to Docker image providers, helping them to design more competitive software products.Source: STAF 2018, pp. 98–113, Toulouse, 25/06/2018
DOI: 10.1007/978-3-030-04771-9_9
Project(s): SoBigData via OpenAIRE
Metrics:


See at: Archivio della Ricerca - Università di Pisa Open Access | link.springer.com Open Access | ISTI Repository Open Access | ISTI Repository Open Access | doi.org Restricted | CNR ExploRA


2019 Conference article Open Access OPEN
Investigating neighborhood generation methods for explanations of obscure image classifiers
Guidotti R., Monreale A., Cariaggi L.
Given the wide use of machine learning approaches based on opaque prediction models, understanding the reasons behind decisions of black box decision systems is nowadays a crucial topic. We address the problem of providing meaningful explanations in the widely-applied image classification tasks. In particular, we explore the impact of changing the neighborhood generation function for a local interpretable model-agnostic explanator by proposing four different variants. All the proposed methods are based on a grid-based segmentation of the images, but each of them proposes a different strategy for generating the neighborhood of the image for which an explanation is required. A deep experimentation shows both improvements and weakness of each proposed approach.Source: PAKDD, pp. 55–68, Macau, 14-17/04/2019
DOI: 10.1007/978-3-030-16148-4_5
Project(s): Track and Know via OpenAIRE, SoBigData via OpenAIRE
Metrics:


See at: arpi.unipi.it Open Access | Lecture Notes in Computer Science Restricted | link.springer.com Restricted | CNR ExploRA


2019 Conference article Open Access OPEN
Privacy risk for individual basket patterns
Pellungrini R., Monreale A., Guidotti R.
Retail data are of fundamental importance for businesses and enterprises that want to understand the purchasing behaviour of their customers. Such data is also useful to develop analytical services and for marketing purposes, often based on individual purchasing patterns. However, retail data and extracted models may also provide very sensitive information to possible malicious third parties. Therefore, in this paper we propose a methodology for empirically assessing privacy risk in the releasing of individual purchasing data. The experiments on real-world retail data show that although individual patterns describe a summary of the customer activity, they may be successful used for the customer re-identifiation.Source: PAP 2018, pp. 141–155, Dublin, Ireland, 10/09/2018 - 14/09/2018
DOI: 10.1007/978-3-030-13463-1_11
Project(s): SoBigData via OpenAIRE
Metrics:


See at: arpi.unipi.it Open Access | Lecture Notes in Computer Science Restricted | link.springer.com Restricted | CNR ExploRA


2019 Conference article Restricted
Helping your docker images to spread based on explainable models
Guidotti R., Soldani J., Neri D., Brogi A., Pedreschi D.
Docker is on the rise in today's enterprise IT. It permits shipping applications inside portable containers, which run from so-called Docker images. Docker images are distributed in public registries, which also monitor their popularity. The popularity of an image impacts on its actual usage, and hence on the potential revenues for its developers. In this paper, we present a solution based on interpretable decision tree and regression trees for estimating the popularity of a given Docker image, and for understanding how to improve an image to increase its popularity. The results presented in this work can provide valuable insights to Docker developers, helping them in spreading their images. Code related to this paper is available at: https://github.com/di-unipi-socc/DockerImageMiner.Source: ECML-PKDD 2018, pp. 205–221, Dublin, Ireland, 10/09/2018 - 14/09/2018
DOI: 10.1007/978-3-030-10997-4_13
Project(s): SoBigData via OpenAIRE
Metrics:


See at: doi.org Restricted | link.springer.com Restricted | CNR ExploRA


2020 Conference article Open Access OPEN
Black box explanation by learning image exemplars in the latent feature space
Guidotti R., Monreale A., Matwin S., Pedreschi D.
We present an approach to explain the decisions of black box models for image classification. While using the black box to label images, our explanation method exploits the latent feature space learned through an adversarial autoencoder. The proposed method first generates exemplar images in the latent feature space and learns a decision tree classifier. Then, it selects and decodes exemplars respecting local decision rules. Finally, it visualizes them in a manner that shows to the user how the exemplars can be modified to either stay within their class, or to become counter-factuals by "morphing" into another class. Since we focus on black box decision systems for image classification, the explanation obtained from the exemplars also provides a saliency map highlighting the areas of the image that contribute to its classification, and areas of the image that push it into another class. We present the results of an experimental evaluation on three datasets and two black box models. Besides providing the most useful and interpretable explanations, we show that the proposed method outperforms existing explainers in terms of fidelity, relevance, coherence, and stability.Source: European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2019, pp. 189–205, Wurzburg, Germany, 16-20 September, 2019
DOI: 10.1007/978-3-030-46150-8_12
DOI: 10.48550/arxiv.2002.03746
Project(s): AI4EU via OpenAIRE, Track and Know via OpenAIRE, Track and Know via OpenAIRE, PRO-RES via OpenAIRE, SoBigData via OpenAIRE
Metrics:


See at: arXiv.org e-Print Archive Open Access | arxiv.org Open Access | ISTI Repository Open Access | doi.org Restricted | doi.org Restricted | www.springerprofessional.de Restricted | CNR ExploRA


2020 Conference article Closed Access
Global explanations with local scoring
Setzu M., Guidotti R., Monreale A., Turini F.
Artificial Intelligence systems often adopt machine learning models encoding complex algorithms with potentially unknown behavior. As the application of these "black box" models grows, it is our responsibility to understand their inner working and formulate them in human-understandable explanations. To this end, we propose a rule-based model-agnostic explanation method that follows a local-to-global schema: it generalizes a global explanation summarizing the decision logic of a black box starting from the local explanations of single predicted instances. We define a scoring system based on a rule relevance score to extract global explanations from a set of local explanations in the form of decision rules. Experiments on several datasets and black boxes show the stability, and low complexity of the global explanations provided by the proposed solution in comparison with baselines and state-of-the-art global explainers.Source: Joint European Conference on Machine Learning and Knowledge Discovery in Databases - ECML PKDD 2019, pp. 159–171, Würzburg, Germany, 16-20 September, 2019
DOI: 10.1007/978-3-030-43823-4_14
Project(s): AI4EU via OpenAIRE, Track and Know via OpenAIRE, Track and Know via OpenAIRE, PRO-RES via OpenAIRE, XAI via OpenAIRE, SoBigData via OpenAIRE
Metrics:


See at: Communications in Computer and Information Science Restricted | link.springer.com Restricted | CNR ExploRA


2023 Journal article Open Access OPEN
Solving imbalanced learning with outlier detection and features reduction
Lusito S., Pugnana A., Guidotti R.
A critical problem for several real world applications is class imbalance. Indeed, in contexts like fraud detection or medical diagnostics, standard machine learning models fail because they are designed to handle balanced class distributions. Existing solutions typically increase the rare class instances by generating synthetic records to achieve a balanced class distribution. However, these procedures generate not plausible data and tend to create unnecessary noise. We propose a change of perspective where instead of relying on resampling techniques, we depend on unsupervised features engineering approaches to represent records with a combination of features that will help the classifier capturing the differences among classes, even in presence of imbalanced data. Thus, we combine a large array of outlier detection, features projection, and features selection approaches to augment the expressiveness of the dataset population. We show the effectiveness of our proposal in a deep and wide set of benchmarking experiments as well as in real case studies.Source: Machine learning (2023). doi:10.1007/s10994-023-06448-0
DOI: 10.1007/s10994-023-06448-0
Project(s): SoBigData-PlusPlus via OpenAIRE
Metrics:


See at: link.springer.com Open Access | ISTI Repository Open Access | CNR ExploRA


2023 Conference article Restricted
Explaining black-boxes in federated learning
Corbucci L., Guidotti R., Monreale A.
Federated Learning has witnessed increasing popularity in the past few years for its ability to train Machine Learning models in critical contexts, using private data without moving them. Most of the work in the literature proposes algorithms and architectures for training neural networks, which although they present high performance in different predicting tasks and are easy to be learned with a cooperative mechanism, their predictive reasoning is obscure. Therefore, in this paper, we propose a variant of SHAP, one of the most widely used explanation methods, tailored to Horizontal server-based Federated Learning. The basic idea is having the possibility to explain an instance's prediction performed by the trained Machine Leaning model as an aggregation of the explanations provided by the clients participating in the cooperation. We empirically test our proposal on two different tabular datasets, and we observe interesting and encouraging preliminary results.Source: xAI 2023 - World Conference on Explainable Artificial Intelligence, pp. 151–163, Lisbon, Portugal, 26-28/07/2023
DOI: 10.1007/978-3-031-44067-0_8
Project(s): TAILOR via OpenAIRE, XAI via OpenAIRE, SoBigData-PlusPlus via OpenAIRE, Humane AI via OpenAIRE
Metrics:


See at: doi.org Restricted | link.springer.com Restricted | CNR ExploRA


2015 Conference article Restricted
Find your way back: Mobility profile mining with constraints
Kotthoff L., Nanni M., Guidotti R., Òsullivan B.
Mobility profile mining is a data mining task that can be formulated as clustering over movement trajectory data. The main challenge is to separate the signal from the noise, i.e. one-off trips. We show that standard data mining approaches suffer the important drawback that they cannot take the symmetry of non-noise trajectories into account. That is, if a trajectory has a symmetric equivalent that covers the same trip in the reverse direction, it should become more likely that neither of them is labelled as noise. We present a constraint model that takes this knowledge into account to produce better clusters. We show the efficacy of our approach on real-world data that was previously processed using standard data mining techniques.Source: Principles and Practice of Constraint Programming. 21st International Conference, pp. 638–653, Cork, Ireland, 31/09/2015-04/10/2015
DOI: 10.1007/978-3-319-23219-5_44
Project(s): ICON via OpenAIRE
Metrics:


See at: doi.org Restricted | www.scopus.com Restricted | CNR ExploRA


2015 Conference article Restricted
Managing travels with PETRA: The Rome use case
Botea A., Braghin S., Lopes N., Guidotti R., Pratesi F.
The aim of the PETRA project is to provide the basis for a city-wide transportation system that supports policies catering for both individual preferences of users and city-wide travel patterns. The PETRA platform will be initially deployed in the partner city of Rome, and later in Venice, and Tel-Aviv.Source: 31st IEEE International Conference on Data Engineering. Data Mining and Smart Cities Applications Workshop, pp. 110–111, Seoul, Korea, 13-17/04/2015
DOI: 10.1109/icdew.2015.7129558
Project(s): PETRA via OpenAIRE
Metrics:


See at: doi.org Restricted | ieeexplore.ieee.org Restricted | CNR ExploRA


2015 Conference article Restricted
Mobility Mining for Journey Planning in Rome
Berlingerio M., Bicer V., Botea A., Braghin S., Lopes N., Guidotti R., Pratesi F.
We present recent results on integrating private car GPS routines obtained by a Data Mining module. into the PETRA (PErsonal TRansport Advisor) platform. The routines are used as additional "bus lines", available to provide a ride to travelers. We present the effects of querying the planner with and without the routines, which show how Data Mining may help Smarter Cities applications.Source: European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases. European Conference, pp. 222–226, Porto, Portugal, 07-11/09/2015
DOI: 10.1007/978-3-319-23461-8_18
Project(s): PETRA via OpenAIRE
Metrics:


See at: doi.org Restricted | link.springer.com Restricted | CNR ExploRA


2016 Contribution to book Restricted
Audio ergo sum. A personal data model for musical preferences
Guidotti R., Rossetti G., Pedreschi D.
Nobody can state " Rock is my favorite genre " or " David Bowie is my favorite artist ". We defined a Personal Listening Data Model able to capture musical preferences through indicators and patterns, and we discovered that we are all characterized by a limited set of musical preferences, but not by a unique predilection. The empowered capacity of mobile devices and their growing adoption in our everyday life is generating an enormous increment in the production of personal data such as calls, positioning, online purchases and even music listening. Musical listening is a type of data that has started receiving more attention from the scientific community as consequence of the increasing availability of rich and punctual online data sources. Starting from the listening of 30k Last.Fm users, we show how the employment of the Personal Listening Data Models can provide higher levels of self-awareness. In addition, the proposed model will enable the development of a wide range of analysis and musical services both at personal and at collective level.Source: Software Technologies: Applications and Foundations, edited by Milazzo, Paolo; Varró, Dániel; Wimmer, Manuel, pp. 51–66, 2016
DOI: 10.1007/978-3-319-50230-4_5
Project(s): CIMPLEX via OpenAIRE
Metrics:


See at: doi.org Restricted | link.springer.com Restricted | CNR ExploRA


2016 Contribution to book Restricted
ICON loop carpooling show case
Nanni M., Kotthoff L., Guidotti R., Òsullivan B., Pedreschi D.
In this chapter we describe a proactive carpooling service that combines induction and optimization mechanisms to maximize the impact of carpooling within a community. The approach autonomously infers the mobility demand of the users through the analysis of their mobility traces (i.e. Data Mining of GPS trajectories) and builds the network of all possible ride sharing opportunities among the users. Then, the maximal set of carpooling matches that satisfy some standard requirements (maximal capacity of vehicles, etc.) is computed through Constraint Programming models, and the resulting matches are proactively proposed to the users. Finally, in order to maximize the expected impact of the service, the probability that each carpooling match is accepted by the users involved is inferred through Machine Learning mechanisms and put in the CP model. The whole process is reiterated at regular intervals, thus forming an instance of the general ICON loop.Source: Data Mining and Constraint Programming - Foundations of a Cross-Disciplinary Approach, edited by Bessiere, C.; De Raedt, L.; Kotthoff, L.; Nijssen, S.; O'Sullivan, B.; Pedreschi, D., pp. 310–324, 2016
DOI: 10.1007/978-3-319-50137-6_13
Project(s): PETRA via OpenAIRE
Metrics:


See at: doi.org Restricted | link.springer.com Restricted | CNR ExploRA


2018 Conference article Open Access OPEN
Recognizing Residents and Tourists with Retail Data Using Shopping Profiles
Guidotti R., Gabrielli L.
The huge quantity of personal data stored by service providers registering customers daily life enables the analysis of individual findgerprints characterizing the customers' behavioral profiles. We propose a framework for recognizing residents, tourists and occasional shoppers among the customers of a retail market chain. We employ our recognition framework on a real massive dataset containing the shopping transactions of more than one million of customers, and we identify representative temporal shopping profiles for residents, tourists and occasional customers. Our experiments show that even though residents are about 33% of the customers they are responsible for more than 90% of the expenditure. We statistically validate the number of residents and tourists with national official statistics enabling in this way the adoption of our recognition framework for the development of novel services and analysis.Source: 3rd EAI International Conference on Smart Objects and Technologies for Social Good, pp. 353–363, Pisa, Italy, 29-30/11/2017
DOI: 10.1007/978-3-319-76111-4_35
Project(s): SoBigData via OpenAIRE
Metrics:


See at: link.springer.com Open Access | ISTI Repository Open Access | Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering Restricted | CNR ExploRA


2018 Report Open Access OPEN
Assessing the stability of interpretable models
Guidotti R., Ruggieri S.
Interpretable classification models are built with the purpose of providing a comprehensible description of the decision logic to an external oversight agent. When considered in isolation, a decision tree, a set of classification rules, or a linear model, are widely recognized as human-interpretable. However, such models are generated as part of a larger analytical process, which, in particular, comprises data collection and filtering. Selection bias in data collection or in data pre-processing may affect the model learned. Although model induction algorithms are designed to learn to generalize, they pursue optimization of predictive accuracy. It remains unclear how interpretability is instead impacted. We conduct an experimental analysis to investigate whether interpretable models are able to cope with data selection bias as far as interpretability is concerned.Source: ISTI Technical reports, 2018
Project(s): SoBigData via OpenAIRE

See at: arxiv.org Open Access | ISTI Repository Open Access | CNR ExploRA