308 result(s)
Page Size: 10, 20, 50
Export: bibtex, xml, json, csv
Order by:

CNR Author operator: and / or
more
Typology operator: and / or
Language operator: and / or
Date operator: and / or
more
Rights operator: and / or
2022 Journal article Open Access OPEN

Understanding peace through the world news
Voukelatou V., Miliou I., Giannotti F., Pappalardo L.
Peace is a principal dimension of well-being and is the way out of inequity and violence. Thus, its measurement has drawn the attention of researchers, policymakers, and peacekeepers. During the last years, novel digital data streams have drastically changed the research in this field. The current study exploits information extracted from a new digital database called Global Data on Events, Location, and Tone (GDELT) to capture peace through the Global Peace Index (GPI). Applying predictive machine learning models, we demonstrate that news media attention from GDELT can be used as a proxy for measuring GPI at a monthly level. Additionally, we use explainable AI techniques to obtain the most important variables that drive the predictions. This analysis highlights each country's profile and provides explanations for the predictions, and particularly for the errors and the events that drive these errors. We believe that digital data exploited by researchers, policymakers, and peacekeepers, with data science tools as powerful as machine learning, could contribute to maximizing the societal benefits and minimizing the risks to peace.Source: EPJ 11 (2022). doi:10.1140/epjds/s13688-022-00315-z
DOI: 10.1140/epjds/s13688-022-00315-z
Project(s): XAI via OpenAIRE, SoBigData-PlusPlus via OpenAIRE

See at: epjdatascience.springeropen.com Open Access | ISTI Repository Open Access | CNR ExploRA Open Access


2021 Conference article Open Access OPEN

Measuring immigrants adoption of natives shopping consumption with machine learning
Guidotti R., Nanni M., Giannotti F., Pedreschi D., Bertoli S., Speciale B., Rapoport H.
Tell me what you eat and I will tell you what you are". Jean Anthelme Brillat-Savarin was among the firsts to recognize the relationship between identity and food consumption. Food adoption choices are much less exposed to external judgment and social pressure than other individual behaviours, and can be observed over a long period. That makes them an interesting basis for, among other applications, studying the integration of immigrants from a food consumption viewpoint. Indeed, in this work we analyze immigrants' food consumption from shopping retail data for understanding if and how it converges towards those of natives. As core contribution of our proposal, we define a score of adoption of natives' consumption habits by an individual as the probability of being recognized as a native from a machine learning classifier, thus adopting a completely data-driven approach. We measure the immigrant's adoption of natives' consumption behavior over a long time, and we identify different trends. A case study on real data of a large nation-wide supermarket chain reveals that we can distinguish five main different groups of immigrants depending on their trends of native consumption adoption.Source: ECML PKDD 2020 - Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 369–385, Ghent, Belgium, September 14-18, 2020
DOI: 10.1007/978-3-030-67670-4_23

See at: ISTI Repository Open Access | link.springer.com Restricted | link.springer.com Restricted | CNR ExploRA Restricted


2021 Report Open Access OPEN

Understanding peacefulness through the world news
Voukelatou V., Miliou I., Giannotti F., Pappalardo L.
Peacefulness is a principal dimension of well-being for all humankind and is the way out of inequity and every single form of violence. Thus, its measurement has lately drawn the attention of researchers and policy-makers. During the last years, novel digital data streams have drastically changed the research in this field. In the current study, we exploit information extracted from Global Data on Events, Location, and Tone (GDELT) digital news database, to capture peacefulness through the Global Peace Index (GPI). Applying predictive machine learning models, we demonstrate that news media attention from GDELT can be used as a proxy for measuring GPI at a monthly level. Additionally, we use the SHAP methodology to obtain the most important variables that drive the predictions. This analysis highlights each country's profile and provides explanations for the predictions overall, and particularly for the errors and the events that drive these errors. We believe that digital data exploited by Social Good researchers, policy-makers, and peace-builders, with data science tools as powerful as machine learning, could contribute to maximize the societal benefits and minimize the risks to peacefulness.Source: ISTI Research Report, SoBigData++, 2021
Project(s): SoBigData-PlusPlus via OpenAIRE

See at: arxiv.org Open Access | ISTI Repository Open Access | CNR ExploRA Open Access


2021 Journal article Open Access OPEN

Data Science: a game changer for science and innovation
Grossi V., Giannotti F., Pedreschi D., Manghi P., Pagano P., Assante M.
This paper shows data science's potential for disruptive innovation in science, industry, policy, and people's lives. We present how data science impacts science and society at large in the coming years, including ethical problems in managing human behavior data and considering the quantitative expectations of data science economic impact. We introduce concepts such as open science and e-infrastructure as useful tools for supporting ethical data science and training new generations of data scientists. Finally, this work outlines SoBigData Research Infrastructure as an easy-to-access platform for executing complex data science processes. The services proposed by SoBigData are aimed at using data science to understand the complexity of our contemporary, globally interconnected society.Source: International Journal of Data Science and Analytics (Print) 11 (2021): 263–278. doi:10.1007/s41060-020-00240-2
DOI: 10.1007/s41060-020-00240-2

See at: link.springer.com Open Access | ISTI Repository Open Access | CNR ExploRA Open Access | International Journal of Data Science and Analytics Restricted | International Journal of Data Science and Analytics Restricted


2021 Journal article Open Access OPEN

Predicting seasonal influenza using supermarket retail records
Miliou I., Xiong X., Rinzivillo S., Zhang Q., Rossetti G., Giannotti F., Pedreschi D., Vespignani A.
Increased availability of epidemiological data, novel digital data streams, and the rise of powerful machine learning approaches have generated a surge of research activity on realtime epidemic forecast systems. In this paper, we propose the use of a novel data source, namely retail market data to improve seasonal influenza forecasting. Specifically, we consider supermarket retail data as a proxy signal for influenza, through the identification of sentinel baskets, i.e., products bought together by a population of selected customers. We develop a nowcasting and forecasting framework that provides estimates for influenza incidence in Italy up to 4 weeks ahead. We make use of the Support Vector Regression (SVR) model to produce the predictions of seasonal flu incidence. Our predictions outperform both a baseline autoregressive model and a second baseline based on product purchases. The results show quantitatively the value of incorporating retail market data in forecasting models, acting as a proxy that can be used for the real-time analysis of epidemics.Source: PLoS computational biology 17 (2021). doi:10.1371/journal.pcbi.1009087
DOI: 10.1371/journal.pcbi.1009087

See at: journals.plos.org Open Access | ISTI Repository Open Access | CNR ExploRA Open Access


2021 Journal article Open Access OPEN

Ethics of smart cities: towards value-sensitive design and co-evolving city life
Helbing D., Fanitabasi F., Giannotti F., Hanggli R., Hausladen C. I., Van Den Hoven J., Mahajan S., Pedreschi D., Pournaras E.
The digital revolution has brought about many societal changes such as the creation of "smart cities". The smart city concept has changed the urban ecosystem by embedding digital technologies in the city fabric to enhance the quality of life of its inhabitants. However, it has also led to some pressing issues and challenges related to data, privacy, ethics inclusion, and fairness. While the initial concept of smart cities was largely technology-and data-driven, focused on the automation of traffic, logistics and processes, this concept is currently being replaced by technology-enabled, human-centred solutions. However, this is not the end of the development, as there is now a big trend towards "design for values". In this paper, we point out how a value-sensitive design approach could promote a more sustainable pathway of cities that better serves people and nature. Such "valuesensitive design" will have to take ethics, law and culture on board. We discuss how organising the digital world in a participatory way, as well as leveraging the concepts of self-organisation, selfregulation, and self-control, would foster synergy effects and thereby help to leverage a sustainable technological revolution on a global scale. Furthermore, a "democracy by design" approach could also promote resilience.Source: Sustainability (Basel) 13 (2021). doi:10.3390/su132011162
DOI: 10.3390/su132011162
Project(s): SoBigData-PlusPlus via OpenAIRE

See at: CNR ExploRA Open Access | www.mdpi.com Open Access


2021 Conference article Closed Access

Artificial intelligence for humankind: a panel on how to create truly interactive and human-centered AI for the benefit of individuals and society
Schmidt A., Giannotti F., Mackay W., Shneiderman B., Vaananen K.
This panel discusses the role of human-computer interaction (HCI) in the conception, design, and implementation of human-centered artificial intelligence (AI). For us, it is important that AI and machine learning (ML) are ethical and create value for humans - as individuals as well as for society. Our discussion emphasizes the opportunities of using HCI and User Experience Design methods to create advanced AI/ML-based systems that will be widely adopted, reliable, safe, trustworthy, and responsible. The resulting systems will integrate AI and ML algorithms while providing user interfaces and control panels that ensure meaningful human control.Source: INTERACT 2021 - 18th IFIP TC 13 International Conference on Human-Computer Interaction (Part V), pp. 335–339, Bari, Italy, 30/08/2021 - 03/09/2021
DOI: 10.1007/978-3-030-85607-6_32

See at: link.springer.com Restricted | link.springer.com Restricted | CNR ExploRA Restricted


2021 Journal article Open Access OPEN

Give more data, awareness and control to individual citizens, and they will help COVID-19 containment
Nanni M., Andrienko G., Barabasi A. -L., Boldrini C., Bonchi F., Cattuto C., Chiaromonte F., Comande G., Conti M., Cote M., Dignum F., Dignum V., Domingo-Ferrer J., Ferragina P., Giannotti F., Guidotti R., Helbing D., Kaski K., Kertesz J., Lehmann S., Lepri B., Lukowicz P., Matwin S., Jimenez D. M., Monreale A., Morik K., Oliver N., Passarella A., Passerini A., Pedreschi D., Pentland A., Pianesi F., Pratesi F., Rinzivillo S., Ruggieri S., Siebes A., Torra V., Trasarti R., Hoven J., Vespignani A.
The rapid dynamics of COVID-19 calls for quick and effective tracking of virus transmission chains and early detection of outbreaks, especially in the "phase 2" of the pandemic, when lockdown and other restriction measures are progressively withdrawn, in order to avoid or minimize contagion resurgence. For this purpose, contact-tracing apps are being proposed for large scale adoption by many countries. A centralized approach, where data sensed by the app are all sent to a nation-wide server, raises concerns about citizens' privacy and needlessly strong digital surveillance, thus alerting us to the need to minimize personal data collection and avoiding location tracking. We advocate the conceptual advantage of a decentralized approach, where both contact and location data are collected exclusively in individual citizens' "personal data stores", to be shared separately and selectively (e.g., with a backend system, but possibly also with other citizens), voluntarily, only when the citizen has tested positive for COVID-19, and with a privacy preserving level of granularity. This approach better protects the personal sphere of citizens and affords multiple benefits: it allows for detailed information gathering for infected people in a privacy-preserving fashion; and, in turn this enables both contact tracing, and, the early detection of outbreak hotspots on more finely-granulated geographic scale. The decentralized approach is also scalable to large populations, in that only the data of positive patients need be handled at a central level. Our recommendation is two-fold. First to extend existing decentralized architectures with a light touch, in order to manage the collection of location data locally on the device, and allow the user to share spatio-temporal aggregates--if and when they want and for specific aims--with health authorities, for instance. Second, we favour a longer-term pursuit of realizing a Personal Data Store vision, giving users the opportunity to contribute to collective good in the measure they want, enhancing self-awareness, and cultivating collective efforts for rebuilding society.Source: Ethics and information technology 23 (2021). doi:10.1007/s10676-020-09572-w
DOI: 10.1007/s10676-020-09572-w
Project(s): SoBigData-PlusPlus via OpenAIRE

See at: Aaltodoc Publication Archive Open Access | Ethics and Information Technology Open Access | Archivio Istituzionale Open Access | link.springer.com Open Access | Ethics and Information Technology Open Access | Ethics and Information Technology Open Access | City Research Online Open Access | ISTI Repository Open Access | Fraunhofer-ePrints Open Access | CNR ExploRA Open Access | TU Delft Repository Open Access | TU Delft Repository Open Access | Digitala Vetenskapliga Arkivet - Academic Archive On-line Open Access | Digitala Vetenskapliga Arkivet - Academic Archive On-line Open Access | Ethics and Information Technology Restricted | Archivio della Ricerca - Università di Pisa Restricted | Ethics and Information Technology Restricted | Ethics and Information Technology Restricted | publons.com Restricted | Ethics and Information Technology Restricted | www.scopus.com Restricted


2020 Report Open Access OPEN

Mobile phone data analytics against the COVID-19 epidemics in Italy: flow diversity and local job markets during the national lockdown
Bonato P., Cintia P., Fabbri F., Fadda D., Giannotti F., Lopalco P. L., Mazzilli S., Nanni M., Pappalardo L., Pedreschi D., Penone F., Rinzivillo S., Rossetti G., Savarese M., Tavoschi L.
Understanding human mobility patterns is crucial to plan the restart of production and economic activities, which are currently put in "stand-by" to fight the diffusion of the epidemics. A recent analysis shows that, following the national lockdown of March 9th, the mobility fluxes have decreased by 50% or more, everywhere in the country. To this purpose, we use mobile phone data to compute the movements of people between Italian provinces, and we analyze the incoming, outcoming and internal mobility flows before and during the national lockdown (March 9th, 2020) and after the closure of non-necessary productive and economic activities (March 23th, 2020). The population flow across provinces and municipalities enable for the modeling of a risk index tailored for the mobility of each municipality or province. Such an index would be a useful indicator to drive counter-measures in reaction to a sudden reactivation of the epidemics. Mobile phone data, even when aggregated to preserve the privacy of individuals, are a useful data source to track the evolution in time of human mobility, hence allowing for monitoring the effectiveness of control measures such as physical distancing. In this report, we address the following analytical questions: How does the mobility structure of a territory change? Do incoming and outcoming flows become more predictable during the lockdown, and what are the differences between weekdays and weekends? Can we detect proper local job markets based on human mobility flows, to eventually shape the borders of a local outbreak?Source: ISTI Technical Reports 005/2020, 2020
DOI: 10.32079/isti-tr-2020/005

See at: ISTI Repository Open Access | CNR ExploRA Open Access


2020 Journal article Restricted

Authenticated Outlier Mining for Outsourced Databases
Dong B., Wang H., Monreale A., Pedreschi D., Giannotti F., Guo W.
The Data-Mining-as-a-Service (DMaS) paradigm is becoming the focus of research, as it allows the data owner (client) who lacks expertise and/or computational resources to outsource their data and mining needs to a third-party service provider (server). Outsourcing, however, raises some issues about result integrity: how could the client verify the mining results returned by the server are both sound and complete? In this paper, we focus on outlier mining, an important mining task. Previous verification techniques use an authenticated data structure (ADS) for correctness authentication, which may incur much space and communication cost. In this paper, we propose a novel solution that returns a probabilistic result integrity guarantee with much cheaper verification cost. The key idea is to insert a set of artificial records (ARs) into the dataset, from which it constructs a set of artificial outliers (AOs) and artificial non-outliers (ANOs). The AOs and ANOs are used by the client to detect any incomplete and/or incorrect mining results with a probabilistic guarantee. The main challenge that we address is how to construct ARs so that they do not change the (non-)outlierness of original records, while guaranteeing that the client can identify ANOs and AOs without executing mining. Furthermore, we build a strategic game and show that a Nash equilibrium exists only when the server returns correct outliers. Our implementation and experiments demonstrate that our verification solution is efficient and lightweight.Source: IEEE transactions on dependable and secure computing 17 (2020): 222–235. doi:10.1109/TDSC.2017.2754493
DOI: 10.1109/tdsc.2017.2754493
Project(s): CAREER: Verifiable Outsourcing of Data Mining Computations via OpenAIRE, SaTC-EDU: EAGER: Development and Evaluation of Privacy Education Tools via Open Collaboration via OpenAIRE

See at: IEEE Transactions on Dependable and Secure Computing Restricted | IEEE Transactions on Dependable and Secure Computing Restricted | IEEE Transactions on Dependable and Secure Computing Restricted | ieeexplore.ieee.org Restricted | IEEE Transactions on Dependable and Secure Computing Restricted | IEEE Transactions on Dependable and Secure Computing Restricted | CNR ExploRA Restricted | IEEE Transactions on Dependable and Secure Computing Restricted


2020 Conference article Open Access OPEN

Digital footprints of international migration on twitter
Kim J., Sirbu A., Giannotti F., Gabrielli L.
Studying migration using traditional data has some limitations. To date, there have been several studies proposing innovative methodologies to measure migration stocks and flows from social big data. Nevertheless, a uniform definition of a migrant is difficult to find as it varies from one work to another depending on the purpose of the study and nature of the dataset used. In this work, a generic methodology is developed to identify migrants within the Twitter population. This describes a migrant as a person who has the current residence different from the nationality. The residence is defined as the location where a user spends most of his/her time in a certain year. The nationality is inferred from linguistic and social connections to a migrant's country of origin. This methodology is validated first with an internal gold standard dataset and second with two official statistics, and shows strong performance scores and correlation coefficients. Our method has the advantage that it can identify both immigrants and emigrants, regardless of the origin/destination countries. The new methodology can be used to study various aspects of migration, including opinions, integration, attachment, stocks and flows, motivations for migration, etc. Here, we exemplify how trending topics across and throughout different migrant communities can be observed.Source: IDA 2020 - 18th International Conference on Intelligent Data Analysis, pp. 274–286, Konstanz, Germany, 27-29 April, 2020
DOI: 10.1007/978-3-030-44584-3_22
Project(s): HumMingBird via OpenAIRE, SoBigData via OpenAIRE

See at: link.springer.com Open Access | link.springer.com Open Access | ISTI Repository Open Access | CNR ExploRA Open Access | academic.microsoft.com Restricted | dblp.uni-trier.de Restricted | Archivio della Ricerca - Università di Pisa Restricted | link.springer.com Restricted | link.springer.com Restricted | link.springer.com Restricted


2020 Journal article Open Access OPEN

Give more data, awareness and control to individual citizens, and they will help COVID-19 containment
Nanni M., Andrienko G., Barabasi A. -L., Boldrini C., Bonchi F., Cattuto C., Chiaromonte F., Comandé G., Conti M., Coté M., Dignum F., Dignum V., Domingo-Ferrer J., Ferragina P., Giannotti F., Guidotti R., Helbing D., Kaski K., Kertesz J., Lehmann S., Lepri B., Lukowicz P., Matwin S., Jimenez D., Monreale A., Morik K., Oliver N., Passarella A., Passerini A., Pedreschi D., Pentland A., Pianesi F., Pratesi F., Rinzivillo S., Ruggieri S., Siebes A., Torra V., Trasarti R., Van Den Hoven J., Vespignani A.
The rapid dynamics of COVID-19 calls for quick and effective tracking of virus transmission chains and early detection of outbreaks, especially in the "phase 2" of the pandemic, when lockdown and other restriction measures are progressively withdrawn, in order to avoid or minimize contagion resurgence. For this purpose, contact-tracing apps are being proposed for large scale adoption by many countries. A centralized approach, where data sensed by the app are all sent to a nation-wide server, raises concerns about citizens' privacy and needlessly strong digital surveillance, thus alerting us to the need to minimize personal data collection and avoiding location tracking. We advocate the conceptual advantage of a decentralized approach, where both contact and location data are collected exclusively in individual citizens' "personal data stores", to be shared separately and selectively (e.g., with a backend system, but possibly also with other citizens), voluntarily, only when the citizen has tested positive for COVID-19, and with a privacy preserving level of granularity. This approach better protects the personal sphere of citizens and affords multiple benefits: It allows for detailed information gathering for infected people in a privacy-preserving fashion; and, in turn this enables both contact tracing, and, the early detection of outbreak hotspots on more finely-granulated geographic scale. The decentralized approach is also scalable to large populations, in that only the data of positive patients need be handled at a central level. Our recommendation is two-fold. First to extend existing decentralized architectures with a light touch, in order to manage the collection of location data locally on the device, and allowthe user to share spatio-temporal aggregates-if and when they want and for specific aims-with health authorities, for instance. Second, we favour a longerterm pursuit of realizing a Personal Data Store vision, giving users the opportunity to contribute to collective good in the measure they want, enhancing self-awareness, and cultivating collective efforts for rebuilding society.Source: Transactions on data privacy 13 (2020): 61–66.

See at: ISTI Repository Open Access | CNR ExploRA Open Access | www.tdp.cat Open Access


2020 Journal article Restricted

Human migration: the big data perspective
Sîrbu A., Andrienko G., Andrienko N., Boldrini C., Conti M., Giannotti F., Guidotti R., Bertoli S., Kim J., Muntean C. I., Pappalardo L., Passarella A., Pedreschi D., Pollacci L., Pratesi F., Sharma R.
How can big data help to understand the migration phenomenon? In this paper, we try to answer this question through an analysis of various phases of migration, comparing traditional and novel data sources and models at each phase. We concentrate on three phases of migration, at each phase describing the state of the art and recent developments and ideas. The first phase includes the journey, and we study migration flows and stocks, providing examples where big data can have an impact. The second phase discusses the stay, i.e. migrant integration in the destination country. We explore various data sets and models that can be used to quantify and understand migrant integration, with the final aim of providing the basis for the construction of a novel multi-level integration index. The last phase is related to the effects of migration on the source countries and the return of migrants.Source: International Journal of Data Science and Analytics (Online) (2020). doi:10.1007/s41060-020-00213-5
DOI: 10.1007/s41060-020-00213-5
Project(s): SoBigData via OpenAIRE

See at: International Journal of Data Science and Analytics Restricted | International Journal of Data Science and Analytics Restricted | International Journal of Data Science and Analytics Restricted | International Journal of Data Science and Analytics Restricted | International Journal of Data Science and Analytics Restricted | International Journal of Data Science and Analytics Restricted | link.springer.com | CNR ExploRA


2020 Report Open Access OPEN

The relationship between human mobility and viral transmissibility during the COVID-19 epidemics in Italy
Cintia P., Fadda D., Giannotti F., Pappalardo L., Rossetti G., Pedreschi D., Rinzivillo S., Bonato P., Fabbri F., Penone F., Bavarese M., Checchi D., Chiaromonte F., Vineis P., Gazzetta G., Riccardo F., Marziano V., Poletti P., Trentini F., Bella A., Xanthi A., Del Manso M., Fabiani M., Bellino S., Boros S., Urdiales A. M., Vescia M. F., Brusaferro S., Rezza G., Pezzotti P., Ajelli M., Merler S.
We describe in this report our studies to understand the relationship between human mobility and the spreading of COVID-19, as an aid to manage the restart of the social and economic activities after the lockdown and monitor the epidemics in the coming weeks and months. We compare the evolution (from January to May 2020) of the daily mobility flows in Italy, measured by means of nation-wide mobile phone data, and the evolution of transmissibility, measured by the net reproduction number, i.e., the mean number of secondary infections generated by one primary infector in the presence of control interventions and human behavioural adaptations. We find a striking relationship between the negative variation of mobility flows and the net reproduction number, in all Italian regions, between March 11th and March 18th, when the country entered the lockdown. This observation allows us to quantify the time needed to "switch off" the country mobility (one week) and the time required to bring the net reproduction number below 1 (one week). A reasonably simple regression model provides evidence that the net reproduction number is correlated with a region's incoming, outgoing and internal mobility. We also find a strong relationship between the number of days above the epidemic threshold before the mobility flows reduce significantly as an effect of lockdowns, and the total number of confirmed SARS-CoV-2 infections per 100k inhabitants, thus indirectly showing the effectiveness of the lockdown and the other non-pharmaceutical interventions in the containment of the contagion. Our study demonstrates the value of "big" mobility data to the monitoring of key epidemic indicators to inform choices as the epidemics unfolds in the coming months.Project(s): SoBigData via OpenAIRE

See at: arxiv.org Open Access | ISTI Repository Open Access | CNR ExploRA Open Access


2020 Conference article Embargo

Estimating countries' peace index through the lens of the world news as monitored by GDELT
Voukelatou V., Pappalardo L., Miliou I., Gabrielli L., Giannotti F.
Peacefulness is a principal dimension of well-being, and its measurement has lately drawn the attention of researchers and policy-makers. During the last years, novel digital data streams have drastically changed research in this field. In the current study, we exploit information extracted from Global Data on Events, Location, and Tone (GDELT) digital news database, to capture peacefulness through the Global Peace Index (GPI). Applying machine learning techniques, we demonstrate that news media attention, sentiment, and social stability from GDELT can be used as proxies for measuring GPI at a monthly level. Additionally, through the variable importance analysis, we show that each country's socio-economic, political, and military profile emerges. This could bring added value to researchers interested in "Data Science for Social Good", to policy-makers, and peacekeeping organizations since they could monitor peacefulness almost real-time, and therefore facilitate timely and more efficient policy-making.Source: 2020 IEEE 7th International Conference on Data Science and Advanced Analytics (DSAA), pp. 216–225, 06/10/2020, 09/10/2020
DOI: 10.1109/dsaa49011.2020.00034
Project(s): SoBigData-PlusPlus via OpenAIRE

See at: academic.microsoft.com Restricted | dblp.uni-trier.de Restricted | doi.org Restricted | ieeexplore.ieee.org Restricted | CNR ExploRA Restricted | xplorestaging.ieee.org Restricted


2020 Report Open Access OPEN

Predicting seasonal influenza using supermarket retail records
Miliou I., Xiong X., Rinzivillo S., Zhang Q., Rossetti G., Giannotti F., Pedreschi D., Vespignani A.
Increased availability of epidemiological data, novel digital data streams, and the rise of powerful machine learning approaches have generated a surge of research activity on real-time epidemic forecast systems. In this paper, we propose the use of a novel data source, namely retail market data to improve seasonal influenza forecasting. Specifically, we consider supermarket retail data as a proxy signal for influenza, through the identification of sentinel baskets, i.e., products bought together by a population of selected customers. We develop a nowcasting and forecasting framework that provides estimates for influenza incidence in Italy up to 4 weeks ahead. We make use of the Support Vector Regression (SVR) model to produce the predictions of seasonal flu incidence. Our predictions outperform both a baseline autoregressive model and a second baseline based on product purchases. The results show quantitatively the value of incorporating retail market data in forecasting models, acting as a proxy that can be used for the real-time analysis of epidemics.Source: ISTI Technical Reports 2020/009, 2020
DOI: 10.32079/isti-tr-2020/009
Project(s): SoBigData-PlusPlus via OpenAIRE

See at: ISTI Repository Open Access | CNR ExploRA Open Access


2020 Conference article Restricted

Prediction and explanation of privacy risk on mobility data with neural networks
Naretto F., Pellungrini R., Nardini F. M., Giannotti F.
The analysis of privacy risk for mobility data is a fundamental part of any privacy-aware process based on such data. Mobility data are highly sensitive. Therefore, the correct identification of the privacy risk before releasing the data to the public is of utmost importance. However, existing privacy risk assessment frameworks have high computational complexity. To tackle these issues, some recent work proposed a solution based on classification approaches to predict privacy risk using mobility features extracted from the data. In this paper, we propose an improvement of this approach by applying long short-term memory (LSTM) neural networks to predict the privacy risk directly from original mobility data. We empirically evaluate privacy risk on real data by applying our LSTM-based approach. Results show that our proposed method based on a LSTM network is effective in predicting the privacy risk with results in terms of F1 of up to 0.91. Moreover, to explain the predictions of our model, we employ a state-of-the-art explanation algorithm, Shap. We explore the resulting explanation, showing how it is possible to provide effective predictions while explaining them to the end-user.Source: ECML PKDD 2020 Workshops, pp. 501–516, Ghent, Belgium, 14-18/10/2020
DOI: 10.1007/978-3-030-65965-3_34
Project(s): HumanE-AI-Net via OpenAIRE, XAI via OpenAIRE, SoBigData-PlusPlus via OpenAIRE

See at: academic.microsoft.com Restricted | link.springer.com Restricted | link.springer.com Restricted | CNR ExploRA Restricted


2019 Journal article Open Access OPEN

Personalized market basket prediction with temporal annotated recurring sequences
Guidotti R., Rossetti G., Pappalardo L., Giannotti F., Pedreschi D.
Nowadays, a hot challenge for supermarket chains is to offer personalized services to their customers. Market basket prediction, i.e., supplying the customer a shopping list for the next purchase according to her current needs, is one of these services. Current approaches are not capable of capturing at the same time the different factors influencing the customer's decision process: co-occurrence, sequentuality, periodicity and recurrency of the purchased items. To this aim, we define a pattern Temporal Annotated Recurring Sequence (TARS) able to capture simultaneously and adaptively all these factors. We define the method to extract TARS and develop a predictor for next basket named TBP (TARS Based Predictor) that, on top of TARS, is able to understand the level of the customer's stocks and recommend the set of most necessary items. By adopting the TBP the supermarket chains could crop tailored suggestions for each individual customer which in turn could effectively speed up their shopping sessions. A deep experimentation shows that TARS are able to explain the customer purchase behavior, and that TBP outperforms the state-of-the-art competitors.Source: IEEE transactions on knowledge and data engineering (Print) 31 (2019): 2151–2163. doi:10.1109/TKDE.2018.2872587
DOI: 10.1109/tkde.2018.2872587
Project(s): SoBigData via OpenAIRE

See at: Archivio della Ricerca - Università di Pisa Open Access | IEEE Transactions on Knowledge and Data Engineering Open Access | ISTI Repository Open Access | IEEE Transactions on Knowledge and Data Engineering Restricted | IEEE Transactions on Knowledge and Data Engineering Restricted | ieeexplore.ieee.org Restricted | IEEE Transactions on Knowledge and Data Engineering Restricted | CNR ExploRA Restricted | IEEE Transactions on Knowledge and Data Engineering Restricted


2019 Journal article Open Access OPEN

A survey of methods for explaining black box models
Guidotti R., Monreale A., Ruggieri S., Turini F., Giannotti F., Pedreschi D.
In recent years, many accurate decision support systems have been constructed as black boxes, that is as systems that hide their internal logic to the user. This lack of explanation constitutes both a practical and an ethical issue. The literature reports many approaches aimed at overcoming this crucial weakness, sometimes at the cost of sacrificing accuracy for interpretability. The applications in which black box decision systems can be used are various, and each approach is typically developed to provide a solution for a specific problem and, as a consequence, it explicitly or implicitly delineates its own definition of interpretability and explanation. The aim of this article is to provide a classification of the main problems addressed in the literature with respect to the notion of explanation and the type of black box system. Given a problem definition, a black box type, and a desired explanation, this survey should help the researcher to find the proposals more useful for his own work. The proposed classification of approaches to open black box models should also be useful for putting the many research open questions in perspective.Source: ACM computing surveys 51 (2019). doi:10.1145/3236009
DOI: 10.1145/3236009
Project(s): SoBigData via OpenAIRE

See at: arXiv.org e-Print Archive Open Access | dl.acm.org Open Access | ACM Computing Surveys Open Access | Archivio della Ricerca - Università di Pisa Open Access | ISTI Repository Open Access | CNR ExploRA Open Access | ACM Computing Surveys Restricted | ACM Computing Surveys Restricted | ACM Computing Surveys Restricted | ACM Computing Surveys Restricted | ACM Computing Surveys Restricted | ACM Computing Surveys Restricted | ACM Computing Surveys Restricted | ACM Computing Surveys Restricted | ACM Computing Surveys Restricted | ACM Computing Surveys Restricted


2019 Journal article Open Access OPEN

The italian music superdiversity. Geography, emotion and language: one resource to find them, one resource to rule them all
Pollacci L., Guidotti R., Rossetti G., Giannotti F., Pedreschi D.
Globalization can lead to a growing standardization of musical contents. Using a cross-service multi-level dataset we investigate the actual Italian music scene. The investigation highlights the musical Italian superdiversity both individually analyzing the geographical and lexical dimensions and combining them. Using different kinds of features over the geographical dimension leads to two similar, comparable and coherent results, confirming the strong and essential correlation between melodies and lyrics. The profiles identified are markedly distinct one from another with respect to sentiment, lexicon, and melodic features. Through a novel application of a sentiment spreading algorithm and songs' melodic features, we are able to highlight discriminant characteristics that violate the standard regional political boundaries, reconfiguring them following the actual musical communicative practices.Source: Multimedia tools and applications (Dordrecht. Online) 78 (2019): 3297–3319. doi:10.1007/s11042-018-6511-6
DOI: 10.1007/s11042-018-6511-6
Project(s): SoBigData via OpenAIRE

See at: ISTI Repository Open Access | Multimedia Tools and Applications Restricted | Multimedia Tools and Applications Restricted | Multimedia Tools and Applications Restricted | Multimedia Tools and Applications Restricted | link.springer.com Restricted | Multimedia Tools and Applications Restricted | Multimedia Tools and Applications Restricted | CNR ExploRA Restricted