23 result(s)
Page Size: 10, 20, 50
Export: bibtex, xml, json, csv
Order by:

CNR Author operator: and / or
more
Typology operator: and / or
Language operator: and / or
Date operator: and / or
Rights operator: and / or
2020 Journal article Open Access OPEN
(So) Big Data and the transformation of the city
Andrienko G., Andrienko N., Boldrini C., Caldarelli G., Cintia P., Cresci S., Facchini A., Giannotti F., Gionis A., Guidotti R., Mathioudakis M., Muntean C. I., Pappalardo L., Pedreschi D., Pournaras E., Pratesi F., Tesconi M., Trasarti R.
The exponential increase in the availability of large-scale mobility data has fueled the vision of smart cities that will transform our lives. The truth is that we have just scratched the surface of the research challenges that should be tackled in order to make this vision a reality. Consequently, there is an increasing interest among different research communities (ranging from civil engineering to computer science) and industrial stakeholders in building knowledge discovery pipelines over such data sources. At the same time, this widespread data availability also raises privacy issues that must be considered by both industrial and academic stakeholders. In this paper, we provide a wide perspective on the role that big data have in reshaping cities. The paper covers the main aspects of urban data analytics, focusing on privacy issues, algorithms, applications and services, and georeferenced data from social media. In discussing these aspects, we leverage, as concrete examples and case studies of urban data science tools, the results obtained in the "City of Citizens" thematic area of the Horizon 2020 SoBigData initiative, which includes a virtual research environment with mobility datasets and urban analytics methods developed by several institutions around Europe. We conclude the paper outlining the main research challenges that urban data science has yet to address in order to help make the smart city vision a reality.Source: International Journal of Data Science and Analytics (Print) 1 (2020). doi:10.1007/s41060-020-00207-3
DOI: 10.1007/s41060-020-00207-3
Project(s): SoBigData via OpenAIRE
Metrics:


See at: Aaltodoc Publication Archive Open Access | International Journal of Data Science and Analytics Open Access | White Rose Research Online Open Access | HELDA - Digital Repository of the University of Helsinki Open Access | Archivio istituzionale della ricerca - Università degli Studi di Venezia Ca' Foscari Open Access | link.springer.com Open Access | International Journal of Data Science and Analytics Open Access | City Research Online Open Access | ISTI Repository Open Access | Fraunhofer-ePrints Restricted | CNR ExploRA


2019 Journal article Open Access OPEN
PRIMULE: Privacy risk mitigation for user profiles
Pratesi F., Gabrielli L., Cintia P., Monreale A., Giannotti F.
The availability of mobile phone data has encouraged the development of different data-driven tools, supporting social science studies and providing new data sources to the standard official statistics. However, this particular kind of data are subject to privacy concerns because they can enable the inference of personal and private information. In this paper, we address the privacy issues related to the sharing of user profiles, derived from mobile phone data, by proposing PRIMULE, a privacy risk mitigation strategy. Such a method relies on PRUDEnce (Pratesi et al., 2018), a privacy risk assessment framework that provides a methodology for systematically identifying risky-users in a set of data. An extensive experimentation on real-world data shows the effectiveness of PRIMULE strategy in terms of both quality of mobile user profiles and utility of these profiles for analytical services such as the Sociometer (Furletti et al., 2013), a data mining tool for city users classification.Source: Data & knowledge engineering 125 (2019). doi:10.1016/j.datak.2019.101786
DOI: 10.1016/j.datak.2019.101786
Project(s): SoBigData via OpenAIRE
Metrics:


See at: ISTI Repository Open Access | Archivio istituzionale della Ricerca - Scuola Normale Superiore Open Access | Data & Knowledge Engineering Restricted | www.sciencedirect.com Restricted | CNR ExploRA


2019 Software Unknown
PlayeRank
Cintia P., Pappalardo L.
PlayeRank is a data-driven algorithm that offers a principled multi-dimensional and role-aware evaluation of the performance of soccer players. Playerank is designed to work with soccer-logs, in which a match consists of a sequence of events encoded as a tuple: (id, type, position, timestamp), where id is the identifer of the player that originated/refers to this event, type is the event type (i.e., passes, shots, goals, tackles, etc.), position and timestamp denote the spatio-temporal coordinates of the event over the soccer field. PlayeRank assumes that soccer-logs are stored into a database, which is updated with new events after each soccer match. An exhaustive description of PlayeRank framework is available in this paper: Pappalardo, Luca, Cintia, Paolo, Ferragina, Paolo, Massucco, Emanuele, Pedreschi, Dino & Giannotti, Fosca (2019) PlayeRank: Data-driven Performance Evaluation and Player Ranking in Soccer via a Machine Learning Approach. ACM Transactions on Intelligent Systems and Technologies 10(5), DOI:https://doi.org/10.1145/3343172Project(s): SoBigData via OpenAIRE

See at: github.com | CNR ExploRA


2019 Master thesis Unknown
Injury forecasting in soccer utilizing machine learning and multivariate time series
Guerrini L. Relatori: Paolo Ferragina, Luca Pappalardo, Paolo Cintia
Injuries have a great impact on professional soccer due to their influence on team performance and considerable costs of rehabilitation for players. In this thesis, we use injury records and workload data describing the training sessions of players in a professional soccer club, spanning two entire seasons, to train and compare three classes of approaches to injury forecasting, i.e., predicting whether or not a player will get injured in next matches or training sessions. The first class of approaches is based on traditional techniques used in sports science and industry, such as the Acute Chronic Workload Ratio. The second class is based on machine learning tools such as decision tree and k-nearest neighbor classifier. The third class of approaches extends the second class by fully exploiting the temporal information present in the data through the usage of a multivariate time series representation of a player's workload history. We demonstrate that machine learning approaches significantly outperform traditional techniques still used in sports industry, moving accuracy prediction from 4% up to 50%, paving the way to a more accurate monitoring of the health status of soccer players.Project(s): SoBigData via OpenAIRE

See at: etd.adm.unipi.it | CNR ExploRA


2019 Master thesis Unknown
Capturing football-teams behavior with a stochastic model
Barbone M. Relatori: Paolo Ferragina, Luca Pappalardo, Paolo Cintia
This thesis aims to capture soccer teams behavior using a stochastic approach on a graph built on top of the Wyscout dataset, a market-leading company in data scouting for soccer. The main contributions of the thesis are twofold: first, it proposes a stochastic representation of a soccer game via a weighted graph properly derived from the Wyscout dataset. Secondly, it analyses every game through a stochastic model to detect the way teams move the ball together with the way they move onto the field and the performance that they achieve.Project(s): SoBigData via OpenAIRE

See at: etd.adm.unipi.it | CNR ExploRA


2018 Journal article Open Access OPEN
Quantifying the relation between performance and success in soccer
Pappalardo L., Cintia P.
The availability of massive data about sports activities offers nowadays the opportunity to quantify the relation between performance and success. In this study, we analyze more than 6000 games and 10 million events in six European leagues and investigate this relation in soccer competitions. We discover that a team's position in a competition's final ranking is significantly related to its typical performance, as described by a set of technical features extracted from the soccer data. Moreover, we find that, while victory and defeats can be explained by the team's performance during a game, it is difficult to detect draws by using a machine learning approach. We then simulate the outcomes of an entire season of each league only relying on technical data and exploiting a machine learning model trained on data from past seasons. The simulation produces a team ranking which is similar to the actual ranking, suggesting that a complex systems' view on soccer has the potential of revealing hidden patterns regarding the relation between performance and success.Source: Advances in Complex Systems 21 (2018). doi:10.1142/S021952591750014X
DOI: 10.1142/s021952591750014x
DOI: 10.48550/arxiv.1705.00885
Project(s): SoBigData via OpenAIRE
Metrics:


See at: arXiv.org e-Print Archive Open Access | Advances in Complex Systems Open Access | ISTI Repository Open Access | Advances in Complex Systems Restricted | doi.org Restricted | www.worldscientific.com Restricted | CNR ExploRA


2018 Journal article Open Access OPEN
Effective injury forecasting in soccer with GPS training data and machine learning
Rossi A., Pappalardo L., Cintia P., Iaia F. M., Fernandez J., Medina D.
Injuries have a great impact on professional soccer, due to their large influence on team performance and the considerable costs of rehabilitation for players. Existing studies in the literature provide just a preliminary understanding of which factors mostly affect injury risk, while an evaluation of the potential of statistical models in forecasting injuries is still missing. In this paper, we propose a multi-dimensional approach to injury forecasting in professional soccer that is based on GPS measurements and machine learning. By using GPS tracking technology, we collect data describing the training workload of players in a professional soccer club during a season. We then construct an injury forecaster and show that it is both accurate and interpretable by providing a set of case studies of interest to soccer practitioners. Our approach opens a novel perspective on injury prevention, providing a set of simple and practical rules for evaluating and interpreting the complex relations between injury risk and training performance in professional soccer.Source: PloS one 13 (2018): 1–15. doi:10.1371/journal.pone.0201264
DOI: 10.1371/journal.pone.0201264
DOI: 10.48550/arxiv.1705.08079
Project(s): SoBigData via OpenAIRE
Metrics:


See at: arXiv.org e-Print Archive Open Access | PLoS ONE Open Access | PLoS ONE Open Access | PLoS ONE Open Access | PLoS ONE Open Access | ISTI Repository Open Access | doi.org Restricted | CNR ExploRA


2017 Conference article Open Access OPEN
Who is going to get hurt? Predicting injuries in professional soccer
Rossi A., Pappalardo L., Cintia P., Fernandez J., Iaia F. M., Medina D.
Injury prevention has a fundamental role in professional soccer due to the high cost of recovery for players and the strong influence of injuries on a club's performance. In this paper we provide a predictive model to prevent injuries of soccer players using a multidimensional approach based on GPS measurements and machine learning. In an evolutive scenario, where a soccer club starts collecting the data for the first time and updates the predictive model as the season goes by, our approach can detect around half of the injuries, allowing the soccer club to save 70% of a season's economic costs related to injuries. The proposed approach can be a valuable support for coaches, helping the soccer club to reduce injury incidence, save money and increase team performance.Source: MLSA'17 - 4th Workshop on Machine Learning and Data Mining for Sports Analytics, pp. 21–30, Skopje, Macedonia, 18 September 2017
Project(s): SoBigData via OpenAIRE

See at: ceur-ws.org Open Access | ISTI Repository Open Access | CNR ExploRA


2017 Journal article Open Access OPEN
Discovering and understanding city events with big data: the case of Rome
Furletti B., Trasarti R., Cintia P., Gabrielli L.
The increasing availability of large amounts of data and digital footprints has given rise to ambitious research challenges in many fields, which spans from medical research, financial and commercial world, to people and environmental monitoring. Whereas traditional data sources and census fail in capturing actual and up-to-date behaviors, Big Data integrate the missing knowledge providing useful and hidden information to analysts and decision makers. With this paper, we focus on the identification of city events by analyzing mobile phone data (Call Detail Record), and we study and evaluate the impact of these events over the typical city dynamics. We present an analytical process able to discover, understand and characterize city events from Call Detail Record, designing a distributed computation to implement Sociometer, that is a profiling tool to categorize phone users. The methodology provides an useful tool for city mobility manager to manage the events and taking future decisions on specific classes of users, i.e., residents, commuters and tourists.Source: Information (Basel) 8 (2017). doi:10.3390/info8030074
DOI: 10.3390/info8030074
Metrics:


See at: Information Open Access | ISTI Repository Open Access | www.mdpi.com Open Access | Information Open Access | CNR ExploRA


2016 Contribution to conference Open Access OPEN
Network-based performance indicators for football teams
Pappalardo L., Cintia P.
Sports analytics has evolved in recent years in an amazing way, thanks to the sensing technologies that provide data streams extracted from every game. Despite the increasing wealth of data, there is not yet a consolidated repertoire of indicators for the various facets of team and players performance. In this poster we propose two data-driven approaches to measure the performance of football teams and football players.Source: International School and Conference on Network Science (Netsci-x), Wroclaw, Polonia, 11-13/01/2016

See at: netsci-x.net Open Access | ISTI Repository Open Access | CNR ExploRA


2016 Report Unknown
ASAP - Telecommunication Data Analytics (TDA) specification and early prototype
Bertoldi R., Cintia P., Trasarti R.
The main objective of this Work Package (WP) is the design and development of an analytics application on WIND Telecommunications customer data, targeted towards tourism and mobility scenarios. The envisaged use cases will be integrated into the ASAP framework and will be evaluated using several measurement methods. At the end of the project's second year (M24) the tasks involved are three: the end of the task T9.2, the task T9.3 and the beginning of task T9.4.Source: Project report, ASAP, Deliverable D9.3, 2016
Project(s): ASAP via OpenAIRE

See at: CNR ExploRA


2016 Report Open Access OPEN
ProgettISTI 2016
Banterle F., Barsocchi P., Candela L., Carlini E., Carrara F., Cassarà P., Ciancia V., Cintia P., Dellepiane M., Esuli A., Gabrielli L., Germanese D., Girardi M., Girolami M., Kavalionak H., Lonetti F., Lulli A., Moreo Fernandez A., Moroni D., Nardini F. M., Monteiro De Lira V. C., Palumbo F., Pappalardo L., Pascali M. A., Reggianini M., Righi M., Rinzivillo S., Russo D., Siotto E., Villa A.
ProgettISTI research project grant is an award for members of the Institute of Information Science and Technologies (ISTI) to provide support for innovative, original and multidisciplinary projects of high quality and potential. The choice of theme and the design of the research are entirely up to the applicants yet (i) the theme must fall under the ISTI research topics, (ii) the proposers of each project must be of diverse laboratories of the Institute and must contribute different expertise to the project idea, and (iii) project proposals should have a duration of 12 months. This report documents the procedure, the proposals and the results of the 2016 edition of the award. In this edition, ten project proposals have been submitted and three of them have been awarded.Source: ISTI Technical reports, 2016

See at: ISTI Repository Open Access | CNR ExploRA


2016 Conference article Restricted
The Haka network: Evaluating rugby team performance with dynamic graph analysis
Cintia P., Pappalardo L., Coscia M.
Real world events are intrinsically dynamic and analytic techniques have to take into account this dynamism. This aspect is particularly important on complex network analysis when relations are channels for interaction events between actors. Sensing technologies open the possibility of doing so for sport networks, enabling the analysis of team performance in a standard environment and rules. Useful applications are directly related for improving playing quality, but can also shed light on all forms of team efforts that are relevant for work teams, large firms with coordination and collaboration issues and, as a consequence, economic development. In this paper, we consider dynamics over networks representing the interaction between rugby players during a match. We build a pass network and we introduce the concept of disruption network, building a multilayer structure. We perform both a global and a micro-level analysis on game sequences. When deploying our dynamic graph analysis framework on data from 18 rugby matches, we discover that structural features that make networks resilient to disruptions are a good predictor of a team's performance, both at the global and at the local level. Using our features, we are able to predict the outcome of the match with a precision comparable to state of the art bookmaking.Source: IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, pp. 1095–1102, San Francisco, Ca, USA, 18-21 August 2016
DOI: 10.1109/asonam.2016.7752377
Project(s): SoBigData via OpenAIRE
Metrics:


See at: doi.org Restricted | ieeexplore.ieee.org Restricted | CNR ExploRA


2015 Report Open Access OPEN
An effective time-aware map matching process for low sampling GPS data
Cintia P., Nanni M.
In the era of the proliferation of Geo-Spatial Data, induced by the diffusion of GPS devices, the map matching problem still represents an important and valuable challenge. The process of associating a segment of the underlying road network to a GPS point gives us the chance to enrich raw data with the semantic layer provided by the roadmap, with all contextual information associated to it, e.g. the presence of speed limits, attraction points, changes in elevation, etc. Most state-of-art solutions for this classical problem simply look for the shortest or fastest path connecting any pair of consecutive points in a trip. While in some contexts that is reasonable, in this work we argue that the shortest/fastest path assumption can be in general erroneous. Indeed, we show that such approaches can yield travel times that are significantly incoherent with the real ones, and propose a Time-Aware Map matching process that tries to improve the state-of-art by taking into account also such temporal aspect. Our algorithm results to be very efficient, effective on low- sampling data and to outperform existing solutions, as proved by experiments on large datasets of real GPS trajectories. Moreover, our algorithm is parameter-free and does not depend on specific characteristics of the GPS localization error and of the road network (e.g. density of roads, road network topology, etc.).Source: ISTI Technical reports, 2015

See at: ISTI Repository Open Access | CNR ExploRA


2015 Contribution to book Open Access OPEN
Towards a boosted route planner using individual mobility models
Guidotti R., Cintia P.
Route planners generally return routes that minimize either the distance covered or the time traveled. However, these routes are rarely considered by people who move in a certain area systematically. Indeed, due to their expertise, they very often prefer different solutions. In this paper we provide an analytic model to study the deviations of the systematic movements from the paths proposed by a route planner. As proxy of human mobility we use real GPS traces and we analyze a set of users which act in Pisa and Florence province. By using appropriate mobility data mining techniques, we extract the GPS systematic movements and we transform them into sequences of road segments. Finally, we calculate the shortest and fastest path from the origin to the destination of each systematic movement and we compare them with the routes mapped on the road network. Our results show that about 30-35% of the systematic movements follow the shortest paths, while the others follow routes which are on average 7 km longer. In addition, we divided the area object of study in cells and we analyzed the deviations in the flows of systematic movements. We found that, these deviations are not only driven by individual mobility behaviors but are a signal of an existing common sense that could be exploited by a route planner.Source: Software Engineering and Formal Methods, edited by Domenico Bianculli, Radu Calinescu, Bernhard Rumpe, pp. 108–123. Berlin Heidelberg: Springer, 2015
DOI: 10.1007/978-3-662-49224-6_10
Project(s): PETRA via OpenAIRE
Metrics:


See at: ISTI Repository Open Access | doi.org Restricted | link.springer.com Restricted | CNR ExploRA


2015 Conference article Open Access OPEN
A network-based approach to evaluate the performance of football teams
Cintia P., Pappalardo L., Rinzivillo S.
The striking proliferation of sensing technologies that provide high-fidelity data streams extracted from every game, induced an amazing evolution of football statistics. Nowadays professional statistical analysis firms like ProZone and Opta provide data to football clubs, coaches and leagues, who are starting to analyze these data to monitor their players and improve team strategies. Standard approaches in evaluating and predicting team performance are based on history-related factors such as past victories or defeats, record in qualification games and margin of victory in past games. In contrast with traditional models, in this paper we propose a model based on the observation of players' behavior on the pitch. We model a the game of a team as a network and extract simple network measures, showing the value of our approach on predicting the outcomes of a long-running tournament such as Italian major league.Source: Workshop on Machine Learning and Data Mining for Sports Analytics, pp. 46–54, Porto, Portugal, 11/09/2015

See at: ceur-ws.org Open Access | CNR ExploRA


2015 Conference article Restricted
The harsh rule of the goals: Data-driven performance indicators for football teams
Cintia P., Pappalardo L., Pedreschi D., Giannotti F., Malvaldi M.
Sports analytics in general, and football (soccer in USA) analytics in particular, have evolved in recent years in an amazing way, thanks to automated or semi-automated sensing technologies that provide high-fidelity data streams extracted from every game. In this paper we propose a data-driven approach and show that there is a large potential to boost the understanding of football team performance. From observational data of football games we extract a set of pass-based performance indicators and summarize them in the H indicator. We observe a strong correlation among the proposed indicator and the success of a team, and therefore perform a simulation on the four major European championships (78 teams, almost 1500 games). The outcome of each game in the championship was replaced by a synthetic outcome (win, loss or draw) based on the performance indicators computed for each team. We found that the final rankings in the simulated championships are very close to the actual rankings in the real championships, and show that teams with high ranking error show extreme values of a defense/attack efficiency measure, the Pezzali score. Our results are surprising given the simplicity of the proposed indicators, suggesting that a complex systems' view on football data has the potential of revealing hidden patterns and behavior of superior quality.Source: IEEE International Conference on Data Science and Advanced Analytics, Paris, France, 19-21/10/2015
DOI: 10.1109/dsaa.2015.7344823
Project(s): CIMPLEX via OpenAIRE
Metrics:


See at: doi.org Restricted | ieeexplore.ieee.org Restricted | CNR ExploRA


2015 Master thesis Open Access OPEN
Storia e struttura del data Journalism
Locci P.
Gli obbiettivi di questa tesi sono di analizzare la nascita e lo sviluppo del data journalism a partire dalle inchieste giornalistiche che hanno determinato la sua evoluzione, analizzando il metodo di lavoro di tre premi Pulitzer, Philip Meyer, Bill Dedman e Stephen K. Doig. Esaminare quali sono i metodi di lavoro e gli strumenti più utilizzati dalle redazioni che sono più attente al data journalism, per arrivare alla creazione di un vero articolo di data journalism, "La crisi economica e il declino del calcio italiano", nel quale vengono messi in relazione i dati che riguardano la crisi economica e i dati che riguardano il declino del calcio Italiano, il quale dal 2010 non è stato all'altezza della propria tradizione calcistica. In questo periodo, in Italia, è stato registrato un vero e proprio crollo dal punto di vista dei risultati, da imputare ad un calo degli investimenti che non è stato riscontrato negli altri campionati europei, nei quali, a dispetto della crisi, gli investimenti sono aumentati.

See at: etd.adm.unipi.it Open Access | ISTI Repository Open Access | CNR ExploRA


2014 Contribution to book Restricted
Mobility profiling
Nanni M., Trasarti R., Cintia P., Furletti B., Gabrielli L., Rinzivillo S., Giannotti F.
An abstract is not availableSource: Data Science and Simulation in Transportation Research, edited by Davy Janssens, Ansar-Ul-Haque Yasar, Luk Knapen, pp. 1–29. Hershey: IGI Global, 2014
DOI: 10.4018/978-1-4666-4920-0.ch001
Metrics:


See at: www.igi-global.com Restricted | www.igi-global.com Restricted | CNR ExploRA


2014 Conference article Open Access OPEN
Mining efficient training patterns of non-professional cyclists (Discussion Paper)
Cintia P., Pappalardo L., Pedreschi D.
The recent emergence of the so called online social fitness open up new scenarios for fascinating challenges in the field of data sci- ence. Through these platforms, users can collect, monitor and share with friends their sport performance, with interesting details about heartrate, watt consumption and calories burned. The availability of this data, col- lected among a large number of users, gives us the possibility to explore new data mining applications. In the current work, we present the results of a study conducted on a sample of 29; 284 cyclists downloaded via APIs from the social fitness platform Strava.com. We defined two basic metrics: A measure of train- ing effort, that is how much a cyclist struggled during the workout; and a measure of training performance indicating the results achieved during the training. Although the average effort is weakly correlated with the average performance, by deeply investigating workouts time evolution and cyclists' training characteristics interesting findings came out. We found that athletes that better improve their performance follow precise training patterns usually referred as overcompensation theory, with alter- nation of stress peaks and rest periods. Studies and experiments related to such theory, up to now, have always been conducted by sports doctors on a few dozen professionals athletes. To the best of our knowledge, our study is the first corroboration on large scale of this theory.Source: SEBD 2014 - 22nd Italian Symposium on Advanced Database Systems, pp. 1–8, Sorrento Coast, Italy, 16-18 June 2014

See at: toc.proceedings.com Open Access | CNR ExploRA