Page 1 of 2

2015 Contribution to book Open Access

Towards a boosted route planner using individual mobility models
Guidotti R, Cintia P
Route planners generally return routes that minimize either the distance covered or the time traveled. However, these routes are rarely considered by people who move in a certain area systematically. Indeed, due to their expertise, they very often prefer different solutions. In this paper we provide an analytic model to study the deviations of the systematic movements from the paths proposed by a route planner. As proxy of human mobility we use real GPS traces and we analyze a set of users which act in Pisa and Florence province. By using appropriate mobility data mining techniques, we extract the GPS systematic movements and we transform them into sequences of road segments. Finally, we calculate the shortest and fastest path from the origin to the destination of each systematic movement and we compare them with the routes mapped on the road network. Our results show that about 30-35% of the systematic movements follow the shortest paths, while the others follow routes which are on average 7 km longer. In addition, we divided the area object of study in cells and we analyzed the deviations in the flows of systematic movements. We found that, these deviations are not only driven by individual mobility behaviors but are a signal of an existing common sense that could be exploited by a route planner.DOI: 10.1007/978-3-662-49224-6_10
Project(s): PETRA via OpenAIRE

Metrics:

2013 Conference article Restricted

"Engine matters": a first large scale data driven study on cyclists' performance
Cintia P, Pappalardo L, Pedreschi D
The recent emergence of the so called online social fitness constitutes a good proxy to study the patterns underlying success in sport. Through these platforms, users can collect, monitor and share with friends their sport performance, diet, and even burned calories, giving an unprecedented opportunity to answer very fascinating questions: What are the main factors that shape sport performance? What are the characteristics that distinguish successful sportsmen? Can we characterize the role of social influence on fitness behavior? In the current work, we present the results of a study conducted on a sample of 29, 284 cyclists downloaded via APIs from the social fitness platform Strava.com. We defined two basic metrics: a measure of training effort, that is how much a cyclist struggled during the workout; and a measure of training performance indicating the results achieved during the training. Analyzing the relationship between these two metrics, an interesting result immediately emerges: at a global level, there is no correlation between effort and performance. This means that, in general, the performance is not simply a function of training: two athletes with the same level of training have different performance. However, by deeply investigating workouts time evolution and cyclists' training characteristics, we found that athletes that better improve their performance follow precise training patterns usually referred as overcompensation theory, with alternation of stress peaks and rest periods. Studies and experiments related to such theory, up to now, have always been conducted by sports doctors on a few dozen professionals athletes. To the best of our knowledge, our study is the first corroboration on large scale of this theory, mainly confirming that "engine matters", but tuning is fundamental.DOI: 10.1109/icdmw.2013.41
Metrics:

See at: doi.org Restricted | CNR IRIS | CNR IRIS | www.dataminingcasestudies.com

2013 Conference article Restricted

Estimating time-dependent speed functions using a gravity model over road network
Cintia P, Trasarti R, Macedo J A, Almada L, Ferreira C
The availability of inexpensive tracking devices,such as GPS- enabled devices, gives the opportunity to collect large amounts of trajectory data from vehicles. In this context, we are interested in the problem of generating the traffic information in time-dependent networks using this kind of data. This problem is not trivial since several works in liter- ature use strong assumptions on the error distribution we want to drop, proposing a gravitational model method to compute road segment aver- age speed from trajectory data. Furthermore we show how to generate travel-time functions from the computed average speeds useful for time- dependent networks routing systems. Our approach allows creating an accurate picture of the traffic conditions in time and space. The method we present in this paper tackles all this aspect showing how its perfor- mance over a synthetic dataset and a real case.Project(s): SEEK via OpenAIRE

See at: CNR IRIS Restricted | CNR IRIS

2015 Other Open Access

An effective time-aware map matching process for low sampling GPS data
Cintia P, Nanni M
In the era of the proliferation of Geo-Spatial Data, induced by the diffusion of GPS devices, the map matching problem still represents an important and valuable challenge. The process of associating a segment of the underlying road network to a GPS point gives us the chance to enrich raw data with the semantic layer provided by the roadmap, with all contextual information associated to it, e.g. the presence of speed limits, attraction points, changes in elevation, etc. Most state-of-art solutions for this classical problem simply look for the shortest or fastest path connecting any pair of consecutive points in a trip. While in some contexts that is reasonable, in this work we argue that the shortest/fastest path assumption can be in general erroneous. Indeed, we show that such approaches can yield travel times that are significantly incoherent with the real ones, and propose a Time-Aware Map matching process that tries to improve the state-of-art by taking into account also such temporal aspect. Our algorithm results to be very efficient, effective on low- sampling data and to outperform existing solutions, as proved by experiments on large datasets of real GPS trajectories. Moreover, our algorithm is parameter-free and does not depend on specific characteristics of the GPS localization error and of the road network (e.g. density of roads, road network topology, etc.).

See at: CNR IRIS Open Access | ISTI Repository | CNR IRIS Restricted

2016 Other Open Access

Network-based performance indicators for football teams
Pappalardo L, Cintia P
Sports analytics has evolved in recent years in an amazing way, thanks to the sensing technologies that provide data streams extracted from every game. Despite the increasing wealth of data, there is not yet a consolidated repertoire of indicators for the various facets of team and players performance. In this poster we propose two data-driven approaches to measure the performance of football teams and football players.

See at: CNR IRIS Open Access | netsci-x.net | ISTI Repository | CNR IRIS Restricted

2016 Other Restricted

ASAP - Telecommunication Data Analytics (TDA) specification and early prototype
Bertoldi R, Cintia P, Trasarti R
The main objective of this Work Package (WP) is the design and development of an analytics application on WIND Telecommunications customer data, targeted towards tourism and mobility scenarios. The envisaged use cases will be integrated into the ASAP framework and will be evaluated using several measurement methods. At the end of the project's second year (M24) the tasks involved are three: the end of the task T9.2, the task T9.3 and the beginning of task T9.4.Project(s): ASAP via OpenAIRE

See at: CNR IRIS Restricted | CNR IRIS

2016 Conference article Restricted

The Haka network: Evaluating rugby team performance with dynamic graph analysis
Cintia P, Pappalardo L, Coscia M
Real world events are intrinsically dynamic and analytic techniques have to take into account this dynamism. This aspect is particularly important on complex network analysis when relations are channels for interaction events between actors. Sensing technologies open the possibility of doing so for sport networks, enabling the analysis of team performance in a standard environment and rules. Useful applications are directly related for improving playing quality, but can also shed light on all forms of team efforts that are relevant for work teams, large firms with coordination and collaboration issues and, as a consequence, economic development. In this paper, we consider dynamics over networks representing the interaction between rugby players during a match. We build a pass network and we introduce the concept of disruption network, building a multilayer structure. We perform both a global and a micro-level analysis on game sequences. When deploying our dynamic graph analysis framework on data from 18 rugby matches, we discover that structural features that make networks resilient to disruptions are a good predictor of a team's performance, both at the global and at the local level. Using our features, we are able to predict the outcome of the match with a precision comparable to state of the art bookmaking.DOI: 10.1109/asonam.2016.7752377
Project(s): SoBigData via OpenAIRE

Metrics:

See at: doi.org Restricted | CNR IRIS | ieeexplore.ieee.org | CNR IRIS

2018 Journal article Open Access

Quantifying the relation between performance and success in soccer
Pappalardo L, Cintia P
The availability of massive data about sports activities offers nowadays the opportunity to quantify the relation between performance and success. In this study, we analyze more than 6000 games and 10 million events in six European leagues and investigate this relation in soccer competitions. We discover that a team's position in a competition's final ranking is significantly related to its typical performance, as described by a set of technical features extracted from the soccer data. Moreover, we find that, while victory and defeats can be explained by the team's performance during a game, it is difficult to detect draws by using a machine learning approach. We then simulate the outcomes of an entire season of each league only relying on technical data and exploiting a machine learning model trained on data from past seasons. The simulation produces a team ranking which is similar to the actual ranking, suggesting that a complex systems' view on soccer has the potential of revealing hidden patterns regarding the relation between performance and success.Source: ADVANCES IN COMPLEX SYSTEM, vol. 21 (issue 3)
DOI: 10.1142/s021952591750014x
DOI: 10.48550/arxiv.1705.00885
Project(s): SoBigData via OpenAIRE

Metrics:

2017 Conference article Open Access

Who is going to get hurt? Predicting injuries in professional soccer
Rossi A, Pappalardo L, Cintia P, Fernandez J, Iaia Fm, Medina D
Injury prevention has a fundamental role in professional soccer due to the high cost of recovery for players and the strong influence of injuries on a club's performance. In this paper we provide a predictive model to prevent injuries of soccer players using a multidimensional approach based on GPS measurements and machine learning. In an evolutive scenario, where a soccer club starts collecting the data for the first time and updates the predictive model as the season goes by, our approach can detect around half of the injuries, allowing the soccer club to save 70% of a season's economic costs related to injuries. The proposed approach can be a valuable support for coaches, helping the soccer club to reduce injury incidence, save money and increase team performance.Source: CEUR WORKSHOP PROCEEDINGS, pp. 21-30. Skopje, Macedonia, 18 September 2017
Project(s): SoBigData via OpenAIRE

See at: ceur-ws.org Open Access | CNR IRIS | ISTI Repository | CNR IRIS Restricted

2018 Journal article Open Access

Effective injury forecasting in soccer with GPS training data and machine learning
Rossi A, Pappalardo L, Cintia P, Iaia F M, Fernandez J, Medina D
Injuries have a great impact on professional soccer, due to their large influence on team performance and the considerable costs of rehabilitation for players. Existing studies in the literature provide just a preliminary understanding of which factors mostly affect injury risk, while an evaluation of the potential of statistical models in forecasting injuries is still missing. In this paper, we propose a multi-dimensional approach to injury forecasting in professional soccer that is based on GPS measurements and machine learning. By using GPS tracking technology, we collect data describing the training workload of players in a professional soccer club during a season. We then construct an injury forecaster and show that it is both accurate and interpretable by providing a set of case studies of interest to soccer practitioners. Our approach opens a novel perspective on injury prevention, providing a set of simple and practical rules for evaluating and interpreting the complex relations between injury risk and training performance in professional soccer.Source: PLOS ONE, vol. 13 (issue 7), pp. 1-15
DOI: 10.1371/journal.pone.0201264
DOI: 10.48550/arxiv.1705.08079
Project(s): SoBigData via OpenAIRE

Metrics:

2019 Software Metadata Only Access

PlayeRank
Cintia P, Pappalardo L
PlayeRank is a data-driven algorithm that offers a principled multi-dimensional and role-aware evaluation of the performance of soccer players. Playerank is designed to work with soccer-logs, in which a match consists of a sequence of events encoded as a tuple: (id, type, position, timestamp), where id is the identifer of the player that originated/refers to this event, type is the event type (i.e., passes, shots, goals, tackles, etc.), position and timestamp denote the spatio-temporal coordinates of the event over the soccer field. PlayeRank assumes that soccer-logs are stored into a database, which is updated with new events after each soccer match. An exhaustive description of PlayeRank framework is available in this paper: Pappalardo, Luca, Cintia, Paolo, Ferragina, Paolo, Massucco, Emanuele, Pedreschi, Dino & Giannotti, Fosca (2019) PlayeRank: Data-driven Performance Evaluation and Player Ranking in Soccer via a Machine Learning Approach. ACM Transactions on Intelligent Systems and Technologies 10(5), DOI:https://doi.org/10.1145/3343172Project(s): SoBigData via OpenAIRE

See at: github.com Restricted | CNR IRIS

2020 Other Metadata Only Access

Predicting soccer game evolution through AI-based tracking data analysis
Quasso E., Pappalardo L., Cintia P.
Nowadays, technology is increasingly used in soccer. An open challenge is how to use the massive data produced by technology to create a framework to simulate different match situations and help trainers understand the dynamics on the field better. This thesis aims to extrapolate logical patterns that describe how the ball moves on the field in different game situations. We use tracking and event data of several matches to extract players and ball positions on the field. Then, we build two machine learning approaches. The first approach involves the use of handmade features passed to a Random Forest classifier. The second approach is a Convolutional Neural Network that automatically highlights valuable features to make a prediction. We show that the Random Forest provides a better understanding of the rules governing the movement of the ball than the Convolutional Neural Network. This result emphasizes that conditional control statements based on the position of the object on the field alongside handmade features work better than an automated feature extraction method based on deep learning.Project(s): SoBigData via OpenAIRE

See at: etd.adm.unipi.it Restricted | CNR IRIS

2019 Other Metadata Only Access

Injury forecasting in soccer utilizing machine learning and multivariate time series
Guerrini L Laureando Relatori Paolo Ferragina, Luca Pappalardo, Paolo Cintia
Injuries have a great impact on professional soccer due to their influence on team performance and considerable costs of rehabilitation for players. In this thesis, we use injury records and workload data describing the training sessions of players in a professional soccer club, spanning two entire seasons, to train and compare three classes of approaches to injury forecasting, i.e., predicting whether or not a player will get injured in next matches or training sessions. The first class of approaches is based on traditional techniques used in sports science and industry, such as the Acute Chronic Workload Ratio. The second class is based on machine learning tools such as decision tree and k-nearest neighbor classifier. The third class of approaches extends the second class by fully exploiting the temporal information present in the data through the usage of a multivariate time series representation of a player's workload history. We demonstrate that machine learning approaches significantly outperform traditional techniques still used in sports industry, moving accuracy prediction from 4% up to 50%, paving the way to a more accurate monitoring of the health status of soccer players.Project(s): SoBigData via OpenAIRE

See at: etd.adm.unipi.it Restricted | CNR IRIS

2018 Other Metadata Only Access

Capturing football-teams behavior with a stochastic model
Barbone M Laureando Relatori Paolo Ferragina, Luca Pappalardo, Paolo Cintia
This thesis aims to capture soccer teams behavior using a stochastic approach on a graph built on top of the Wyscout dataset, a market-leading company in data scouting for soccer. The main contributions of the thesis are twofold: first, it proposes a stochastic representation of a soccer game via a weighted graph properly derived from the Wyscout dataset. Secondly, it analyses every game through a stochastic model to detect the way teams move the ball together with the way they move onto the field and the performance that they achieve.Project(s): SoBigData via OpenAIRE

See at: etd.adm.unipi.it Restricted | CNR IRIS

2013 Conference article Restricted

Inferring human activities from GPS tracks
Furletti B., Cintia P., Renso C., Spinsanti L.
The collection of huge amount of tracking data made possi- bile by the widespread use of GPS devices, enabled the anal- ysis of such data for several applications domains, ranging from traffic management to advertisement and social stud- ies. However, the raw positioning data, as it is detected by GPS devices, lacks of semantic information since these data do not natively provide any additional contextual in- formation like the places that people visited or the activities performed. Traditionally, this information is collected by hand filled questionnaire where a limited number of users are asked to annotate their tracks whith the activities they have done. With the purpose of getting large amount of semantically rich trajectories, we propose an algorithm for automatically annotating raw trajectories with the activi- ties performed by the users. To do this, we analyse the stops points trying to infer the Point Of Interest (POI) the user has visited. Based on the category of the POI and a probability law, we infer the activity performed. We exper- imented and evaluated the method in a real case study of car trajectories, manually annotated by users with their ac- tivities. We exploit the Gravity law and the nearby POIs for inferring the most probable activity performed by a user during a stop. Experimental results are encouraging and will drive our future works.Source: UrbComp'13 - 2nd ACM SIGKDD International Workshop on Urban Computing, pp. 5–8, Chicago, USA, 11-14 August 2013
DOI: 10.1145/2505821.2505830
Project(s): DATA SIM via OpenAIRE

Metrics:

See at: dl.acm.org Restricted | doi.org | CNR ExploRA

2014 Conference article Open Access

Mining efficient training patterns of non-professional cyclists (Discussion Paper)
Cintia P, Pappalardo L, Pedreschi D
The recent emergence of the so called online social fitness open up new scenarios for fascinating challenges in the field of data sci- ence. Through these platforms, users can collect, monitor and share with friends their sport performance, with interesting details about heartrate, watt consumption and calories burned. The availability of this data, col- lected among a large number of users, gives us the possibility to explore new data mining applications. In the current work, we present the results of a study conducted on a sample of 29; 284 cyclists downloaded via APIs from the social fitness platform Strava.com. We defined two basic metrics: A measure of train- ing effort, that is how much a cyclist struggled during the workout; and a measure of training performance indicating the results achieved during the training. Although the average effort is weakly correlated with the average performance, by deeply investigating workouts time evolution and cyclists' training characteristics interesting findings came out. We found that athletes that better improve their performance follow precise training patterns usually referred as overcompensation theory, with alter- nation of stress peaks and rest periods. Studies and experiments related to such theory, up to now, have always been conducted by sports doctors on a few dozen professionals athletes. To the best of our knowledge, our study is the first corroboration on large scale of this theory.

See at: CNR IRIS Open Access | toc.proceedings.com | CNR IRIS Restricted

2015 Conference article Open Access

A network-based approach to evaluate the performance of football teams
Cintia P, Pappalardo L, Rinzivillo S
The striking proliferation of sensing technologies that provide high-fidelity data streams extracted from every game, induced an amazing evolution of football statistics. Nowadays professional statistical analysis firms like ProZone and Opta provide data to football clubs, coaches and leagues, who are starting to analyze these data to monitor their players and improve team strategies. Standard approaches in evaluating and predicting team performance are based on history-related factors such as past victories or defeats, record in qualification games and margin of victory in past games. In contrast with traditional models, in this paper we propose a model based on the observation of players' behavior on the pitch. We model a the game of a team as a network and extract simple network measures, showing the value of our approach on predicting the outcomes of a long-running tournament such as Italian major league.Source: CEUR WORKSHOP PROCEEDINGS, pp. 46-54. Porto, Portugal, 11/09/2015

See at: ceur-ws.org Open Access | CNR IRIS | CNR IRIS Restricted

2015 Conference article Restricted

The harsh rule of the goals: Data-driven performance indicators for football teams
Cintia P, Pappalardo L, Pedreschi D, Giannotti F, Malvaldi M
Sports analytics in general, and football (soccer in USA) analytics in particular, have evolved in recent years in an amazing way, thanks to automated or semi-automated sensing technologies that provide high-fidelity data streams extracted from every game. In this paper we propose a data-driven approach and show that there is a large potential to boost the understanding of football team performance. From observational data of football games we extract a set of pass-based performance indicators and summarize them in the H indicator. We observe a strong correlation among the proposed indicator and the success of a team, and therefore perform a simulation on the four major European championships (78 teams, almost 1500 games). The outcome of each game in the championship was replaced by a synthetic outcome (win, loss or draw) based on the performance indicators computed for each team. We found that the final rankings in the simulated championships are very close to the actual rankings in the real championships, and show that teams with high ranking error show extreme values of a defense/attack efficiency measure, the Pezzali score. Our results are surprising given the simplicity of the proposed indicators, suggesting that a complex systems' view on football data has the potential of revealing hidden patterns and behavior of superior quality.DOI: 10.1109/dsaa.2015.7344823
Project(s): CIMPLEX via OpenAIRE

Metrics:

See at: doi.org Restricted | CNR IRIS | ieeexplore.ieee.org | CNR IRIS

2017 Journal article Open Access

Discovering and understanding city events with big data: the case of Rome
Furletti B, Trasarti R, Cintia P, Gabrielli L
The increasing availability of large amounts of data and digital footprints has given rise to ambitious research challenges in many fields, which spans from medical research, financial and commercial world, to people and environmental monitoring. Whereas traditional data sources and census fail in capturing actual and up-to-date behaviors, Big Data integrate the missing knowledge providing useful and hidden information to analysts and decision makers. With this paper, we focus on the identification of city events by analyzing mobile phone data (Call Detail Record), and we study and evaluate the impact of these events over the typical city dynamics. We present an analytical process able to discover, understand and characterize city events from Call Detail Record, designing a distributed computation to implement Sociometer, that is a profiling tool to categorize phone users. The methodology provides an useful tool for city mobility manager to manage the events and taking future decisions on specific classes of users, i.e., residents, commuters and tourists.Source: INFORMATION, vol. 8 (issue 3)
DOI: 10.3390/info8030074
Metrics:

2020 Other Metadata Only Access

A Computer Vision Approach for Pass Detection on Soccer Broadcast Video
Sorano D., Pappalardo L., Cintia P., Carrara F.
The annotation of the events that occur during a soccer match is a primary issue for companies that produce data for analytical purposes. Nowadays, the annotation is mostly manual, i.e., humans operators use proprietary software to annotate the events. This thesis aims to automate part of the annotation process with a computer vision approach that can recognize one of the most frequent events in soccer: the passes. To achieve this purpose, we combine soccer broadcast videos and events data. Broadcast videos are the input of the models, while the events data define the labels of the videos. We propose a model that is a combination of the pre-trained model ResNet18, applied to extract features from single frames and a Bidirectional LSTM model that analyzes the temporal evolution of the extracted features. Moreover, we use real-time object detection method YOLO to extract the positional information of the ball and the players inside each frame. This information is concatenated to the feature extracted from the ResNet18 model and used as input of bidirectional LSTM. Our results show a significant improvement in the accuracy of pass detection with respect to baseline classifiers applied to the same task, highlighting that our approach is a first step towards the automation of events annotation in soccer.Project(s): SoBigData via OpenAIRE

See at: etd.adm.unipi.it Restricted | CNR IRIS