2019
Journal article  Open Access

PlayeRank: data-driven performance evaluation and player ranking in Soccer via a machine learning approach

Pappalardo L., Cintia P., Ferragina P., Massucco E., Pedreschi D., Giannotti F.

Football analytics  Artificial Intelligence  Data science  Sports analytics  Clustering  Ranking  FOS: Computer and information sciences  Artificial Intelligence (cs.AI)  Searching  Statistics - Applications  Theoretical Computer Science  Multi-dimensional analysis  Predictive modelling  Big data  Applications (stat.AP)  Soccer analytics  Computer Science - Artificial Intelligence 

The problem of evaluating the performance of soccer players is attracting the interest of many companies and the scientific community, thanks to the availability of massive data capturing all the events generated during a match (e.g., tackles, passes, shots, etc.). Unfortunately, there is no consolidated and widely accepted metric for measuring performance quality in all of its facets. In this article, we design and implement PlayeRank, a data-driven framework that offers a principled multi-dimensional and role-aware evaluation of the performance of soccer players. We build our framework by deploying a massive dataset of soccer-logs and consisting of millions of match events pertaining to four seasons of 18 prominent soccer competitions. By comparing PlayeRank to known algorithms for performance evaluation in soccer, and by exploiting a dataset of players' evaluations made by professional soccer scouts, we show that PlayeRank significantly outperforms the competitors. We also explore the ratings produced by PlayeRank and discover interesting patterns about the nature of excellent performances and what distinguishes the top players from the others. At the end, we explore some applications of PlayeRank-i.e. searching players and player versatility-showing its flexibility and efficiency, which makes it worth to be used in the design of a scalable platform for soccer analytics.

Source: ACM transactions on intelligent systems and technology (Print) 10 (2019). doi:10.1145/3343172

Publisher: Association for Computing Machinery, New York, NY , Stati Uniti d'America


[1] Wyscout API, howpublished=https://support.wyscout.com/.
[2] A. Bialkowski, P. Lucey, P. Carr, Y. Yue, S. Sridharan, and I. Matthews. Large-scale analysis of soccer matches using spatiotemporal tracking data. In Procs of the IEEE Intl Conference on Data Mining, pages 725-730, Dec 2014.
[3] J. Brooks, M. Kerr, and J. Guttag. Developing a data-driven player ranking in soccer using predictive model weights. In Procs of the 22nd ACM SIGKDD Intl. Conference on Knowledge Discovery and Data Mining, pages 49-55, 2016.
[4] P. Cintia, F. Giannotti, L. Pappalardo, D. Pedreschi, and M. Malvaldi. The harsh rule of the goals: data-driven performance indicators for football teams. In Procs of the 2015 IEEE Intl. Conference on Data Science and Advanced Analytics, 2015.
[5] T. M. Cover and J. A. Thomas. Elements of Information Theory. Wiley & Son, Inc., New York, NY, USA, 1991.
[6] J. Duch, J. S. Waitzman, and L. A. N. Amaral. Quantifying the performance of individual players in a team activity. PLOS ONE, 5(6):1-7, 2010.
[7] J. Gudmundsson and M. Horton. Spatio-temporal analysis of team sports. ACM Computing Surveys, 50(2):22:1-22:34, 2017.
[8] L. Gyarmati and M. Hefeeda. Analyzing in-game movements of soccer players at scale. In MIT SLOAN Sports Analytics Conference, 2016.
[9] L. Gyarmati, H. Kwak, and P. Rodriguez. Searching for a unique style in soccer. CoRR, abs/1409.0308, 2014.
[10] J. A. Hartigan and M. A. Wong. A k-means clustering algorithm. JSTOR: Applied Statistics, 28(1):100-108, 1979.
[11] J. López Peña and H. Touchette. A network theory analysis of football strategies. ArXiv e-prints, June 2012.
[12] P. Lucey, D. Oliver, P. Carr, J. Roth, and I. Matthews. Assessing team strategy using spatiotemporal data. In Proceedings of the 19th ACM SIGKDD Intl Conference on Knowledge Discovery and Data Mining, pages 1366-1374, 2013.
[13] C. D. Manning, P. Raghavan, and H. Schütze. Introduction to Information Retrieval. Cambridge University Press, 2008.
[14] O. Müller, A. Simons, and M. Weinmann. Beyond crowd judgments: Data-driven estimation of market value in association football. European Journal of Operational Research, 263(2):611-624, 2017.
[15] T. A. Myrvoll and F. K. Soong. Optimal clustering of multivariate normal distributions using divergence and its application to HMM adaptation. In Procs of the 2003 IEEE Intl Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 552-555, 2003.
[16] L. Pappalardo and P. Cintia. Quantifying the relation between performance and success in soccer. Advances in Complex Systems, 2017.
[17] L. Pappalardo, P. Cintia, D. Pedreschi, F. Giannotti, and A.-L. Barabasi. Human perception of performance. arXiv preprint arXiv:1712.02224, 2017.
[18] S. Pettigrew. Assessing the ofensive productivity of nhl players using in-game win probabilities. In MIT Sloan Sports Analytics Conference, 2015.
[19] P. Power, H. Ruiz, X. Wei, and P. Lucey. Not all passes are created equal: Objectively measuring the risk and reward of passes in soccer from tracking data. In Procs of the 23rd ACM SIGKDD Intl. Conference on Knowledge Discovery and Data Mining, pages 1605-1613, 2017.
[20] R. Rein and D. Memmert. Big data and tactical analysis in elite soccer: future challenges and opportunities for sports science. SpringerPlus, 5(1):1410, 2016.
[21] B. Ribeiro-Neto and R. Baeza-Yates. Modern Information Retrieval. AddisonWesley Longman Publishing Co., Inc. Boston, MA, USA, 1999.
[22] P. J. Rousseeuw. Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics, 20(Supplement C):53 - 65, 1987.
[23] O. Shulte and Z. Zhao. Apples-to-apples: Clustering and ranking nhl players using location information and scoring impact. In MIT Sloan Sports Analytics Conference, Hynes Convention Center, Boston, MA, USA, 2017.
[24] R. Stanojevic and L. Gyarmati. Towards data-driven football player assessment. In Procs of the IEEE 16th Intl Conference on Data Mining, pages 167-172, 2016.
[25] M. Stein, H. Janetzko, D. Seebacher, A. Jäger, M. Nagel, J. Hölsch, S. Kosub, T. Schreck, D. A. Keim, and M. Grossniklaus. How to make sense of team sport data: From acquisition to data modeling and research aspects. Data, 2(1), 2017.
[26] B. Torgler and S. L. Schmidt. What shapes player performance in soccer? empirical ifndings from a panel analysis. Applied Economics, 39(18):2355-2369, 2007.
[27] Q. Wang, H. Zhu, W. Hu, Z. Shen, and Y. Yao. Discerning tactical patterns for professional soccer teams: An enhanced topic model with applications. In Procs of the 21th ACM SIGKDD Intl. Conference on Knowledge Discovery and Data Mining, pages 2197-2206, 2015.

Metrics



Back to previous page
BibTeX entry
@article{oai:it.cnr:prodotti:412397,
	title = {PlayeRank: data-driven performance evaluation and player ranking in Soccer via a machine learning approach},
	author = {Pappalardo L. and Cintia P. and Ferragina P. and Massucco E. and Pedreschi D. and Giannotti F.},
	publisher = {Association for Computing Machinery, New York, NY  , Stati Uniti d'America},
	doi = {10.1145/3343172 and 10.48550/arxiv.1802.04987},
	journal = {ACM transactions on intelligent systems and technology (Print)},
	volume = {10},
	year = {2019}
}

SoBigData
SoBigData Research Infrastructure


OpenAIRE