2020
Other  Open Access

Analysis of technical attributes of male and female national football teams: a comparison through a statistical machine learning approach

Pontillo G., Pappalardo L.

data science  sports analytics  soccer analytics  sports data  machine learning 

Too often women's football has been compared to men's football mainly on the basis of the players' physical attributes, offering an incomplete analysis of when the characteristics of any football team are studied analytically. Thanks to the availability of an open soccer-logs data set provided by Wyscout, this thesis aims to statistically analyse and compare male and female national football teams based on their technical qualities, measured through the event data obtained from the last World Cup championships. An event could be defined as a certain action, such as a pass, a shot, a foul, a save attempt, and so on, made by a team's player in a match. First results show, for example, that there are significant differences in the number of key playing events, such as passes, percentage of accurate passes and free kicks made by the national teams during a match. Through the use of particular methods and algorithms, there were computed variables related to the technical characteristics of a team, such as the average time between two passes and the average ball possession recovery time, which can also define the intensity of a game, and variables that summarize and quantify the individual and collective performance of a team's players within a single value, such as the H indicator or the players' ratings aggregated for each team via mean and standard deviation. For example, the more the ratings' standard deviation, the more, in a particular match, the team was characterized by players that, individually, outperformed respect to their teammates. Finally, all these features were used into advanced classification algorithms such as Decision Tree, Random Forest and AdaBoost with the task of classifying a team in a game as male (class 0 ) or female (class 1 ). All the classifiers were validated through a 10-fold Cross Validation on a training set and they all showed a good predictive performance, indicating that it is possible to distinct a male football team from a female one (and vice versa) on technical skills. Moreover, after fitting a Decision Tree on different versions of training set and looking at the importance that each variable had in the decision path every time, we find that the most important differences underlie in variables such as players' individual performance variability, pass velocity, ball recovery time and the percentage of accurate passes made by the teams.

Publisher: Consiglio Nazionale delle Ricerche



Back to previous page
BibTeX entry
@misc{oai:it.cnr:prodotti:425773,
	title = {Analysis of technical attributes of male and female national football teams: a comparison through a statistical machine learning approach},
	author = {Pontillo G. and Pappalardo L.},
	publisher = {Consiglio Nazionale delle Ricerche},
	year = {2020}
}

SoBigData
SoBigData Research Infrastructure


OpenAIRE