• Non ci sono risultati.

Soccer Analytics:

N/A
N/A
Protected

Academic year: 2021

Condividi "Soccer Analytics:"

Copied!
65
0
0

Testo completo

(1)

Soccer Analytics:

how Data Science is changing the “Beautiful Game”

L. Pappalardo

@lucpappalard

(2)

Sports Analytics

● Popularized by book and

movie Moneyball

(3)

Sports Analytics

● On-field: performance analysis, soccer scouting

● Off-field: talent scouting, gambling, merchandising

● Popularized by book and

movie Moneyball

(4)

Sports Analytics

L. Bornn, D. Cervone, J. Fernandez, Soccer analytics: Unravelling the complexity of “the beautiful game”, Significance, 15: 26-29, 2018.

● Popularized by book and movie Moneyball

● Soccer:

first to try, last to adopt

● On-field: performance analysis, soccer scouting

● Off-field: talent scouting,

gambling, merchandising

(5)

Sports Analytics

C. Anderson, D. Sally, The numbers game: why everything you know about football is wrong, Penguin, 2013.

● Popularized by book and movie Moneyball

● Soccer:

first to try, last to adopt

● On-field: performance analysis, soccer scouting

● Off-field: talent scouting,

gambling, merchandising

(6)
(7)

Charles Reep

1950s

(8)

Charles Reep 1950s

R. Pollard, Charles Reep

(1904-2002): pioneer of notational and performance analysis in

football, Journal of Sports Sciences 20(10):853-855, 2002.

2,194 matches annotated by hand

from 1950s to 1990s

(9)

Long-ball theory 1950s

length of pass chain ended in a goal

frequency (%)

(10)

Long-ball theory 1950s

C. Reep and B. Benjamin, Skill and Chance in Association Football, Journal of the Royal Statistical Society. Series, 131(4):581-585, 1968.

Not more than three passes. “If a team tries to

play football and keeps it down to not more than

three passes, it will have a much higher chance of

winning matches. Passing for the sake of

passing can be disastrous.

(11)

Long-ball theory 1950s

C. Reep and B. Benjamin, Skill and Chance in Association Football, Journal of the Royal Statistical Society. Series, 131(4):581-585, 1968.

Not more than three passes. “If a team tries to

play football and keeps it down to not more than

three passes, it will have a much higher chance of

winning matches. Passing for the sake of

passing can be disastrous.

(12)

Long-ball theory 1950s

long/short passes

ranking

(13)

Charles Reepa

1950s

(14)

Valeri Lobanovskyi anni ‘70

Valeri Lobanovskyi

1970s

(15)

Valeri Lobanovskyi anni ‘70

Valeri Lobanovskyi 1970s

AM Zelentsov, V.V. Lobanovsky, МЕТОДОЛОГИЧЕСКИЕ ОСНОВЫ РАЗРАБОТКИ

МОДЕЛЕЙ ТРЕНИРОВОЧНЫХ ЗАНЯТИЙ

(16)

tagger

2010s

(17)

{'eventName': 'pass', 'eventSec': 8.221464, 'matchId': 2576132, 'matchPeriod': '1H', 'playerId': 8306,

'positions': [{'x': 42, 'y': 14}, {'x': 74, 'y': 33}],

'subEventName': 'key pass', 'tags': ['accurate'],

'teamId': 3158}

Soccer-logs

(18)

1700 events per match (on average)

{'eventName': 8,

'eventSec': 8.221464, 'id': 217097515,

'matchId': 2576132, 'matchPeriod': '1H', 'playerId': 8306,

'positions': [{'x': 42, 'y': 14}, {'x': 74, 'y': 33}],

'subEventName': 83,

'tags': [{'id': 1801}], 'teamId': 3158}

pass

accurate identifiers

passes xG pressing accuracy

... ... ... ... ... ... ... ...

Performance vector

(19)
(20)
(21)

Ranking soccer players

(22)

Evaluate and Rank teams

(23)

Passing network

J. Duch, J.S. Waitzman, L.A.N. Amaral, Quantifying the Performance of Individual Players in a Team Activity, PLoS One, 5(6), 2010.

J.L. Pena, H. Touchette,,

A network theory analysis of football strategies, arXiv:1206.6904v1, 2012.

(24)

Flow centrality

Flow centrality

a player’s

betweenness centrality

J. Duch, J.S. Waitzman, L.A.N. Amaral, Quantifying the Performance of Individual Players in a Team Activity, PLoS One, 5(6), 2010.

J.L. Pena, H. Touchette,,

A network theory analysis of football strategies, arXiv:1206.6904v1, 2012.

(25)

Flow centrality

Flow centrality

a player’s

betweenness centrality

J. Duch, J.S. Waitzman, L.A.N. Amaral, Quantifying the Performance of Individual Players in a Team Activity, PLoS One, 5(6), 2010.

J.L. Pena, H. Touchette,,

A network theory analysis of football strategies, arXiv:1206.6904v1, 2012.

(26)

Flow centrality

(27)

Flow centrality

Team flow centrality

(28)

H indicator

(29)

H indicator

European Ranking, 2014

P. Cintia et al., The harsh rule of the goals: data-driven performance indicators for football teams, in Procs of the 2015 IEEE International Conference on Data Science and Advanced Analytics (DSAA), 2015.

(30)

H indicator

European Ranking, 2014

P. Cintia et al., The harsh rule of the goals: data-driven performance indicators for football teams, in Procs of the 2015 IEEE International Conference on Data Science and Advanced Analytics (DSAA), 2015.

(31)

H indicator simulation

P. Cintia et al., The harsh rule of the goals: data-driven performance indicators for football teams, in Procs of the 2015 IEEE International Conference on Data Science and Advanced Analytics (DSAA), 2015.

(32)

H indicator

(33)

H indicator

+11 points

+2 positions

(34)

H indicator

-13 points

-5 positions

(35)

H indicator

-17 points

-6 positions

(36)

Harshness

harshness

(37)

Pros

• simple representation

• consider interactions

Cons

• only passes

• all passes are equal

Passing network

(38)

5 seasons

18 competitions 30M events

20K matches

21K players

(39)

L. Pappalardo et al., PlayeRank: data-driven performance evaluation and player ranking in soccer via a machine learning approach, arXiv:1802.04987, 2018.

(40)

Feature Weighting

= ?

= ?

= ?

= ?

= ?

= ? = ? = ?

= ?

(41)

Feature Weighting

team performance vector

(42)

passes xG pressing accuracy ...

team1

team2

?

Feature Weighting

passes xG pressing accuracy ...

Pappalardo and Cintia, (2017) Quantifying the relation between performance and success

in soccer, Advances in Complex Systems, doi:10.1142/S021952591750014X

(43)

Feature Weighting

76 features in total

(44)

Evaluating the weights

● stability

across competitions and roles

● evaluation of resulting ranking

(45)

Are these weights “universal”?

(46)

Are these weights “universal”?

(47)

Are these weights “universal”?

(48)

Are these weights “universal”?

leagues

(49)

Are these weights “universal”?

(50)

Rating Computation

performance rating

of u in game g

(51)
(52)

How to evaluate the evaluation?

algorithm expert 1 expert 2 expert 3

● majority agreement

● unanimity agreement

(53)

Evaluation of 211 pairs

(54)

Evolution of players

(55)

Evolution of players

(56)

Performance patterns

(57)

Versatility of players

(58)

Versatility of players

(59)

In summary...

1. weights are similar across different leagues 2. World and Euro Cups slightly differ

3. ranking has high agreement with experts…

4. ...when a difference emerges between players 5. Future: exploring new ways of extracting

weights, that capture non linearity

(60)

Open challenges

1. forecast the performance of players 2. search for the player(s) who best

adapt to a team’s playing style

3. make substitutions during a match

which maximize the probability of

winning given the opponents

(61)

● L. Pappalardo et al.. 2019.

PlayeRank: Data-driven Performance Evaluation and Player Ranking in Soccer via a Machine Learning Approach.

ACM TIST 10:(5)

● L. Pappalardo et al. 2019.

An open data set of spatio-temporal match events in soccer competitions.

Nature Scientific Data 6:236.

(62)

Flow Centrality (FC)

Duch et al. (2010) Quantifying the Performance of Individual Players in a Team Activity. PLoS ONE 5(6): e10937.

fraction of a player’s accurate shots

Validation: 8 of the 20 players in the list of the

competition’s best players

(63)

Pass Shot Value (PSV)

Brooks et al. (2016) Developing a Data-Driven Player Ranking in Soccer using Predictive Model Weights, SIGKDD

each pass is represented as a vector size=360

(64)

Pass Shot Value (PSV)

Brooks et al. (2016) Developing a Data-Driven Player Ranking in Soccer using Predictive Model Weights, SIGKDD

predicting if a possession ends in a shot

Validation: correlation with assists and goals

(65)

Role classification

Riferimenti

Documenti correlati

L’assistenza deve essere erogata sempre nel rispetto della dignità, della riserva- tezza e della libertà di scelta della donna e secondo l’adesione in- tima e coerente ai

Amichai Magen is a Lecturer in Law, Stanford Law School, and a Fellow at the Center on Democracy, Development and the Rule of Law (CDDRL), Freeman Spogli Institute (FSI),

● PSV → Brooks et al., Developing a Data-Driven Player Ranking in Soccer using Predictive Model Weights, SIGKDD.. Evaluation

This is a conference review of the 2nd Commemoration of the International Day of Women and Girls in Science, which had the theme Gender, Science and Sustainable Development: The

To compare with Helvensteijn’s data mining application [21] that only discovered 81 joseki variations in a data set of 13,325, our data science solution discovered

[IPERMOB] IPERMOB (Infrastruttura Pervasiva Eterogenea Real-time per il controllo della Mobilità) è un'azienda che utilizza una tecnologia wireless economica (WSN), da loro

Thus, Big Data can provide organic information, particularly in geographical commu- nities where there is a gap of traditional data sources, on issues related to women and girls

A solution of freshly prepared bromo(methoxy)methyl trimethylsilane (4.24 mmol) (obtained from methoxymethyl trimethylsilane and bromine in CCl4) was slowly added at room