Which indicators matter? Using performance indicators to predict in-game success-related events in association football

Steffen Lang; Thomas Wimmer; Alexander Erben; Daniel Link

doi:10.2478/ijcss-2025-0011

.blurhash-client-img { display: none !important; }

Which indicators matter? Using performance indicators to predict in-game success-related events in association football

International Journal of Computer Science in Sport

Volume 24 (2025): Issue 2 (June 2025)

By: Steffen Lang, Thomas Wimmer, Alexander Erben and Daniel Link

Open Access

|Jul 2025

Figures & Tables

Visualization of the In-Play Prediction Masking approach, the rolling window approach, and the utilized PIs; PGs; IW and PW window sizes. The lower band visualizes the rolling window, whereas all windows moved one step further. In brackets the number of configuration feature levels.

Pearson’s inter-correlation of PIs and PGs with a window size of 15 minutes ordered for readability. Complete results are shown in Table 3 in the Appendix.

MCC results of all experiments split by MLMs and PGs. Each boxplot contains results of 700 experiments, except for the PGisEntr3rd with 250 experiments. The red dotted line indicates MCC=0.

MCC results of all experiments split by PIs. Each boxplot contains the results of 545 experiments. Boxplots are sorted by their MCCmean in descending order. The red dotted line indicates MCC=0.

a) Top 3 and b) Bottom 3 PIs for each PG. PIs are sorted by their mean MCC results in descending order. Each boxplot contains the results of 125 experiments, except for the PGisEntr3rd with 45 experiments. The red dotted line indicates MCC=0.

The percentual appearance in the Top 10% PI-combinations in the application scenario (Part III) of a) the selected Top 10 individual Pls, b) the number of individual Pls combined, and c) the input window length.

Application of the trained model (rank 3) for an unseen match between FC Bayern Munich (FCB) vs SC Paderborn (SCP) in Season 19/20 which resulted in a 3:2. In the upper half (FCB) and the lower half (SCP) for each team important events (corner kicks, given cards, goals scored, shots taken), PIDanger and PIEntr3rd, and prediction values are illustrated over the course of the match. Also, Dominance by Link et al. (2016) and the goal prediction difference between both teams, as our proposed match momentum metric, are shown. Additionally, eight important sequences (P1-8) are highlighted.

Ranking of PIs for PGisEntr3rd. PIs are sorted by their mean MCC results in descending order. The red dotted line indicates MCC=0.

Ranking of PIs for PGisEntrBox. PIs are sorted by their mean MCC results in descending order. The red dotted line indicates MCC=0.

Ranking of PIs for PGisCorner. PIs are sorted by their mean MCC results in descending order. The red dotted line indicates MCC=0.

Ranking of PIs for PGisShot. PIs are sorted by their mean MCC results in descending order. The red dotted line indicates MCC=0.

Ranking of PIs for PGisGoal. PIs are sorted by their mean MCC results in descending order. The red dotted line indicates MCC=0.

Application of the 1st ranked model for an unseen match between FC Bayern Munich (FCB) vs SC Paderborn (SCP) in Season 19/20 which resulted in a 3:2. In the upper half (FCB) and the lower half (SCP) for each team important events (corner kicks, given cards, goals scored, shots taken), Dominance by Link et al. (2016) and the prediction difference, as our proposed match momentum metric, are shown.

Application of the 2nd ranked model for an unseen match between FC Bayern Munich (FCB) vs SC Paderborn (SCP) in Season 19/20 which resulted in a 3:2. In the upper half (FCB) and the lower half (SCP) for each team important events (corner kicks, given cards, goals scored, shots taken), Dominance by Link et al. (2016) and the prediction difference, as our proposed match momentum metric, are shown.

An overview of the five utilized machine learning models applied with default configurations in the study and the minor individual changes in configuration that were determined as optimal in hyperparameter tuning experiments on subsets of the data_

Machine Learning Model	Python library	Configuration	URL to documentation
Logistic Regression (LR)	Scikit-learn 1.4.0	class weights in loss: balanced (inversely proportional weighting according to class frequencies in train set) training metric: binary cross entropy	https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html#sklearn.linear_model.LogisticRegression
Gaussian Naive Bayes (NB)	Scikit-learn 1.4.0		https://scikit-learn.org/stable/modules/generated/sklearn.naive_bayes.GaussianNB.html
Support Vector Machine (SVM)	Scikit-learn 1.4.0	class weights in loss: balanced training metric: hinge loss	https://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html#sklearn.svm.SVC
K-Nearest-Neighbors (KNN)	Scikit-learn 1.4.0	n_neighbors: 2 weights: Euclidean distance	https://scikit-learn.org/stable/modules/generated/sklearn.neighbors.KNeighborsClassifier.html#sklearn.neighbors.KNeighborsClassifier
Neural Network (NN)	PyTorch 2.3.0	class weights in loss: balanced 4 hidden layers with 256, 512, 128, 16 channels, respectively learning rate: 0.001 with exponential decay optimizer: Adam training until convergence of validation loss (early stopping) dropout probability: 0.4 activation function: LeakyReLU (Sigmoid in the last layer) training metric: binary cross entropy	https://pytorch.org/docs/stable/nn.html

a) Performance indicators (PIs) and b) Prediction goals (PGs) of a team and their definitions utilized in our study_ PIs 1–14 as individual PIs and 15–28 as the difference to the opponent team of the individual ones_ The definition of a PI is always based on the performance of the respective team in an interval and is either an event performed, or a metric based on actions of the team_ The definition of a PG is that the event happens at minimum once in the respective prediction window for the team_

No	Abbreviation	Definition
a) Performance Indicators (PI)
1	PI_Corner	Number of corner kicks
2	PI_EntrBox	Number of entries of a player with ball possession into the opponent box
3	PI_Entr3rd	Number of entries of a player with ball possession into the attacking third
4	PI_Goal	Number of goals scored
5	PI_Shot	Number of shot attempts
6	PI_Cross	Number of crosses
7	PI_TackWon	Number of tacklings won
8	PI_PassBox	Number of successful passes in or into the opponent box
9	PI_Pass3rd	Number of successful passes in or into the attacking third
10	PI_BP	Time of ball possession
11	PI_BPBox	Time of ball possession in the box
12	PI_BP3rd	Time of ball possession in the attacking third
13	PI_OutpOpp	Number of outplayed opponent players by successful passes
14	PI_Danger	Goal scoring probability at each moment (Link et al., 2016)
15-28	PI_{PI_diff}	Difference of both PI values, (Team – Opponent)
b) Prediction Goals (PG)
1	PG_isGoal	A goal event for the team occurs
2	PG_isShot	A shot event for the team occurs
3	PG_isCorner	A corner kick event for the team occurs
4	PG_isEntrBox	An entry into the opponent box performed by the team occurs
5	PG_isEntr3rd	An entry into the attacking third performed by the team occurs

Pearson’s inter-correlation results of PIs and PGs for the window length of 15 minutes_ Top 3 (green) and Bottom 3 (red) PIs per PG are highlighted_ The best and worst results per PG are bold_

	PG_isGoal	PG_isCorner	PG_isShot	PG_isEntrBox	PG_isEntr3rd
PI_{BP_diff}	.048	.225	.333	.443	.573
PI_BP	.053	.201	.319	.414	.535
PI_{Pass3rd_diff}	.127	.216	.340	.403	.529
PI_Pass3rd	.130	.219	.312	.371	.473
PI_{OutpOpp_diff}	.114	.198	.342	.419	.522
PI_OutpOpp	.092	.207	.292	.400	.486
PI_{BP3rd_diff}	.118	.201	.355	.398	.507
PI_BP3rd	.109	.227	.333	.391	.468
PI_{Entr3rd_diff}	.107	.197	.343	.391	.505
PI_Entr3rd	.089	.227	.331	.401	.503
PI_{Danger_diff}	.125	.190	.319	.380	.481
PI_Danger	.124	.207	.303	.387	.459
PI_{Cross_diff}	.104	.190	.251	.324	.387
PI_Cross	.106	.204	.242	.323	.343
PI_{EntrBox_diff}	.106	.166	.255	.312	.368
PI_EntrBox	.077	.191	.228	.324	.336
PI_{Shot_diff}	.071	.112	.183	.286	.322
PI_Shot	.057	.104	.136	.227	.258
PI_{BPBox_diff}	.092	.130	.199	.218	.276
PI_BPBox	.036	.133	.140	.224	.222
PI_{Corner_diff}	.069	.083	.142	.205	.253
PI_Corner	.062	.124	.130	.196	.213
PI_{PassBox_diff}	.058	.111	.125	.164	.220
PI_PassBox	.057	.078	.086	.145	.186
PI_{TackWon_diff}	.052	.074	.058	.138	.087
PI_TackWon	.012	.157	.060	.162	.138
PI_{Goal_diff}	.038	−.054	−.043	−.022	−.099
PI_Goal	−.022	−.075	−.052	−.028	−.087

References

Authors

Metrics

Articles in this issue

DOI: https://doi.org/10.2478/ijcss-2025-0011 | Journal eISSN: 1684-4769

Journal RSS Feed

Language: English

Page range: 16 - 44

Published on: Jul 31, 2025

Published by: International Association of Computer Science in Sport

In partnership with: Paradigm Publishing Services

Publication frequency: 2 issues per year

Keywords:

Performance Indicators

Related subjects:

Computer sciences,

Databases and data mining,

Computer sciences, other,

Sports and recreation,

Physical education,

Sports and recreation, other

© 2025 Steffen Lang, Thomas Wimmer, Alexander Erben, Daniel Link, published by International Association of Computer Science in Sport
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.

Volume 24 (2025): Issue 2 (June 2025)

Which indicators matter? Using performance indicators to predict in-game success-related events in association football

Figures & Tables

Figure 1.

Figure 2.

Figure 3.

Figure 4.

Figure 5.

Figure 6.

Figure 7.

Figure 8.

Figure 9.

Figure 10.

Figure 11.

Figure 12.

Figure 13.

Figure 14.

An overview of the five utilized machine learning models applied with default configurations in the study and the minor individual changes in configuration that were determined as optimal in hyperparameter tuning experiments on subsets of the data_

Pearson’s inter-correlation results of PIs and PGs for the window length of 15 minutes_ Top 3 (green) and Bottom 3 (red) PIs per PG are highlighted_ The best and worst results per PG are bold_

Paradigm

My account