Getting NBA Shots in Context: Analysing Basketball Shots with Graph Embeddings

Marc Schmid; Moritz Schöpf; Otto Kolbinger

doi:10.2478/ijcss-2025-0005

.blurhash-client-img { display: none !important; }

Getting NBA Shots in Context: Analysing Basketball Shots with Graph Embeddings

International Journal of Computer Science in Sport

Volume 24 (2025): Issue 1 (February 2025)

By: Marc Schmid, Moritz Schöpf and Otto Kolbinger

Open Access

|May 2025

Figures & Tables

Left: Tracking data provided by SportsVU, Right: Video footage of Stephen Curry shooting a 3pt shot over Anthony Davis in a game of the Golden State Warriors against the New Orleans Pelicans

Left: Spatio-temporal subgraph of tracking data from multiple time frames. Spatial edges are in grey (respective on the x and y plane), and temporal edges are in lime (in the time dimension). Right: Spatio-temporal graph for a game situation with seven frames before the shot. The red and blue nodes are for attackers and defenders, respectively. Green nodes are for the ball. Playerposition is annotated with their position (G: Guard, F: Forward, C: Center).Similar colored rings around the players indicate the players’ identity. Spatialedges are grey and indicate the distance between the nodes. The shooter nodehas an outline in magenta. Temporal edges are green and connect nodes from neighboring frames. The numbers indicate the temporal delay in time frames.

We can see the calibration probabilities in the main plot and the corresponding population of predicted probabilities in the top left detail box of the figure for the six differently trained models to predict the expected shot quality.

Normalized values of the learned feature mask by the GNNExplainer model for the shot classification task with output node shooter and mean of all embeddings (orange). The graph shows the difference in importance for an average aggregate over all nodes, and the average shooter node.

In the left frame, we can see the embeddings for the neural network predictions with a mean over different shooters with coloring in positions. The pure guards are colored in turquoise, the pure forwards are colored in pink, and the pure centers are colored in red. Guard-Forwards are colored in blue and the Forward-Centers are colored in green. On the right, the embeddings of every single situation are taken with coloring in 2-point (red) and 3-point shots (blue), both displayed with the TSNE-Embedding visualization.

Left: Shot of Serge Ibaka. Right: Shot from Blake Griffin. The attacking team is displayed in red with the shooter in magenta. The defending team is displayed in blue and the ball in dark green. Spatial edges are drawn in black/grey and temporal edges in lime. The thickness of the lines and nodes indicates the attention values, while the darker colour indicates the stronger attention of the network.

Stephen Curry’s similarity measures compared_ Embedding Similarity (ours), RAPTOR, RAPM, averaged season statistics and the correlation coefficient of the weight matrix of the NMF_

Rank	Embedding Similarity	RAPTOR	RAPM	Normalized Stats	NMF-weight-correlation
1	K. Lowry (0.008)	C. Paul (26)	C. Paul (4.01)	B. Beal (0.074)	D. Augustin (0.85)
2	Se. Curry (0.009)	D. Wade (22)	L. James (3.9)	K. Irving (0.074)	E. Gordon (0.80)
3	P. Mills (0.011)	L. James (20)	D. Green (3.56)	I. Thomas (0.077)	Ryan Anderson (0.80)
4	M. Williams (0.012)	K. Leonard (18)	R. Gobert (3.39)	K. Walker (0.081)	Jodie Meeks (0.79)
5	D. Lillard (0.013)	J. Harden (17)	P. Patterson (3.33)	D. Lillard (0.088)	Ersan Ilyasova (0.78)

Features and number of players for each reference model_

Player	Position (x and y)	Velocity (x and y)	Acceleration (x and y)	Basket (distance and angle)	Nominal Position	Height
Shooter	x	x	x	x	x	x
Closest Defender	x	x	x	x	x	x
2.Closest Defender	x	x	x	x	x	x

AUC, accuracy, F1-score, log-loss, Brier Loss and ECE for logistic regression, naive Bayes, gradient boosted classifier, multi-layer perceptron, GNN and GNN trained with Brier-loss_ Arrows point upwards (downwards) if a higher (lower) value represents better performance_

Model	AUC ↑	Accuracy↑	F1↑	Log loss↓	Brier loss↓	ECE ↓
Logistic Regression	0.5914	0.5861	0.6188	0.6727	0.2399	0.0294
Naïve Bayes	0.5777	0.4699	0.6187	2.0678	0.4238	0.4887
Gradient-Boosted Classifier	0.5989	0.5895	0.6205	0.6696	0.2384	0.0278
Multi-Layer Perceptron	0.5690	0.57822	0.6187	0.7915	0.2712	0.1573
GNN - NII	0.6102	0.6069	0.6245	0.6693	0.2379	0.073
GNN - Brier	0.6174	0.6093	0.6259	0.6522	0.2287	0.0263

Hyperparameter boundary settings for the optimization of the baseline models via optuna_

Model	Framework	Hyperparameter
MLP	sklearn	Layers: 1<x<6 Neurons: x<400 lr: 0.01>x>0.0001
GBC	sklearn	Max depth: 3<x<10, n_estimators: 10< x <200
NB	sklearn	-
Log. Reg.	sklearn	-

References

Authors

Metrics

Articles in this issue

DOI: https://doi.org/10.2478/ijcss-2025-0005 | Journal eISSN: 1684-4769

Journal RSS Feed

Language: English

Page range: 73 - 93

Published on: May 14, 2025

Published by: International Association of Computer Science in Sport

In partnership with: Paradigm Publishing Services

Publication frequency: 2 issues per year

Keywords:

Basketball Analytics,

Graph Neural Networks,

Emeddings

Related subjects:

Computer sciences,

Databases and data mining,

Computer sciences, other,

Sports and recreation,

Physical education,

Sports and recreation, other

© 2025 Marc Schmid, Moritz Schöpf, Otto Kolbinger, published by International Association of Computer Science in Sport
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.

Volume 24 (2025): Issue 1 (February 2025)

Getting NBA Shots in Context: Analysing Basketball Shots with Graph Embeddings

Figures & Tables

Figure 1

Figure 2:

Figure 3.

Figure 4.

Figure 5.

Figure 6.

Stephen Curry’s similarity measures compared_ Embedding Similarity (ours), RAPTOR, RAPM, averaged season statistics and the correlation coefficient of the weight matrix of the NMF_

Features and number of players for each reference model_

AUC, accuracy, F1-score, log-loss, Brier Loss and ECE for logistic regression, naive Bayes, gradient boosted classifier, multi-layer perceptron, GNN and GNN trained with Brier-loss_ Arrows point upwards (downwards) if a higher (lower) value represents better performance_

Hyperparameter boundary settings for the optimization of the baseline models via optuna_

Paradigm

My account