Exporting event data to a dataframe¶
Setup¶
Start by loading some event data using the Kloppy module. For the sake of this demonstration, we will use Statsbomb Open Event Data.
from kloppy import statsbomb
# Load Statsbomb open dataset
dataset = statsbomb.load_open_data(
match_id=15946,
# Optional arguments
coordinates="statsbomb",
)
/cw/dtaijupiter/NoCsBack/dtai/pieterr/Projects/kloppy/kloppy/_providers/statsbomb.py:83: UserWarning: You are about to use StatsBomb public data. By using this data, you are agreeing to the user agreement. The user agreement can be found here: https://github.com/statsbomb/open-data/blob/master/LICENSE.pdf warnings.warn(
Test if the loading worked by printing the home and away teams.
# Get teams
home_team, away_team = dataset.metadata.teams
f"{home_team} vs {away_team}"
'Barcelona vs Deportivo Alavés'
Filter the shot events from the dataset.¶
# Only keep shots
shots = dataset.filter("shot")
Convert to Polars DataFrame¶
Parameters:
player_id: Includes the player’s unique identifier.Lambda function: Extracts
player_nameandis_goalstatus from each shot event.player_name: Converts the player object to a string.is_goal: Retrieves whether the shot resulted in a goal (True or False).
coordinates_*: Includes all coordinate-related fields in the dataset.prev_pass_player: Captures the player who made the pass before the shot.engine="polars": Specifies the use of the Polars library for DataFrame processing. Alternatively, usingengine="pandas"would convert the dataset into a Pandas DataFrame.
# Convert Kloppy dataset to Polars DataFrame
shots.to_df(
"player_id",
lambda event: {
"player_name": str(event.player),
"is_goal": event.result.is_success,
},
"coordinates_*",
prev_pass_player=lambda event: str(event.prev("pass").player),
engine="polars",
)
| player_id | player_name | is_goal | coordinates_x | coordinates_y | prev_pass_player |
|---|---|---|---|---|---|
| str | str | bool | f64 | f64 | str |
| "5503" | "Lionel Andrés Messi Cuccittini" | false | 111.45 | 52.85 | "Ivan Rakitić" |
| "5211" | "Jordi Alba Ramos" | false | 113.85 | 26.35 | "Lionel Andrés Messi Cuccittini" |
| "5503" | "Lionel Andrés Messi Cuccittini" | false | 93.65 | 34.65 | "Rubén Duarte Sánchez" |
| "6613" | "Rubén Sobrino Pozuelo" | false | 109.15 | 39.05 | "Manuel Alejandro García Sánche… |
| "5246" | "Luis Alberto Suárez Díaz" | false | 107.75 | 24.65 | "Ousmane Dembélé" |
| … | … | … | … | … | … |
| "6935" | "Adrián Marín Gómez" | false | 114.45 | 32.75 | "Ibai Gómez Pérez" |
| "3501" | "Philippe Coutinho Correia" | false | 113.15 | 31.35 | "Lionel Andrés Messi Cuccittini" |
| "3501" | "Philippe Coutinho Correia" | true | 105.25 | 33.35 | "Arthur Henrique Ramos de Olive… |
| "5503" | "Lionel Andrés Messi Cuccittini" | false | 106.55 | 46.75 | "Sergi Roberto Carnicer" |
| "5503" | "Lionel Andrés Messi Cuccittini" | true | 111.45 | 36.15 | "Luis Alberto Suárez Díaz" |
Filter using lambda functions¶
You can also use lambda functions to apply filtering. For eg. this snippet demonstrates how to filter events using a lambda function to extract actions performed by left center-backs (LCBs) and convert them into a Polars DataFrame.
left_centerbacks_actions = dataset.filter(
lambda event: event.player
and event.player.starting_position
and str(event.player.starting_position.code) == "LCB"
).to_df(
"player_id",
lambda event: {"player_name": str(event.player), "event_name": event.event_name},
"coordinates_*",
engine="polars",
)
left_centerbacks_actions
| player_id | player_name | event_name | coordinates_x | coordinates_y |
|---|---|---|---|---|
| str | str | str | f64 | f64 |
| "6855" | "Guillermo Alfonso Maripán Loay… | "Ball Receipt*" | 33.75 | 27.95 |
| "6855" | "Guillermo Alfonso Maripán Loay… | "carry" | 33.75 | 27.95 |
| "6855" | "Guillermo Alfonso Maripán Loay… | "pass" | 36.75 | 27.25 |
| "5492" | "Samuel Yves Umtiti" | "Ball Receipt*" | 36.55 | 33.25 |
| "5492" | "Samuel Yves Umtiti" | "carry" | 36.55 | 33.25 |
| … | … | … | … | … |
| "5492" | "Samuel Yves Umtiti" | "pass" | 57.95 | 21.55 |
| "6855" | "Guillermo Alfonso Maripán Loay… | "pass" | 12.25 | 42.05 |
| "5492" | "Samuel Yves Umtiti" | "interception" | 15.65 | 39.65 |
| "5492" | "Samuel Yves Umtiti" | "pass" | 15.65 | 39.65 |
| "6855" | "Guillermo Alfonso Maripán Loay… | "pressure" | 15.45 | 41.35 |