{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Exporting event data to a dataframe" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Setup\n", "Start by loading some event data using the Kloppy module. For the sake of this demonstration, we will use Statsbomb Open Event Data." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/cw/dtaijupiter/NoCsBack/dtai/pieterr/Projects/kloppy/kloppy/_providers/statsbomb.py:83: UserWarning: \n", "\n", "You are about to use StatsBomb public data.\n", "By using this data, you are agreeing to the user agreement. \n", "The user agreement can be found here: https://github.com/statsbomb/open-data/blob/master/LICENSE.pdf\n", "\n", " warnings.warn(\n" ] } ], "source": [ "from kloppy import statsbomb\n", "\n", "# Load Statsbomb open dataset\n", "dataset = statsbomb.load_open_data(\n", " match_id=15946,\n", " # Optional arguments\n", " coordinates=\"statsbomb\",\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Test if the loading worked by printing the home and away teams." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'Barcelona vs Deportivo Alavés'" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Get teams\n", "home_team, away_team = dataset.metadata.teams\n", "f\"{home_team} vs {away_team}\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Filter the `shot` events from the dataset. " ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "# Only keep shots\n", "shots = dataset.filter(\"shot\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Convert to Polars DataFrame\n", "Parameters:\n", "\n", "- `player_id`: Includes the player’s unique identifier.\n", "\n", "- Lambda function: Extracts `player_name` and `is_goal` status from each shot event.\n", " - `player_name`: Converts the player object to a string.\n", " - `is_goal`: Retrieves whether the shot resulted in a goal (True or False).\n", "\n", "- `coordinates_*`: Includes all coordinate-related fields in the dataset.\n", "\n", "- `prev_pass_player`: Captures the player who made the pass before the shot.\n", "\n", "- `engine=\"polars\"`: Specifies the use of the Polars library for DataFrame processing. Alternatively, using `engine=\"pandas\"` would convert the dataset into a Pandas DataFrame. " ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/html": [ "<div><style>\n", ".dataframe > thead > tr,\n", ".dataframe > tbody > tr {\n", " text-align: right;\n", " white-space: pre-wrap;\n", "}\n", "</style>\n", "<small>shape: (28, 6)</small><table border=\"1\" class=\"dataframe\"><thead><tr><th>player_id</th><th>player_name</th><th>is_goal</th><th>coordinates_x</th><th>coordinates_y</th><th>prev_pass_player</th></tr><tr><td>str</td><td>str</td><td>bool</td><td>f64</td><td>f64</td><td>str</td></tr></thead><tbody><tr><td>"5503"</td><td>"Lionel Andrés Messi Cuccittini"</td><td>false</td><td>111.45</td><td>52.85</td><td>"Ivan Rakitić"</td></tr><tr><td>"5211"</td><td>"Jordi Alba Ramos"</td><td>false</td><td>113.85</td><td>26.35</td><td>"Lionel Andrés Messi Cuccittini"</td></tr><tr><td>"5503"</td><td>"Lionel Andrés Messi Cuccittini"</td><td>false</td><td>93.65</td><td>34.65</td><td>"Rubén Duarte Sánchez"</td></tr><tr><td>"6613"</td><td>"Rubén Sobrino Pozuelo"</td><td>false</td><td>109.15</td><td>39.05</td><td>"Manuel Alejandro García Sánche…</td></tr><tr><td>"5246"</td><td>"Luis Alberto Suárez Díaz"</td><td>false</td><td>107.75</td><td>24.65</td><td>"Ousmane Dembélé"</td></tr><tr><td>…</td><td>…</td><td>…</td><td>…</td><td>…</td><td>…</td></tr><tr><td>"6935"</td><td>"Adrián Marín Gómez"</td><td>false</td><td>114.45</td><td>32.75</td><td>"Ibai Gómez Pérez"</td></tr><tr><td>"3501"</td><td>"Philippe Coutinho Correia"</td><td>false</td><td>113.15</td><td>31.35</td><td>"Lionel Andrés Messi Cuccittini"</td></tr><tr><td>"3501"</td><td>"Philippe Coutinho Correia"</td><td>true</td><td>105.25</td><td>33.35</td><td>"Arthur Henrique Ramos de Olive…</td></tr><tr><td>"5503"</td><td>"Lionel Andrés Messi Cuccittini"</td><td>false</td><td>106.55</td><td>46.75</td><td>"Sergi Roberto Carnicer"</td></tr><tr><td>"5503"</td><td>"Lionel Andrés Messi Cuccittini"</td><td>true</td><td>111.45</td><td>36.15</td><td>"Luis Alberto Suárez Díaz"</td></tr></tbody></table></div>" ], "text/plain": [ "shape: (28, 6)\n", "┌───────────┬──────────────────────┬─────────┬───────────────┬───────────────┬─────────────────────┐\n", "│ player_id ┆ player_name ┆ is_goal ┆ coordinates_x ┆ coordinates_y ┆ prev_pass_player │\n", "│ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │\n", "│ str ┆ str ┆ bool ┆ f64 ┆ f64 ┆ str │\n", "╞═══════════╪══════════════════════╪═════════╪═══════════════╪═══════════════╪═════════════════════╡\n", "│ 5503 ┆ Lionel Andrés Messi ┆ false ┆ 111.45 ┆ 52.85 ┆ Ivan Rakitić │\n", "│ ┆ Cuccittini ┆ ┆ ┆ ┆ │\n", "│ 5211 ┆ Jordi Alba Ramos ┆ false ┆ 113.85 ┆ 26.35 ┆ Lionel Andrés Messi │\n", "│ ┆ ┆ ┆ ┆ ┆ Cuccittini │\n", "│ 5503 ┆ Lionel Andrés Messi ┆ false ┆ 93.65 ┆ 34.65 ┆ Rubén Duarte │\n", "│ ┆ Cuccittini ┆ ┆ ┆ ┆ Sánchez │\n", "│ 6613 ┆ Rubén Sobrino ┆ false ┆ 109.15 ┆ 39.05 ┆ Manuel Alejandro │\n", "│ ┆ Pozuelo ┆ ┆ ┆ ┆ García Sánche… │\n", "│ 5246 ┆ Luis Alberto Suárez ┆ false ┆ 107.75 ┆ 24.65 ┆ Ousmane Dembélé │\n", "│ ┆ Díaz ┆ ┆ ┆ ┆ │\n", "│ … ┆ … ┆ … ┆ … ┆ … ┆ … │\n", "│ 6935 ┆ Adrián Marín Gómez ┆ false ┆ 114.45 ┆ 32.75 ┆ Ibai Gómez Pérez │\n", "│ 3501 ┆ Philippe Coutinho ┆ false ┆ 113.15 ┆ 31.35 ┆ Lionel Andrés Messi │\n", "│ ┆ Correia ┆ ┆ ┆ ┆ Cuccittini │\n", "│ 3501 ┆ Philippe Coutinho ┆ true ┆ 105.25 ┆ 33.35 ┆ Arthur Henrique │\n", "│ ┆ Correia ┆ ┆ ┆ ┆ Ramos de Olive… │\n", "│ 5503 ┆ Lionel Andrés Messi ┆ false ┆ 106.55 ┆ 46.75 ┆ Sergi Roberto │\n", "│ ┆ Cuccittini ┆ ┆ ┆ ┆ Carnicer │\n", "│ 5503 ┆ Lionel Andrés Messi ┆ true ┆ 111.45 ┆ 36.15 ┆ Luis Alberto Suárez │\n", "│ ┆ Cuccittini ┆ ┆ ┆ ┆ Díaz │\n", "└───────────┴──────────────────────┴─────────┴───────────────┴───────────────┴─────────────────────┘" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Convert Kloppy dataset to Polars DataFrame\n", "shots.to_df(\n", " \"player_id\",\n", " lambda event: {\n", " \"player_name\": str(event.player),\n", " \"is_goal\": event.result.is_success,\n", " },\n", " \"coordinates_*\",\n", " prev_pass_player=lambda event: str(event.prev(\"pass\").player),\n", " engine=\"polars\",\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Filter using lambda functions\n", "You can also use lambda functions to apply filtering. For eg. this snippet demonstrates how to filter events using a lambda function to extract actions performed by left center-backs (LCBs) and convert them into a Polars DataFrame." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/html": [ "<div><style>\n", ".dataframe > thead > tr,\n", ".dataframe > tbody > tr {\n", " text-align: right;\n", " white-space: pre-wrap;\n", "}\n", "</style>\n", "<small>shape: (252, 5)</small><table border=\"1\" class=\"dataframe\"><thead><tr><th>player_id</th><th>player_name</th><th>event_name</th><th>coordinates_x</th><th>coordinates_y</th></tr><tr><td>str</td><td>str</td><td>str</td><td>f64</td><td>f64</td></tr></thead><tbody><tr><td>"6855"</td><td>"Guillermo Alfonso Maripán Loay…</td><td>"Ball Receipt*"</td><td>33.75</td><td>27.95</td></tr><tr><td>"6855"</td><td>"Guillermo Alfonso Maripán Loay…</td><td>"carry"</td><td>33.75</td><td>27.95</td></tr><tr><td>"6855"</td><td>"Guillermo Alfonso Maripán Loay…</td><td>"pass"</td><td>36.75</td><td>27.25</td></tr><tr><td>"5492"</td><td>"Samuel Yves Umtiti"</td><td>"Ball Receipt*"</td><td>36.55</td><td>33.25</td></tr><tr><td>"5492"</td><td>"Samuel Yves Umtiti"</td><td>"carry"</td><td>36.55</td><td>33.25</td></tr><tr><td>…</td><td>…</td><td>…</td><td>…</td><td>…</td></tr><tr><td>"5492"</td><td>"Samuel Yves Umtiti"</td><td>"pass"</td><td>57.95</td><td>21.55</td></tr><tr><td>"6855"</td><td>"Guillermo Alfonso Maripán Loay…</td><td>"pass"</td><td>12.25</td><td>42.05</td></tr><tr><td>"5492"</td><td>"Samuel Yves Umtiti"</td><td>"interception"</td><td>15.65</td><td>39.65</td></tr><tr><td>"5492"</td><td>"Samuel Yves Umtiti"</td><td>"pass"</td><td>15.65</td><td>39.65</td></tr><tr><td>"6855"</td><td>"Guillermo Alfonso Maripán Loay…</td><td>"pressure"</td><td>15.45</td><td>41.35</td></tr></tbody></table></div>" ], "text/plain": [ "shape: (252, 5)\n", "┌───────────┬─────────────────────────────────┬───────────────┬───────────────┬───────────────┐\n", "│ player_id ┆ player_name ┆ event_name ┆ coordinates_x ┆ coordinates_y │\n", "│ --- ┆ --- ┆ --- ┆ --- ┆ --- │\n", "│ str ┆ str ┆ str ┆ f64 ┆ f64 │\n", "╞═══════════╪═════════════════════════════════╪═══════════════╪═══════════════╪═══════════════╡\n", "│ 6855 ┆ Guillermo Alfonso Maripán Loay… ┆ Ball Receipt* ┆ 33.75 ┆ 27.95 │\n", "│ 6855 ┆ Guillermo Alfonso Maripán Loay… ┆ carry ┆ 33.75 ┆ 27.95 │\n", "│ 6855 ┆ Guillermo Alfonso Maripán Loay… ┆ pass ┆ 36.75 ┆ 27.25 │\n", "│ 5492 ┆ Samuel Yves Umtiti ┆ Ball Receipt* ┆ 36.55 ┆ 33.25 │\n", "│ 5492 ┆ Samuel Yves Umtiti ┆ carry ┆ 36.55 ┆ 33.25 │\n", "│ … ┆ … ┆ … ┆ … ┆ … │\n", "│ 5492 ┆ Samuel Yves Umtiti ┆ pass ┆ 57.95 ┆ 21.55 │\n", "│ 6855 ┆ Guillermo Alfonso Maripán Loay… ┆ pass ┆ 12.25 ┆ 42.05 │\n", "│ 5492 ┆ Samuel Yves Umtiti ┆ interception ┆ 15.65 ┆ 39.65 │\n", "│ 5492 ┆ Samuel Yves Umtiti ┆ pass ┆ 15.65 ┆ 39.65 │\n", "│ 6855 ┆ Guillermo Alfonso Maripán Loay… ┆ pressure ┆ 15.45 ┆ 41.35 │\n", "└───────────┴─────────────────────────────────┴───────────────┴───────────────┴───────────────┘" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "left_centerbacks_actions = dataset.filter(\n", " lambda event: event.player\n", " and event.player.starting_position\n", " and str(event.player.starting_position.code) == \"LCB\"\n", ").to_df(\n", " \"player_id\",\n", " lambda event: {\"player_name\": str(event.player), \"event_name\": event.event_name},\n", " \"coordinates_*\",\n", " engine=\"polars\",\n", ")\n", "\n", "left_centerbacks_actions" ] } ], "metadata": { "kernelspec": { "display_name": "/home/pieterr/Jupiter/Projects/kloppy", "language": "python", "name": "kloppy" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.9" } }, "nbformat": 4, "nbformat_minor": 4 }