{ "cells": [ { "cell_type": "markdown", "id": "44465b73", "metadata": {}, "source": [ "# Sportec\n", "\n", "- [Load local event files](#load-local-event-files)\n", "- [Load local tracking files](#load-local-tracking-files)\n", "- [Load Open Data](#load-open-data)\n", "- [Load remote files](#load-remote-files)\n", "\n", "## Load local event files" ] }, { "cell_type": "code", "execution_count": 1, "id": "4f6455fb", "metadata": {}, "outputs": [ { "data": { "text/html": [ "<div>\n", "<style scoped>\n", " .dataframe tbody tr th:only-of-type {\n", " vertical-align: middle;\n", " }\n", "\n", " .dataframe tbody tr th {\n", " vertical-align: top;\n", " }\n", "\n", " .dataframe thead th {\n", " text-align: right;\n", " }\n", "</style>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>event_id</th>\n", " <th>event_type</th>\n", " <th>result</th>\n", " <th>success</th>\n", " <th>period_id</th>\n", " <th>timestamp</th>\n", " <th>end_timestamp</th>\n", " <th>ball_state</th>\n", " <th>ball_owning_team</th>\n", " <th>team_id</th>\n", " <th>player_id</th>\n", " <th>coordinates_x</th>\n", " <th>coordinates_y</th>\n", " <th>end_coordinates_x</th>\n", " <th>end_coordinates_y</th>\n", " <th>receiver_player_id</th>\n", " <th>set_piece_type</th>\n", " <th>body_part_type</th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>0</th>\n", " <td>17364900000006</td>\n", " <td>PASS</td>\n", " <td>COMPLETE</td>\n", " <td>True</td>\n", " <td>1</td>\n", " <td>0.000</td>\n", " <td>None</td>\n", " <td>alive</td>\n", " <td>None</td>\n", " <td>DFL-CLU-000004</td>\n", " <td>DFL-OBJ-0000SP</td>\n", " <td>56.41</td>\n", " <td>68.00</td>\n", " <td>77.75</td>\n", " <td>38.71</td>\n", " <td>DFL-OBJ-0000ZS</td>\n", " <td>KICK_OFF</td>\n", " <td>None</td>\n", " </tr>\n", " <tr>\n", " <th>1</th>\n", " <td>17364900000007</td>\n", " <td>PASS</td>\n", " <td>COMPLETE</td>\n", " <td>True</td>\n", " <td>1</td>\n", " <td>3.123</td>\n", " <td>None</td>\n", " <td>alive</td>\n", " <td>None</td>\n", " <td>DFL-CLU-000004</td>\n", " <td>DFL-OBJ-0000ZS</td>\n", " <td>73.94</td>\n", " <td>37.21</td>\n", " <td>NaN</td>\n", " <td>NaN</td>\n", " <td>DFL-OBJ-002G3I</td>\n", " <td>None</td>\n", " <td>None</td>\n", " </tr>\n", " <tr>\n", " <th>2</th>\n", " <td>17364900000014</td>\n", " <td>PASS</td>\n", " <td>COMPLETE</td>\n", " <td>True</td>\n", " <td>1</td>\n", " <td>31.797</td>\n", " <td>None</td>\n", " <td>alive</td>\n", " <td>None</td>\n", " <td>DFL-CLU-00000A</td>\n", " <td>DFL-OBJ-00017V</td>\n", " <td>35.57</td>\n", " <td>68.00</td>\n", " <td>21.24</td>\n", " <td>28.58</td>\n", " <td>DFL-OBJ-0027B9</td>\n", " <td>THROW_IN</td>\n", " <td>None</td>\n", " </tr>\n", " <tr>\n", " <th>3</th>\n", " <td>17364900000031</td>\n", " <td>SHOT</td>\n", " <td>BLOCKED</td>\n", " <td>False</td>\n", " <td>1</td>\n", " <td>79.480</td>\n", " <td>None</td>\n", " <td>alive</td>\n", " <td>None</td>\n", " <td>DFL-CLU-000004</td>\n", " <td>DFL-OBJ-002706</td>\n", " <td>21.24</td>\n", " <td>28.58</td>\n", " <td>NaN</td>\n", " <td>NaN</td>\n", " <td>None</td>\n", " <td>None</td>\n", " <td>RIGHT_FOOT</td>\n", " </tr>\n", " <tr>\n", " <th>4</th>\n", " <td>17364900000036</td>\n", " <td>PASS</td>\n", " <td>INCOMPLETE</td>\n", " <td>False</td>\n", " <td>1</td>\n", " <td>95.173</td>\n", " <td>None</td>\n", " <td>alive</td>\n", " <td>None</td>\n", " <td>DFL-CLU-000004</td>\n", " <td>DFL-OBJ-002G3I</td>\n", " <td>8.72</td>\n", " <td>4.21</td>\n", " <td>NaN</td>\n", " <td>NaN</td>\n", " <td>None</td>\n", " <td>None</td>\n", " <td>None</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " event_id event_type result success period_id timestamp \\\n", "0 17364900000006 PASS COMPLETE True 1 0.000 \n", "1 17364900000007 PASS COMPLETE True 1 3.123 \n", "2 17364900000014 PASS COMPLETE True 1 31.797 \n", "3 17364900000031 SHOT BLOCKED False 1 79.480 \n", "4 17364900000036 PASS INCOMPLETE False 1 95.173 \n", "\n", " end_timestamp ball_state ball_owning_team team_id player_id \\\n", "0 None alive None DFL-CLU-000004 DFL-OBJ-0000SP \n", "1 None alive None DFL-CLU-000004 DFL-OBJ-0000ZS \n", "2 None alive None DFL-CLU-00000A DFL-OBJ-00017V \n", "3 None alive None DFL-CLU-000004 DFL-OBJ-002706 \n", "4 None alive None DFL-CLU-000004 DFL-OBJ-002G3I \n", "\n", " coordinates_x coordinates_y end_coordinates_x end_coordinates_y \\\n", "0 56.41 68.00 77.75 38.71 \n", "1 73.94 37.21 NaN NaN \n", "2 35.57 68.00 21.24 28.58 \n", "3 21.24 28.58 NaN NaN \n", "4 8.72 4.21 NaN NaN \n", "\n", " receiver_player_id set_piece_type body_part_type \n", "0 DFL-OBJ-0000ZS KICK_OFF None \n", "1 DFL-OBJ-002G3I None None \n", "2 DFL-OBJ-0027B9 THROW_IN None \n", "3 None None RIGHT_FOOT \n", "4 None None None " ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from kloppy import sportec\n", "\n", "dataset = sportec.load_event(\n", " event_data=\"../../kloppy/tests/files/sportec_events.xml\",\n", " meta_data=\"../../kloppy/tests/files/sportec_meta.xml\",\n", " \n", " # Optional arguments\n", " coordinates=\"sportec\",\n", " event_types=[\"pass\", \"shot\"]\n", ")\n", "\n", "dataset.to_df().head()" ] }, { "cell_type": "markdown", "id": "81989fc6", "metadata": {}, "source": [ "# Load local tracking files" ] }, { "cell_type": "code", "execution_count": 3, "id": "958f17ee", "metadata": {}, "outputs": [ { "data": { "text/html": [ "<div>\n", "<style scoped>\n", " .dataframe tbody tr th:only-of-type {\n", " vertical-align: middle;\n", " }\n", "\n", " .dataframe tbody tr th {\n", " vertical-align: top;\n", " }\n", "\n", " .dataframe thead th {\n", " text-align: right;\n", " }\n", "</style>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>period_id</th>\n", " <th>timestamp</th>\n", " <th>frame_id</th>\n", " <th>ball_state</th>\n", " <th>ball_owning_team_id</th>\n", " <th>ball_x</th>\n", " <th>ball_y</th>\n", " <th>ball_z</th>\n", " <th>ball_speed</th>\n", " <th>DFL-OBJ-002G3I_x</th>\n", " <th>...</th>\n", " <th>DFL-OBJ-002G3I_d</th>\n", " <th>DFL-OBJ-002G3I_s</th>\n", " <th>DFL-OBJ-002G5S_x</th>\n", " <th>DFL-OBJ-002G5S_y</th>\n", " <th>DFL-OBJ-002G5S_d</th>\n", " <th>DFL-OBJ-002G5S_s</th>\n", " <th>DFL-OBJ-002FVJ_x</th>\n", " <th>DFL-OBJ-002FVJ_y</th>\n", " <th>DFL-OBJ-002FVJ_d</th>\n", " <th>DFL-OBJ-002FVJ_s</th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>0</th>\n", " <td>1</td>\n", " <td>0.00</td>\n", " <td>10000</td>\n", " <td>dead</td>\n", " <td>DFL-CLU-000004</td>\n", " <td>2.69</td>\n", " <td>0.26</td>\n", " <td>0.06</td>\n", " <td>0.00</td>\n", " <td>0.35</td>\n", " <td>...</td>\n", " <td>None</td>\n", " <td>0.00</td>\n", " <td>NaN</td>\n", " <td>NaN</td>\n", " <td>None</td>\n", " <td>NaN</td>\n", " <td>NaN</td>\n", " <td>NaN</td>\n", " <td>None</td>\n", " <td>NaN</td>\n", " </tr>\n", " <tr>\n", " <th>1</th>\n", " <td>1</td>\n", " <td>0.04</td>\n", " <td>10001</td>\n", " <td>alive</td>\n", " <td>DFL-CLU-00000A</td>\n", " <td>3.41</td>\n", " <td>0.26</td>\n", " <td>0.08</td>\n", " <td>65.59</td>\n", " <td>0.34</td>\n", " <td>...</td>\n", " <td>None</td>\n", " <td>1.74</td>\n", " <td>NaN</td>\n", " <td>NaN</td>\n", " <td>None</td>\n", " <td>NaN</td>\n", " <td>NaN</td>\n", " <td>NaN</td>\n", " <td>None</td>\n", " <td>NaN</td>\n", " </tr>\n", " <tr>\n", " <th>2</th>\n", " <td>1</td>\n", " <td>0.08</td>\n", " <td>10002</td>\n", " <td>alive</td>\n", " <td>DFL-CLU-000004</td>\n", " <td>4.22</td>\n", " <td>0.33</td>\n", " <td>0.09</td>\n", " <td>65.16</td>\n", " <td>0.32</td>\n", " <td>...</td>\n", " <td>None</td>\n", " <td>1.76</td>\n", " <td>NaN</td>\n", " <td>NaN</td>\n", " <td>None</td>\n", " <td>NaN</td>\n", " <td>NaN</td>\n", " <td>NaN</td>\n", " <td>None</td>\n", " <td>NaN</td>\n", " </tr>\n", " <tr>\n", " <th>3</th>\n", " <td>1</td>\n", " <td>0.12</td>\n", " <td>10003</td>\n", " <td>alive</td>\n", " <td>DFL-CLU-000004</td>\n", " <td>5.02</td>\n", " <td>0.38</td>\n", " <td>0.09</td>\n", " <td>74.34</td>\n", " <td>0.31</td>\n", " <td>...</td>\n", " <td>None</td>\n", " <td>1.78</td>\n", " <td>NaN</td>\n", " <td>NaN</td>\n", " <td>None</td>\n", " <td>NaN</td>\n", " <td>NaN</td>\n", " <td>NaN</td>\n", " <td>None</td>\n", " <td>NaN</td>\n", " </tr>\n", " <tr>\n", " <th>4</th>\n", " <td>1</td>\n", " <td>0.16</td>\n", " <td>10004</td>\n", " <td>alive</td>\n", " <td>DFL-CLU-000004</td>\n", " <td>5.79</td>\n", " <td>0.44</td>\n", " <td>0.08</td>\n", " <td>73.58</td>\n", " <td>0.29</td>\n", " <td>...</td>\n", " <td>None</td>\n", " <td>1.80</td>\n", " <td>NaN</td>\n", " <td>NaN</td>\n", " <td>None</td>\n", " <td>NaN</td>\n", " <td>NaN</td>\n", " <td>NaN</td>\n", " <td>None</td>\n", " <td>NaN</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "<p>5 rows × 21 columns</p>\n", "</div>" ], "text/plain": [ " period_id timestamp frame_id ball_state ball_owning_team_id ball_x \\\n", "0 1 0.00 10000 dead DFL-CLU-000004 2.69 \n", "1 1 0.04 10001 alive DFL-CLU-00000A 3.41 \n", "2 1 0.08 10002 alive DFL-CLU-000004 4.22 \n", "3 1 0.12 10003 alive DFL-CLU-000004 5.02 \n", "4 1 0.16 10004 alive DFL-CLU-000004 5.79 \n", "\n", " ball_y ball_z ball_speed DFL-OBJ-002G3I_x ... DFL-OBJ-002G3I_d \\\n", "0 0.26 0.06 0.00 0.35 ... None \n", "1 0.26 0.08 65.59 0.34 ... None \n", "2 0.33 0.09 65.16 0.32 ... None \n", "3 0.38 0.09 74.34 0.31 ... None \n", "4 0.44 0.08 73.58 0.29 ... None \n", "\n", " DFL-OBJ-002G3I_s DFL-OBJ-002G5S_x DFL-OBJ-002G5S_y DFL-OBJ-002G5S_d \\\n", "0 0.00 NaN NaN None \n", "1 1.74 NaN NaN None \n", "2 1.76 NaN NaN None \n", "3 1.78 NaN NaN None \n", "4 1.80 NaN NaN None \n", "\n", " DFL-OBJ-002G5S_s DFL-OBJ-002FVJ_x DFL-OBJ-002FVJ_y DFL-OBJ-002FVJ_d \\\n", "0 NaN NaN NaN None \n", "1 NaN NaN NaN None \n", "2 NaN NaN NaN None \n", "3 NaN NaN NaN None \n", "4 NaN NaN NaN None \n", "\n", " DFL-OBJ-002FVJ_s \n", "0 NaN \n", "1 NaN \n", "2 NaN \n", "3 NaN \n", "4 NaN \n", "\n", "[5 rows x 21 columns]" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from kloppy import sportec\n", "\n", "\n", "dataset = sportec.load_tracking(\n", " raw_data=\"../../kloppy/tests/files/sportec_positional.xml\",\n", " meta_data=\"../../kloppy/tests/files/sportec_meta.xml\",\n", " \n", " # Optional arguments\n", " sample_rate=1,\n", " limit=10,\n", " coordinates=\"sportec\",\n", " only_alive=False\n", ")\n", "\n", "dataset.to_df().head()" ] }, { "cell_type": "markdown", "id": "0b4471cd", "metadata": {}, "source": [ "### Load open data\n", "There are 7 games of open [Sportec Open DFL Tracking and Event Data (Bassek et al. 2025)](https://www.nature.com/articles/s41597-025-04505-y).\n", "\n", "You can use the following match ids to load these games automatically with kloppy.\n", "\n", "| match_id | home | away |\n", "| --- | --- | --- |\n", "| J03WMX | 1. FC Köln | FC Bayern München |\n", "| J03WN1 | VfL Bochum 1848 | Bayer 04 Leverkusen |\n", "| J03WPY | Fortuna Düsseldorf | 1. FC Nürnberg |\n", "| J03WOH | Fortuna Düsseldorf | SSV Jahn Regensburg |\n", "| J03WQQ | Fortuna Düsseldorf | FC St. Pauli |\n", "| J03WOY | Fortuna Düsseldorf | F.C. Hansa Rostock |\n", "| J03WR9 | Fortuna Düsseldorf | 1. FC Kaiserslautern |" ] }, { "cell_type": "code", "execution_count": null, "id": "8d434c80", "metadata": {}, "outputs": [], "source": [ "from kloppy import sportec\n", "\n", "match_id = \"J03WMX\"\n", "event_dataset = sportec.load_open_event_data(match_id=match_id)\n", "tracking_dataset = sportec.load_open_tracking_data(match_id=match_id)" ] }, { "cell_type": "markdown", "id": "964133be", "metadata": {}, "source": [ "## Load remote files\n", "Kloppy supports remote files through `fsspec` FileSystem under the hood. This allows you to work with files in AWS S3, Google Cloud, Azure Blob, HDFS, FTP, and SFTP without extra tools.\n", "For example you can pass:\n", "- Individual s3 file paths: (e.g `raw_data=s3://.../sportec_positional.xml`)\n", "\n", "Note: Kloppy might throw an the first time to help you identify missing cloud specific dependencies like `s3fs`. " ] }, { "cell_type": "code", "execution_count": null, "id": "576fd98a", "metadata": {}, "outputs": [], "source": [ "from kloppy import sportec\n", "\n", "dataset = sportec.load_tracking(\n", " raw_data=\"s3://.../sportec_positional.xml\",\n", " meta_data=\"s3://.../sportec_meta.xml\",\n", " \n", " # Optional arguments\n", " sample_rate=1,\n", " limit=10,\n", " coordinates=\"sportec\",\n", " only_alive=False\n", ")" ] } ], "metadata": { "kernelspec": { "display_name": ".venv", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.11" } }, "nbformat": 4, "nbformat_minor": 5 }