{ "cells": [ { "cell_type": "markdown", "id": "a93fc613", "metadata": {}, "source": [ "# StatsPerform\n", "\n", "- [Load local event files](#load-local-event-files)\n", " - [MA1 and MA3](#ma1-and-ma3)\n", " - [Opta F7 and F24 (or F73)](#opta-f7-and-f24-or-f73)\n", "- [Load local tracking files](#load-local-event-files)\n", "- [Load remote files](#load-remote-files)\n", " \n", "## Load local event files\n", "\n", "### MA1 and MA3" ] }, { "cell_type": "code", "execution_count": null, "id": "78b9c1c0-b956-4662-885c-7273654ccaf8", "metadata": {}, "outputs": [ { "data": { "text/html": [ "<div>\n", "<style scoped>\n", " .dataframe tbody tr th:only-of-type {\n", " vertical-align: middle;\n", " }\n", "\n", " .dataframe tbody tr th {\n", " vertical-align: top;\n", " }\n", "\n", " .dataframe thead th {\n", " text-align: right;\n", " }\n", "</style>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>event_id</th>\n", " <th>event_type</th>\n", " <th>result</th>\n", " <th>success</th>\n", " <th>period_id</th>\n", " <th>timestamp</th>\n", " <th>end_timestamp</th>\n", " <th>ball_state</th>\n", " <th>ball_owning_team</th>\n", " <th>team_id</th>\n", " <th>player_id</th>\n", " <th>coordinates_x</th>\n", " <th>coordinates_y</th>\n", " <th>end_coordinates_x</th>\n", " <th>end_coordinates_y</th>\n", " <th>receiver_player_id</th>\n", " <th>set_piece_type</th>\n", " <th>pass_type</th>\n", " <th>body_part_type</th>\n", " <th>is_counter_attack</th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>0</th>\n", " <td>2328589789</td>\n", " <td>PASS</td>\n", " <td>COMPLETE</td>\n", " <td>True</td>\n", " <td>1</td>\n", " <td>0 days 00:00:00.030000</td>\n", " <td>None</td>\n", " <td>alive</td>\n", " <td>75xi6hloabmnjn2kzgj1g8h1s</td>\n", " <td>75xi6hloabmnjn2kzgj1g8h1s</td>\n", " <td>aksjicf4keobpav3tuujngell</td>\n", " <td>50.0</td>\n", " <td>50.0</td>\n", " <td>33.9</td>\n", " <td>51.0</td>\n", " <td>None</td>\n", " <td>KICK_OFF</td>\n", " <td>None</td>\n", " <td>None</td>\n", " <td>None</td>\n", " </tr>\n", " <tr>\n", " <th>1</th>\n", " <td>2328589863</td>\n", " <td>PASS</td>\n", " <td>COMPLETE</td>\n", " <td>True</td>\n", " <td>1</td>\n", " <td>0 days 00:00:02.075000</td>\n", " <td>None</td>\n", " <td>alive</td>\n", " <td>75xi6hloabmnjn2kzgj1g8h1s</td>\n", " <td>75xi6hloabmnjn2kzgj1g8h1s</td>\n", " <td>apdrig6xt1hxub1986s3uh1x</td>\n", " <td>33.9</td>\n", " <td>51.0</td>\n", " <td>34.0</td>\n", " <td>88.6</td>\n", " <td>None</td>\n", " <td>None</td>\n", " <td>None</td>\n", " <td>None</td>\n", " <td>None</td>\n", " </tr>\n", " <tr>\n", " <th>2</th>\n", " <td>2328589885</td>\n", " <td>PASS</td>\n", " <td>COMPLETE</td>\n", " <td>True</td>\n", " <td>1</td>\n", " <td>0 days 00:00:06.803000</td>\n", " <td>None</td>\n", " <td>alive</td>\n", " <td>75xi6hloabmnjn2kzgj1g8h1s</td>\n", " <td>75xi6hloabmnjn2kzgj1g8h1s</td>\n", " <td>46vr35g415omy60ahbiv40wk5</td>\n", " <td>34.2</td>\n", " <td>88.6</td>\n", " <td>56.1</td>\n", " <td>92.7</td>\n", " <td>None</td>\n", " <td>None</td>\n", " <td>CHIPPED_PASS</td>\n", " <td>None</td>\n", " <td>None</td>\n", " </tr>\n", " <tr>\n", " <th>3</th>\n", " <td>2328589929</td>\n", " <td>PASS</td>\n", " <td>COMPLETE</td>\n", " <td>True</td>\n", " <td>1</td>\n", " <td>0 days 00:00:09.373000</td>\n", " <td>None</td>\n", " <td>alive</td>\n", " <td>75xi6hloabmnjn2kzgj1g8h1s</td>\n", " <td>75xi6hloabmnjn2kzgj1g8h1s</td>\n", " <td>2nrmndj0uq3f46c2cb1fbf85</td>\n", " <td>56.8</td>\n", " <td>94.2</td>\n", " <td>48.0</td>\n", " <td>96.6</td>\n", " <td>None</td>\n", " <td>None</td>\n", " <td>None</td>\n", " <td>None</td>\n", " <td>None</td>\n", " </tr>\n", " <tr>\n", " <th>4</th>\n", " <td>2328590577</td>\n", " <td>PASS</td>\n", " <td>COMPLETE</td>\n", " <td>True</td>\n", " <td>1</td>\n", " <td>0 days 00:00:40.486000</td>\n", " <td>None</td>\n", " <td>alive</td>\n", " <td>75xi6hloabmnjn2kzgj1g8h1s</td>\n", " <td>75xi6hloabmnjn2kzgj1g8h1s</td>\n", " <td>46vr35g415omy60ahbiv40wk5</td>\n", " <td>46.1</td>\n", " <td>94.7</td>\n", " <td>32.7</td>\n", " <td>73.8</td>\n", " <td>None</td>\n", " <td>FREE_KICK</td>\n", " <td>None</td>\n", " <td>None</td>\n", " <td>None</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " event_id event_type result success period_id timestamp \\\n", "0 2328589789 PASS COMPLETE True 1 0 days 00:00:00.030000 \n", "1 2328589863 PASS COMPLETE True 1 0 days 00:00:02.075000 \n", "2 2328589885 PASS COMPLETE True 1 0 days 00:00:06.803000 \n", "3 2328589929 PASS COMPLETE True 1 0 days 00:00:09.373000 \n", "4 2328590577 PASS COMPLETE True 1 0 days 00:00:40.486000 \n", "\n", " end_timestamp ball_state ball_owning_team \\\n", "0 None alive 75xi6hloabmnjn2kzgj1g8h1s \n", "1 None alive 75xi6hloabmnjn2kzgj1g8h1s \n", "2 None alive 75xi6hloabmnjn2kzgj1g8h1s \n", "3 None alive 75xi6hloabmnjn2kzgj1g8h1s \n", "4 None alive 75xi6hloabmnjn2kzgj1g8h1s \n", "\n", " team_id player_id coordinates_x \\\n", "0 75xi6hloabmnjn2kzgj1g8h1s aksjicf4keobpav3tuujngell 50.0 \n", "1 75xi6hloabmnjn2kzgj1g8h1s apdrig6xt1hxub1986s3uh1x 33.9 \n", "2 75xi6hloabmnjn2kzgj1g8h1s 46vr35g415omy60ahbiv40wk5 34.2 \n", "3 75xi6hloabmnjn2kzgj1g8h1s 2nrmndj0uq3f46c2cb1fbf85 56.8 \n", "4 75xi6hloabmnjn2kzgj1g8h1s 46vr35g415omy60ahbiv40wk5 46.1 \n", "\n", " coordinates_y end_coordinates_x end_coordinates_y receiver_player_id \\\n", "0 50.0 33.9 51.0 None \n", "1 51.0 34.0 88.6 None \n", "2 88.6 56.1 92.7 None \n", "3 94.2 48.0 96.6 None \n", "4 94.7 32.7 73.8 None \n", "\n", " set_piece_type pass_type body_part_type is_counter_attack \n", "0 KICK_OFF None None None \n", "1 None None None None \n", "2 None CHIPPED_PASS None None \n", "3 None None None None \n", "4 FREE_KICK None None None " ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from kloppy import statsperform\n", "\n", "dataset = statsperform.load_event(\n", " ma1_data=\"../../kloppy/tests/files/statsperform_event_ma1.json\",\n", " ma3_data=\"../../kloppy/tests/files/statsperform_event_ma3.json\",\n", " \n", " # Optional arguments\n", " coordinates=\"opta\", \n", " pitch_length=102.5,\n", " pitch_width=69.0,\n", " event_types=[\"pass\", \"shot\"],\n", ")\n", "\n", "dataset.to_df().head()" ] }, { "cell_type": "markdown", "id": "d73a2fad", "metadata": {}, "source": [ "### Opta F7 and F24 (or F73)\n", "\n", "You can also use the 'old' Opta F7 and F24 format or the F7 and F73 format. \n", "\n", "To use F73, simply pass it to the `f24_data` parameter instead of the f24 file." ] }, { "cell_type": "code", "execution_count": null, "id": "f4c2c391", "metadata": {}, "outputs": [ { "data": { "application/vnd.microsoft.datawrangler.viewer.v0+json": { "columns": [ { "name": "index", "rawType": "int64", "type": "integer" }, { "name": "event_id", "rawType": "object", "type": "string" }, { "name": "event_type", "rawType": "object", "type": "string" }, { "name": "result", "rawType": "object", "type": "string" }, { "name": "success", "rawType": "bool", "type": "boolean" }, { "name": "period_id", "rawType": "int64", "type": "integer" }, { "name": "timestamp", "rawType": "timedelta64[ns]", "type": "unknown" }, { "name": "end_timestamp", "rawType": "timedelta64[ns]", "type": "unknown" }, { "name": "ball_state", "rawType": "object", "type": "string" }, { "name": "ball_owning_team", "rawType": "object", "type": "string" }, { "name": "team_id", "rawType": "object", "type": "string" }, { "name": "player_id", "rawType": "object", "type": "string" }, { "name": "coordinates_x", "rawType": "float64", "type": "float" }, { "name": "coordinates_y", "rawType": "float64", "type": "float" }, { "name": "end_coordinates_x", "rawType": "float64", "type": "float" }, { "name": "end_coordinates_y", "rawType": "float64", "type": "float" }, { "name": "receiver_player_id", "rawType": "object", "type": "unknown" }, { "name": "set_piece_type", "rawType": "object", "type": "unknown" }, { "name": "pass_type", "rawType": "object", "type": "unknown" }, { "name": "body_part_type", "rawType": "object", "type": "unknown" }, { "name": "is_counter_attack", "rawType": "object", "type": "unknown" } ], "conversionMethod": "pd.DataFrame", "ref": "4a95a790-e8f6-4284-a0da-4aab3c81671d", "rows": [ [ "0", "1510681159", "PASS", "COMPLETE", "True", "1", "0 days 00:00:00.431000", "0 days 00:00:03.395000", "alive", "569", "569", "48337", "50.1", "49.4", "36.4", "45.1", "76001", "KICK_OFF", null, null, null ], [ "1", "1646695660", "PASS", "COMPLETE", "True", "1", "0 days 00:00:03.395000", "0 days 00:00:05.299000", "alive", "569", "569", "76001", "36.4", "45.1", "28.0", "39.8", "164266", null, null, null, null ], [ "2", "1782829017", "PASS", "COMPLETE", "True", "1", "0 days 00:00:05.299000", "0 days 00:00:06.995000", "alive", "569", "569", "164266", "27.9", "39.8", "29.1", "62.0", "77384", null, null, null, null ], [ "3", "1909884550", "PASS", "COMPLETE", "True", "1", "0 days 00:00:06.995000", "0 days 00:00:08.971000", "alive", "569", "569", "77384", "29.3", "62.6", "26.3", "37.6", "164266", null, null, null, null ], [ "4", "1515097980", "PASS", "COMPLETE", "True", "1", "0 days 00:00:08.971000", null, "alive", "569", "569", "164266", "26.3", "34.3", "29.1", "7.4", null, null, null, null, null ] ], "shape": { "columns": 20, "rows": 5 } }, "text/html": [ "<div>\n", "<style scoped>\n", " .dataframe tbody tr th:only-of-type {\n", " vertical-align: middle;\n", " }\n", "\n", " .dataframe tbody tr th {\n", " vertical-align: top;\n", " }\n", "\n", " .dataframe thead th {\n", " text-align: right;\n", " }\n", "</style>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>event_id</th>\n", " <th>event_type</th>\n", " <th>result</th>\n", " <th>success</th>\n", " <th>period_id</th>\n", " <th>timestamp</th>\n", " <th>end_timestamp</th>\n", " <th>ball_state</th>\n", " <th>ball_owning_team</th>\n", " <th>team_id</th>\n", " <th>player_id</th>\n", " <th>coordinates_x</th>\n", " <th>coordinates_y</th>\n", " <th>end_coordinates_x</th>\n", " <th>end_coordinates_y</th>\n", " <th>receiver_player_id</th>\n", " <th>set_piece_type</th>\n", " <th>pass_type</th>\n", " <th>body_part_type</th>\n", " <th>is_counter_attack</th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>0</th>\n", " <td>1510681159</td>\n", " <td>PASS</td>\n", " <td>COMPLETE</td>\n", " <td>True</td>\n", " <td>1</td>\n", " <td>0 days 00:00:00.431000</td>\n", " <td>0 days 00:00:03.395000</td>\n", " <td>alive</td>\n", " <td>569</td>\n", " <td>569</td>\n", " <td>48337</td>\n", " <td>50.1</td>\n", " <td>49.4</td>\n", " <td>36.4</td>\n", " <td>45.1</td>\n", " <td>76001</td>\n", " <td>KICK_OFF</td>\n", " <td>None</td>\n", " <td>None</td>\n", " <td>None</td>\n", " </tr>\n", " <tr>\n", " <th>1</th>\n", " <td>1646695660</td>\n", " <td>PASS</td>\n", " <td>COMPLETE</td>\n", " <td>True</td>\n", " <td>1</td>\n", " <td>0 days 00:00:03.395000</td>\n", " <td>0 days 00:00:05.299000</td>\n", " <td>alive</td>\n", " <td>569</td>\n", " <td>569</td>\n", " <td>76001</td>\n", " <td>36.4</td>\n", " <td>45.1</td>\n", " <td>28.0</td>\n", " <td>39.8</td>\n", " <td>164266</td>\n", " <td>None</td>\n", " <td>None</td>\n", " <td>None</td>\n", " <td>None</td>\n", " </tr>\n", " <tr>\n", " <th>2</th>\n", " <td>1782829017</td>\n", " <td>PASS</td>\n", " <td>COMPLETE</td>\n", " <td>True</td>\n", " <td>1</td>\n", " <td>0 days 00:00:05.299000</td>\n", " <td>0 days 00:00:06.995000</td>\n", " <td>alive</td>\n", " <td>569</td>\n", " <td>569</td>\n", " <td>164266</td>\n", " <td>27.9</td>\n", " <td>39.8</td>\n", " <td>29.1</td>\n", " <td>62.0</td>\n", " <td>77384</td>\n", " <td>None</td>\n", " <td>None</td>\n", " <td>None</td>\n", " <td>None</td>\n", " </tr>\n", " <tr>\n", " <th>3</th>\n", " <td>1909884550</td>\n", " <td>PASS</td>\n", " <td>COMPLETE</td>\n", " <td>True</td>\n", " <td>1</td>\n", " <td>0 days 00:00:06.995000</td>\n", " <td>0 days 00:00:08.971000</td>\n", " <td>alive</td>\n", " <td>569</td>\n", " <td>569</td>\n", " <td>77384</td>\n", " <td>29.3</td>\n", " <td>62.6</td>\n", " <td>26.3</td>\n", " <td>37.6</td>\n", " <td>164266</td>\n", " <td>None</td>\n", " <td>None</td>\n", " <td>None</td>\n", " <td>None</td>\n", " </tr>\n", " <tr>\n", " <th>4</th>\n", " <td>1515097980</td>\n", " <td>PASS</td>\n", " <td>COMPLETE</td>\n", " <td>True</td>\n", " <td>1</td>\n", " <td>0 days 00:00:08.971000</td>\n", " <td>NaT</td>\n", " <td>alive</td>\n", " <td>569</td>\n", " <td>569</td>\n", " <td>164266</td>\n", " <td>26.3</td>\n", " <td>34.3</td>\n", " <td>29.1</td>\n", " <td>7.4</td>\n", " <td>None</td>\n", " <td>None</td>\n", " <td>None</td>\n", " <td>None</td>\n", " <td>None</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " event_id event_type result success period_id timestamp \\\n", "0 1510681159 PASS COMPLETE True 1 0 days 00:00:00.431000 \n", "1 1646695660 PASS COMPLETE True 1 0 days 00:00:03.395000 \n", "2 1782829017 PASS COMPLETE True 1 0 days 00:00:05.299000 \n", "3 1909884550 PASS COMPLETE True 1 0 days 00:00:06.995000 \n", "4 1515097980 PASS COMPLETE True 1 0 days 00:00:08.971000 \n", "\n", " end_timestamp ball_state ball_owning_team team_id player_id \\\n", "0 0 days 00:00:03.395000 alive 569 569 48337 \n", "1 0 days 00:00:05.299000 alive 569 569 76001 \n", "2 0 days 00:00:06.995000 alive 569 569 164266 \n", "3 0 days 00:00:08.971000 alive 569 569 77384 \n", "4 NaT alive 569 569 164266 \n", "\n", " coordinates_x coordinates_y end_coordinates_x end_coordinates_y \\\n", "0 50.1 49.4 36.4 45.1 \n", "1 36.4 45.1 28.0 39.8 \n", "2 27.9 39.8 29.1 62.0 \n", "3 29.3 62.6 26.3 37.6 \n", "4 26.3 34.3 29.1 7.4 \n", "\n", " receiver_player_id set_piece_type pass_type body_part_type is_counter_attack \n", "0 76001 KICK_OFF None None None \n", "1 164266 None None None None \n", "2 77384 None None None None \n", "3 164266 None None None None \n", "4 None None None None None " ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from kloppy import opta\n", "\n", "dataset = opta.load(\n", " f7_data=\"../../kloppy/tests/files/opta_f7.xml\",\n", " f24_data=\"../../kloppy/tests/files/opta_f24.xml\",\n", " # f24_data=\"../../kloppy/tests/files/opta_f73.xml\",\n", " \n", " # Optional arguments\n", " coordinates=\"opta\", \n", " event_types=[\"pass\", \"shot\"],\n", ")\n", "\n", "dataset.to_df().head()" ] }, { "cell_type": "markdown", "id": "63f318e2-fe18-4f16-98fc-cf0faa173eb9", "metadata": {}, "source": [ "## Load local tracking files" ] }, { "cell_type": "code", "execution_count": null, "id": "8a96a838-ef9c-42a7-a647-0a6e0090ff86", "metadata": {}, "outputs": [ { "data": { "text/html": [ "<div>\n", "<style scoped>\n", " .dataframe tbody tr th:only-of-type {\n", " vertical-align: middle;\n", " }\n", "\n", " .dataframe tbody tr th {\n", " vertical-align: top;\n", " }\n", "\n", " .dataframe thead th {\n", " text-align: right;\n", " }\n", "</style>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>period_id</th>\n", " <th>timestamp</th>\n", " <th>frame_id</th>\n", " <th>ball_state</th>\n", " <th>ball_owning_team_id</th>\n", " <th>ball_x</th>\n", " <th>ball_y</th>\n", " <th>ball_z</th>\n", " <th>ball_speed</th>\n", " <th>a2s2c6anax9wnlsw1s6vunl5h_x</th>\n", " <th>...</th>\n", " <th>6wfwy94p5bm0zv3aku0urfq39_d</th>\n", " <th>6wfwy94p5bm0zv3aku0urfq39_s</th>\n", " <th>6ekdnbnk56xlxforb5owt3dn9_x</th>\n", " <th>6ekdnbnk56xlxforb5owt3dn9_y</th>\n", " <th>6ekdnbnk56xlxforb5owt3dn9_d</th>\n", " <th>6ekdnbnk56xlxforb5owt3dn9_s</th>\n", " <th>ct32113pfx5q9avf2c0x208ru_x</th>\n", " <th>ct32113pfx5q9avf2c0x208ru_y</th>\n", " <th>ct32113pfx5q9avf2c0x208ru_d</th>\n", " <th>ct32113pfx5q9avf2c0x208ru_s</th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>0</th>\n", " <td>1</td>\n", " <td>0 days 00:00:00</td>\n", " <td>1598184000000</td>\n", " <td>alive</td>\n", " <td>None</td>\n", " <td>52.350</td>\n", " <td>33.250</td>\n", " <td>0.0</td>\n", " <td>None</td>\n", " <td>52.803</td>\n", " <td>...</td>\n", " <td>None</td>\n", " <td>None</td>\n", " <td>5.268</td>\n", " <td>33.556</td>\n", " <td>None</td>\n", " <td>None</td>\n", " <td>NaN</td>\n", " <td>NaN</td>\n", " <td>None</td>\n", " <td>None</td>\n", " </tr>\n", " <tr>\n", " <th>1</th>\n", " <td>1</td>\n", " <td>0 days 00:00:00.100000</td>\n", " <td>1598184000100</td>\n", " <td>alive</td>\n", " <td>None</td>\n", " <td>50.615</td>\n", " <td>35.325</td>\n", " <td>0.0</td>\n", " <td>None</td>\n", " <td>52.558</td>\n", " <td>...</td>\n", " <td>None</td>\n", " <td>None</td>\n", " <td>5.265</td>\n", " <td>33.529</td>\n", " <td>None</td>\n", " <td>None</td>\n", " <td>NaN</td>\n", " <td>NaN</td>\n", " <td>None</td>\n", " <td>None</td>\n", " </tr>\n", " <tr>\n", " <th>2</th>\n", " <td>1</td>\n", " <td>0 days 00:00:00.200000</td>\n", " <td>1598184000200</td>\n", " <td>alive</td>\n", " <td>None</td>\n", " <td>49.630</td>\n", " <td>36.140</td>\n", " <td>0.0</td>\n", " <td>None</td>\n", " <td>52.310</td>\n", " <td>...</td>\n", " <td>None</td>\n", " <td>None</td>\n", " <td>5.264</td>\n", " <td>33.502</td>\n", " <td>None</td>\n", " <td>None</td>\n", " <td>NaN</td>\n", " <td>NaN</td>\n", " <td>None</td>\n", " <td>None</td>\n", " </tr>\n", " <tr>\n", " <th>3</th>\n", " <td>1</td>\n", " <td>0 days 00:00:00.300000</td>\n", " <td>1598184000300</td>\n", " <td>alive</td>\n", " <td>None</td>\n", " <td>48.725</td>\n", " <td>36.625</td>\n", " <td>0.0</td>\n", " <td>None</td>\n", " <td>52.059</td>\n", " <td>...</td>\n", " <td>None</td>\n", " <td>None</td>\n", " <td>5.268</td>\n", " <td>33.476</td>\n", " <td>None</td>\n", " <td>None</td>\n", " <td>NaN</td>\n", " <td>NaN</td>\n", " <td>None</td>\n", " <td>None</td>\n", " </tr>\n", " <tr>\n", " <th>4</th>\n", " <td>1</td>\n", " <td>0 days 00:00:00.400000</td>\n", " <td>1598184000400</td>\n", " <td>alive</td>\n", " <td>None</td>\n", " <td>47.890</td>\n", " <td>37.130</td>\n", " <td>0.0</td>\n", " <td>None</td>\n", " <td>51.804</td>\n", " <td>...</td>\n", " <td>None</td>\n", " <td>None</td>\n", " <td>5.277</td>\n", " <td>33.452</td>\n", " <td>None</td>\n", " <td>None</td>\n", " <td>NaN</td>\n", " <td>NaN</td>\n", " <td>None</td>\n", " <td>None</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "<p>5 rows × 101 columns</p>\n", "</div>" ], "text/plain": [ " period_id timestamp frame_id ball_state \\\n", "0 1 0 days 00:00:00 1598184000000 alive \n", "1 1 0 days 00:00:00.100000 1598184000100 alive \n", "2 1 0 days 00:00:00.200000 1598184000200 alive \n", "3 1 0 days 00:00:00.300000 1598184000300 alive \n", "4 1 0 days 00:00:00.400000 1598184000400 alive \n", "\n", " ball_owning_team_id ball_x ball_y ball_z ball_speed \\\n", "0 None 52.350 33.250 0.0 None \n", "1 None 50.615 35.325 0.0 None \n", "2 None 49.630 36.140 0.0 None \n", "3 None 48.725 36.625 0.0 None \n", "4 None 47.890 37.130 0.0 None \n", "\n", " a2s2c6anax9wnlsw1s6vunl5h_x ... 6wfwy94p5bm0zv3aku0urfq39_d \\\n", "0 52.803 ... None \n", "1 52.558 ... None \n", "2 52.310 ... None \n", "3 52.059 ... None \n", "4 51.804 ... None \n", "\n", " 6wfwy94p5bm0zv3aku0urfq39_s 6ekdnbnk56xlxforb5owt3dn9_x \\\n", "0 None 5.268 \n", "1 None 5.265 \n", "2 None 5.264 \n", "3 None 5.268 \n", "4 None 5.277 \n", "\n", " 6ekdnbnk56xlxforb5owt3dn9_y 6ekdnbnk56xlxforb5owt3dn9_d \\\n", "0 33.556 None \n", "1 33.529 None \n", "2 33.502 None \n", "3 33.476 None \n", "4 33.452 None \n", "\n", " 6ekdnbnk56xlxforb5owt3dn9_s ct32113pfx5q9avf2c0x208ru_x \\\n", "0 None NaN \n", "1 None NaN \n", "2 None NaN \n", "3 None NaN \n", "4 None NaN \n", "\n", " ct32113pfx5q9avf2c0x208ru_y ct32113pfx5q9avf2c0x208ru_d \\\n", "0 NaN None \n", "1 NaN None \n", "2 NaN None \n", "3 NaN None \n", "4 NaN None \n", "\n", " ct32113pfx5q9avf2c0x208ru_s \n", "0 None \n", "1 None \n", "2 None \n", "3 None \n", "4 None \n", "\n", "[5 rows x 101 columns]" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from kloppy import statsperform\n", "\n", "dataset = statsperform.load_tracking(\n", " ma1_data=\"../../kloppy/tests/files/statsperform_tracking_ma1.json\",\n", " ma25_data=\"../../kloppy/tests/files/statsperform_tracking_ma25.txt\",\n", "\n", " # Optional arguments\n", " coordinates=\"statsperform\",\n", " only_alive=True,\n", " limit=50,\n", " sample_rate=(1/2),\n", " pitch_length=102.5,\n", " pitch_width=69.0,\n", " tracking_system=\"sportvu\",\n", ")\n", "\n", "dataset.to_df().head()" ] }, { "cell_type": "code", "execution_count": null, "id": "bb1c3e31-53fa-49db-a3d4-26e91b68adee", "metadata": {}, "outputs": [ { "data": { "text/html": [ "<div>\n", "<style scoped>\n", " .dataframe tbody tr th:only-of-type {\n", " vertical-align: middle;\n", " }\n", "\n", " .dataframe tbody tr th {\n", " vertical-align: top;\n", " }\n", "\n", " .dataframe thead th {\n", " text-align: right;\n", " }\n", "</style>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>period_id</th>\n", " <th>timestamp</th>\n", " <th>frame_id</th>\n", " <th>ball_state</th>\n", " <th>ball_owning_team_id</th>\n", " <th>ball_x</th>\n", " <th>ball_y</th>\n", " <th>ball_z</th>\n", " <th>ball_speed</th>\n", " <th>a2s2c6anax9wnlsw1s6vunl5h_x</th>\n", " <th>...</th>\n", " <th>6wfwy94p5bm0zv3aku0urfq39_d</th>\n", " <th>6wfwy94p5bm0zv3aku0urfq39_s</th>\n", " <th>6ekdnbnk56xlxforb5owt3dn9_x</th>\n", " <th>6ekdnbnk56xlxforb5owt3dn9_y</th>\n", " <th>6ekdnbnk56xlxforb5owt3dn9_d</th>\n", " <th>6ekdnbnk56xlxforb5owt3dn9_s</th>\n", " <th>ct32113pfx5q9avf2c0x208ru_x</th>\n", " <th>ct32113pfx5q9avf2c0x208ru_y</th>\n", " <th>ct32113pfx5q9avf2c0x208ru_d</th>\n", " <th>ct32113pfx5q9avf2c0x208ru_s</th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>0</th>\n", " <td>1</td>\n", " <td>0 days 00:00:00</td>\n", " <td>1598184000000</td>\n", " <td>alive</td>\n", " <td>None</td>\n", " <td>52.350</td>\n", " <td>33.250</td>\n", " <td>0.0</td>\n", " <td>None</td>\n", " <td>52.803</td>\n", " <td>...</td>\n", " <td>None</td>\n", " <td>None</td>\n", " <td>5.268</td>\n", " <td>33.556</td>\n", " <td>None</td>\n", " <td>None</td>\n", " <td>NaN</td>\n", " <td>NaN</td>\n", " <td>None</td>\n", " <td>None</td>\n", " </tr>\n", " <tr>\n", " <th>1</th>\n", " <td>1</td>\n", " <td>0 days 00:00:00.100000</td>\n", " <td>1598184000100</td>\n", " <td>alive</td>\n", " <td>None</td>\n", " <td>50.615</td>\n", " <td>35.325</td>\n", " <td>0.0</td>\n", " <td>None</td>\n", " <td>52.558</td>\n", " <td>...</td>\n", " <td>None</td>\n", " <td>None</td>\n", " <td>5.265</td>\n", " <td>33.529</td>\n", " <td>None</td>\n", " <td>None</td>\n", " <td>NaN</td>\n", " <td>NaN</td>\n", " <td>None</td>\n", " <td>None</td>\n", " </tr>\n", " <tr>\n", " <th>2</th>\n", " <td>1</td>\n", " <td>0 days 00:00:00.200000</td>\n", " <td>1598184000200</td>\n", " <td>alive</td>\n", " <td>None</td>\n", " <td>49.630</td>\n", " <td>36.140</td>\n", " <td>0.0</td>\n", " <td>None</td>\n", " <td>52.310</td>\n", " <td>...</td>\n", " <td>None</td>\n", " <td>None</td>\n", " <td>5.264</td>\n", " <td>33.502</td>\n", " <td>None</td>\n", " <td>None</td>\n", " <td>NaN</td>\n", " <td>NaN</td>\n", " <td>None</td>\n", " <td>None</td>\n", " </tr>\n", " <tr>\n", " <th>3</th>\n", " <td>1</td>\n", " <td>0 days 00:00:00.300000</td>\n", " <td>1598184000300</td>\n", " <td>alive</td>\n", " <td>None</td>\n", " <td>48.725</td>\n", " <td>36.625</td>\n", " <td>0.0</td>\n", " <td>None</td>\n", " <td>52.059</td>\n", " <td>...</td>\n", " <td>None</td>\n", " <td>None</td>\n", " <td>5.268</td>\n", " <td>33.476</td>\n", " <td>None</td>\n", " <td>None</td>\n", " <td>NaN</td>\n", " <td>NaN</td>\n", " <td>None</td>\n", " <td>None</td>\n", " </tr>\n", " <tr>\n", " <th>4</th>\n", " <td>1</td>\n", " <td>0 days 00:00:00.400000</td>\n", " <td>1598184000400</td>\n", " <td>alive</td>\n", " <td>None</td>\n", " <td>47.890</td>\n", " <td>37.130</td>\n", " <td>0.0</td>\n", " <td>None</td>\n", " <td>51.804</td>\n", " <td>...</td>\n", " <td>None</td>\n", " <td>None</td>\n", " <td>5.277</td>\n", " <td>33.452</td>\n", " <td>None</td>\n", " <td>None</td>\n", " <td>NaN</td>\n", " <td>NaN</td>\n", " <td>None</td>\n", " <td>None</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "<p>5 rows × 101 columns</p>\n", "</div>" ], "text/plain": [ " period_id timestamp frame_id ball_state \\\n", "0 1 0 days 00:00:00 1598184000000 alive \n", "1 1 0 days 00:00:00.100000 1598184000100 alive \n", "2 1 0 days 00:00:00.200000 1598184000200 alive \n", "3 1 0 days 00:00:00.300000 1598184000300 alive \n", "4 1 0 days 00:00:00.400000 1598184000400 alive \n", "\n", " ball_owning_team_id ball_x ball_y ball_z ball_speed \\\n", "0 None 52.350 33.250 0.0 None \n", "1 None 50.615 35.325 0.0 None \n", "2 None 49.630 36.140 0.0 None \n", "3 None 48.725 36.625 0.0 None \n", "4 None 47.890 37.130 0.0 None \n", "\n", " a2s2c6anax9wnlsw1s6vunl5h_x ... 6wfwy94p5bm0zv3aku0urfq39_d \\\n", "0 52.803 ... None \n", "1 52.558 ... None \n", "2 52.310 ... None \n", "3 52.059 ... None \n", "4 51.804 ... None \n", "\n", " 6wfwy94p5bm0zv3aku0urfq39_s 6ekdnbnk56xlxforb5owt3dn9_x \\\n", "0 None 5.268 \n", "1 None 5.265 \n", "2 None 5.264 \n", "3 None 5.268 \n", "4 None 5.277 \n", "\n", " 6ekdnbnk56xlxforb5owt3dn9_y 6ekdnbnk56xlxforb5owt3dn9_d \\\n", "0 33.556 None \n", "1 33.529 None \n", "2 33.502 None \n", "3 33.476 None \n", "4 33.452 None \n", "\n", " 6ekdnbnk56xlxforb5owt3dn9_s ct32113pfx5q9avf2c0x208ru_x \\\n", "0 None NaN \n", "1 None NaN \n", "2 None NaN \n", "3 None NaN \n", "4 None NaN \n", "\n", " ct32113pfx5q9avf2c0x208ru_y ct32113pfx5q9avf2c0x208ru_d \\\n", "0 NaN None \n", "1 NaN None \n", "2 NaN None \n", "3 NaN None \n", "4 NaN None \n", "\n", " ct32113pfx5q9avf2c0x208ru_s \n", "0 None \n", "1 None \n", "2 None \n", "3 None \n", "4 None \n", "\n", "[5 rows x 101 columns]" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from kloppy import statsperform\n", "\n", "dataset = statsperform.load_tracking(\n", " ma1_data=\"../../kloppy/tests/files/statsperform_tracking_ma1.xml\",\n", " ma25_data=\"../../kloppy/tests/files/statsperform_tracking_ma25.txt\",\n", "\n", " # Optional arguments\n", " coordinates=\"statsperform\",\n", " only_alive=True,\n", " limit=50,\n", " sample_rate=(1/2),\n", " pitch_length=102.5,\n", " pitch_width=69.0,\n", " tracking_system=\"sportvu\",\n", ")\n", "\n", "dataset.to_df().head()" ] }, { "cell_type": "markdown", "id": "0a531211", "metadata": {}, "source": [ "## Load remote files\n", "Kloppy supports remote files through `fsspec` FileSystem under the hood. This allows you to work with files in AWS S3, Google Cloud, Azure Blob, HDFS, FTP, and SFTP without extra tools.\n", "For example you can pass:\n", "- Individual s3 file paths: (e.g `ma1_data=s3://.../statsperform_event_ma1.xml`)\n", "\n", "Note: Kloppy might throw an the first time to help you identify missing cloud specific dependencies like `s3fs`. " ] }, { "cell_type": "code", "execution_count": null, "id": "52f705c2", "metadata": {}, "outputs": [], "source": [ "from kloppy import statsbomb\n", "\n", "dataset = statsperform.load_event(\n", " ma1_data=\"s3://.../statsperform_event_ma1.xml\",\n", " ma3_data=\"s3://.../statsperform_event_ma3.xml\",\n", " \n", " # Optional arguments\n", " coordinates=\"opta\",\n", " pitch_length=102.5,\n", " pitch_width=69.0,\n", " event_types=[\"pass\", \"shot\"],\n", ")" ] } ], "metadata": { "kernelspec": { "display_name": ".venv", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.11" } }, "nbformat": 4, "nbformat_minor": 5 }