Code data¶
Apart from event- and tracking data analytics often use code data. This type of data can be collected by hand using tools like SportsCode.
Kloppy allows easy read AND write functionallity for the codes in SportsCode XML format.
Reading XML file¶
In [1]:
Copied!
from kloppy import sportscode
from kloppy import sportscode
In [2]:
Copied!
with open("file.xml", "w") as fp:
fp.write("""<?xml version="1.0"?>
<file>
<ALL_INSTANCES>
<instance>
<ID>P1</ID>
<start>3.6</start>
<end>9.7</end>
<code>PASS</code>
<label>
<group>Team</group>
<text>Henkie</text>
</label>
<label>
<group>Packing.Value</group>
<text>1</text>
</label>
<label>
<group>Receiver</group>
<text>Klaas Nøme</text>
</label>
</instance>
<instance>
<ID>P2</ID>
<start>68.3</start>
<end>74.5</end>
<code>PASS</code>
<label>
<group>Team</group>
<text>Henkie</text>
</label>
<label>
<group>Packing.Value</group>
<text>3</text>
</label>
<label>
<group>Receiver</group>
<text>Piet</text>
</label>
</instance>
<instance>
<ID>P3</ID>
<start>103.6</start>
<end>109.6</end>
<code>SHOT</code>
<label>
<group>Team</group>
<text>Henkie</text>
</label>
<label>
<group>Expected.Goal.Value</group>
<text>0.13</text>
</label>
</instance>
</ALL_INSTANCES>
</file>""")
code_dataset = sportscode.load("file.xml")
with open("file.xml", "w") as fp:
fp.write("""
P1
3.6
9.7
P2
68.3
74.5
P3
103.6
109.6
""")
code_dataset = sportscode.load("file.xml")
PASS
PASS
SHOT
In [3]:
Copied!
code_dataset.to_df()
code_dataset.to_df()
Out[3]:
code_id | period_id | timestamp | end_timestamp | code | Team | Packing.Value | Receiver | Expected.Goal.Value | |
---|---|---|---|---|---|---|---|---|---|
0 | P1 | 1 | 3.6 | 9.7 | PASS | Henkie | 1.0 | Klaas Nøme | NaN |
1 | P2 | 1 | 68.3 | 74.5 | PASS | Henkie | 3.0 | Piet | NaN |
2 | P3 | 1 | 103.6 | 109.6 | SHOT | Henkie | NaN | None | 0.13 |
The code dataset also allows filtering the codes
In [4]:
Copied!
passes = code_dataset.filter(lambda code: code.code == 'PASS')
passes.to_df()
passes = code_dataset.filter(lambda code: code.code == 'PASS')
passes.to_df()
Out[4]:
code_id | period_id | timestamp | end_timestamp | code | Team | Packing.Value | Receiver | |
---|---|---|---|---|---|---|---|---|
0 | P1 | 1 | 3.6 | 9.7 | PASS | Henkie | 1 | Klaas Nøme |
1 | P2 | 1 | 68.3 | 74.5 | PASS | Henkie | 3 | Piet |
Writing XML file¶
In [5]:
Copied!
sportscode.save(passes, "file.xml")
with open("file.xml", "r") as fp:
print(fp.read())
sportscode.save(passes, "file.xml")
with open("file.xml", "r") as fp:
print(fp.read())
<?xml version='1.0' encoding='utf-8'?> <file> <ALL_INSTANCES> <instance> <ID>P1</ID> <start>3.6</start> <end>9.7</end> <code>PASS</code> <label> <group>Team</group> <text>Henkie</text> </label> <label> <group>Packing.Value</group> <text>1</text> </label> <label> <group>Receiver</group> <text>Klaas Nøme</text> </label> </instance> <instance> <ID>P2</ID> <start>68.3</start> <end>74.5</end> <code>PASS</code> <label> <group>Team</group> <text>Henkie</text> </label> <label> <group>Packing.Value</group> <text>3</text> </label> <label> <group>Receiver</group> <text>Piet</text> </label> </instance> </ALL_INSTANCES> </file>
Converting event dataset into XML dataset¶
In [6]:
Copied!
from kloppy import statsbomb
dataset = statsbomb.load_open_data()
from kloppy import statsbomb
dataset = statsbomb.load_open_data()
/Users/koen/Developer/Projects/PySport/kloppy/.venv/lib/python3.10/site-packages/kloppy-3.10.2-py3.10.egg/kloppy/_providers/statsbomb.py:67: UserWarning: You are about to use StatsBomb public data. By using this data, you are agreeing to the user agreement. The user agreement can be found here: https://github.com/statsbomb/open-data/blob/master/LICENSE.pdf warnings.warn(
In [7]:
Copied!
from kloppy.domain import Code, CodeDataset, EventType
dataset_shots = dataset.filter(
lambda event: event.event_type == EventType.SHOT
)
code_dataset = (
CodeDataset
.from_dataset(
dataset_shots,
lambda event: Code(
code_id=None, # make it auto increment on write
code=event.event_name,
period=event.period,
timestamp=max(0, event.timestamp - 7),
end_timestamp=event.timestamp + 5,
labels={
'Player': str(event.player),
'Team': str(event.team)
},
# In the future next two won't be needed anymore
ball_owning_team=None,
ball_state=None
)
)
)
from kloppy.domain import Code, CodeDataset, EventType
dataset_shots = dataset.filter(
lambda event: event.event_type == EventType.SHOT
)
code_dataset = (
CodeDataset
.from_dataset(
dataset_shots,
lambda event: Code(
code_id=None, # make it auto increment on write
code=event.event_name,
period=event.period,
timestamp=max(0, event.timestamp - 7),
end_timestamp=event.timestamp + 5,
labels={
'Player': str(event.player),
'Team': str(event.team)
},
# In the future next two won't be needed anymore
ball_owning_team=None,
ball_state=None
)
)
)
In [8]:
Copied!
code_dataset.to_df()
code_dataset.to_df()
Out[8]:
code_id | period_id | timestamp | end_timestamp | code | Player | Team | |
---|---|---|---|---|---|---|---|
0 | None | 1 | 142.094 | 154.094 | shot | Lionel Andrés Messi Cuccittini | Barcelona |
1 | None | 1 | 332.239 | 344.239 | shot | Jordi Alba Ramos | Barcelona |
2 | None | 1 | 921.625 | 933.625 | shot | Lionel Andrés Messi Cuccittini | Barcelona |
3 | None | 1 | 972.616 | 984.616 | shot | Rubén Sobrino Pozuelo | Deportivo Alavés |
4 | None | 1 | 1088.914 | 1100.914 | shot | Luis Alberto Suárez Díaz | Barcelona |
5 | None | 1 | 1835.287 | 1847.287 | shot | Ousmane Dembélé | Barcelona |
6 | None | 1 | 2097.861 | 2109.861 | shot | Ivan Rakitić | Barcelona |
7 | None | 1 | 2241.168 | 2253.168 | shot | Lionel Andrés Messi Cuccittini | Barcelona |
8 | None | 1 | 2243.989 | 2255.989 | shot | Gerard Piqué Bernabéu | Barcelona |
9 | None | 1 | 2301.083 | 2313.083 | shot | Ousmane Dembélé | Barcelona |
10 | None | 1 | 2427.592 | 2439.592 | shot | Luis Alberto Suárez Díaz | Barcelona |
11 | None | 1 | 2603.612 | 2615.612 | shot | Ousmane Dembélé | Barcelona |
12 | None | 2 | 152.524 | 164.524 | shot | Jordi Alba Ramos | Barcelona |
13 | None | 2 | 360.400 | 372.400 | shot | Mubarak Wakaso | Deportivo Alavés |
14 | None | 2 | 527.355 | 539.355 | shot | Luis Alberto Suárez Díaz | Barcelona |
15 | None | 2 | 589.388 | 601.388 | shot | Philippe Coutinho Correia | Barcelona |
16 | None | 2 | 627.490 | 639.490 | shot | Jordi Alba Ramos | Barcelona |
17 | None | 2 | 733.847 | 745.847 | shot | Lionel Andrés Messi Cuccittini | Barcelona |
18 | None | 2 | 929.156 | 941.156 | shot | Philippe Coutinho Correia | Barcelona |
19 | None | 2 | 1084.954 | 1096.954 | shot | Lionel Andrés Messi Cuccittini | Barcelona |
20 | None | 2 | 1236.588 | 1248.588 | shot | Lionel Andrés Messi Cuccittini | Barcelona |
21 | None | 2 | 1406.375 | 1418.375 | shot | Ivan Rakitić | Barcelona |
22 | None | 2 | 1640.492 | 1652.492 | shot | Luis Alberto Suárez Díaz | Barcelona |
23 | None | 2 | 1990.679 | 2002.679 | shot | Adrián Marín Gómez | Deportivo Alavés |
24 | None | 2 | 2184.606 | 2196.606 | shot | Philippe Coutinho Correia | Barcelona |
25 | None | 2 | 2254.578 | 2266.578 | shot | Philippe Coutinho Correia | Barcelona |
26 | None | 2 | 2655.638 | 2667.638 | shot | Lionel Andrés Messi Cuccittini | Barcelona |
27 | None | 2 | 2795.770 | 2807.770 | shot | Lionel Andrés Messi Cuccittini | Barcelona |
In [9]:
Copied!
sportscode.save(code_dataset, "file.xml")
with open("file.xml", "r") as fp:
print(fp.read())
sportscode.save(code_dataset, "file.xml")
with open("file.xml", "r") as fp:
print(fp.read())
<?xml version='1.0' encoding='utf-8'?> <file> <ALL_INSTANCES> <instance> <ID>1</ID> <start>142.094</start> <end>154.094</end> <code>shot</code> <label> <group>Player</group> <text>Lionel Andrés Messi Cuccittini</text> </label> <label> <group>Team</group> <text>Barcelona</text> </label> </instance> <instance> <ID>2</ID> <start>332.239</start> <end>344.239</end> <code>shot</code> <label> <group>Player</group> <text>Jordi Alba Ramos</text> </label> <label> <group>Team</group> <text>Barcelona</text> </label> </instance> <instance> <ID>3</ID> <start>921.625</start> <end>933.625</end> <code>shot</code> <label> <group>Player</group> <text>Lionel Andrés Messi Cuccittini</text> </label> <label> <group>Team</group> <text>Barcelona</text> </label> </instance> <instance> <ID>4</ID> <start>972.616</start> <end>984.616</end> <code>shot</code> <label> <group>Player</group> <text>Rubén Sobrino Pozuelo</text> </label> <label> <group>Team</group> <text>Deportivo Alavés</text> </label> </instance> <instance> <ID>5</ID> <start>1088.914</start> <end>1100.914</end> <code>shot</code> <label> <group>Player</group> <text>Luis Alberto Suárez Díaz</text> </label> <label> <group>Team</group> <text>Barcelona</text> </label> </instance> <instance> <ID>6</ID> <start>1835.287</start> <end>1847.287</end> <code>shot</code> <label> <group>Player</group> <text>Ousmane Dembélé</text> </label> <label> <group>Team</group> <text>Barcelona</text> </label> </instance> <instance> <ID>7</ID> <start>2097.861</start> <end>2109.861</end> <code>shot</code> <label> <group>Player</group> <text>Ivan Rakitić</text> </label> <label> <group>Team</group> <text>Barcelona</text> </label> </instance> <instance> <ID>8</ID> <start>2241.168</start> <end>2253.168</end> <code>shot</code> <label> <group>Player</group> <text>Lionel Andrés Messi Cuccittini</text> </label> <label> <group>Team</group> <text>Barcelona</text> </label> </instance> <instance> <ID>9</ID> <start>2243.989</start> <end>2255.989</end> <code>shot</code> <label> <group>Player</group> <text>Gerard Piqué Bernabéu</text> </label> <label> <group>Team</group> <text>Barcelona</text> </label> </instance> <instance> <ID>10</ID> <start>2301.083</start> <end>2313.083</end> <code>shot</code> <label> <group>Player</group> <text>Ousmane Dembélé</text> </label> <label> <group>Team</group> <text>Barcelona</text> </label> </instance> <instance> <ID>11</ID> <start>2427.592</start> <end>2439.592</end> <code>shot</code> <label> <group>Player</group> <text>Luis Alberto Suárez Díaz</text> </label> <label> <group>Team</group> <text>Barcelona</text> </label> </instance> <instance> <ID>12</ID> <start>2603.612</start> <end>2615.612</end> <code>shot</code> <label> <group>Player</group> <text>Ousmane Dembélé</text> </label> <label> <group>Team</group> <text>Barcelona</text> </label> </instance> <instance> <ID>13</ID> <start>2857.7909999999997</start> <end>2869.7909999999997</end> <code>shot</code> <label> <group>Player</group> <text>Jordi Alba Ramos</text> </label> <label> <group>Team</group> <text>Barcelona</text> </label> </instance> <instance> <ID>14</ID> <start>3065.667</start> <end>3077.667</end> <code>shot</code> <label> <group>Player</group> <text>Mubarak Wakaso</text> </label> <label> <group>Team</group> <text>Deportivo Alavés</text> </label> </instance> <instance> <ID>15</ID> <start>3232.622</start> <end>3244.622</end> <code>shot</code> <label> <group>Player</group> <text>Luis Alberto Suárez Díaz</text> </label> <label> <group>Team</group> <text>Barcelona</text> </label> </instance> <instance> <ID>16</ID> <start>3294.6549999999997</start> <end>3306.6549999999997</end> <code>shot</code> <label> <group>Player</group> <text>Philippe Coutinho Correia</text> </label> <label> <group>Team</group> <text>Barcelona</text> </label> </instance> <instance> <ID>17</ID> <start>3332.7569999999996</start> <end>3344.7569999999996</end> <code>shot</code> <label> <group>Player</group> <text>Jordi Alba Ramos</text> </label> <label> <group>Team</group> <text>Barcelona</text> </label> </instance> <instance> <ID>18</ID> <start>3439.1139999999996</start> <end>3451.1139999999996</end> <code>shot</code> <label> <group>Player</group> <text>Lionel Andrés Messi Cuccittini</text> </label> <label> <group>Team</group> <text>Barcelona</text> </label> </instance> <instance> <ID>19</ID> <start>3634.423</start> <end>3646.423</end> <code>shot</code> <label> <group>Player</group> <text>Philippe Coutinho Correia</text> </label> <label> <group>Team</group> <text>Barcelona</text> </label> </instance> <instance> <ID>20</ID> <start>3790.2209999999995</start> <end>3802.2209999999995</end> <code>shot</code> <label> <group>Player</group> <text>Lionel Andrés Messi Cuccittini</text> </label> <label> <group>Team</group> <text>Barcelona</text> </label> </instance> <instance> <ID>21</ID> <start>3941.8549999999996</start> <end>3953.8549999999996</end> <code>shot</code> <label> <group>Player</group> <text>Lionel Andrés Messi Cuccittini</text> </label> <label> <group>Team</group> <text>Barcelona</text> </label> </instance> <instance> <ID>22</ID> <start>4111.642</start> <end>4123.642</end> <code>shot</code> <label> <group>Player</group> <text>Ivan Rakitić</text> </label> <label> <group>Team</group> <text>Barcelona</text> </label> </instance> <instance> <ID>23</ID> <start>4345.759</start> <end>4357.759</end> <code>shot</code> <label> <group>Player</group> <text>Luis Alberto Suárez Díaz</text> </label> <label> <group>Team</group> <text>Barcelona</text> </label> </instance> <instance> <ID>24</ID> <start>4695.946</start> <end>4707.946</end> <code>shot</code> <label> <group>Player</group> <text>Adrián Marín Gómez</text> </label> <label> <group>Team</group> <text>Deportivo Alavés</text> </label> </instance> <instance> <ID>25</ID> <start>4889.873</start> <end>4901.873</end> <code>shot</code> <label> <group>Player</group> <text>Philippe Coutinho Correia</text> </label> <label> <group>Team</group> <text>Barcelona</text> </label> </instance> <instance> <ID>26</ID> <start>4959.844999999999</start> <end>4971.844999999999</end> <code>shot</code> <label> <group>Player</group> <text>Philippe Coutinho Correia</text> </label> <label> <group>Team</group> <text>Barcelona</text> </label> </instance> <instance> <ID>27</ID> <start>5360.905</start> <end>5372.905</end> <code>shot</code> <label> <group>Player</group> <text>Lionel Andrés Messi Cuccittini</text> </label> <label> <group>Team</group> <text>Barcelona</text> </label> </instance> <instance> <ID>28</ID> <start>5501.037</start> <end>5513.037</end> <code>shot</code> <label> <group>Player</group> <text>Lionel Andrés Messi Cuccittini</text> </label> <label> <group>Team</group> <text>Barcelona</text> </label> </instance> </ALL_INSTANCES> </file>
In [10]:
Copied!
import os
os.unlink("file.xml")
import os
os.unlink("file.xml")
In [14]:
Copied!
# Chain filter and map operators
new_dataset = (
code_dataset
.filter(lambda record: record.labels['Team'] == 'Barcelona')
.map(lambda record: record.replace(code='Schot Barcelona'))
)
new_dataset.to_df()
# Chain filter and map operators
new_dataset = (
code_dataset
.filter(lambda record: record.labels['Team'] == 'Barcelona')
.map(lambda record: record.replace(code='Schot Barcelona'))
)
new_dataset.to_df()
Out[14]:
code_id | period_id | timestamp | end_timestamp | code | Player | Team | |
---|---|---|---|---|---|---|---|
0 | None | 1 | 142.094 | 154.094 | Schot Barcelona | Lionel Andrés Messi Cuccittini | Barcelona |
1 | None | 1 | 332.239 | 344.239 | Schot Barcelona | Jordi Alba Ramos | Barcelona |
2 | None | 1 | 921.625 | 933.625 | Schot Barcelona | Lionel Andrés Messi Cuccittini | Barcelona |
3 | None | 1 | 1088.914 | 1100.914 | Schot Barcelona | Luis Alberto Suárez Díaz | Barcelona |
4 | None | 1 | 1835.287 | 1847.287 | Schot Barcelona | Ousmane Dembélé | Barcelona |
5 | None | 1 | 2097.861 | 2109.861 | Schot Barcelona | Ivan Rakitić | Barcelona |
6 | None | 1 | 2241.168 | 2253.168 | Schot Barcelona | Lionel Andrés Messi Cuccittini | Barcelona |
7 | None | 1 | 2243.989 | 2255.989 | Schot Barcelona | Gerard Piqué Bernabéu | Barcelona |
8 | None | 1 | 2301.083 | 2313.083 | Schot Barcelona | Ousmane Dembélé | Barcelona |
9 | None | 1 | 2427.592 | 2439.592 | Schot Barcelona | Luis Alberto Suárez Díaz | Barcelona |
10 | None | 1 | 2603.612 | 2615.612 | Schot Barcelona | Ousmane Dembélé | Barcelona |
11 | None | 2 | 152.524 | 164.524 | Schot Barcelona | Jordi Alba Ramos | Barcelona |
12 | None | 2 | 527.355 | 539.355 | Schot Barcelona | Luis Alberto Suárez Díaz | Barcelona |
13 | None | 2 | 589.388 | 601.388 | Schot Barcelona | Philippe Coutinho Correia | Barcelona |
14 | None | 2 | 627.490 | 639.490 | Schot Barcelona | Jordi Alba Ramos | Barcelona |
15 | None | 2 | 733.847 | 745.847 | Schot Barcelona | Lionel Andrés Messi Cuccittini | Barcelona |
16 | None | 2 | 929.156 | 941.156 | Schot Barcelona | Philippe Coutinho Correia | Barcelona |
17 | None | 2 | 1084.954 | 1096.954 | Schot Barcelona | Lionel Andrés Messi Cuccittini | Barcelona |
18 | None | 2 | 1236.588 | 1248.588 | Schot Barcelona | Lionel Andrés Messi Cuccittini | Barcelona |
19 | None | 2 | 1406.375 | 1418.375 | Schot Barcelona | Ivan Rakitić | Barcelona |
20 | None | 2 | 1640.492 | 1652.492 | Schot Barcelona | Luis Alberto Suárez Díaz | Barcelona |
21 | None | 2 | 2184.606 | 2196.606 | Schot Barcelona | Philippe Coutinho Correia | Barcelona |
22 | None | 2 | 2254.578 | 2266.578 | Schot Barcelona | Philippe Coutinho Correia | Barcelona |
23 | None | 2 | 2655.638 | 2667.638 | Schot Barcelona | Lionel Andrés Messi Cuccittini | Barcelona |
24 | None | 2 | 2795.770 | 2807.770 | Schot Barcelona | Lionel Andrés Messi Cuccittini | Barcelona |
In [ ]:
Copied!