Navigating¶
Kloppy 3.2 adds some powerfull tools to navigate through you event data. In this tutorial you will learn how to use them.
Dataset¶
On dataset level it's possible to use filter
, find
and find_all
. All these functions access the same argument for finding the right events.
You can pass a string or a function. In case of a string is must be either 'event_type', 'event_type.result' or '.result'. Some examples: 'shot.goal', 'pass' or '.complete'.
Lets have a look at how these work.
from kloppy import statsbomb
# Load a Statsbomb open data dataset
dataset = statsbomb.load_open_data()
# Create a new dataset which contains all goals
filtered_dataset = dataset.filter('shot.goal')
# Show the results
filtered_dataset.to_df()
/Users/koen/Developer/Projects/PySport/kloppy/.venv/lib/python3.10/site-packages/kloppy-3.7.1-py3.10.egg/kloppy/_providers/statsbomb.py:67: UserWarning: You are about to use StatsBomb public data. By using this data, you are agreeing to the user agreement. The user agreement can be found here: https://github.com/statsbomb/open-data/blob/master/LICENSE.pdf warnings.warn(
event_id | event_type | result | success | period_id | timestamp | end_timestamp | ball_state | ball_owning_team | team_id | player_id | coordinates_x | coordinates_y | end_coordinates_x | end_coordinates_y | body_part_type | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 4c7c4ab1-6b9f-4504-a237-249c2e0c549f | SHOT | GOAL | True | 2 | 1091.954 | None | alive | 217 | 217 | 5503 | 0.800417 | 0.563125 | None | None | LEFT_FOOT |
1 | 683c6752-13bc-4892-94ed-22e1c938f1f7 | SHOT | GOAL | True | 2 | 2261.578 | None | alive | 217 | 217 | 3501 | 0.875417 | 0.388125 | None | None | RIGHT_FOOT |
2 | 55d71847-9511-4417-aea9-6f415e279011 | SHOT | GOAL | True | 2 | 2802.770 | None | alive | 217 | 217 | 5503 | 0.932917 | 0.431875 | None | None | LEFT_FOOT |
The filtered dataset doesn't contain any events other than goals. Lets validate that. When we try to find all passes we should get an empty list
passes = filtered_dataset.find_all('pass')
len(passes)
0
The original dataset does contain passes, right?
passes = dataset.find_all('pass')
len(passes)
1132
Now we already touched the find_all
method. This method accepts the same argument. The difference is that find_all
returns a list of events, where filter
returns a new Dataset. The find
method return the first matching event or None when it cannot find one.
dataset.find('shot')
<StatsBombShotEvent event_id='65f16e50-7c5d-4293-b2fc-d20887a772f9' time='P1T02:29' player='Lionel Andrés Messi Cuccittini' result='OFF_TARGET'>
print(filtered_dataset.find('pass'))
None
Event¶
On Event level there are also some new methods for navigating. The prev
and next
methods are added. These allow you to quickly find previous or next events. But those two methods also accept the filter argument like the Dataset methods do. This makes useful to find a certain type of event instead of just the one before/after.
Lets have look at how this works
# Load a Statsbomb open data dataset
dataset = statsbomb.load_open_data()
first_goal = dataset.find('shot.goal')
first_goal
/Users/koen/Developer/Projects/PySport/kloppy/.venv/lib/python3.10/site-packages/kloppy-3.7.1-py3.10.egg/kloppy/_providers/statsbomb.py:67: UserWarning: You are about to use StatsBomb public data. By using this data, you are agreeing to the user agreement. The user agreement can be found here: https://github.com/statsbomb/open-data/blob/master/LICENSE.pdf warnings.warn(
<StatsBombShotEvent event_id='4c7c4ab1-6b9f-4504-a237-249c2e0c549f' time='P2T18:12' player='Lionel Andrés Messi Cuccittini' result='GOAL'>
# Lets previous and next events
print(first_goal.prev())
print(first_goal)
print(first_goal.next())
<GenericEvent:Foul Won event_id='eed04441-624f-4f23-9843-7bd069c16232' time='P2T17:02' player='Lionel Andrés Messi Cuccittini' result='None'> <StatsBombShotEvent event_id='4c7c4ab1-6b9f-4504-a237-249c2e0c549f' time='P2T18:12' player='Lionel Andrés Messi Cuccittini' result='GOAL'> <GenericEvent:Goal Keeper event_id='5080ad86-383c-40c5-b718-508d8c9be454' time='P2T18:13' player='Fernando Pacheco Flores' result='None'>
But what if we want to find the last complete pass before the goal?
first_goal.prev('pass.complete')
<StatsBombPassEvent event_id='df4a42e4-e5d3-4573-853a-604e46a588d4' time='P2T16:58' player='Ivan Rakitić' result='COMPLETE'>
Or when we don't care about the event type, but want to make sure it's complete..
first_goal.prev('.complete')
<CarryEvent event_id='95bded73-4861-4374-99cf-2a278ff07ea6' time='P2T17:02' player='Lionel Andrés Messi Cuccittini' result='COMPLETE'>
Related events¶
Some vendors include related_events
in their data. The related events can be accessed via get_related_events
method, or by related_pass
, related_carry
, etc for each event type.
The get_related_events
returns a list which can be empty. The related_pass
methods return an Event or None when that type is not specified.
carry_event = first_goal.prev('carry')
carry_event.get_related_events()
[<TakeOnEvent event_id='a1b51bfb-9198-4180-966a-91937f399d2d' time='P2T17:02' player='Lionel Andrés Messi Cuccittini' result='COMPLETE'>, <GenericEvent:Dribbled Past event_id='a1b860e4-71b4-4366-b1bf-7290a82a380f' time='P2T17:02' player='Rubén Duarte Sánchez' result='None'>, <FoulCommittedEvent event_id='e44eea88-d7ae-4806-b322-04134755e187' time='P2T17:02' player='Daniel Alejandro Torres Rojas' result='None'>, <GenericEvent:Foul Won event_id='eed04441-624f-4f23-9843-7bd069c16232' time='P2T17:02' player='Lionel Andrés Messi Cuccittini' result='None'>]
print(carry_event.related_pass())
None
print(carry_event.related_take_on())
<TakeOnEvent event_id='a1b51bfb-9198-4180-966a-91937f399d2d' time='P2T17:02' player='Lionel Andrés Messi Cuccittini' result='COMPLETE'>