pip install nba_api-1.5.2 numpy-1.26.4 requests-2.32.3 Successfully installed nba_api
API NBA - Pandas
Case Study
- We’ll use the API provided for the NBA to determine how well the Golden State Warriors performed against the Toronto Raptors
- Determine the number of points by which they won or lost each game
- If they won by 3 the value will be 3
- If they lost by 2 the value will be -2
- All that’s required to access the API is an id
- Import the module teams
Setup
- Need to pip install nba_api in (cmd prompt for me), so let’s do that
library(reticulate)
from nba_api.stats.static import teams
from nba_api.stats.endpoints import leaguegamefinder
import requests
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
Data
- NBA API is found here: https://pypi.org/project/nba_api/
Get Teams
- The method
get_teams()
returns a list of dictionaries - The dictionary key id has a unique identifier for each team as a value
- Let’s look at the first 3 elements of the list
# Get the list of all the teams
= teams.get_teams()
nba_teams
# Let's look at the first 3
0:3] nba_teams[
[{'id': 1610612737, 'full_name': 'Atlanta Hawks', 'abbreviation': 'ATL', 'nickname': 'Hawks', 'city': 'Atlanta', 'state': 'Georgia', 'year_founded': 1949}, {'id': 1610612738, 'full_name': 'Boston Celtics', 'abbreviation': 'BOS', 'nickname': 'Celtics', 'city': 'Boston', 'state': 'Massachusetts', 'year_founded': 1946}, {'id': 1610612739, 'full_name': 'Cleveland Cavaliers', 'abbreviation': 'CLE', 'nickname': 'Cavaliers', 'city': 'Cleveland', 'state': 'Ohio', 'year_founded': 1970}]
Convert to DF
- To make it easier to read let’s convert list to dictionary then
- convert dict to df
- Let’s create a function that converts the list to dictionary
- name the function: one_dict()
Create one_dict()
keys=list_dict[0].keys()
takes the first element in nba_teams and extracts all the keys- giving us a list of the keys
- out_dict={key:[] for key in keys} creates a new dict with the keys list above along with blank [values]
- The for loop will take every key and value from the nba_teams and matches the key to the keys in the new dictionary and appends its matching value
- Return the newly created dict
def one_dict(list_dict):
=list_dict[0].keys()
keysprint("This is the value of keys printed from inside def one_dict - keys =\n",keys)
={key:[] for key in keys}
out_dictprint("This is the value of out_dict printed from inside def one_dict - out_dict =\n",out_dict)
for dict_ in list_dict:
for key, value in dict_.items():
out_dict[key].append(value)return out_dict
Convert list to dict
- convert the list of teams: nba_teams to a dictionary using the function we created
# use the function to convert the list to dict
= one_dict(nba_teams) dict_nba_team
This is the value of keys printed from inside def one_dict - keys =
dict_keys(['id', 'full_name', 'abbreviation', 'nickname', 'city', 'state', 'year_founded'])
This is the value of out_dict printed from inside def one_dict - out_dict =
{'id': [], 'full_name': [], 'abbreviation': [], 'nickname': [], 'city': [], 'state': [], 'year_founded': []}
# Let's look at the keys of the dictionary
list(dict_nba_team)
['id', 'full_name', 'abbreviation', 'nickname', 'city', 'state', 'year_founded']
Convert dict to df
= pd.DataFrame(dict_nba_team)
df_teams 5) df_teams.head(
id full_name ... state year_founded
0 1610612737 Atlanta Hawks ... Georgia 1949
1 1610612738 Boston Celtics ... Massachusetts 1946
2 1610612739 Cleveland Cavaliers ... Ohio 1970
3 1610612740 New Orleans Pelicans ... Louisiana 2002
4 1610612741 Chicago Bulls ... Illinois 1966
[5 rows x 7 columns]
Search and Filter
- Let’s search the ‘nickname’ column for “Warriors”
= df_teams[df_teams['nickname'] == 'Warriors']
df_warriors df_warriors
id full_name ... state year_founded
7 1610612744 Golden State Warriors ... California 1946
[1 rows x 7 columns]
Extract ID
- Now we can extract the ID of the team and use to request information from the API
= df_warriors['id']
id_warriors id_warriors
7 1610612744
Name: id, dtype: int64
API Call
Now that we have the id for the team let’s make the call to the API and provide them with the id
- The function “League Game Finder” which we imported earlier will make the API call
- It’s in the module
- The parameter
team_id_nullable
is the unique ID for the warriors. Under the hood, the NBA API is making a HTTP request. - The information requested is provided and is transmitted via an HTTP response this is assigned to the object
game finder
.
= leaguegamefinder.LeagueGameFinder(team_id_nullable=id_warriors) gamefinder
Review the Request
# It is long so let's not run it
gamefinder.get_json()
Review the Response
- The game finder object has a method
get_data_frames()
, that returns a dataframe. - If we view the dataframe, we can see it contains information about all the games the Warriors played.
- The
PLUS_MINUS
column contains information on the score, if the value is negative, the Warriors lost by that many points, if the value is positive, the warriors won by that amount of points. - The column
MATCHUP
has the team the Warriors were playing, GSW stands for Golden State Warriors and TOR means Toronto Raptors. -
vs
signifies it was a home game and the@
symbol means an away game.
= gamefinder.get_data_frames()[0]
games games.head()
SEASON_ID TEAM_ID TEAM_ABBREVIATION ... TOV PF PLUS_MINUS
0 22024 1610612744 GSW ... 7 20 18.0
1 22024 1610612744 GSW ... 21 22 -8.0
2 22024 1610612744 GSW ... 13 25 41.0
3 22024 1610612744 GSW ... 17 27 36.0
4 12024 1610612744 GSW ... 13 14 58.0
[5 rows x 28 columns]
games.columns
Index(['SEASON_ID', 'TEAM_ID', 'TEAM_ABBREVIATION', 'TEAM_NAME', 'GAME_ID',
'GAME_DATE', 'MATCHUP', 'WL', 'MIN', 'PTS', 'FGM', 'FGA', 'FG_PCT',
'FG3M', 'FG3A', 'FG3_PCT', 'FTM', 'FTA', 'FT_PCT', 'OREB', 'DREB',
'REB', 'AST', 'STL', 'BLK', 'TOV', 'PF', 'PLUS_MINUS'],
dtype='object')
Create df for Home
- Let’s create 2 df one for homes against the Raptors at home
- One against them on the road
= games[games['MATCHUP'] == 'GSW vs. TOR']
games_home = games[games['MATCHUP'] == 'GSW @ TOR']
games_away 3) games_home.head(
SEASON_ID TEAM_ID TEAM_ABBREVIATION ... TOV PF PLUS_MINUS
65 22023 1610612744 GSW ... 10 11 -15.0
159 22022 1610612744 GSW ... 12 18 12.0
308 22021 1610612744 GSW ... 16 18 15.0
[3 rows x 28 columns]
3) games_away.head(
SEASON_ID TEAM_ID TEAM_ABBREVIATION ... TOV PF PLUS_MINUS
42 22023 1610612744 GSW ... 14 14 15.0
106 22023 1610612744 GSW ... 21 29 1.2
177 22022 1610612744 GSW ... 17 21 16.0
[3 rows x 28 columns]
Average Differential
- If we wanted to know the average of the diferrentials for the entire season
'PLUS_MINUS'].mean() games_home[
3.375
'PLUS_MINUS'].mean() games_away[
-1.7212121212121212
Plot + and -
= plt.subplots()
fig, ax
='GAME_DATE',y='PLUS_MINUS', ax=ax)
games_away.plot(x='GAME_DATE',y='PLUS_MINUS', ax=ax)
games_home.plot(x"away", "home"])
ax.legend([ plt.show()
Average Points Scored
'PTS'].mean() games_home[
107.875
'PTS'].mean() games_away[
103.24242424242425