API NBA - Pandas

Case Study


  • We’ll use the API provided for the NBA to determine how well the Golden State Warriors performed against the Toronto Raptors
  • Determine the number of points by which they won or lost each game
  • If they won by 3 the value will be 3
  • If they lost by 2 the value will be -2
  • All that’s required to access the API is an id
  • Import the module teams

Setup


  • Need to pip install nba_api in (cmd prompt for me), so let’s do that
pip install nba_api
Successfully installed nba_api-1.5.2 numpy-1.26.4 requests-2.32.3
library(reticulate)
from nba_api.stats.static import teams
from nba_api.stats.endpoints import leaguegamefinder
import requests

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

Data


Get Teams

  • The method get_teams() returns a list of dictionaries
  • The dictionary key id has a unique identifier for each team as a value
  • Let’s look at the first 3 elements of the list
# Get the list of all the teams
nba_teams = teams.get_teams()

# Let's look at the first 3
nba_teams[0:3]
[{'id': 1610612737, 'full_name': 'Atlanta Hawks', 'abbreviation': 'ATL', 'nickname': 'Hawks', 'city': 'Atlanta', 'state': 'Georgia', 'year_founded': 1949}, {'id': 1610612738, 'full_name': 'Boston Celtics', 'abbreviation': 'BOS', 'nickname': 'Celtics', 'city': 'Boston', 'state': 'Massachusetts', 'year_founded': 1946}, {'id': 1610612739, 'full_name': 'Cleveland Cavaliers', 'abbreviation': 'CLE', 'nickname': 'Cavaliers', 'city': 'Cleveland', 'state': 'Ohio', 'year_founded': 1970}]

Convert to DF

  • To make it easier to read let’s convert list to dictionary then
  • convert dict to df
  • Let’s create a function that converts the list to dictionary
  • name the function: one_dict()

Create one_dict()

  • keys=list_dict[0].keys() takes the first element in nba_teams and extracts all the keys
  • giving us a list of the keys
  • out_dict={key:[] for key in keys} creates a new dict with the keys list above along with blank [values]
  • The for loop will take every key and value from the nba_teams and matches the key to the keys in the new dictionary and appends its matching value
  • Return the newly created dict
def one_dict(list_dict):
    keys=list_dict[0].keys()
    print("This is the value of keys printed from inside def one_dict - keys =\n",keys)
    out_dict={key:[] for key in keys}
    print("This is the value of out_dict printed from inside def one_dict - out_dict =\n",out_dict)
    for dict_ in list_dict:
        for key, value in dict_.items():
            out_dict[key].append(value)
    return out_dict

Convert list to dict

  • convert the list of teams: nba_teams to a dictionary using the function we created
# use the function to convert the list to dict
dict_nba_team = one_dict(nba_teams)
This is the value of keys printed from inside def one_dict - keys =
 dict_keys(['id', 'full_name', 'abbreviation', 'nickname', 'city', 'state', 'year_founded'])
This is the value of out_dict printed from inside def one_dict - out_dict =
 {'id': [], 'full_name': [], 'abbreviation': [], 'nickname': [], 'city': [], 'state': [], 'year_founded': []}
# Let's look at the keys of the dictionary
list(dict_nba_team)
['id', 'full_name', 'abbreviation', 'nickname', 'city', 'state', 'year_founded']

Convert dict to df

df_teams = pd.DataFrame(dict_nba_team)
df_teams.head(5)
           id             full_name  ...          state year_founded
0  1610612737         Atlanta Hawks  ...        Georgia         1949
1  1610612738        Boston Celtics  ...  Massachusetts         1946
2  1610612739   Cleveland Cavaliers  ...           Ohio         1970
3  1610612740  New Orleans Pelicans  ...      Louisiana         2002
4  1610612741         Chicago Bulls  ...       Illinois         1966

[5 rows x 7 columns]

Search and Filter

  • Let’s search the ‘nickname’ column for “Warriors”
df_warriors = df_teams[df_teams['nickname'] == 'Warriors']
df_warriors
           id              full_name  ...       state year_founded
7  1610612744  Golden State Warriors  ...  California         1946

[1 rows x 7 columns]

Extract ID

  • Now we can extract the ID of the team and use to request information from the API
id_warriors = df_warriors['id']
id_warriors
7    1610612744
Name: id, dtype: int64

API Call


Now that we have the id for the team let’s make the call to the API and provide them with the id

  • The function “League Game Finder” which we imported earlier will make the API call
  • It’s in the module
  • The parameter team_id_nullable is the unique ID for the warriors. Under the hood, the NBA API is making a HTTP request.
  • The information requested is provided and is transmitted via an HTTP response this is assigned to the object game finder.
gamefinder = leaguegamefinder.LeagueGameFinder(team_id_nullable=id_warriors)

Review the Request

# It is long so let's not run it
gamefinder.get_json()

Review the Response

  • The game finder object has a method get_data_frames(), that returns a dataframe.
  • If we view the dataframe, we can see it contains information about all the games the Warriors played.
  • The PLUS_MINUS column contains information on the score, if the value is negative, the Warriors lost by that many points, if the value is positive, the warriors won by that amount of points.
  • The column MATCHUP has the team the Warriors were playing, GSW stands for Golden State Warriors and TOR means Toronto Raptors.
  •  vs signifies it was a home game and the @symbol means an away game.
games = gamefinder.get_data_frames()[0]
games.head()
  SEASON_ID     TEAM_ID TEAM_ABBREVIATION  ... TOV  PF PLUS_MINUS
0     22024  1610612744               GSW  ...   7  20       18.0
1     22024  1610612744               GSW  ...  21  22       -8.0
2     22024  1610612744               GSW  ...  13  25       41.0
3     22024  1610612744               GSW  ...  17  27       36.0
4     12024  1610612744               GSW  ...  13  14       58.0

[5 rows x 28 columns]
games.columns
Index(['SEASON_ID', 'TEAM_ID', 'TEAM_ABBREVIATION', 'TEAM_NAME', 'GAME_ID',
       'GAME_DATE', 'MATCHUP', 'WL', 'MIN', 'PTS', 'FGM', 'FGA', 'FG_PCT',
       'FG3M', 'FG3A', 'FG3_PCT', 'FTM', 'FTA', 'FT_PCT', 'OREB', 'DREB',
       'REB', 'AST', 'STL', 'BLK', 'TOV', 'PF', 'PLUS_MINUS'],
      dtype='object')

Create df for Home

  • Let’s create 2 df one for homes against the Raptors at home
  • One against them on the road
games_home =  games[games['MATCHUP'] == 'GSW vs. TOR']
games_away =  games[games['MATCHUP'] == 'GSW @ TOR']
games_home.head(3)
    SEASON_ID     TEAM_ID TEAM_ABBREVIATION  ... TOV  PF PLUS_MINUS
65      22023  1610612744               GSW  ...  10  11      -15.0
159     22022  1610612744               GSW  ...  12  18       12.0
308     22021  1610612744               GSW  ...  16  18       15.0

[3 rows x 28 columns]
games_away.head(3)
    SEASON_ID     TEAM_ID TEAM_ABBREVIATION  ... TOV  PF PLUS_MINUS
42      22023  1610612744               GSW  ...  14  14       15.0
106     22023  1610612744               GSW  ...  21  29        1.2
177     22022  1610612744               GSW  ...  17  21       16.0

[3 rows x 28 columns]

Average Differential

  • If we wanted to know the average of the diferrentials for the entire season
games_home['PLUS_MINUS'].mean()
3.375
games_away['PLUS_MINUS'].mean()
-1.7212121212121212

Plot + and -

fig, ax = plt.subplots()

games_away.plot(x='GAME_DATE',y='PLUS_MINUS', ax=ax)
games_home.plot(x='GAME_DATE',y='PLUS_MINUS', ax=ax)
ax.legend(["away", "home"])
plt.show()

Average Points Scored

games_home['PTS'].mean()
107.875
games_away['PTS'].mean()
103.24242424242425