NHL API

In this tutorial we will use the NHL API to practice pulling data from an API and formatting for a downstream process.

The NHL API appears to be undocumented, but there are somed dedicated users that have published most of the functionality. Below are links to the most helpful source I found.

https://gitlab.com/dword4/nhlapi https://gitlab.com/dword4/nhlapi/-/blob/master/records-api.md

Endpoint examples

https://records.nhl.com/records/skater-records/goals/skater-most-goals-one-season https://api.nhle.com/stats/rest/en/franchise?sort=fullName&include=lastSeason.id&include=firstSeason.id

https://api.nhle.com/stats/rest/en/skater/summary?isAggregate=false&isGame=false&sort=%5B%7B%22property%22:%22points%22,%22direction%22:%22DESC%22%7D,%7B%22property%22:%22goals%22,%22direction%22:%22DESC%22%7D,%7B%22property%22:%22assists%22,%22direction%22:%22DESC%22%7D%5D&start=0&limit=50&factCayenneExp=gamesPlayed%3E=1&cayenneExp=franchiseId%3D32%20and%20gameTypeId=2%20and%20seasonId%3C=20192020%20and%20seasonId%3E=20192020

Before we get started let’s load the following libraries.

import requests
import json
import pandas as pd

Get Team Info

The first endpoint we will work with will give us infomation about the team:

https://statsapi.web.nhl.com/api/v1/teams

Get Request

teams_url = "https://statsapi.web.nhl.com/api/v1/teams"

team_response = requests.get(teams_url)

# check that reponse is valid
print(team_response.status_code)

API Status Code

200: Everything went okay, and the result has been returned (if any).
301: The server is redirecting you to a different endpoint. This can happen when a company switches domain names, or an endpoint name is changed.
400: The server thinks you made a bad request. This can happen when you don’t send along the right data, among other things.
401: The server thinks you’re not authenticated. Many APIs require login ccredentials, so this happens when you don’t send the right credentials to access an API.
403: The resource you’re trying to access is forbidden: you don’t have the right permissions to see it.
404: The resource you tried to access wasn’t found on the server.
503: The server is not ready to handle the request.

With 200 respone code we are good to start looking at the conent.

Pull out the content

To get the conents of our reposne object, we can use two methods:

.content with json.loads
.json()

# convert the content into a python dictionary
team_content = json.loads(team_response.content)
type(team_content)

dict

team_content.keys()

dict_keys(['copyright', 'teams'])

We can now access the contents of the data using dictionary actions.

team_content['teams'][0]

{'id': 1,
 'name': 'New Jersey Devils',
 'link': '/api/v1/teams/1',
 'venue': {'name': 'Prudential Center',
  'link': '/api/v1/venues/null',
  'city': 'Newark',
  'timeZone': {'id': 'America/New_York', 'offset': -5, 'tz': 'EST'}},
 'abbreviation': 'NJD',
 'teamName': 'Devils',
 'locationName': 'New Jersey',
 'firstYearOfPlay': '1982',
 'division': {'id': 18,
  'name': 'Metropolitan',
  'nameShort': 'Metro',
  'link': '/api/v1/divisions/18',
  'abbreviation': 'M'},
 'conference': {'id': 6, 'name': 'Eastern', 'link': '/api/v1/conferences/6'},
 'franchise': {'franchiseId': 23,
  'teamName': 'Devils',
  'link': '/api/v1/franchises/23'},
 'shortName': 'New Jersey',
 'officialSiteUrl': 'http://www.newjerseydevils.com/',
 'franchiseId': 23,
 'active': True}

team_response.json() == json.loads(team_response.content)

True

Convert the content to a DataFrame

#convert the dictionary into a dataframe
df_team_content = pd.DataFrame(team_content['teams'])
df_team_content.head()
#df_team_content.info()

.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}

#convert the types of most columns in the dataframe
df_team_content2 = df_team_content.convert_dtypes()
df_team_content2.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 32 entries, 0 to 31
Data columns (total 15 columns):
 #   Column           Non-Null Count  Dtype  
---  ------           --------------  -----  
 0   id               32 non-null     Int64  
 1   name             32 non-null     string 
 2   link             32 non-null     string 
 3   venue            32 non-null     object 
 4   abbreviation     32 non-null     string 
 5   teamName         32 non-null     string 
 6   locationName     32 non-null     string 
 7   firstYearOfPlay  32 non-null     string 
 8   division         32 non-null     object 
 9   conference       32 non-null     object 
 10  franchise        32 non-null     object 
 11  shortName        32 non-null     string 
 12  officialSiteUrl  32 non-null     string 
 13  franchiseId      32 non-null     Int64  
 14  active           32 non-null     boolean
dtypes: Int64(2), boolean(1), object(4), string(8)
memory usage: 3.8+ KB

# find the rangers info
df_team_content2.query("teamName == 'Rangers'")

.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}

json_normalize for DataFrame Conversion

Instead of placing the dicitonary in the DataFame function, we could also use the json_normalize function.

df_team_content_jn = pd.json_normalize(team_content['teams'])
df_team_content_jn.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 32 entries, 0 to 31
Data columns (total 29 columns):
 #   Column                 Non-Null Count  Dtype  
---  ------                 --------------  -----  
 0   id                     32 non-null     int64  
 1   name                   32 non-null     object 
 2   link                   32 non-null     object 
 3   abbreviation           32 non-null     object 
 4   teamName               32 non-null     object 
 5   locationName           32 non-null     object 
 6   firstYearOfPlay        32 non-null     object 
 7   shortName              32 non-null     object 
 8   officialSiteUrl        32 non-null     object 
 9   franchiseId            32 non-null     int64  
 10  active                 32 non-null     bool   
 11  venue.name             32 non-null     object 
 12  venue.link             32 non-null     object 
 13  venue.city             32 non-null     object 
 14  venue.timeZone.id      32 non-null     object 
 15  venue.timeZone.offset  32 non-null     int64  
 16  venue.timeZone.tz      32 non-null     object 
 17  division.id            32 non-null     int64  
 18  division.name          32 non-null     object 
 19  division.nameShort     32 non-null     object 
 20  division.link          32 non-null     object 
 21  division.abbreviation  32 non-null     object 
 22  conference.id          32 non-null     int64  
 23  conference.name        32 non-null     object 
 24  conference.link        32 non-null     object 
 25  franchise.franchiseId  32 non-null     int64  
 26  franchise.teamName     32 non-null     object 
 27  franchise.link         32 non-null     object 
 28  venue.id               26 non-null     float64
dtypes: bool(1), float64(1), int64(6), object(21)
memory usage: 7.2+ KB

Notice that the json_normalize() method flattend nested dictionaries by concatenating the keys.

df_team_content_jn.head()

.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}

If there are mulitple levles to the dictionary, each key will be concatenated. For examample, consider the dictionary in the venue column.

team_content['teams'][0]['venue']

{'name': 'Prudential Center',
 'link': '/api/v1/venues/null',
 'city': 'Newark',
 'timeZone': {'id': 'America/New_York', 'offset': -5, 'tz': 'EST'}}

Since the value for the timezone is a dict the parent keys (venue, timeZone) are concatenated with each key within timeZone (id, offset, tz)

df_team_content_jn["venue.timeZone.tz"].head()

0    EST
1    EST
2    EST
3    EST
4    EST
Name: venue.timeZone.tz, dtype: object

We could also apply the .json_normalize() methond to a dictionary column to return dataframes. Consider the venue column from our previous df_team_content2 dataframe.

pd.json_normalize(df_team_content2.venue).head()

.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}

Now that we have looked at different approaches, lets use the result fromt he .json_normalize() going forward.

df_team_content = df_team_content_jn

Request Roster Data

Now that we have collected the basic team info, let’s move down a level to collect roster information.

Extracting the team ID usign regex

# get the rangers link to their team site
rangers_link = df_team_content.query("teamName == 'Rangers'").link.values[0]
rangers_link

'/api/v1/teams/3'

type(rangers_link)

str

# get the rangers team id from the end of the url
import re

pattern = "\d$"

re.findall(pattern, rangers_link)

['3']

#get the base url of the api
teams_url

url_pattern = ".+com"

base_url = re.search(url_pattern, teams_url).group()
base_url

'https://statsapi.web.nhl.com'

#create the url to get the rangers team info
rangers_url = base_url + rangers_link

rangers_url

'https://statsapi.web.nhl.com/api/v1/teams/3'

Add Parameters to Team URL

In order to get the roster data, we need to use modifier or the roster endpoint.

?expand=team.roster
/roster appended to the end of the url

We also need to specify a season

yyyyYYYYY format
- yyyy:start year
- YYYY: end

# create a url to get the rangers team roster info
params = "?expand=team.roster&season=20182019"
rangers_roster_url = rangers_url + params
rangers_roster_url

'https://statsapi.web.nhl.com/api/v1/teams/3?expand=team.roster&season=20182019'

rangers_response = requests.get(rangers_roster_url)
rangers_response.status_code

#convert the content
rangers_content = rangers_response.json()
rangers_content.keys()

dict_keys(['copyright', 'teams'])

rangers_content['teams'][0]['roster']['roster'][1]

{'person': {'id': 8471686,
  'fullName': 'Marc Staal',
  'link': '/api/v1/people/8471686'},
 'jerseyNumber': '18',
 'position': {'code': 'D',
  'name': 'Defenseman',
  'type': 'Defenseman',
  'abbreviation': 'D'}}

Make a function to get team info

Now that endpoints are leveraging parameters, it would be useful to create a function to simplify the creation of the API url.

For now, the urls we are creating require a team number and season.

test = 19992000
test = str(test)
pattern = re.compile('\d{8}')
bool(pattern.match(test))

True

def create_team_roster_url(team_number, season):
    from sys import exit
    
    base_url = "https://statsapi.web.nhl.com/api/v1/teams/"
 
    #verify team number makes sense
    if int(team_number) > 33:
        raise ValueError("Please use a correct value.  Team numbers must be less than 33")

    # convert to strings
    team_number = str(team_number) # convert to a string in case a number was supplied
    season = str(season)
    
    #verify the season is 8 digits
    pattern = re.compile('\d{8}')
    is_season_formatted = bool(pattern.match(season))
    if not is_season_formatted:
        raise ValueError("You did not provide the season with 8 digits.  Specify start and end season with four digits and no spaces")
     
    # combine the components into one url
    url = base_url + team_number + "/roster/" + "?season=" + season
    return url

create_team_roster_url(3,20192020)

'https://statsapi.web.nhl.com/api/v1/teams/3/roster/?season=20192020'

Checking Errors

create_team_roster_url(3,2019)

---------------------------------------------------------------------------

ValueError                                Traceback (most recent call last)

<ipython-input-63-f09f6485bbbd> in <module>
----> 1 create_team_roster_url(3,2019)


<ipython-input-61-803bcea8f4e1> in create_team_roster_url(team_number, season)
     16     is_season_formatted = bool(pattern.match(season))
     17     if not is_season_formatted:
---> 18         raise ValueError("You did not provide the season with 8 digits.  Specify start and end season with four digits and no spaces")
     19 
     20     # combine the components into one url


ValueError: You did not provide the season with 8 digits.  Specify start and end season with four digits and no spaces

create_team_roster_url(100,20192020)

---------------------------------------------------------------------------

ValueError                                Traceback (most recent call last)

<ipython-input-64-7641c6e970a0> in <module>
----> 1 create_team_roster_url(100,20192020)


<ipython-input-61-803bcea8f4e1> in create_team_roster_url(team_number, season)
      6     #verify team number makes sense
      7     if int(team_number) > 33:
----> 8         raise ValueError("Please use a correct value.  Team numbers must be less than 33")
      9 
     10     # convert to strings


ValueError: Please use a correct value.  Team numbers must be less than 33

Send Get Request for Rangers Data

# get the rangers roster from 2019-2020
rangers_roster_url = create_team_roster_url(3,20192020) 
rangers_response = requests.get(rangers_roster_url)
rangers_response.status_code

# convert the roster
rangers_roster_content = rangers_response.json()["roster"]
rangers_roster_content[0]

{'person': {'id': 8471686,
  'fullName': 'Marc Staal',
  'link': '/api/v1/people/8471686'},
 'jerseyNumber': '18',
 'position': {'code': 'D',
  'name': 'Defenseman',
  'type': 'Defenseman',
  'abbreviation': 'D'}}

# convert the roster to a dataframe
df_rangers_roster = pd.json_normalize(rangers_roster_content).astype(str)
df_rangers_roster.head()

.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}

Stat Types


stat_type_url = "https://statsapi.web.nhl.com/api/v1/statTypes"

response_stat_type = requests.get(stat_type_url)
response_stat_type.status_code

response_stat_type.json()

[{'displayName': 'yearByYear', 'gameType': None},
 {'displayName': 'yearByYearRank', 'gameType': None},
 {'displayName': 'yearByYearPlayoffs', 'gameType': None},
 {'displayName': 'yearByYearPlayoffsRank', 'gameType': None},
 {'displayName': 'careerRegularSeason', 'gameType': None},
 {'displayName': 'careerPlayoffs', 'gameType': None},
 {'displayName': 'gameLog', 'gameType': None},
 {'displayName': 'playoffGameLog', 'gameType': None},
 {'displayName': 'vsTeam', 'gameType': None},
 {'displayName': 'vsTeamPlayoffs', 'gameType': None},
 {'displayName': 'vsDivision', 'gameType': None},
 {'displayName': 'vsDivisionPlayoffs', 'gameType': None},
 {'displayName': 'vsConference', 'gameType': None},
 {'displayName': 'vsConferencePlayoffs', 'gameType': None},
 {'displayName': 'byMonth', 'gameType': None},
 {'displayName': 'byMonthPlayoffs', 'gameType': None},
 {'displayName': 'byDayOfWeek', 'gameType': None},
 {'displayName': 'byDayOfWeekPlayoffs', 'gameType': None},
 {'displayName': 'homeAndAway', 'gameType': None},
 {'displayName': 'homeAndAwayPlayoffs', 'gameType': None},
 {'displayName': 'winLoss', 'gameType': None},
 {'displayName': 'winLossPlayoffs', 'gameType': None},
 {'displayName': 'onPaceRegularSeason', 'gameType': None},
 {'displayName': 'regularSeasonStatRankings', 'gameType': None},
 {'displayName': 'playoffStatRankings', 'gameType': None},
 {'displayName': 'goalsByGameSituation', 'gameType': None},
 {'displayName': 'goalsByGameSituationPlayoffs', 'gameType': None},
 {'displayName': 'statsSingleSeason',
  'gameType': {'id': 'R',
   'description': 'Regular season',
   'postseason': False}},
 {'displayName': 'statsSingleSeasonPlayoffs',
  'gameType': {'id': 'P', 'description': 'Playoffs', 'postseason': True}}]

Player Stats

# create a url to get player stats
def create_player_stats_url(id, param = ""):
    base_url = "https://statsapi.web.nhl.com/api/v1/people/"
    if param == "":
        url = base_url + id + "/"
    else:
        url = base_url + id + "/stats/?" + param
    return url

# get Artemi's player id
artemi_id = df_rangers_roster[df_rangers_roster['person.fullName'].str.contains("Artemi")]['person.id'].values[0]
artemi_id

'8478550'

# create a url for every player
df_rangers_roster["player_stats_link"] = create_player_stats_url(df_rangers_roster["person.id"], "stats=statsSingleSeason&season=20182019")
df_rangers_roster["player_stats_link"][0]

'https://statsapi.web.nhl.com/api/v1/people/8471686/stats/?stats=statsSingleSeason&season=20182019'

# get players stats
def get_player_stats(url):
    first_layer = "stats" 
    response = requests.get(url)
    try:
        content = json.loads(response.content)[first_layer][0]['splits'][0]
    except:
        content = {}
    return content

# test the function on one player
test_url = df_rangers_roster["player_stats_link"][0]
test_return = get_player_stats(test_url)
test_return
#pd.json_normalize(test_return['people'])

{'season': '20182019',
 'stat': {'timeOnIce': '1534:12',
  'assists': 10,
  'goals': 3,
  'pim': 32,
  'shots': 84,
  'games': 79,
  'hits': 94,
  'powerPlayGoals': 0,
  'powerPlayPoints': 0,
  'powerPlayTimeOnIce': '02:23',
  'evenTimeOnIce': '1303:27',
  'penaltyMinutes': '32',
  'faceOffPct': 0.0,
  'shotPct': 3.57,
  'gameWinningGoals': 0,
  'overTimeGoals': 0,
  'shortHandedGoals': 0,
  'shortHandedPoints': 0,
  'shortHandedTimeOnIce': '228:22',
  'blocked': 119,
  'plusMinus': -9,
  'points': 13,
  'shifts': 2061,
  'timeOnIcePerGame': '19:25',
  'evenTimeOnIcePerGame': '16:29',
  'shortHandedTimeOnIcePerGame': '02:53',
  'powerPlayTimeOnIcePerGame': '00:01'}}

Store the results in a data frame

# get stats for each player
df_rangers_roster["player_json"] =  df_rangers_roster["player_stats_link"].apply(get_player_stats)

df_rangers_roster["player_json"][0]

{'season': '20182019',
 'stat': {'timeOnIce': '1534:12',
  'assists': 10,
  'goals': 3,
  'pim': 32,
  'shots': 84,
  'games': 79,
  'hits': 94,
  'powerPlayGoals': 0,
  'powerPlayPoints': 0,
  'powerPlayTimeOnIce': '02:23',
  'evenTimeOnIce': '1303:27',
  'penaltyMinutes': '32',
  'faceOffPct': 0.0,
  'shotPct': 3.57,
  'gameWinningGoals': 0,
  'overTimeGoals': 0,
  'shortHandedGoals': 0,
  'shortHandedPoints': 0,
  'shortHandedTimeOnIce': '228:22',
  'blocked': 119,
  'plusMinus': -9,
  'points': 13,
  'shifts': 2061,
  'timeOnIcePerGame': '19:25',
  'evenTimeOnIcePerGame': '16:29',
  'shortHandedTimeOnIcePerGame': '02:53',
  'powerPlayTimeOnIcePerGame': '00:01'}}

df_rangers_stats_2018_2019 = pd.json_normalize(df_rangers_roster["player_json"])
df_rangers_stats_2018_2019.head()

.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}

#find the players from 2019-2020 roster that did have stats in 2018-2019
#they were all rookies in 2018-2019
df_rangers_roster[df_rangers_stats_2018_2019['season'].isnull().values]

.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}

df_rangers_stats = pd.concat([df_rangers_roster, df_rangers_stats_2018_2019], axis = 1)
df_rangers_stats[df_rangers_stats["person.id"] == artemi_id]["stat.powerPlayGoals"]

17    6.0
Name: stat.powerPlayGoals, dtype: float64

%md

Compare Players

example = "https://suggest.svc.nhl.com/svc/suggest/v1/minplayers/wayne%20gre/99999"

the number reflects the results to return

url1 = "https://suggest.svc.nhl.com/svc/suggest/v1/minplayers/wayne/2"

  File "<ipython-input-82-aa338d895432>", line 1
    https://suggest.svc.nhl.com/svc/suggest/v1/minplayers/wayne/2
          ^
SyntaxError: invalid syntax

url2 = "https://statsapi.web.nhl.com/api/v1/teams/?teamId=3&expand=team.stats&season=20132014"

Team Stats

def flatten_json(y):
    out = {}

    def flatten(x, name=''):
        if type(x) is dict:
            for a in x:
                flatten(x[a], name + a + '_')
        elif type(x) is list:
            i = 0
            for a in x:
                flatten(a, name + str(i) + '_')
                i += 1
        else:
            out[name[:-1]] = x

    flatten(y)
    return out

import time

def get_team_stats(start, stop):
    base_url = "https://statsapi.web.nhl.com/api/v1/teams/?teamId=3&expand=team.stats&season="
    start = int(start)
    stop = int(stop)
    
    stats = []
    
    while start != stop:
        url = base_url + str(start) + str(start + 1)
        response = requests.get(url)
        content = response.json()['teams']
        try:
            team_stats = [team_dict['teamStats'][0] for team_dict in content][0]['splits'][0]
            if type(team_stats) is dict:
                team_stats.update({"season_start":start, "season_stop" : start + 1})
                stats.append(team_stats)
            else:
                for item in team_stats:
                    item.update({"season_start":start, "season_stop" : start + 1})
        except:
            pass
        time.sleep(1)
        start += 1
        print(str(start) + " season complete")
    
    #stats = [item for sublist in stats for item in sublist ]
    df_stats = pd.json_normalize(stats, sep = "_")
    return df_stats

#rangers_stats = get_team_stats(2012,2014)

rangers_stats

---------------------------------------------------------------------------

NameError                                 Traceback (most recent call last)

<ipython-input-87-d4f0a941d66d> in <module>
----> 1 rangers_stats


NameError: name 'rangers_stats' is not defined

df_rangers = get_team_stats(1926,2020)

1927 season complete
1928 season complete
1929 season complete
1930 season complete
1931 season complete
1932 season complete
1933 season complete
1934 season complete
1935 season complete
1936 season complete
1937 season complete
1938 season complete
1939 season complete
1940 season complete
1941 season complete
1942 season complete
1943 season complete
1944 season complete
1945 season complete
1946 season complete
1947 season complete
1948 season complete
1949 season complete
1950 season complete
1951 season complete
1952 season complete
1953 season complete
1954 season complete
1955 season complete
1956 season complete
1957 season complete
1958 season complete
1959 season complete
1960 season complete
1961 season complete
1962 season complete
1963 season complete
1964 season complete
1965 season complete
1966 season complete
1967 season complete
1968 season complete
1969 season complete
1970 season complete
1971 season complete
1972 season complete
1973 season complete
1974 season complete
1975 season complete
1976 season complete
1977 season complete
1978 season complete
1979 season complete
1980 season complete
1981 season complete
1982 season complete
1983 season complete
1984 season complete
1985 season complete
1986 season complete
1987 season complete
1988 season complete
1989 season complete
1990 season complete
1991 season complete
1992 season complete
1993 season complete
1994 season complete
1995 season complete
1996 season complete
1997 season complete
1998 season complete
1999 season complete
2000 season complete
2001 season complete
2002 season complete
2003 season complete
2004 season complete
2005 season complete
2006 season complete
2007 season complete
2008 season complete
2009 season complete
2010 season complete
2011 season complete
2012 season complete
2013 season complete
2014 season complete
2015 season complete
2016 season complete
2017 season complete
2018 season complete
2019 season complete
2020 season complete

import re

re.sub("\\.", "_", "test this.out")

df_rangers.rename(columns = lambda x: re.sub("\\.", "_", x), inplace = True)
df_rangers.columns

Index(['season_start', 'season_stop', 'stat_gamesPlayed', 'stat_wins',
       'stat_losses', 'stat_ot', 'stat_pts', 'stat_ptPctg',
       'stat_goalsPerGame', 'stat_goalsAgainstPerGame', 'stat_evGGARatio',
       'stat_powerPlayPercentage', 'stat_powerPlayGoals',
       'stat_powerPlayGoalsAgainst', 'stat_powerPlayOpportunities',
       'stat_penaltyKillPercentage', 'stat_shotsPerGame', 'stat_shotsAllowed',
       'stat_winScoreFirst', 'stat_winOppScoreFirst', 'stat_winLeadFirstPer',
       'stat_winLeadSecondPer', 'stat_winOutshootOpp', 'stat_winOutshotByOpp',
       'stat_faceOffsTaken', 'stat_faceOffsWon', 'stat_faceOffsLost',
       'stat_faceOffWinPercentage', 'stat_shootingPctg', 'stat_savePctg',
       'team_id', 'team_name', 'team_link'],
      dtype='object')

from bokeh.plotting import figure, output_file, show
from bokeh.io import output_notebook, show
from bokeh.models import ColumnDataSource
from bokeh.models.tools import HoverTool
from bokeh.models import SingleIntervalTicker, LinearAxis, Range1d, LabelSet, Label, \
Arrow, NormalHead, OpenHead, VeeHead
output_notebook()

p = figure(plot_width=400, plot_height=400, 
           x_axis_type=None, title = "Rangers Points by Season")

source = ColumnDataSource(df_rangers)

p.line(x = "season_stop", y = "stat_pts", source = source, line_width = 2, 
       legend_label = "Points")
p.line(x = "season_stop", y = "stat_gamesPlayed", source = source, line_color = "grey",
      line_width = 4, line_dash = "dashed",
      legend_label = "Games Played")
p.yaxis.minor_tick_line_color = None

# add annotations
labels = Label(x = 1995, y = 20, text = "Lockout", level = "overlay")
p.add_layout(Arrow(end=NormalHead(line_color="black", line_width=1),
                   x_start=2005, y_start=25, x_end=1995, y_end=45))
p.add_layout(Arrow(end=NormalHead(line_color="black", line_width=1),
                   x_start=2005, y_start=25, x_end=2013, y_end=45))
p.add_layout(labels)

hover = HoverTool()
hover.tooltips=[
    ("Year","@season_stop"),
    ('Points', "@stat_pts"),
    ('Games Played', '@stat_gamesPlayed')

]

ticker = SingleIntervalTicker(interval=10)
xaxis = LinearAxis(ticker=ticker)
p.add_layout(xaxis, 'below')

p.add_tools(hover)


p.legend.location = "top_left"
p.legend.title_text_font_style = "bold"
p.legend.title_text_font_size = "20px"
p.legend.background_fill_alpha = 0
p.legend.border_line_alpha = 0
p.legend.margin = -1

show(p)

p = figure(plot_width=400, plot_height=400, 
           x_axis_type=None, title = "Rangers Points by Season")


p.line(x = "season_stop", y = "stat_wins", source = source, line_color = "firebrick",
      line_width = 4, line_dash = "dashed")
p.yaxis.minor_tick_line_color = None

# Setting the second y axis range name and range
p.extra_y_ranges = {"percent": Range1d(start=0, end=1)}

# Adding the second axis to the plot.  
p.add_layout(LinearAxis(y_range_name="percent"), 'right')

p.line(x = "season_stop", y = "stat_winScoreFirst", source = source, 
       line_width = 2, line_color = "green",
      y_range_name = "percent")

hover = HoverTool()
hover.tooltips=[
    ("Year","@season_stop"),
    ('% Win Score First', "@stat_winScoreFirst"),
    ('Wins', '@stat_wins')

]

ticker = SingleIntervalTicker(interval=10)
xaxis = LinearAxis(ticker=ticker)
p.add_layout(xaxis, 'below')

p.add_tools(hover)

show(p)

df_rangers.stat_winLeadFirstPer

0     0.938
1     0.875
2     0.833
3     0.917
4     0.923
      ...  
88    0.806
89    0.741
90    0.625
91    0.526
92    0.741
Name: stat_winLeadFirstPer, Length: 93, dtype: float64

NHL API Intro

TOC