Flow and Dynamics in NBA games - Part I

A look from 2001 to now

Posted by Thomas Vincent on June 12, 2015

I have been interested in the game flow and dynamics of NBA games for a while, and so have decided to start a series of blog posts that will hopefully culminate in a web app that processes and displays game flows for present and historical games. Before that, however, I wanted to sink my teeth into the data and run a few analysis that may be of interest.

To start of with, I wrote a Python script that scraped all available play-by-play data for all games played between the seasons 2001-2014. The output was stored in a SQL database of ~450Mb, which translate to a 7,364,787 x 8 matrix when read in memory. With the data now available for analysis, we can proceed toowards producing some vizualizations of the data. First, I set out to explore the score differential for each team, which I figured would be an interesting proxy for overall performance. In this blog post, I looked at the average number of points that teams were ahead/behind when they were playing at home or away.

Here, the score differential attributed to a team during any given season was simply calculated as the average score difference observed at every scoring event. More formally, we can write this as:

\[ScoreDiff = \frac{\sum_{n=1}^{N} S_{n} - O_{n}}{N}\]

where \(N\) is the total number of scoring events played by a team at home or away, \(S_{i}\) and \(O_{i}\) are the scores for the home and away team at scoring event \(n\), respectively. Therefore, a positive \(ScoreDiff\) indicates that a team tends to be in the lead, while a negative \(ScoreDiff\) indicates that teams tend to be behind. The first chart below shows the average score differential for teams playing on their home-court.

Average score differential for teams when playing at home

One cool thing about these plots is that they allow to quickly see which teams have been consistently excellent at home. For example, we can see that the San Antonio Spurs and Dallas Mavericks are two teams that have achieved positive score differentials all the way from 2000 to 2014 when playing at home. Could it be a Texas thing?

Strong performance at home for the Spurs and Mavericks

Next, we can look at the score differentials achieved by teams playing away. In this case, we see that San Antonio is again the team that plays the best when away from home. Of the historically worst teams out there, we have New York, Toronto, Golden State and Utah.

Average score differential for teams playing away

Worst performing teams away

The charts below show the advantages that steams from teams playing on their home-court. Surisingly, the average score differential per season (both away and at home) was not correlated to how many wins were achieved by each team. This is clearly shown in the two plots below. I would guess that this is the due to the fact that many teams go on signifcant runs that may actually superseed the fact that they are not necessarily in the lead all the time.


Correlation between home score differential and win shares

Correlation between away score differential and win shares

In this first part, I have looked at the average number of points that teams lead or trail across all games played in a season. I have actually seen other examples online of gameflow charts that typically display score differential during the course of a game. While this method of looking a game flow is interesting, it is somewhat limited. In the upcoming part of this blog series, I will look at scoring streaks, proportion of time in the lead, and largest leads.