NBA possession count details, flaws and suggestions

The number of possessions is one of the most important stats in a basketball game. It is used by most (if not all) advanced stats for teams (e.g.: point per possession), players (e.g.: usage rate) and lineups (e.g.: net rating). In the last post, I wrote about ways to count possessions in the NBA, using data from play-by-play. At first I believed it would be a pretty simple task, seeing that it’s such an important stat, but I ended up finding a few inconsistencies and flaws after comparing the results I thought would be correct to the results from the official NBA.com stats page. In this post, I will go into detail on some of these flaws and try to propose fixes to make it more accurate and standardized, with examples using R.

Let’s look at a game from this year’s playoffs: game 1 of last year’s first round series between Philadelphia and Washington. Here’s the 3rd quarter play-by-play, taken directly from the NBA API:

library(httr)
library(jsonlite)
library(tidyverse)

pbp_call <- GET("http://data.nba.net/prod/v1/20210523/0042000101_pbp_3.json")
pbp_data <- fromJSON(rawToChar(pbp_call$content)) %>%
  pluck("plays") %>%
  as_tibble() %>%
  select(1:5) %>%
  mutate(numberEvent = row_number())

We need to remember that a possession ends every time there is:

  • a made field goal
  • a missed field goal where the shooting team doesn’t get the rebound
  • a turnover
  • a 2 or 3 free throw trip where the shooting team doesn’t keep the ball after the last attempt (via offensive rebound or flagrant/clear path foul)

We can identify field goals, free throws and turnovers:

pbp_data_poss <- pbp_data %>%
  mutate(possession = case_when(eventMsgType %in% c(1, 2) ~ 1,  # made and missed fg
                                eventMsgType == 3 & str_detect(description, "1 of 2|1 of 3") ~ 1, # free throws
                                eventMsgType == 5 ~ 1,  # turnovers
                                TRUE ~ 0))
pbp_data_poss
## # A tibble: 104 x 7
##    clock eventMsgType description        personId  teamId numberEvent possession
##    <chr> <chr>        <chr>              <chr>     <chr>        <int>      <dbl>
##  1 12:00 12           Start Period       ""        ""               1          0
##  2 11:39 6            [PHI] Simmons Fou~ "1627732" "1610~           2          0
##  3 11:39 3            [WAS 63-61] Westb~ "201566"  "1610~           3          1
##  4 11:39 3            [WAS 64-61] Westb~ "201566"  "1610~           4          0
##  5 11:23 6            [WAS] Len Foul: S~ "203458"  "1610~           5          0
##  6 11:23 3            [PHI 62-64] Embii~ "203954"  "1610~           6          1
##  7 11:23 3            [PHI 63-64] Embii~ "203954"  "1610~           7          0
##  8 11:13 2            [WAS] Hachimura J~ "1629060" "1610~           8          1
##  9 11:09 4            [PHI] Embiid Rebo~ "203954"  "1610~           9          0
## 10 11:03 2            [PHI] Green Pullu~ "201980"  "1610~          10          1
## # ... with 94 more rows

There are other situations we need to keep and eye on (read more about it in my previous post), but we don’t have any of them in this quarter. To identify the cases where a team got an offensive rebound to extend the possession or it restarted after a free throw (flagrant/clear path/technical/away from play), let’s look at the times where the same team had more than 1 possession a row:

pbp_data_poss %>%
  filter(possession == 1) %>%
  filter(possession == lead(possession) & teamId == lead(teamId))
## # A tibble: 5 x 7
##   clock   eventMsgType description        personId teamId numberEvent possession
##   <chr>   <chr>        <chr>              <chr>    <chr>        <int>      <dbl>
## 1 09:55   2            [PHI] Curry 3pt S~ 203552   16106~          16          1
## 2 08:49   2            [PHI] Harris Driv~ 202699   16106~          27          1
## 3 08:39   2            [PHI] Embiid Turn~ 203954   16106~          29          1
## 4 02:38   2            [WAS] Westbrook 3~ 201566   16106~          72          1
## 5 00:45.2 2            [WAS] Bertans Jum~ 202722   16106~          90          1

If we look at these plays in the play-by-play, we’ll find that they were followed by an offensive rebound. Since the possession is restarted by the rebound and we need to count only the end of possession, we must change the possession column in these rebounded shots to 0 and update it in the original table:

change_consec <- pbp_data_poss %>%
  filter(possession == 1) %>%
  filter(possession == lead(possession) & teamId == lead(teamId)) %>%
  mutate(possession = 0) %>%
  select(numberEvent, possession)

pbp_data_poss_changed <- pbp_data_poss %>%
  rows_update(change_consec, by = "numberEvent")

pbp_data_poss_changed %>%
  filter(possession == 1) %>%
  filter(possession == lead(possession) & teamId == lead(teamId))
## # A tibble: 0 x 7
## # ... with 7 variables: clock <chr>, eventMsgType <chr>, description <chr>,
## #   personId <chr>, teamId <chr>, numberEvent <int>, possession <dbl>

Now that we don’t have any consecutive possessions, let’s count the total for each team:

pbp_data_poss_changed %>%
  group_by(teamId) %>%
  summarise(total_poss = sum(possession)) %>%
  ungroup()
## # A tibble: 3 x 2
##   teamId       total_poss
##   <chr>             <dbl>
## 1 ""                    0
## 2 "1610612755"         24
## 3 "1610612764"         24

Just to make sure, we can look at the official play-by-play and count each possession of the 3rd quarter:

It confirms that each team had 24 possession on the quarter. However, if we look at the number of possessions on NBA.com/stats:

Notice that it has 25 possessions for Washington. This is because, after Ben Simmons’ missed field goal at the end of the quarter, the Wizards get a rebound with 00:00.1 on the clock:

This 0.1 second is enough for the NBA to consider that Washington had an extra possession before the end of the quarter, bringing their total to 25, even though the quarter ended before the player had the chance to put the ball on the court. I believe the NBA should stop counting these situations as possessions unless the player attempts a field goal or gets a turnover, for a few reasons:

  • it relies too much on the play-by-play operator to impute the accurate time (to the decimal point) of mostly meaningless plays. A lot of times, the imputation is not very reliable.
  • some data sources, including another version of the play-by-play on the NBA API, don’t have the decimal seconds on the quarter clock. This makes it impossible to count the official number of possessions from it.

To illustrate the first point, let’s look at a game from the 2019-20 season and find the number of possessions for each team:

pbp_call <- GET("http://data.nba.net/prod/v1/20200223/0021900844_pbp_1.json")
pbp_data <- fromJSON(rawToChar(pbp_call$content)) %>%
  pluck("plays") %>%
  as_tibble() %>%
  select(1:5) %>%
  mutate(numberEvent = row_number())

pbp_data_poss <- pbp_data %>%
  mutate(possession = case_when(eventMsgType %in% c(1, 2) ~ 1,  # made and missed fg
                                eventMsgType == 3 & str_detect(description, "1 of 2|1 of 3") ~ 1, # free throws
                                eventMsgType == 5 ~ 1,  # turnovers
                                TRUE ~ 0))

change_consec <- pbp_data_poss %>%
  filter(possession == 1) %>%
  filter(possession == lead(possession) & teamId == lead(teamId)) %>%
  mutate(possession = 0) %>%
  select(numberEvent, possession)

pbp_data_poss_changed <- pbp_data_poss %>%
  rows_update(change_consec, by = "numberEvent")

pbp_data_poss_changed %>%
  group_by(teamId) %>%
  summarise(total_poss = sum(possession)) %>%
  ungroup()
## # A tibble: 3 x 2
##   teamId       total_poss
##   <chr>             <dbl>
## 1 ""                    0
## 2 "1610612743"         28
## 3 "1610612750"         27

According to out code and manual count, Minnesota has 27 possessions, while Denver has 28.

On NBA.com/stats, both teams have 27 possessions. The reason is that, if you look at the time on the steal and field goal in the last play, it was imputed as 00:00.0, while there was clearly more time left:

I suspect that the NBA uses a script to count the possessions, and that script stops the count when the clock hits 00:00.0, even if there are still more plays after it, with the wrong clock. This happens at least 10 times during a season, which is why I believe the league needs to stop relying on game time to count possessions and just look at the plays that end one (field goals, turnover, free throw trips). If a team gets the ball at the end of the quarter and doesn’t do any of these things, it shouldn’t count as a possession.

The second inconsistency I found is regarding fouls where the ball goes back to the shooting team after the free throw (technical, clear path, flagrant, away from play). I don’t believe the NBA has a set standard for these situations, as they sometimes count the free throws plus the ensuing play as 2 possessions, and other times they count it as only 1. We have examples of both just by looking at last year’s playoffs. In OT of game 1 of the first round series between Miami and Milwaukee:

Accounting for the extra possession Miami had when they got the ball with 00:00.5 left, each team should have 9 possessions. But NBA.com/stats gives the Bucks 10, because they had flagrant free throws in the quarter (at 04:13). Since the ball goes back to Milwaukee, restarting the possession, this shouldn’t be counted twice. Which is exactly what happens in the first quarter of game 2 of the first round series between the Lakers and the Suns:

Counting the flagrant foul free throws + following play as only one possession, each team has 25 in total. This matches exactly the number of official possessions on NBA.com/stats. So, in this case, the NBA restarts the possession after the free throws, as I believe it should be. In games that were 3 days apart, different criteria was adopted for the same situation.

Conclusion: for a stat that is so important, I believe the league has to address these 2 situations in order to facilitate the replication of the stat. I suggest that it stops counting possessions at the end of quarter unless there is a field goal, free throw trip or turnover as the clock expires, no matter how much time is left on the clock when the team gets the ball. I also suggest that technical/flagrant/clear path/away from play free throws should not be counted as a possession. I believe this will vastly improve the accuracy of the official data.

Thank you for reading!

 Share!

 
comments powered by Disqus