Introduction
One of the NCAA Men’s Basketball metrics I’ve been fascinated with lately is that of Game Excitement Index. Game Excitement Index (GEI) attempts to quantify how exciting a particular game was after it has been played. Related metrics have been implemented for NFL games by Brian Burke, NBA games by InPredict (Mike Beuoy), and for March Madness by FiveThirtyEight. One can compute GEI for college basketball games using my ncaahoopR
package, which I define as follows:
\[
\text{GEI} = \frac{2400}{t}\sum_{i = 2}^n |p_i - p_{i-1}|
\]
where \(t\) is the length of the game (in seconds), \(n\) is the number of plays in the game, and \(p_i\) is the home team’s win probability on play \(i\) of the game. One can think of GEI as a measure of the length of the win probability curve if it were to be unwound, normalized to the length of a standard regulation game. The reason I choose to normalize the length of games is that I don’t want sloppy “boring” games which simply happen to go to 2 or 3 overtimes to be pegged as more exciting. In general, this small normalization has little effect, as games that go deep into overtime are generally pretty exciting to begin with. Through this article, I hope to explore which games, teams, and conferences have produced the most exciting basketball this season, while showing off how one can use ncaahoopR
to answer interesting college basketball questions.
Obtaining the Data
The below code uses ncaahoopR
to
- Get each team’s schedule
- Compute GEI for each game this season
Note that a complete list of team’s can be found in the ids
data frame built into the package.
library(ncaahoopR)
### Scrape All Schedules
i <- 1
for(team in ids$team) {
print(paste("Getting Team:", i, " of ", 353))
schedule <- get_schedule(team) %>%
filter(date < Sys.Date()) %>%
mutate("team" = team)
if(i == 1) {
master <- schedule
}else{
master <- rbind(master, schedule)
}
i <- i + 1
}
### Get Unique Game IDs
game_ids <-
filter(master, !duplicated(game_id)) %>%
pull(game_id) %>%
unique()
### Compute GEI for Each Game
n <- length(game_ids)
df <- data.frame("game_id" = game_ids,
"gei" = NA)
for(i in 1:nrow(df)) {
print(paste("GEI:", i, "of", n))
df$gei[i] <- game_excitement_index(game_ids[i])
}
master <- left_join(master, df, by = "game_id")
Analysis
Below is a histogram of the Game Excitement Index for the 2018-19 season (for all games play-by-play data is avaiable). Through the first 6 weeks of the season, GEI has a mean of roughly 3.6 and standard deviation of about 2.6. GEI appears to follow some sort of Gamma distribution. The distribution is skewed right, with over 62 percent of games registing GEI less than 4.
We can see looking by the plot above that very few games have GEI greater than 10. In fact, 10 is the 99th percentile for GEI, and through 12/21/2018, only 23 games have achieved such a mark. Below are the most exciting games of the season.
team | opponent | date | location | team_score | opp_score | gei |
---|---|---|---|---|---|---|
St John’s | VCU | 2018-11-20 | N | 87 | 86 | 14.34 |
Kentucky | Seton Hall | 2018-12-08 | N | 83 | 84 | 12.49 |
Louisville | Michigan State | 2018-11-27 | H | 82 | 78 | 11.27 |
Campbell | UNC Wilmington | 2018-11-06 | H | 97 | 93 | 11.14 |
Hampton | Norfolk State | 2018-11-29 | A | 89 | 94 | 11.10 |
UMass Lowell | Wagner | 2018-11-10 | H | 88 | 84 | 10.86 |
Columbia | Fordham | 2018-11-18 | A | 69 | 70 | 10.79 |
BYU | Illinois State | 2018-11-28 | A | 89 | 92 | 10.75 |
SMU | Wright State | 2018-11-21 | N | 77 | 76 | 10.72 |
Columbia | Delaware | 2018-12-02 | H | 86 | 87 | 10.59 |
Texas State | UTSA | 2018-12-01 | A | 69 | 68 | 10.52 |
Santa Clara | USC | 2018-12-18 | H | 102 | 92 | 10.48 |
Boston College | Providence | 2018-12-04 | H | 95 | 100 | 10.47 |
E Kentucky | Northern Kentucky | 2018-12-08 | H | 76 | 74 | 10.46 |
LIU Brooklyn | Milwaukee | 2018-11-20 | A | 87 | 92 | 10.44 |
Iona | Long Beach State | 2018-11-19 | N | 85 | 86 | 10.32 |
Towson | UMBC | 2018-12-11 | A | 80 | 76 | 10.25 |
BYU | UNLV | 2018-12-15 | N | 90 | 92 | 10.24 |
Fairfield | LIU Brooklyn | 2018-11-13 | H | 87 | 89 | 10.22 |
American | UMBC | 2018-11-24 | A | 73 | 69 | 10.18 |
Abil Christian | Pacific | 2018-11-23 | A | 73 | 71 | 10.04 |
Denver | Wyoming | 2018-12-11 | A | 90 | 87 | 10.02 |
N Illinois | Northern Kentucky | 2018-11-09 | H | 85 | 88 | 10.01 |
Let’s take a closer look at the most exciting game of the season, VCU vs. St. John’s, an OT thriller in the Legend’s Classic championship game, during Thanksgiving “Feast Week”. We can make the win probability chart for the game using the function gg_wp_chart()
, as follows.
gg_wp_chart(game_id = 401096927, home_col = "black", away_col = "red")
Next, we can look at which teams have the highest and lowest average GEI. Due to the skewed nature of the GEI distribution, it probably makes most sense to rank teams by median GEI. We can also classify games into a few different categories based on their GEI:
- Heart Pounders: GEI > 8
- Thrillers: 4 < GEI \(\leq\) 8
- Average Games: 1 < GEI \(\leq\) 4
- Duds: GEI < 1
Most Exciting Teams
## `summarise()` ungrouping output (override with `.groups` argument)
team | median_gei | mean_gei | max_gei | min_gei | heart_pounders | thrillers | average_games | duds |
---|---|---|---|---|---|---|---|---|
American | 7.43 | 6.26 | 10.18 | 0.98 | 3 | 2 | 1 | 1 |
San Jose State | 7.15 | 6.27 | 9.85 | 1.05 | 4 | 2 | 2 | 0 |
Delaware | 7.05 | 6.17 | 10.59 | 2.43 | 3 | 4 | 3 | 0 |
UMKC | 6.97 | 6.93 | 9.42 | 4.38 | 1 | 2 | 0 | 0 |
Harvard | 6.87 | 6.43 | 9.26 | 2.03 | 2 | 4 | 1 | 0 |
Iona | 6.85 | 6.58 | 10.32 | 1.14 | 3 | 3 | 2 | 0 |
Fordham | 6.66 | 5.62 | 10.79 | 0.48 | 2 | 4 | 2 | 1 |
Seton Hall | 6.46 | 5.79 | 12.49 | 0.74 | 2 | 3 | 3 | 1 |
Wyoming | 6.39 | 5.36 | 10.02 | 1.05 | 2 | 5 | 4 | 0 |
VCU | 6.11 | 7.38 | 14.34 | 3.66 | 1 | 3 | 1 | 0 |
CSU Bakersfield | 6.07 | 5.52 | 9.14 | 1.77 | 1 | 5 | 2 | 0 |
Niagara | 5.97 | 6.18 | 9.92 | 1.50 | 2 | 7 | 1 | 0 |
Ga Southern | 5.91 | 5.10 | 7.96 | 1.61 | 0 | 8 | 3 | 0 |
Arizona State | 5.89 | 4.87 | 9.55 | 0.56 | 2 | 4 | 2 | 2 |
Saint Joe’s | 5.82 | 5.31 | 8.42 | 1.46 | 1 | 3 | 1 | 0 |
Belmont | 5.78 | 5.30 | 8.13 | 1.88 | 1 | 5 | 2 | 0 |
High Point | 5.74 | 5.03 | 9.25 | 1.15 | 2 | 4 | 5 | 0 |
Lafayette | 5.74 | 5.27 | 8.75 | 0.59 | 3 | 3 | 3 | 1 |
Indiana | 5.62 | 4.39 | 9.41 | 0.34 | 2 | 4 | 2 | 3 |
Pacific | 5.60 | 5.46 | 10.04 | 1.06 | 3 | 3 | 4 | 0 |
Least Exciting Teams
team | median_gei | mean_gei | max_gei | min_gei | heart_pounders | thrillers | average_games | duds |
---|---|---|---|---|---|---|---|---|
MD-E Shore | 0.53 | 1.31 | 5.35 | 0.23 | 0 | 1 | 2 | 8 |
AR-Pine Bluff | 0.61 | 1.21 | 5.14 | 0.30 | 0 | 1 | 2 | 6 |
Miss Valley St | 0.67 | 0.83 | 1.72 | 0.32 | 0 | 0 | 2 | 7 |
Texas Tech | 0.68 | 2.63 | 7.68 | 0.26 | 0 | 4 | 0 | 6 |
Coppin State | 0.73 | 1.45 | 8.29 | 0.30 | 1 | 0 | 2 | 7 |
Illinois | 0.77 | 1.14 | 2.16 | 0.48 | 0 | 0 | 1 | 2 |
Chicago State | 0.80 | 2.46 | 8.19 | 0.38 | 1 | 3 | 1 | 7 |
Alabama State | 0.84 | 1.23 | 2.84 | 0.31 | 0 | 0 | 3 | 6 |
Auburn | 0.94 | 2.18 | 7.80 | 0.24 | 0 | 2 | 2 | 5 |
Alcorn State | 0.95 | 1.30 | 3.07 | 0.44 | 0 | 0 | 5 | 5 |
Virginia Tech | 0.95 | 2.48 | 8.45 | 0.39 | 1 | 1 | 2 | 5 |
Georgia Tech | 0.99 | 2.60 | 7.25 | 0.39 | 0 | 3 | 1 | 4 |
NC State | 1.00 | 2.52 | 8.50 | 0.23 | 1 | 2 | 3 | 5 |
S Carolina St | 1.00 | 1.87 | 5.92 | 0.39 | 0 | 2 | 4 | 6 |
UNC Asheville | 1.01 | 2.47 | 7.52 | 0.34 | 0 | 2 | 3 | 4 |
Saint Mary’s | 1.03 | 0.98 | 1.35 | 0.69 | 0 | 0 | 3 | 2 |
UNC | 1.03 | 2.16 | 6.38 | 0.46 | 0 | 1 | 4 | 4 |
TCU | 1.13 | 2.51 | 7.13 | 0.64 | 0 | 1 | 1 | 2 |
Duke | 1.16 | 2.23 | 9.80 | 0.25 | 1 | 1 | 5 | 5 |
Maine | 1.23 | 2.07 | 5.27 | 0.30 | 0 | 2 | 2 | 3 |
Perhaps not suprisingly, many of the least exciting teams so far are MEAC and SWAC teams, often scheduling buy-games (and getting blown-out) against high-major opponents. Teams like Duke, UVA, and Texas Tech are likely on the list for the same reason–scheduling and destroying many weaker opponents. Duke’s appearance on this list is actually a testament to it’s domiance this season. Even having Kentucky, Texas Tech, Auburn, Indiana, and Gonzaga on it’s non-conference schedule, it’s level of dominance means Duke’s win probability charts flatline early and are marked by low GEI scores. This also shows a limitation of GEI to show what is exciting. Against most opponents, a team like Duke will still be fairly heavily favored when the score is close and as such, won’t be able to wrack up as high of a Game Excitement Index. Perhaps at this stage of the season, GEI is best used to rank mid-major teams, and it would be wise to wait until conference play begins to evalaute high-major teams on this metric.
GEI Game Types by Conference
Most Exciting Game by Date
Finally, one can look at the most exciting game on each day of the season. I got the idea for the below chart from Jordan Sperber’s look at the best ranked games each day per KenPom FanMatch, which seeks to quantify the quality of a game before it is played.
By no means is GEI meant to be a perfect metric, and it might fail to capture important aspects of the game that a given individual might find to be exciting. That being said, I think it does a pretty good job of capturing a lot of what makes games fun to watch, and helps raise awareness of a lot of good mid-major basketball that often flies under the radar. I’ll keep updating these materials as conference play kicks off in the coming weeks to see if/how the metric changes, but for now, we can sit back, relax, and appreciate some good college hoops.