Introduction to regularized adjusted plus-minus (RAPM)
An introduction to ridge regression in the context of estimating basketball player effects.
Motivation
Measuring a player’s effect on game outcomes is one of the most fundamental tasks in sports analytics. But this is not a simple thing to do, and varies greatly between sports! In the National Basketball Association (NBA), traditional box-score statistics provide a limited view of a player’s performance. In order to measure an individual player’s contribution, it is necessary to adjust for the presence of their teammates and opposition. Different versions of regularized adjusted plus-minus (RAPM) models are popular approaches in the basketball analytics community for attempting to address this challenge. In this module, you will build a RAPM model in R
for NBA players in an attempt to estimate an individual player’s effect when on the court.
Learning Objectives
By the end of this module, you will be able to:
- Fit, interpret, and understand the limitations of adjusted plus-minus models.
- Understand the role of penalization in ridge regression.
- Become familiar with basics of implementing ridge regression in
R
withglmnet
. - Fit, interpret, and evaluate players using regularized adjusted plus-minus models.
Data
The dataset and description are available at the SCORE Network Data Repository.
Module Materials
Prior to working on through this module, students are expected to know the following:
- Familiar with
R
and basictidyverse
data wrangling functions. - Exposure to linear regression.
- Familiar with cross-validation.
The module has sections indicating which portions are challenging exercises, and is designed to take an undergraduate student roughly 3-4 hours to complete.