Fantasy Football Rushing Yards Over Expected Model

Introduction

As I continue my search to find new ways to compare and evaluate players, it is evident that “traditional” counting stats are just simply not enough anymore. To get the full scope of a player, you need more. And if not all rushing attempts are created equal, why should we assume outcomes are equal?

Next Gen Stats has figured this out and uses tracking data to evaluate players relative to their surroundings but, unfortunately, they keep their info private. This is where my new model, found exclusively at brotofantasy.com, comes in. Using publicly available stats, I attempt to predict the number of yards the “average” running back would get in any given rushing scenario.

Instead of using speed, direction and the position of defenders, I used situational data to get my expected rushing yards (xRY) number and compared it to the actual result of the play, to get rushing yards over expected (RYOE).

In simple terms, xRY of a given play can be viewed as the average yards gained by all the players in the past decade that had a rush attempt in the same game situation at the moment of handoff… but extended to all possible scenarios.

Coming up I will describe what went into making the model and later present the results.

The model

The data used for the model included all rushing plays since 2010 (provided by nflfastR), excluding QB scrambles and plays that resulted in penalties, to ensure we had a representative sample. Then, all plays were split on whether the rush was up the middle or to the outside, in order to calculate the average yards allowed by defensive units in those situations.

Once we had all the info necessary, 80% of the data was put through a 14-feature, extreme gradient boosting algorithm to generate the model. The general idea of an extreme gradient boosting algorithm is to use an ensemble of machine-learning techniques to build models on top of models that then “learn” from the mistakes the previous model made. This renders very accurate results.

Among the 14 features used were field position, Vegas win probability, time remaining, down, distance, whether it was out of shotgun or under center, etc.

Each one adds a level of specificity so that the model can better differentiate between plays.

The Results

So how did the model perform? To be perfectly honest: very very well.

When tested against data the model had never seen before, I found very straight forward relationships. While the average yards per play (24800 instances) of the test data sat at 4.208, the average expected yards from our model came out at… 4.200, that gives us an average rushing yards over expected (RYOE) of 0.008, virtually 0, right where we want it. That is including all the breakaway plays where RBs gain 20+ yards while the model had the max predicted value at ~10 yards.

For that same reason, the relationship between xRY and actual yards on a play-by-play basis will be underwhelming at best. There simply will never be an instance where the model predicts 70+ yards (something Miles Sanders did 3 times this past season).

However, if we group by player and season to analyze those results, we get the following:

Screen Shot 2021-02-24 at 8.15.38 PM.png

That’s 94.8% correlation (0.898 R^2) on unseen data for those keeping track at home. ninety-four!!

Taking it one step further, if we take rushers with at least 20 carries in that same test data, the average error (RMSE) between yards per carry and xRY/Att is 1.13. Not half bad.

Furthermore, when compared against nflfastR’s expected points added (EPA), RYOE had a very interesting nonlinear relationship with 72.7% correlation (0.528 R^2), giving foundation to the idea that rushing for more yards than expected does lead to a higher point expectancy.

Screen Shot 2021-02-24 at 8.15.48 PM.png

I am sure one question still remains - How did the model compare to the NGS version of RYOE? The answer again is, very well. For the 51 RBs that met the criteria, there was an 80.4% correlation (0.646 R^2) to the 2020 NGS data, as you can appreciate below.

Screen Shot 2021-02-24 at 8.15.58 PM.png

Again, on a different portion of the data, notice how our average RYOE was virtually 0, as any over/under expectation metric should be.

Finally, to tie the whole results section together, let's look at the 2020 leaders in RYOE:

Screen Shot 2021-02-24 at 8.16.14 PM.png

Outside of some shockers (I’m looking at you Darrell Henderson), I think the league leaders are about as expected.

Limitations

While this model is very accurate and a good comparable to the one created with tracking data, it is not without its limitations. In many cases, the situation (and therefore the model) may call for a 5 yard gain but the reality is a defender is already in the backfield at the moment of handoff making 5 yards impossible to gain, something impossible for us to know.

We have big projects in line that take advantage of the model’s strengths but specific play-by-play analysis isn’t one. One must know what the model is capable of and what it’s not to fully take advantage of it.

A huge thanks to Ben Baldwin and Sebastian Carl for their work on nflfastR and Tej Seth for the idea and setting the foundation for this project to be built upon.

For this and many more Fantasy Football and Football Analytics content you can find us at @BRotoFFCasanova and @BRotoFantasy on twitter, and of course, at brotofantasy.com.

By Santiago Casanova (@BRotoFFCasanova)

Featured

Feb 10, 2024

Broto Bets Super Bowl LVIII

Feb 10, 2024

For a second straight season, I am betting against Mahomes.

Feb 10, 2024

Feb 9, 2024

Super Bowl LVIII Preview

Feb 9, 2024

We have arrived. Only one victory stands between each team’s bid at eternal glory as NFL Champions.

Feb 9, 2024

Jan 27, 2024

Broto Bets Conference Championship

Jan 27, 2024

Broto Bets went 5-0 and gained 3.62 units in the Divisional Round.

Jan 27, 2024

Jan 26, 2024

NFL Playoff Preview: Conference Championships

Jan 26, 2024

Jan 19, 2024

Broto Bets Divisional Round

Jan 19, 2024

Surely if the Texans can destroy an elite Browns defense, they should have half a chance against an elite Ravens defense, right?

Jan 19, 2024

Jan 18, 2024

Playoff Preview: Divisional Championships

Jan 18, 2024

The 49ers make their 2023/2024 playoff debut at home after posting a 12-5 record as the number-one-seeded team in the NFC.

Jan 18, 2024

Jan 13, 2024

Broto Bets Wild Card Weekend

Jan 13, 2024

The Bills have allowed a QB to throw for at least 155 yards in nearly all of their games.

Jan 13, 2024

Jan 11, 2024

NFL Playoff Preview: Super Wild Card Weekend

Jan 11, 2024

The NFL Playoffs are a hallowed ground where the final chapters of the season will be written.

Jan 11, 2024

Jan 10, 2024

Dynasty Market Report: 2023 Season Review

Jan 10, 2024

Love is now eyeing down a Wild Card playoff berth in his first full season as a starter, leading the Packers to a 9-8 regular season record while posting 4,159 yards.

Jan 10, 2024

Jan 6, 2024

Broto Bets Week 18

Jan 6, 2024

With a pitiful point spread, expect a close game in Foxborough. Breece Hall is hot and could be the difference maker for New York.

Jan 6, 2024

Creating a Rushing Yards Over Expected Model

Header Photo Credits: Keith Allison, Tim mielkie, Nicholson - USA today sports, Numberfire, AP Photo (John Bazemore), Niners wire - USA TOday, Yahoo! Sports, CBS Sports

© 2020 BRoto Fantasy inc. All Rights Reserved.