THE DATA MADE ME DO IT

An amateur coder’s guide to fantasy football

Data analytics, save my fantasy team!

Sports columnist Avani Lakkireddy posing for a headshot.
By AVANI LAKKIREDDY
Former USC quarterback Sam Arnold has a high “boom” probability, according to a data analysis. He is pictured on Nov. 26, 2016. (Nick Entin / Daily Trojan file photo)

There comes a time in early September when, inevitably, the call of fantasy football becomes too loud to drown out. Commissioners are chosen. Leagues are drawn up. Fantasy boards are created. Players are drafted.

And, just like every other year, hope flows eternal through the minds of those contained within the thousands of leagues. “This will be my year,” you say. “I know Ladd McConkey is going to go off,” you say. “Joe Burrow is going to have a legacy season,” you say. And you draft accordingly. But just a few weeks after that fated draft, the highest of highs turn into the lowest of lows. 

Suddenly, you’re on a two-week losing streak, with your first-round pick underperforming and your quarterback listed as questionable. Your defense has just given up 41 points, and somehow, every single running back on your roster is putting up 2.5 points. Just two weeks later you think to yourself, “Man, I should have just autodrafted.” 


Daily headlines, sent straight to your inbox.

Subscribe to our newsletter to keep up with the latest at and around USC.

The dread has finally set in. You’re cooked.

I found myself in this state of panic as well. It was in this self-inflicted hole that I decided enough was enough. I — an amateur coder at best — would create a model that would accurately predict the boom and bust probabilities of key offensive players this NFL season. 

Methodology 

The idea came to me in my “Applied Machine Learning and Data Mining” class, thanks to my professor, Martin Prescher. As he was lecturing about logistic regression and classification, I was thinking about how bad my fantasy team was. The two paths converged, and my idea was born: a classification system I would code in Python based on whether players would overperform or underperform week to week. 

A vast majority of the process included data processing and cleaning. I used ESPN’s point-per-reception scoring as my chosen system and sourced the data from ESPN and Stathead (thank you, free trial), using PPR on both. I also chose to include the strength of schedule as a feature in my model, which I sourced from Fox Sports.  

I defined “boom” to be when a player’s actual points are 120% of their projected points. A “bust” is at the opposite end of the spectrum, when a player scores less than 80% of their projected score. The model also adds in a “boom probability” value, which is a percentage value that predicts how likely it is that a player will overperform their projection. 

Margin of error

Before I actually share the results of my model, a note on accuracy and the nature of this classifier: firstly, there is a relative nature to the boom classification. For example, on average, quarterback Josh Allen will undoubtedly score more points than running back Kenneth Walker III. However, Walker may be ranked as a “boom” and Josh Allen as a “bust” because, relative to their own projections, they are under or overperforming, even if Allen scores more points than Walker. 

Also, after running my classifier, I calculated the F1-score of my model. Some technical knowledge here: an F1-score in data analytics basically assesses the level to which your model correctly classifies data and its ability to capture all of the correct instances of data. The highest F1-score is 1, or 100%.

The “boom” side of the model received an F1-score of 20%, while the “bust” side received an F1-score of 67%, demonstrating important limitations of my model — it is better at predicting busts rather than booms and is, overall, not that precise. 

A reason for this inaccuracy is likely the fact that fantasy points do not only depend on the features I included in my model. There are hundreds of factors involved in a football player’s performance, down to the nitty-gritty that can’t even be quantified. A more specified set of features may lead to better predictions. All this is to say: take my advice with about a teaspoon of salt. 

The results

After running my model, I found players classified as a boom, with the highest future boom probability, separated by position. 

The first running back on the list is Seattle Seahawks running back Kenneth Walker III, who, after a quiet week one, exploded for 18.8 points in Seattle’s win against the Steelers. Also, Indianapolis Colts back Jonathan Taylor has a high boom probability after a monster week two. 

Quarterbacks Sam Darnold (fight on) of the Seahawks, Bo Nix of the Denver Broncos and Bryce Young of the Carolina Panthers also have high boom value, with all three putting up fine performances in weeks one and two. However, with five starting quarterbacks out injured right now — let them play on grass! — all three may be good pickups.

Wide receivers populate most of the list including the Colts’ Josh Downs, Miami Dolphins’ Tyreek Hill, Detroit Lions’ Jameson Williams and San Francisco 49ers’ Jauan Jennings having high boom probabilities. Green Bay Packers tight end Tucker Kraft is also high on the list. 

On the other hand, running backs Breece Hall of the New York Jets, Josh Jacobs of the Packers and Dylan Sampson of the Cleveland Browns, are rated low in boom probability and listed as busts. 

The rest of the bottom 10 are quarterbacks: the Washington Commanders’ Jayden Daniels, Arizona Cardinals’ Kyler Murray, Jets’ Garrett Wilson, Pittsburgh Steelers’ Aaron Rodgers, Packers’ Jordan Love, Chicago Bears’ Caleb Williams (fight on) and Jets’ Justin Fields. 

Both Daniels and Fields are currently listed as questionable with injuries, while the rest of the quarterbacks on the list had relatively quiet weeks last time out. Only Love, who scored 20.88 points in the Packers’ convincing win against the Commanders per Fantasy Pros, performed well in week two on the bust list. 

So, put in your trade requests, claim those players out of free agency and bask in the little reprieve you have before your roster falls apart again when your starting quarterback gets turf toe in Week 3. Happy Fantasy Football Season!

See my full dataset here and my Python code here

Avani Lakkireddy is a sophomore using data analytics to find patterns and model behavior in college and professional sports in her column “The Data Made Me Do It,” which runs every other Friday.

ADVERTISEMENTS

Looking to advertise with us? Visit dailytrojan.com/ads.

© University of Southern California/Daily Trojan. All rights reserved.