VIDEO TRANSCRIPTIONS:
Hello and Welcome back to First Line Sports Analytics on Youtube! With 9 minutes to go in Super Bowl 51, the Atlanta Falcons led the New England Patriots 28-12. They had the ball on 2nd down with 2 to go, 35 yards away from the end zone. ESPN Stats and Info’s win probability model reported that the Falcons had a 99.6% chance of winning their first ever Super Bowl. But, they went on to lose in overtime in one of the most storied endings in NFL history.
It also led to crowds flooding social media and the air waves, questioning how this model could be accurate if the Patriots ended up winning, even asking what the point of probability models was at all if games aren’t played on spreadsheets? But what many failed to realize was what this actually showed about probability models and how useful they can really be.
This is the first in what will be a series of videos, in which we look at concepts of probability and statistics, in the context of sports, with the intention of introducing the fundamental place that math has in the sports experience.
Let’s go over some of the fundamentals of probability, which are rooted in the mathematical concept of sets. A set is a list of objects being investigated. Objects, then, are any single possible item in a set. Say we have some action, like an at-bat in baseball. A set can be any of the scenarios that lead to the batter reaching base. An object within the set could be a home run. Every possible result of an at bat exists in what is called the sample space, which can consist of any number of sets with any number of objects within them.
Let’s consider drawing a heart from a conventional 52-card deck. The action is drawing a card, the set is all of the hearts in the deck, and the sample space is all 52 cards in the deck. And, if we were to continue this process and calculate the probability in question, we would take the length of the set, 13, and divide by the length of the sample space, 52. This calculation is an example of what is called a “probability function”.
The input of a probability function is the outcome, or set of outcomes in question as part of the “experiment”. The output is denoted as a fraction where the numerator is the number of events that result in the outcome, and the denominator is the number of events in the sample space. This fraction is most commonly shown as a decimal or percentage, especially in sports. This number represents the probability of the event in question.
Picking a card out of a deck represents what is called a simple sample space, meaning the probability of each single outcome is equal to the probability of any other single outcome. In this case, 1 out of 52 for each card. But sample spaces in most situations are not simple, and they are never simple in sports. Sports has an almost infinite number of possible scenarios and outcomes, each with their own likelihood.
In Super Bowl 51, that 99.6% win probability for the Falcons meant that in the estimated sample space of sequences to end that game, 99.6% of them resulted in a victory for the Falcons. But 0.4% of the situations in this model resulted in the Patriots winning. Just because one of these 0.4% of possible results actually happened, doesn’t mean that the model was “wrong” or “broken”, it just meant the Falcons were more likely to win. In fact, a Patriots win validated the fact that their comeback was incredibly unlikely, but still possible.
So, why run a model if it just reaffirms what we already feel? Well, there are lots of other uses for probability models in sports besides live spectator entertainment. For some of those further uses and more about probability in sports in general, watch out for more videos in this series!
One thought on “Probability in Sports Ep.1 “Games Aren’t Played On Spreadsheets””