Hello and welcome to First Line Sports Analytics on YouTube! In this video, we will be continuing our journey into probability in the context of sports by discussing random variables. If you haven’t seen the previous video in this series, you can click on the thumbnail in the top-right of your screen or click the link in the description below.
We left off discussing probability fundamentals and probability functions. To recap, probability functions consider an action, count the number of events in a set in which that action occurred, then divide that by the number of possible events in the set or sample space.
The next step in understanding probability functions is to introduce random variables. First, a variable is just a value or number that changes, or varies. While we’d casually think of the term “random” as referring to a situation in which a bunch of things are equally likely to happen, with Random Variables, we’re referring specifically to a situation in which something can have different values, each value with their own probability. In this context, the word “random” just refers to the fact that some thing can take on any number of values. These actually serve as the input in probability functions. As an example, if we were to roll a die and ask “what’s the probability of rolling a number less than three?”, what you’re really asking is “what is the probability of the random variable (the number on the die) being a 1 or 2, given the probability function provided?”
Since the output of a probability function, the probability of something happening, is always between 0 and 1, all values of a random variable in question within the sample space have to add up to 1 and all values that exist outside of the sample space must have a probability of zero. Meaning, the probability of rolling a 1 through 6 on a die is 1, because it’s all of the possible values, but the probability of rolling any other number is zero.
Random variables can be classified as either discrete or continuous. A discrete random variable is one that can only be single numeric values, so like the die. It can only be 1, 2, 3, 4, 5, or 6. Not 1.2. A continuous random variable is one that can equal any number within a certain interval. For example, looking at the probability that it’s going to be between 75 and 80 degrees outside tomorrow. In terms of using these different kinds of variables in probability functions, it is the difference between looking for the probability of a single value [ P(X = x) ] and the probability of a range of values [ P(a <= X <= b) ].
So, why don’t we relate this back to sports? Let’s say a basketball player takes 10 shots. The number of shots they make would be a discrete random variable because it can only be from 0 to 10. But let’s also say that the player was on the floor for 30 minutes in that game. The amount of time they were on offense would be a continuous random variable because it can be any value between 0 and 30. But, both of these are random variables and can serve as part of probability functions.
So, in summary, a random variable is a variable that has different values, each with their own probability, random variables serve as the input in probability functions, and they can be either discrete or continuous.
But, what can we get out of probability functions? What kind of information can we tell about the situation being modeled? That is what we will begin to cover in the next video in this series!