top of page

The probability puzzle - Why the probability of a continuous measurement is zero.

  • Writer: Christos Nikolaou
    Christos Nikolaou
  • Apr 24
  • 5 min read


What is the chance that your next TPLO candidate needs a TPA of 5 or 6.5 degrees? Zero!


In a previous post, we discussed the function of the probability of a variable, which we call the distribution of the variable. We said that this function describes the probability that the variable (height) is within a specific interval (e.g. between 160 and 170 cm). We called this the distribution of this variable (Figure 1).


One question that may have come to mind could have been: "Why do we talk about the probability of a variable (e.g. the height of an adult) being within a specific interval (e.g. 160-17cm) and not about the probability of the variable taking a specific value (e.g. 160 cm)?"



Figure 1. P(160<x<170) is the probability that the height of an adult is within the range of 160-170 cm. The graph is for the purposes of illustration and does not represent the proper graph of a normal distribution, where the probability is depicted by the area under the curve and not the y-axis. The sample mean of 180 cm is 1.43 standard deviations (SD) away from the population mean. Hence, the test statistic is 1.43.
Figure 1. P(160<x<170) is the probability that the height of an adult is within the range of 160-170 cm. The graph is for the purposes of illustration and does not represent the proper graph of a normal distribution, where the probability is depicted by the area under the curve and not the y-axis. The sample mean of 180 cm is 1.43 standard deviations (SD) away from the population mean. Hence, the test statistic is 1.43.

Another valid question would be "Why do we define the p-value as the probability of a test statistic being more extreme than the one we found instead of the probability that it is exactly the one we found?". As a reminder, when we use the z or t-tests, the test statistic is the number of standard deviations our sample mean is away from the population mean. The p-value is the probability that our sample mean is further away from the mean, or, in other words, our test statistic is further away from 0. If our test statistic was 1.43, the p-value is the probability that it is between 1.43 and positive infinity or between -1.43 and negative infinity. These are the intervals we are looking into (Figure 2).



Figure 2. p-value is the probability that the test statistic is between 1.43 and plus infinity or -1.43 and minus infinity. So, we are talking about the probability of a value being within an interval.
Figure 2. p-value is the probability that the test statistic is between 1.43 and plus infinity or -1.43 and minus infinity. So, we are talking about the probability of a value being within an interval.

Before we answer these questions, we need to distinguish between a continuous and a discrete variable. A continuous variable can take any value between two numbers. For example, the height of an adult can be any number within the range of heights in the adult population. It can be 160 cm, 160.3 cm, or 160.03 cm. In other words, a continuous variable can take an infinite number of values. On the other hand, a discrete variable can take specific values. For example, when we throw a die, we can get 1, 2, 3, 4, 5 or 6.


Let's assume that the die is fair, meaning that there is no reason why one number will occur than the others. We also know that the number that occurs does not affect what number will occur next. For example, if we draw a 3, this does not affect the next draw. In other words, the events (1,2,3,4,5,6) are equally likely and independent. In this case, the probability of getting any one number out of six is exactly that, i.e. 1/6.


Now, let's assume that the die is still fair, but it has two threes and no twos. So, the likely events are 1, 3, 3, 4, 5, 6. Now, we have two threes out of a total of six likely events. So the probability of drawing a 3 is 2/6 or 1/3.


So far, we have been discussing the probability of events of a discrete variable. The variable is the result of rolling the die. We can call it X. The event is the value we get when we draw it. For example, if we draw a 2, X = 2, where 2 is the event.


Now let's move to a different example. We know that there is a fault in an underground cable with a length of 40 m. We call the origin of the cable 0. We name every single cross-section of the cable after its distance from 0. So, the cross-section that is 20.23 m away from the origin is called 20.23. There is no reason for us to believe that the fault in the cable is more likely to occur at one cross-section than the others. Also, a fault at one cross-section does not affect the likelihood of a fault occurring at any other cross-section. What is the probability of the fault occurring at the cross-section 20.23?



Figure 3. A cable of 40 m in length. The cross-sections at 20 and 20.23 m from the origin are depicted.
Figure 3. A cable of 40 m in length. The cross-sections at 20 and 20.23 m from the origin are depicted.

If we follow the definition of the probability of a variable with independent and equally likely events, then the probability of the fault occurring at cross-section 20.23 is one out of the total number of cross-sections. This time, our variable is the distance of the cross-section from the origin, and the events can be any number between 0 and 40. This is a continuous variable. So, we have infinite possible events. As a result, the probability of the fault occurring at one of them is one out of an infinite number of options, or one divided by infinite, which is 0.


So, when the variable is continuous (length, height, blood pressure, etc.), the probability of an event occurring is zero. In our example, the probability of the fault occurring at cross-section 20.23 is zero.


But what if we want to calculate the probability of a fault occurring between cross-sections 0 and 20? This is like asking the question "What is the probability of the fault occurring at one half of the cable?". The intuitive answer would be 1/2. Because we have two halves, and we are asking the probability of the fault occurring in one of them. So, we have two likely events, one event is that the fault happens in the first half, and the second possible event is that the fault happens in the second half. The events are equally likely and independent. So the probability of any of them happening is 1/2.


So, what is the probability that a dog needs a Tibial Plateau Angle (TPA) of 6.5 or 5 degrees? Zero!


It would be helpful if we had the probability that a dog needs a TPA within a specific interval. This is called a prediction interval, and I have calculated, based on current evidence, in another post.


In this post, we got an idea of why we only discuss the probabilities of intervals when the variable is continuous. We will look into a more rigorous mathematical approach in another post. But for now, we have an idea of why the p-value is not the probability of a test statistic occurring, but the probability that a test statistic is more extreme than the one we found.

Subscribe to our newsletter

Comments


Email us

Thanks for submitting!

bottom of page