How is the p-value found and what is a Type 1 error?

Christos Nikolaou
Apr 21
2 min read

Watch the post on YouTube.

Following the example of the previous post, let's say that the mean height in our sample is not 1.43 standard deviations away from the mean but z standard deviations away from the mean. So, z, which is the factor we multiply the standard deviations to find how many of them the sample is away from the mean, is now considered a variable (i.e it can take any value within the set of real numbers). If we know the probability distribution of this variable, we can calculate the probability that it is equal to or greater than any number. In our example, we want to calculate the probability that it equals or exceeds 1.43.

Now, I can tell you that the z variable that represents the test statistic in our example follows a specific type of normal distribution that we call the z-distribution. It does not matter what this is exactly, but what matters is that it is a normal distribution, and we graph it in such a way that the probability of getting a value within an interval is equal to the area under the curve within this interval. We call such functions probability density functions. So, the probability that we get a value equal to or greater than 1.43 or a value equal to or lower than -1.43 is the area under the curve between 1.43 and plus infinity and the area between -1.43 and minus infinity. This area, which is the p-value, is calculated from the statistical software we use. If this area is greater than 0.05, then our sample is not an extreme sample. If this area is less than 0.05, our sample is either an extreme sample of our population or a sample from a different population.

In statistics, we assume that because it is very unlikely that we have got a sample so extreme, our sample belongs to a different population if the probability of getting it is less than 0.05. But there is a probability we are wrong. Because extreme values exist, and we may get them. Now, the probability of getting an extreme value is 0.05. Every time we find such a value, we call it extreme. So, the probability we mistakenly say that an extreme value does not belong to this population is 0.05. In other words, the probability we make the error of telling our very tall or very short friends they are not part of our town when they are is 0.05, and we call our error a Type 1 error.

How is the p-value found and what is a Type 1 error?

Recent Posts

Comments