Understanding the Uniform Distribution of P-Values in Hypothesis Testing

chowpy · Sep 13, 2010

I read the following statement from wiki,but I don't know how to get this.

"when a p-value is used as a test statistic for a simple null hypothesis, and the distribution of the test statistic is continuous, then the test statistic (p-value) is uniformly distributed between 0 and 1 if the null hypothesis is true."

anyone can explain it more?
thanks~~

Mapes · Sep 14, 2010

Hi chowpy, welcome to PF!

Imagine that you have a data set A of one or more experimental observations. You also have a null hypothesis in mind (a possible distribution of results that data set A may or may not have come from). Say you're comparing the means of these two distributions (but it could be any parameter that you're comparing).

The p-value is always defined as the expected frequency of obtaining your actual data set A from the null hypothesis. (If the p-value is incredibly low, we might decide that A came from another distribution, and therefore reject the null hypothesis; that's what hypothesis testing is all about.)

If the null hypothesis is actually true, then we'd expect to get a p-value anywhere from 0% to 100%, distributed evenly. In other words, if the data set A (or a more extreme* data set) would only arise 20% of the time, then we'd expect a p-value of 0.20. *By more extreme I mean a data set with a mean farther away from the mean of the null hypothesis, in the example we're using.

Does this answer your question?

statdad · Sep 14, 2010

Remember what it means for a random variable X to be uniformly distributed on (0,1)

P(X <=a) = a for any a in (0,1)
Let P denote the p-value as a random variable

T stand for a generic Test statistic that has a continuous distribution.

Pick an a in (0,1). Since T has a continuous distribution, there is a number ta that satisfies

[tex]
\Pr(T \le ta) = a
[/tex]

Now, the events [tex] P \le a [/tex] and [tex] T \le ta [/tex] are equivalent, so that

[tex]
\Pr(P \le a) = \Pr(T \le ta) = a
[/tex]

comparing this to the meaning of "uniformly distributed on (0,1) shows the result.

chowpy · Sep 16, 2010

Thanks Mapes and statdad~
I understand it now~

blue_raver22 · Sep 22, 2010

Sure, I'd be happy to explain this further. In hypothesis testing, the p-value is a measure of the strength of evidence against the null hypothesis. A p-value of 0.05 or less is typically considered statistically significant, meaning that the evidence against the null hypothesis is strong enough to reject it and accept the alternative hypothesis.

The statement you read is referring to the distribution of p-values when the null hypothesis is actually true. In this case, the p-value is calculated based on the assumption that the null hypothesis is true, and it represents the probability of obtaining a test statistic (such as a mean or a correlation coefficient) at least as extreme as the one observed in the sample data.

Now, if the null hypothesis is true, then the test statistic (p-value) is expected to follow a certain distribution. This distribution is called the null distribution and it is determined by the type of test being conducted and the sample size. When the null distribution is continuous (meaning that it can take on any value within a certain range), then the p-value will also have a continuous distribution.

The statement is saying that if the null hypothesis is true and the null distribution is continuous, then the p-value will be uniformly distributed between 0 and 1. This means that all values between 0 and 1 are equally likely to occur as p-values, and there is no bias towards any particular value. This is important because it allows us to set a cutoff value (such as 0.05) to determine statistical significance, and any p-value below this cutoff can be considered strong evidence against the null hypothesis.

In summary, the statement is simply saying that when the null hypothesis is true and the null distribution is continuous, the p-value will follow a uniform distribution between 0 and 1. This is a fundamental concept in hypothesis testing that helps us interpret the strength of evidence against the null hypothesis. I hope this explanation helps!

Understanding the Uniform Distribution of P-Values in Hypothesis Testing

Related to Understanding the Uniform Distribution of P-Values in Hypothesis Testing

1. What is a p-value?

2. How is the p-value related to hypothesis testing?

3. What does a low p-value indicate?

4. What is the significance level, and how does it relate to p-value?

5. Is a small p-value always desirable?

Similar threads

Hot Threads

Recent Insights