Confused about the intuitive explanation of degrees of freedom

In summary, the concept of degrees of freedom (D.F) is often explained by using an example of estimating the variance of a population with a sample. The sample mean is calculated from n numbers, but only n-1 of these numbers are free to vary in order to determine the variance. This is because the n-th number must be chosen so that the mean of all n numbers comes out to the calculated sample mean. Some people argue that this makes sense intuitively, however, this argument is false as the sample mean itself becomes an additional degree of freedom. In statistics, degrees of freedom is an important concept and is also applied in other tests, such as the chi-square test.
  • #1
kotreny
46
0
One common explanation of the concept of D.F. is this:

Suppose you have n numbers (a, b, c,...) that make up a sample of a population. You want to estimate the variance of the population with the sample variance. But the sample mean m is being calculated from these numbers, so when determining the variance ((a-m)2+(b-m)2+(c-m)2...)/n, only n-1 numbers are free to vary. The n-th number must be chosen so that the mean of all n numbers comes out to m. Thus, there are only n-1 "degrees of freedom."

But wait--shouldn't m be free to vary in this case? The value of the n-th number is a function of the other numbers and m. Fair enough, but that means m must become the n-th degree of freedom!
 
Physics news on Phys.org
  • #2
I am not sure what your point is. However in estimating the variance, the sample variance divisor is n-1 in order for it to be an unbiased estimate of the true variance.
 
  • #3
Sorry, I forgot to add that this is a common intuitive explanation for why the n-1 creates an unbiased sample variance. I take it it's a bad one? Regardless, n-1 is generally said to be the number of degrees of freedom in the case of n numbers whose residuals must sum to zero. Supposedly, only n-1 numbers are useful as information because they are free to vary. The nth number is completely determined by the previous n-1 numbers and the condition that all n residuals sum to zero. Sometimes the explanation describes the sample mean as the condition. My argument is that either of these additional conditions qualify as degrees of freedom themselves, making it n degrees of freedom no matter what.

Here is a small sample of links with the D.F. explanation I am questioning. Either all are wrong (not likely), I misinterpreted them, or my own reasoning is naive. Please, clear up the situation for me if you can.

http://en.wikipedia.org/wiki/Degrees_of_freedom_(statistics)#Linear_regression

http://www.tufts.edu/~gdallal/dof.htm

http://arnoldkling.com/apstats/df.html

 
Last edited by a moderator:
  • #4
As a mathematician, specializing in probability theory (not statistics), I have not worked with the concept degrees of freedom. However, the proof of the use of n-1 comes directly from estimating the mean of the sample variance. To make it equal to the true variance, you need n-1.
 
Last edited:
  • #5
Thank you for replying anyway. I am familiar with the proof you speak of, but some people have said that the n-1 "makes sense because it is the number of degrees of freedom." I rather doubt this claim; In fact, as I said twice, I doubt the entire claim that n-1 is even the number of D.F. to begin with.
 
  • #6
I, too, have been struggling with this concept. I don't think degrees of freedom really work in an intuitive manner, so I'm just settling with using n-1 for sample variance to make it an unbiased estimator.
 
  • #7
Hi mezza8, thanks for the input and welcome to the forums. Even if we discard completely the D.F. connection to the sample variance, D.F. is still an important concept in statistics. It is applied in the chi-square test for example. A lot of people say that degrees of freedom is an intuitive concept, and make the questionable argument seen in my links and discussed above. (Check the YouTube one for a particularly clear demonstration of this dubious reasoning. If the link doesn't work for any reason, the uploader's name is jdeisenberg. You can search that with "degrees of freedom.") I hope I have made clear why I think this argument is false. When using an estimated parameter to justify removing a D.F., the parameter itself becomes the so-called removed D.F.
 

Related to Confused about the intuitive explanation of degrees of freedom

1. What are degrees of freedom in statistics?

Degrees of freedom in statistics refer to the number of independent pieces of information available for estimating a population parameter. It is the number of values in a sample that are free to vary once certain constraints or conditions are applied.

2. Why is it important to understand degrees of freedom?

Understanding degrees of freedom is important because it helps us determine the appropriate statistical test to use and interpret the results correctly. It also allows us to understand the limitations and reliability of our data.

3. How are degrees of freedom calculated?

Degrees of freedom are calculated by subtracting the number of constraints or conditions from the total number of values in a sample. For example, if we have a sample size of 50 and we are estimating the mean, the degrees of freedom would be 49.

4. Can degrees of freedom be negative?

No, degrees of freedom cannot be negative. It is a count of the number of values that are free to vary, and therefore it must be a positive integer or zero.

5. How do degrees of freedom relate to sample size?

Degrees of freedom are determined by the sample size and the number of parameters being estimated. As the sample size increases, the degrees of freedom also increase, meaning there is more information available for estimating population parameters.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
4K
  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
591
  • Set Theory, Logic, Probability, Statistics
Replies
9
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
18
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
983
  • Set Theory, Logic, Probability, Statistics
Replies
12
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
674
  • Set Theory, Logic, Probability, Statistics
Replies
22
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
4K
Back
Top