Interpreting a very small reduced chi squared value

  • Thread starter X-Kirk
  • Start date
  • Tags
    Chi Value
In summary: N is the number of non-zero values in your data.As chi squared is an indicator of goodness of fit, your χ2 should be smaller if the line you fitted is a better fit than if you had used a straight line. So I think your data does not support the use of chi square to compare the fit of lines. Your results may be due to incorrect assumptions about the nature of the data.
  • #1
X-Kirk
4
0
So I have been analyzing data a took in an experiment recently and have been using a chi squared as a "goodness of fit" test to against a linear model.
I am using excel and used the LINEST function (least square fitting method) to get an idea for a theoretical gradient and intercept for my data. Using these I found a series of normalized residuals:
Ri =[itex]\frac{obs-exp}{error}[/itex]
and my χ2 is the sum of these normalized residuals. I have then calculated my reduced χ2 by dividing this value by my number of degrees of freedom minus 2 (as i have a gradient and intercept).
However my reduced χ2 is much less than 1 (Specifically 0.007). I understand that this typically means I have over estimated my errors. I have checked these in great detail now, my errors are all in measurements which have very clearly defined uncertainty that I can't change. In light of this I am confused about how I should interpret this result. What does this test tell me?

I only have 8 data points for this particular set which I know is not a lot. Is it still reasonable to use chi square like this for such a small data set or might that be the reason I am not getting a very good result?
 
Physics news on Phys.org
  • #2
X-Kirk said:
Ri =[itex]\frac{obs-exp}{error}[/itex]

Confusing description. What is exp (do you mean estimated), and error? Please state the hypothesis to test and against what alternative. Have you checked that your test statistic is distributed as chi sq under null hyp.

Though I have not realized your problem and method exactly, but for goodness of fit, there should be some classes. Then freq chi sq is of the form Ʃ[(obs freq-exp freq)2)/exp freq].
 
Last edited:
  • #3
Sorry exp means expected/theoretical value.

The experiment deals with the resistivity of a super conductor at different temperatures. I have the characteristic result where the resistivity is zero until reaching a critical temperature, then a rapid near vertical increase followed by a straight line above the critical temperature like so:http://www-outreach.phy.cam.ac.uk/physics_at_work/2011/exhibit/images/irc1.jpg

What i am trying to do it take the section of the graph above Tc and use chi squared to tell me how good a fit it is to linear. Then from there I could determine the best fitting gradient and its uncertainty. Is this clearer?

My understanding was the equation you quoted there was specifically for situations involving counting where the distribution of measurements is a Poisson probability distribution? Whereas I am doing a least-squares fit to a straight line with non-uniform error bars. I may have misunderstood which method to use?
 
Last edited by a moderator:
  • #4
X-Kirk said:
So I have been analyzing data a took in an experiment recently and have been using a chi squared as a "goodness of fit" test to against a linear model.
I am using excel and used the LINEST function (least square fitting method) to get an idea for a theoretical gradient and intercept for my data. Using these I found a series of normalized residuals:
Ri =[itex]\frac{obs-exp}{error}[/itex]
and my χ2 is the sum of these normalized residuals. I have then calculated my reduced χ2 by dividing this value by my number of degrees of freedom minus 2 (as i have a gradient and intercept).

It appears you are following the same procedure as in the Wikipedia article http://en.wikipedia.org/wiki/Goodness_of_fit shown in the Example for Regression Analysis.

Why didn't you divide by 2-1=1 instead of by 2?

That example says you need to know the standard deviation of the population of errors. The [itex] \sigma [/itex] does not denote a value that you calculate from your sample of data.


What i am trying to do it take the section of the graph above Tc and use chi squared to tell me how good a fit it is to linear. Then from there I could determine the best fitting gradient and its uncertainty. Is this clearer?

Since you determined the slope of the line that minimized the sum of the squares of the errors, I don't understand what you mean by finding the best fitting the gradient after you do a chi-square test.

The question what the "uncertainty" of the gradient means is complicated, although many curve-fitting software packages purport to give a number for it. I think the "uncertainty" of the gradient amounts to an estimate of its standard devation - but this is not straightforward. You don't have several samples of the best fitting gradient. You only have one value of it. So trying to compute its standard deviation requires some assumptions.
 
  • #5
The reference book I am working from (Measurements and their Uncertainties by I.G. Hughs & &. Hase) defines reduced chi squared as:
χ[itex]\nu[/itex]2=[itex]\frac{1}{\nu}[/itex]Ʃ[itex]\frac{(yobs-yexp)2}{α2}[/itex]
where α is the uncertainty on the individual yobs.
and [itex]\nu[/itex] is the number of data points minus your degrees of freedom. On the wikipedia page this is effectively the same thing as N-n-1, here I have 8 data points, and 2 DoF so my [itex]\nu[/itex]=6. I don't see why I need the standard deviation?

I determined a rough estimate for the slope using a poor least square fitting method. Then using the solver app built into excel i minimized my chi squared by varying my estimates for gradient and intercept slightly. This gives me the best approximation (that I know of) to a straight line. Then I can use solver again to vary the gradient and until my chi squared is equal to χ2+1. The difference between these gradients gives me the uncertainty.

The original question was what does it mean to have such a small χ2? Is there any situation in which this is appropriate or does it always mean you have over estimated your uncertainties?
 
  • #6
X-Kirk said:
My understanding was the equation you quoted there was specifically for situations involving counting where the distribution of measurements is a Poisson probability distribution? Whereas I am doing a least-squares fit to a straight line with non-uniform error bars. I may have misunderstood which method to use?
My method has nothing to do with Poisson or any other inherent error distribution.
Your method is not applicable unless the errors follow a normal distribution.
Do you have three series of data: fitted, theoretical and observed?
One descriptive way to to know the fit quality is the sq of correln coeff between fitted and obs values. This value is the fraction of total variance explained by regression.
 
Last edited:
  • #7
X-Kirk said:
I don't see why I need the standard deviation?
What do you mean by "uncertainty", if you don't take it to mean "standard deviation"?

The difference between these gradients gives me the uncertainty.
Is that what your reference book says to do to find "the uncertainty" or is this your own concept?

The original question was what does it mean to have such a small χ2? Is there any situation in which this is appropriate or does it always mean you have over estimated your uncertainties?

Chi-square is a statistic, so it is a random variable. If you have chosen the correct model then there is a still a small probability that Chi-square can take on an extremely large or small value. So a particular value of chi-square doesn't "mean" anything, in the sense of guaranteeing a particular conclusion or hypothesis. If we set-up the formalities of a statistical "hypothesis test" then we can say precisely what various statistics tell us. I don't know if that's what you are trying to do.

Here are two scenarios. Which is more likely to produce a small chi-square value?

1) Compare a model, known (or assumed) to be correct, to the data and compute the reduced chi-square statistic.
2) Take the data and fit whatever model minimizes the reduced chi-square statistic for that particular data.

Scenario 2) always gives you the smaller chi-square value, since you can adjust the model to fit the particular data. You can pick the model to be the correct model if it minimizes chi-square or you can pick the model to be a wrong model if the wrong model minimizes chi-square.

The Wikipedia article is talking about scenario 1). You are doing scenario 2). I don't know what your book is doing, but I suspect it is scenario 1).

Applying statistics and curve-fiting to a real world problem is a subjective process. To pose it as a mathematical problem that has a definite solution takes more information or assumptions that most people care to deal with. The way that I'd explain the subjective statements (in the Wikipedia article and in other textbooks) about "over-fitting" and "under fitting" is this: If you compare the correct model to the data then comparing its predictions to typical experimental data will most likely give you "an average amount of error", not an extremely large amount or an extremely small amount. To make practical use of this idea, you have to know what "an average amount of error" is. You can't know what "an average amount of error is" if all you have is the data. To know what "average error" is , you need to know more facts, such as the precision of measuring equipment used in an experiment. This is why the Wikipedia article says you must know [itex] \sigma [/itex] (instead of estimating [itex] \sigma [/itex] from the raw data).
 
Last edited:

Related to Interpreting a very small reduced chi squared value

1. What does a very small reduced chi squared value indicate?

A very small reduced chi squared value indicates that the observed data fits the expected theoretical model very closely. This means that the model is a good representation of the data and there is a high level of confidence in the results.

2. Is a small reduced chi squared value always desirable?

No, a small reduced chi squared value is not always desirable. It can also be an indication of overfitting or excessive model complexity, which may not accurately reflect the underlying data.

3. How is the reduced chi squared value calculated?

The reduced chi squared value is calculated by dividing the chi squared value by the degrees of freedom. The chi squared value is obtained by summing the squared differences between the observed data and the expected theoretical values.

4. What is considered a small reduced chi squared value?

A small reduced chi squared value is typically considered to be less than 1. This indicates a good fit between the observed data and the expected theoretical model.

5. Can a very small reduced chi squared value be a result of chance?

Yes, it is possible for a very small reduced chi squared value to be a result of chance. This is why it is important to also consider other factors, such as the sample size and the complexity of the model, when interpreting the reduced chi squared value.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
944
  • Set Theory, Logic, Probability, Statistics
Replies
5
Views
4K
  • Set Theory, Logic, Probability, Statistics
Replies
5
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
840
  • Set Theory, Logic, Probability, Statistics
Replies
6
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
6
Views
842
  • Set Theory, Logic, Probability, Statistics
Replies
8
Views
1K
  • Calculus and Beyond Homework Help
Replies
3
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
20
Views
3K
Back
Top