Confused about statistics terms

This means that if you take many samples from the population and calculate the sample variances, the average of these values will be equal to the population variance. In general, any statistic that is an unbiased estimator means that if you take many samples and calculate that statistic, the average of these values will equal the true population parameter. Hope that helps!
  • #1
Will
I have always thought that the operation:
((sum(x1-(x-ave))^2)...XN-(x-ave))^2/(N-1))^.5
was known as the standard deviation. But now my physics text says that it is:
((sum(x1-(x-ave))^2)...XN-(x-ave))^2/(N))^.5
and another website I went to says that yes, the second equation is the correct terminology and that the first equation is the square root of the bias-corrected variance, and that the two are often confused. So what operation does the "standard deviation"
operation on my TI-89 do? and how do I do the other one?
 
Mathematics news on Phys.org
  • #2
I've only had an introductory course in Statistics, but I was taught that the division by N corresponded to the population's standard deviation whereas the division by n-1 corresponded to the sample's standard deviation.

i.e.,

[tex]\sigma = \sqrt{\frac{\sum{(x-\mu)^2}}{N}}[/tex] will be used to find the standard deviation of a population,

and

[tex]S = \sqrt{\frac{\sum{(x-\overline{x})^2}}{n-1}}[/tex] will be used to find the standard deviation of a sample of the population.

I would assume that your calculator would assume that a list is a sample, not a population, and so would use the second equation. However, I know that the TI-83+ will give you both if you perform 1-Var Stats on the list.

Perhaps someone more educated can help you more.

cookiemonster
 
Last edited:
  • #3
cookie monster is totally correct.
Imagine you wanted to calculate the standard deviation of peoples heights. If you had every single persons height you calculate the population standard deviation with formula is
((sum(x1-(x-ave))^2)...XN-(x-ave))^2/(N))^.5.

More realistically you would estimate the standard deviation using a sample of 100 peoples heights. In this case use the sample standard deviation formula
((sum(x1-(x-ave))^2)...XN-(x-ave))^2/(n-1))^.5
cleary sample standard deviation is the default defintion of standard deviation.

Sample standard deviation is usually denoted as "s" or "sigma (n-1)"
and population standard devitation is denoted by "sigma (n)". Your calculator certainaly would have "sigma(n-1)" button.
 
  • #4
The difference between dividing by N vs. dividing by N-1 results from the fact that for the entire population, you can calculate the exact average, while for a sample you have an approximate average. When you calculate the mathematical expectation of the sample deviation (with N-1) it will equal the population deviation.
 
  • #5
People appear to be making claims here that aren't true. The N-1 version does not give you the Population S.D. It is the unbiased estimator of the population deviation - ie the best we can do for certain constraints. The unbiased estimator of population mean is the sample mean, fortunately.

Just imagine we draw two samples from the same population. From what was written above one would think there are two S.D. of the population, one coming from each sample.
 
  • #6
matt, would you explain the difference a bit more? I don't quite understand what you're getting at and my intro class never really got into enough detail to warrant a thorough treatment.

cookiemonster
 
  • #7
Ok, so we have a population with unknown mean and standard deviation.We take a sample from it and we want to work out some statistics. We've got the mean and the ordinary standrad deviation lying around (the one dividing by n). The question asked is 'do these form an unbiased estimate of the population mean and s.d.?'

firstly X is an unbiased esitmator of a parameter Y if E(X)=Ythe sample mean is an unbiased estimator of the population mean but if you actually work out the mean of the standard deviation it is not the population s.d..

So we use the n-1 quantity instead which in our calculation above we found coming into it as a measure of the bais of our estimate.

My nit-picking was that you gave the ipmression that it IS the standard deviation of the population - it isn't that is unknown, it is an unbiased estimator of it, and is often called by abuse of notation the pop. s.d., which is subtly different, as there is an implication there that we mean more.CORRECTION

The square of the satistic we are referring to as n-1 is the unbiased estimators of the pop. variance, it is not in general itself an unbiased estimator of the pop. s.d., see eg the wolfram entry. My memory is getting terrible these days. At least I think it might be, I don't recall clearly.
 
Last edited:
  • #8
So, if I'm reading Mathworld correctly and remembering the class correctly, if we repeatedly sampled a population and repeatedly calculated [itex]s_{N-1}^2[/itex] of these samples, the average of these values would yield the true variance?

cookiemonster
 
  • #9
Not exactly. The only way to *know* the *true* population variance is to sample every member of the population. The more samples you take from a population the better the estimate will be of this.
 

1. What are some common statistical terms and what do they mean?

Some common statistical terms include mean, median, mode, standard deviation, and correlation. Mean refers to the average of a set of numbers, median refers to the middle number in a set of ordered numbers, mode refers to the most frequently occurring number in a set, standard deviation measures the amount of variation in a set of data, and correlation measures the strength and direction of a relationship between two variables.

2. What is the difference between descriptive and inferential statistics?

Descriptive statistics involves summarizing and describing a set of data, such as calculating the mean and standard deviation. Inferential statistics involves making inferences and generalizations about a larger population based on a smaller sample of data.

3. How do I determine which statistical test to use?

The type of statistical test to use depends on the research question, the type of data being analyzed, and the assumptions of the data. It is important to consult with a statistician or refer to statistical textbooks or resources to determine the appropriate test for your specific research question.

4. What is the p-value and why is it important?

The p-value is a measure of the likelihood of obtaining a result at least as extreme as the one observed if the null hypothesis is true. It is important because it helps determine the statistical significance of a result and whether it is likely due to chance or a true effect.

5. How can I avoid common statistical mistakes?

Some common statistical mistakes include misinterpreting p-values, using inappropriate statistical tests, and not checking assumptions of the data. To avoid these mistakes, it is important to have a good understanding of statistical concepts, consult with a statistician or use statistical software, and thoroughly check and validate your data before drawing conclusions.

Similar threads

  • General Math
Replies
1
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
18
Views
2K
  • Precalculus Mathematics Homework Help
Replies
3
Views
3K
Replies
2
Views
2K
Replies
1
Views
2K
Replies
4
Views
405
  • Set Theory, Logic, Probability, Statistics
Replies
22
Views
1K
Replies
2
Views
4K
Replies
2
Views
745
Back
Top