When not to use the Student t test?

In summary, when performing a Student t test, it is important to have normally distributed data with equal variances. If data is not normally distributed, a non-parametric test such as the Mann-Whitney test is recommended. If variances are significantly different, the Welch-corrected t test should be used. However, the strictness of following these rules may vary depending on the sample size, with larger samples being more appropriate for these tests. In cases where data is not normally distributed, a logarithmic transformation may be helpful before performing a t test or using a Mann-Whitney test. Additionally, considering a linear regression model may also provide useful information for determining differences between sample groups.
  • #1
Monique
Staff Emeritus
Science Advisor
Gold Member
4,219
67
A Student t test assumes normally distributed data with equal variances.
I know you can test the Gaussian distribution with the Kolmogorov and Smirnov test and test the variances with the F-test.

When data is not normal you use a non-parametric test (Mann-Whitney test), when variances are significantly different you use the Welch-corrected t test.

How strict should I follow those rules?
According to this site (http://www.graphpad.com/articles/interpret/Analyzing_two_groups/choos_anal_comp_two.htm ) the rules work well for >100 samples and works poorly for <12 samples. How about the region in between?

I have samples sets of n around 20, some are not normally distributed. Can I go ahead and do a t test, or should I maybe log transform all the data before doing the t test? Or do a Mann-Whitney test?

Thanks for your input, here is a graph with the data distribution for the 4 samples, together with the 95% CI:
http://img301.imageshack.us/img301/9940/scatter95cifg4.jpg
 
Last edited by a moderator:
Physics news on Phys.org
  • #2
The question of equal variances is easy: there is a variant of the t-test designed for unequal variances. For ex., proc ttest in SAS will produce one statistic under H0: equal variances, and another statistic under unequal variances, and it will also test for equality of the variances.

A first "gut" reaction to the question of normality is, you should use both types of tests (parametric and non). If the results agree, no worry. You should think some more only if their results turn out differently from each other.

The data look as if a logarithmic transformation would do the trick, esp. for the 3rd and the 4th samples.

What I would have done is to estimate the linear regression Log(Y) = a + b2 d2 + ... + b4 d4 + ε, where di = 1 if Y is in the i'th sample (i = 1, 2, 3, 4), di = 0 otherwise; b's are the parameters to be estimated, and ε is the error term. Each b represents the difference between the mean of the i'th sample from the mean of the control sample. In this model, the first sample is made the control group by having been excluded from the regression, but one can easily change that. I'd first run this as an unweighted regression; alternatively I'd run a weighted regression to control for unequal variances (a problem technically known as heteroscedasticity.)
 
Last edited:
  • #3


The Student t test is not appropriate to use when the data does not meet the assumptions of normality and equal variances. In this case, it would be more appropriate to use a non-parametric test such as the Mann-Whitney test. It is important to follow these rules because using a test that assumes normality and equal variances on non-normal data can lead to incorrect conclusions.

The strictness of these rules can vary depending on the size of your sample. As mentioned in the article you referenced, these rules work well for sample sizes larger than 100, but may not work as well for smaller sample sizes. In your case, with sample sizes around 20, it would be best to err on the side of caution and use a non-parametric test.

In terms of transforming your data, it is generally recommended to only transform data if it is necessary to meet the assumptions of the test being used. In this case, if your data is not normally distributed, it would be appropriate to use the Mann-Whitney test instead of transforming the data and using a t test.

Overall, it is important to carefully consider the assumptions of the test being used and choose the appropriate test for your data. In this case, the Mann-Whitney test would be the most appropriate choice for your sample sizes and non-normal data.
 

Related to When not to use the Student t test?

1. When should I not use the Student t test?

The Student t test should not be used when the data does not follow a normal distribution or when the sample size is small (less than 30). This is because the t test assumes a normal distribution and becomes less reliable with smaller sample sizes.

2. Can I use the Student t test for non-parametric data?

No, the Student t test is only suitable for parametric data, meaning data that follows a normal distribution. For non-parametric data, other tests such as the Mann-Whitney U test or the Wilcoxon signed-rank test should be used.

3. Is the Student t test appropriate for comparing more than two groups?

No, the Student t test can only be used to compare two groups. When comparing more than two groups, an ANOVA (analysis of variance) test should be used.

4. Can I use the Student t test if the variances of the two groups are unequal?

If the variances of the two groups are unequal, the t test may still be used if the sample sizes are equal. However, if the sample sizes are unequal, an alternative test such as the Welch t test should be used.

5. Are there any other situations where the Student t test should not be used?

The Student t test should not be used for paired data, where the same individuals are measured in two different conditions. In this case, a paired t test or a non-parametric equivalent such as the Wilcoxon signed-rank test should be used.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
668
  • Set Theory, Logic, Probability, Statistics
Replies
27
Views
3K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
858
  • Set Theory, Logic, Probability, Statistics
Replies
30
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
9
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
3
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
20
Views
3K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
5
Views
1K
Back
Top