Welcome to our community

Be a part of something great, join today!

[SOLVED] Performance of students - Hypothesis testing

mathmari

Well-known member
MHB Site Helper
Apr 14, 2013
4,004
Hey!! :eek:

A teacher wants to find out if the order of the exam tasks has an impact on the performance of the students. Therefore, he creates two versions ($ X $ and $ Y $) of an exam in which the exam tasks are arranged differently. The versions are randomly distributed so that $ n $ students receive version $ X $, and $ m = n $ receive version $ Y $ from them. We call the expected score at $ X $ with $ \mu_X $, and the expected score at $ Y $ with $ \mu_Y $. The variances are denoted $ \sigma_X^2 $ and $ \sigma_Y^2 $; it is assumed normal distribution.


(a) Formulate a suitable null hypothesis for the question of the teacher.
(b) Consider that $n = 30, \overline{X} = 79, \overline{Y}= 74, S_X' = 14, S_Y' = 20$. Check the null hypothesis of (a) with significance level $\alpha=5\%$.
(c) Consider that $\overline{X} = 79, \overline{Y}= 80, S_X' = 14, S_Y' = 20$. For which sample size $n$ can we reject the null hypothesis with significance level $\alpha=1\%$ ?


I have done the following:

(a) The null hypothesis is $H_0: \mu_X=\mu_Y$, right? (Wondering)


(b) Since we don't know if we have the same or different variances, we have to test if we have the same $\sigma$ with a F-test.
  • If $\sigma_x=\sigma_y$ then we apply a two-samples t-test.
  • If $\sigma_x<\sigma_y$ then we apply a Welch-Test

The test is the following:

The null hypothesis and the alternative hypothesis is $H_0:\sigma_Y^2=\sigma_X^2$ and $H_1:\sigma_Y^2>\sigma_X^2$, respectively.

The test statistic is \begin{equation*}F=\frac{{S_Y'}^2}{{S_X'}^2}=\frac{20^2}{14^2}=\frac{400}{196}\approx 2.0408\end{equation*}
$F$ is F-distributed with degres of freedom $\nu_Y=n_Y-1=30-1=29$, $\nu_X=n_X-130-1=29$.

We have that $1-\alpha=95\%$.

The null hypothesis will be rejected if $F>F_{1-\alpha}(\nu_Y, \nu_X)=F_{0.95}(29, 29)$.

It holds that $F_{0.95}(29, 29)=1,86$.

Since $F=2.0408>1.86=F_{0.95}(29, 29)$, we reject the null hypothesis.


So, we apply a Welch-Test.


The zero-hypothesis is $H_0: \mu_X-\mu_Y=0$ and the alternative hypothesis is $H_1:\mu_X-\mu_Y\neq 0$.

The test statistic $T$ for the t-Test with unknown variances \begin{equation*}T=\frac{\overline{X}-\overline{Y}-0}{\sqrt{\frac{{S_X'}^2}{n_X}+\frac{{S_Y'}^2}{n_Y}}}=\frac{79-74}{\sqrt{\frac{14^2}{30}+\frac{20^2}{30}}}=\frac{5}{\sqrt{\frac{196}{30}+\frac{400}{30}}}=\frac{5}{\sqrt{\frac{298}{15}}}\approx 1.1218\end{equation*}

The null hypothesis will be rejected if $|T|>t_{k;1-\alpha/2}$.

The number od degrees of freedom is\begin{align*}k&=\frac{\left (\frac{{S_X'}^2}{n_X}+\frac{{S_Y'}^2}{n_Y}\right )^2}{\frac{1}{n_X-1}\left (\frac{{S_X'}^2}{n_X}\right )^2+\frac{1}{n_Y-1}\left (\frac{{S_Y'}^2}{n_Y}\right )^2}=\frac{\left (\frac{14^2}{30}+\frac{20^2}{30}\right )^2}{\frac{1}{30-1}\left (\frac{14^2}{30}\right )^2+\frac{1}{30-1}\left (\frac{20^2}{30}\right )^2}=\frac{\left (\frac{196}{30}+\frac{400}{30}\right )^2}{\frac{1}{29}\left (\frac{196}{30}\right )^2+\frac{1}{29}\left (\frac{400}{30}\right )^2} \\ & =\frac{\left (\frac{596}{30}\right )^2}{\frac{1}{29}\left (\frac{38416}{900}+\frac{160000}{900}\right )}=\frac{\frac{355216}{900}}{\frac{1}{29}\cdot \frac{198416}{900}}=\frac{355216\cdot 29}{ 198416}=\frac{10301264}{ 198416}\approx 51.9175\end{align*} so $k=52$.

So we get the critical value $t_{k;1-\alpha/2}=t_{52;0.975}=1.67$.

Since $|T|=1.1218<1.67=t_{52;0.975}$ we don't reject the null hypothesis.


Is everything correct? (Wondering)


(c) Do we have to do the same as in (b) just with unknown n? (Wondering)
 

Klaas van Aarsen

MHB Seeker
Staff member
Mar 5, 2012
8,686
The number od degrees of freedom is $k\approx 51.9175$ so $k=52$.
Hey mathmari !!

Just a nitpick. Generally we round degrees of freedom down, so I believe we should pick $k=51$.
That's because we want to be sure with a confidence of 'at least' $1-\alpha$ before we reject the null hypothesis.
In case of doubt, we can't.
So we should round to the safe side. (Nerd)


So we get the critical value $t_{k;1-\alpha/2}=t_{52;0.975}=1.67$.

Since $|T|=1.1218<1.67=t_{52;0.975}$ we don't reject the null hypothesis.


Is everything correct? (Wondering)


(c) Do we have to do the same as in (b) just with unknown n? (Wondering)
Yep. Yep. (Nod)
 

mathmari

Well-known member
MHB Site Helper
Apr 14, 2013
4,004
Just a nitpick. Generally we round degrees of freedom down, so I believe we should pick $k=51$.
That's because we want to be sure with a confidence of 'at least' $1-\alpha$ before we reject the null hypothesis.
In case of doubt, we can't.
So we should round to the safe side. (Nerd)
Ah ok, I understand! (Nerd)


Yep. Yep. (Nod)
We have the following at (c) :

We check again with an F-test if the variances are equal. Or is it not neccesary and it holds the same as at (b) ? (Wondering)

The F-test would be the following:

The null hypothesis is $H_0:\sigma_Y^2=\sigma_X^2$ and the alternative hypothesis is $H_1:\sigma_Y^2>\sigma_X^2$.

The test statistic is \begin{equation*}F=\frac{{S_Y'}^2}{{S_X'}^2}=\frac{20^2}{14^2}=\frac{400}{196}\approx 2.0408\end{equation*}
$F$ is F-distributed with degress of freedom $\nu_Y=\nu_X=n-1$.

We have that $1-\alpha=99\%$.

The null hypothesis will be rejected if $F>F_{1-\alpha}(\nu_Y, \nu_X)=F_{0.99}(n-1, n-1)$.

How can we determine $F_{0.99}(n-1, n-1)$ without knowing $n$ ? (Wondering)
 

Klaas van Aarsen

MHB Seeker
Staff member
Mar 5, 2012
8,686
We already know that we'll need a bigger $n$ than we had for (b) don't we?

Let's inspect the F-table with the smaller $\alpha$ and with the same $F$-value (since the variances are the same).
What happens if we increase the degrees of freedom of both the numerator and the denominator?
Is there a possibility that we can assume equal variances after all? (Wondering)
 

mathmari

Well-known member
MHB Site Helper
Apr 14, 2013
4,004
We already know that we'll need a bigger $n$ than we had for (b) don't we?

Let's inspect the F-table with the smaller $\alpha$ and with the same $F$-value (since the variances are the same).
What happens if we increase the degrees of freedom of both the numerator and the denominator?
Is there a possibility that we can assume equal variances after all? (Wondering)
So we have to check for which n we have at this table. For $n-1\geq 30$ do we nit get values smaller than $F=2.0408$ and so the null hypothesis is rejected, or not?

So we have to apply again a Welch-test.

Or am I wrong? (Wondering)
 

Klaas van Aarsen

MHB Seeker
Staff member
Mar 5, 2012
8,686
So we have to check for which n we have at this table. For $n-1\geq 30$ do we nit get values smaller than $F=2.0408$ and so the null hypothesis is rejected, or not?

So we have to apply again a Welch-test.

Or am I wrong? (Wondering)
There seem to be mistakes in that table. For instance $F_{0.99}(31,31)=1.98$ is lower than the values to the left and right of it, which is not possible. (Worried)
I think we should use another table.

In R we can do:
Code:
> qf(0.99, 43:45, 43:45)
[1] 2.056934 2.039508 2.022824
So for $n-1\ge 44$ the critical $F$-values are below our $F=2.0408$, so we will have to reject the null hypothesis for those $n$, and apply the Welch-test. (Thinking)
 

mathmari

Well-known member
MHB Site Helper
Apr 14, 2013
4,004
There seem to be mistakes in that table. For instance $F_{0.99}(31,31)=1.98$ is lower than the values to the left and right of it, which is not possible. (Worried)
I think we should use another table.
Ah ok!


In R we can do:
Code:
> qf(0.99, 43:45, 43:45)
[1] 2.056934 2.039508 2.022824
So for $n-1\ge 44$ the critical $F$-values are below our $F=2.0408$, so we will have to reject the null hypothesis for those $n$, and apply the Welch-test. (Thinking)

We want for which $n$ the null hypothesis of (a) can be rejected.
So do we have to take cases and find $n$ if $\sigma_x=\sigma_y$, i.e. with a two-samples t-test and also if $\sigma_x<\sigma_y$, i.e. with a Welch-test?

(Wondering)
 

Klaas van Aarsen

MHB Seeker
Staff member
Mar 5, 2012
8,686
We want for which $n$ the null hypothesis of (a) can be rejected.
So do we have to take cases and find $n$ if $\sigma_x=\sigma_y$, i.e. with a two-samples t-test and also if $\sigma_x<\sigma_y$, i.e. with a Welch-test?
Yep. We can do that.
So for $n-1 < 44$ we should assume equal variances, find the critical $n$, and verify that it indeed satisfies $n-1 < 44$.
And for $n-1 \ge 44$ we should assume unequal variances, find the critical $n$, and verify that it indeed satisfies $n-1 \ge 44$. (Thinking)
 

mathmari

Well-known member
MHB Site Helper
Apr 14, 2013
4,004
For $n-1\ge 44$ we apply the Welch-Test.

The null hypothesis is $H_0: \mu_X-\mu_Y=0$ and the alternative hypothesis is $H_1:\mu_X-\mu_Y\neq 0$.

The test statistic is $T$ for the t-test with unknown variances is \begin{equation*}T=\frac{\overline{X}-\overline{Y}-0}{\sqrt{\frac{{S_X'}^2}{n_X}+\frac{{S_Y'}^2}{n_Y}}}=\frac{79-80}{\sqrt{\frac{14^2}{n}+\frac{20^2}{n}}}=\frac{1}{\sqrt{\frac{196}{n}+\frac{400}{n}}}=\frac{1}{\sqrt{\frac{596}{n}}}=\frac{\sqrt{n}}{2\sqrt{149}}\geq \frac{\sqrt{45}}{2\sqrt{149}}\approx 0.2748\end{equation*}

The null hypothesis will be rejected if $|T|>t_{k;1-\alpha/2}$.

The degree of freedom is \begin{align*}k&=\frac{\left (\frac{{S_X'}^2}{n_X}+\frac{{S_Y'}^2}{n_Y}\right )^2}{\frac{1}{n_X-1}\left (\frac{{S_X'}^2}{n_X}\right )^2+\frac{1}{n_Y-1}\left (\frac{{S_Y'}^2}{n_Y}\right )^2}=\frac{\left (\frac{14^2}{n}+\frac{20^2}{n}\right )^2}{\frac{1}{n-1}\left (\frac{14^2}{n}\right )^2+\frac{1}{n-1}\left (\frac{20^2}{n}\right )^2}=\frac{\left (\frac{196}{n}+\frac{400}{n}\right )^2}{\frac{1}{n-1}\left (\frac{196}{n}\right )^2+\frac{1}{n-1}\left (\frac{400}{n}\right )^2} \\ & =\frac{\left (\frac{596}{n}\right )^2}{\frac{1}{n-1}\left (\frac{38416}{n^2}+\frac{160000}{n^2}\right )}=\frac{\frac{355216}{n^2}}{\frac{1}{n-1}\cdot \frac{198416}{n^2}}=\frac{355216\cdot (n-1)}{ 198416}\geq \frac{355216\cdot 44}{ 198416}=78.7714\end{align*} so $k=78$.

The critical value is therefore $t_{k;1-\alpha/2}=t_{78;0.995}=2.375$.

How can we compare $|T|\geq 0.2748$ and $t_{78;0.995}=2.375$ where we have inequalities? (Wondering)
 

Klaas van Aarsen

MHB Seeker
Staff member
Mar 5, 2012
8,686
They are not inequalities if we pick a specific n.
In fact we have found that for n=45, we cannot reject H0.
We will need a bigger n.
How about n=100? Or n=1000? (Wondering)
 

mathmari

Well-known member
MHB Site Helper
Apr 14, 2013
4,004
They are not inequalities if we pick a specific n.
In fact we have found that for n=45, we cannot reject H0.
We will need a bigger n.
How about n=100? Or n=1000? (Wondering)
Ahh.. We reject the null hypothesis if $$|T|>t_{78;0.995}\Rightarrow \frac{\sqrt{n}}{2\sqrt{149}}>2.375\Rightarrow n>3361.81$$ right? (Wondering)
 

Klaas van Aarsen

MHB Seeker
Staff member
Mar 5, 2012
8,686
Ahh.. We reject the null hypothesis if $$|T|>t_{78;0.995}\Rightarrow \frac{\sqrt{n}}{2\sqrt{149}}>2.375\Rightarrow n>3361.81$$ right? (Wondering)
If n is bigger, doesn't the degrees of freedom k also become bigger? (Wondering)
Then the critical t-value becomes smaller until it approaches the critical z-value. Doesn't it?
 

mathmari

Well-known member
MHB Site Helper
Apr 14, 2013
4,004
If n is bigger, doesn't the degrees of freedom k also become bigger? (Wondering)
Then the critical t-value becomes smaller until it approaches the critical z-value. Doesn't it?
So, for big degree of freedom the crtitical t-value approximates the critical z-value and so $t_{k;0.995}\approx z_{0.995}= 2.575$ ? (Wondering)
 

Klaas van Aarsen

MHB Seeker
Staff member
Mar 5, 2012
8,686
So, for big degree of freedom the crtitical t-value approximates the critical z-value and so $t_{k;0.995}\approx z_{0.995}= 2.575$ ? (Wondering)
Ah. You already had the critical z-value! (Blush)
Then it's all correct.
 

mathmari

Well-known member
MHB Site Helper
Apr 14, 2013
4,004
For $n-1\ge 44$ we apply the Welch-test.

The null hypothesis is $H_0: \mu_X-\mu_Y=0$ and the alternative hypothesis is $H_1: \mu_X-\mu_Y\neq 0$.

The test statistic $T$ for the t-test is \begin{equation*}T=\frac{\overline{X}-\overline{Y}-0}{\sqrt{\frac{S_X'^2}{n_X}+\frac{S_Y'^2}{n_Y}}}=\frac{79-80}{\sqrt{\frac{14^2}{n}+\frac{20^2}{n}}}=\frac{-1}{\sqrt{\frac{596}{n}}}=-\frac{\sqrt{n}}{\sqrt{596}}\approx -0.04096\sqrt{n}\end{equation*}

The null hypothesis will be rejected if $|T|>t_{k;1-\alpha/2}=t_{k;0.995}$.

From $n\geq 30$ the t-distribution can be approximated by the normal distribution.

Since this holds in this case, $n\geq 45$, we have that $t_{k;0.995}\approx z_{0.995}=2.575$.

Therefore, so that the null hypothesis is rejected it must hold the following: \begin{equation*}|T|>t_{k;0.995}\Rightarrow 0.04096\sqrt{n}>2.575 \Rightarrow n>3952.16\end{equation*}
So, the null hypothesis will be rejected for a sample of size $n\geq 3953$.


Is everything correct? (Wondering)



Let's consider the case $n-1<44$. We apply here a two-samples t-test.

The test statistic is $T=\frac{\overline{X}-\overline{Y}}{S\cdot \sqrt{\frac{1}{n_X}+\frac{1}{n_Y}}}$ with $S=\sqrt{\frac{(n_X-1)S_X^2+(n_Y-1)S_Y^2}{n_X+n_Y-2}}$, right?

So, we have that $S=\sqrt{\frac{(n-1)14^2+(n-1)20^2}{n+n-2}}=\sqrt{\frac{(n-1)196+(n-1)400}{2n-2}}=\sqrt{\frac{596(n-1)}{2(n-1)}}=\sqrt{\frac{596}{2}}\approx 17.2627$.

Thereforee we get $T=\frac{79-80}{17.2627\cdot \sqrt{\frac{1}{n}+\frac{1}{n}}}=\frac{-\sqrt{n}}{17.2627\cdot \sqrt{2}}=-0.0409615 \sqrt{n}$.

How could we deterine here $t_{k;0.995}$? We cannot approximate the t-distribution by a normal distribution for $n<30$.

(Wondering)
 

Klaas van Aarsen

MHB Seeker
Staff member
Mar 5, 2012
8,686
For $n-1\ge 44$ we apply the Welch-test.
...
So, the null hypothesis will be rejected for a sample of size $n\geq 3953$.

Is everything correct?
It looks correct to me. (Nod)

Let's consider the case $n-1<44$. We apply here a two-samples t-test.

The test statistic is $T=\frac{\overline{X}-\overline{Y}}{S\cdot \sqrt{\frac{1}{n_X}+\frac{1}{n_Y}}}$ with $S=\sqrt{\frac{(n_X-1)S_X^2+(n_Y-1)S_Y^2}{n_X+n_Y-2}}$, right?

So, we have that $S=\sqrt{\frac{(n-1)14^2+(n-1)20^2}{n+n-2}}=\sqrt{\frac{(n-1)196+(n-1)400}{2n-2}}=\sqrt{\frac{596(n-1)}{2(n-1)}}=\sqrt{\frac{596}{2}}\approx 17.2627$.

Thereforee we get $T=\frac{79-80}{17.2627\cdot \sqrt{\frac{1}{n}+\frac{1}{n}}}=\frac{-\sqrt{n}}{17.2627\cdot \sqrt{2}}=-0.0409615 \sqrt{n}$.

How could we deterine here $t_{k;0.995}$? We cannot approximate the t-distribution by a normal distribution for $n<30$.

(Wondering)
How about checking every $n$ between $1$ and $44$?
Maybe we can find a pattern so that we have to check fewer values. (Wondering)
 

mathmari

Well-known member
MHB Site Helper
Apr 14, 2013
4,004
How about checking every $n$ between $1$ and $44$?
Maybe we can find a pattern so that we have to check fewer values. (Wondering)
As I read now, the null hypothesis is rejected if $|T|>t_{1-\alpha/2;n_X+n_Y-2}$, isn't it?

Then we have the ollowing:
$$|T|>t_{1-\alpha/2;n_X+n_Y-2}\Rightarrow |T|>t_{0.995;n+n-2}\Rightarrow 0.0409615 \sqrt{n}>t_{0.995;2n-2}$$

To check that for every $n$ between $1$ and $44$ (i.e. for every $2n-2$ between $0$ and $86$) using the R-compiler do we write [m]qt(0.01, 0 : 86)[/m] ? If yes, we get only negativ values, and that would mean that the above holds for every $n$.

(Wondering)
 

Klaas van Aarsen

MHB Seeker
Staff member
Mar 5, 2012
8,686
As I read now, the null hypothesis is rejected if $|T|>t_{1-\alpha/2;n_X+n_Y-2}$, isn't it?

Then we have the ollowing:
$$|T|>t_{1-\alpha/2;n_X+n_Y-2}\Rightarrow |T|>t_{0.995;n+n-2}\Rightarrow 0.0409615 \sqrt{n}>t_{0.995;2n-2}$$

To check that for every $n$ between $1$ and $44$ (i.e. for every $2n-2$ between $0$ and $86$) using the R-compiler do we write [m]qt(0.01, 0 : 86)[/m] ? If yes, we get only negativ values, and that would mean that the above holds for every $n$.

(Wondering)
Shouldn't we check [m]qt(0.995, 0 : 86)[/m]? (Wondering)
 

mathmari

Well-known member
MHB Site Helper
Apr 14, 2013
4,004
Shouldn't we check [m]qt(0.995, 0 : 86)[/m]? (Wondering)
Oh yes (Blush)

So, at the left side of the inequation the lergst number that we get, i.e. for $n=44$, is about $0.271708$. At the right side every number is greater than $2.634212$.
So that inequality doesn't hold for any $n$.

Is this correct? (Wondering)
 

Klaas van Aarsen

MHB Seeker
Staff member
Mar 5, 2012
8,686
Oh yes (Blush)

So, at the left side of the inequation the lergst number that we get, i.e. for $n=44$, is about $0.271708$. At the right side every number is greater than $2.634212$.
So that inequality doesn't hold for any $n$.

Is this correct? (Wondering)
Looks correct to me. (Nod)
 

mathmari

Well-known member
MHB Site Helper
Apr 14, 2013
4,004