# [SOLVED]Performance of students - Hypothesis testing

#### mathmari

##### Well-known member
MHB Site Helper
Hey!! A teacher wants to find out if the order of the exam tasks has an impact on the performance of the students. Therefore, he creates two versions ($X$ and $Y$) of an exam in which the exam tasks are arranged differently. The versions are randomly distributed so that $n$ students receive version $X$, and $m = n$ receive version $Y$ from them. We call the expected score at $X$ with $\mu_X$, and the expected score at $Y$ with $\mu_Y$. The variances are denoted $\sigma_X^2$ and $\sigma_Y^2$; it is assumed normal distribution.

(a) Formulate a suitable null hypothesis for the question of the teacher.
(b) Consider that $n = 30, \overline{X} = 79, \overline{Y}= 74, S_X' = 14, S_Y' = 20$. Check the null hypothesis of (a) with significance level $\alpha=5\%$.
(c) Consider that $\overline{X} = 79, \overline{Y}= 80, S_X' = 14, S_Y' = 20$. For which sample size $n$ can we reject the null hypothesis with significance level $\alpha=1\%$ ?

I have done the following:

(a) The null hypothesis is $H_0: \mu_X=\mu_Y$, right? (b) Since we don't know if we have the same or different variances, we have to test if we have the same $\sigma$ with a F-test.
• If $\sigma_x=\sigma_y$ then we apply a two-samples t-test.
• If $\sigma_x<\sigma_y$ then we apply a Welch-Test

The test is the following:

The null hypothesis and the alternative hypothesis is $H_0:\sigma_Y^2=\sigma_X^2$ and $H_1:\sigma_Y^2>\sigma_X^2$, respectively.

The test statistic is \begin{equation*}F=\frac{{S_Y'}^2}{{S_X'}^2}=\frac{20^2}{14^2}=\frac{400}{196}\approx 2.0408\end{equation*}
$F$ is F-distributed with degres of freedom $\nu_Y=n_Y-1=30-1=29$, $\nu_X=n_X-130-1=29$.

We have that $1-\alpha=95\%$.

The null hypothesis will be rejected if $F>F_{1-\alpha}(\nu_Y, \nu_X)=F_{0.95}(29, 29)$.

It holds that $F_{0.95}(29, 29)=1,86$.

Since $F=2.0408>1.86=F_{0.95}(29, 29)$, we reject the null hypothesis.

So, we apply a Welch-Test.

The zero-hypothesis is $H_0: \mu_X-\mu_Y=0$ and the alternative hypothesis is $H_1:\mu_X-\mu_Y\neq 0$.

The test statistic $T$ for the t-Test with unknown variances \begin{equation*}T=\frac{\overline{X}-\overline{Y}-0}{\sqrt{\frac{{S_X'}^2}{n_X}+\frac{{S_Y'}^2}{n_Y}}}=\frac{79-74}{\sqrt{\frac{14^2}{30}+\frac{20^2}{30}}}=\frac{5}{\sqrt{\frac{196}{30}+\frac{400}{30}}}=\frac{5}{\sqrt{\frac{298}{15}}}\approx 1.1218\end{equation*}

The null hypothesis will be rejected if $|T|>t_{k;1-\alpha/2}$.

The number od degrees of freedom is\begin{align*}k&=\frac{\left (\frac{{S_X'}^2}{n_X}+\frac{{S_Y'}^2}{n_Y}\right )^2}{\frac{1}{n_X-1}\left (\frac{{S_X'}^2}{n_X}\right )^2+\frac{1}{n_Y-1}\left (\frac{{S_Y'}^2}{n_Y}\right )^2}=\frac{\left (\frac{14^2}{30}+\frac{20^2}{30}\right )^2}{\frac{1}{30-1}\left (\frac{14^2}{30}\right )^2+\frac{1}{30-1}\left (\frac{20^2}{30}\right )^2}=\frac{\left (\frac{196}{30}+\frac{400}{30}\right )^2}{\frac{1}{29}\left (\frac{196}{30}\right )^2+\frac{1}{29}\left (\frac{400}{30}\right )^2} \\ & =\frac{\left (\frac{596}{30}\right )^2}{\frac{1}{29}\left (\frac{38416}{900}+\frac{160000}{900}\right )}=\frac{\frac{355216}{900}}{\frac{1}{29}\cdot \frac{198416}{900}}=\frac{355216\cdot 29}{ 198416}=\frac{10301264}{ 198416}\approx 51.9175\end{align*} so $k=52$.

So we get the critical value $t_{k;1-\alpha/2}=t_{52;0.975}=1.67$.

Since $|T|=1.1218<1.67=t_{52;0.975}$ we don't reject the null hypothesis.

Is everything correct? (c) Do we have to do the same as in (b) just with unknown n? #### Klaas van Aarsen

##### MHB Seeker
Staff member
The number od degrees of freedom is $k\approx 51.9175$ so $k=52$.
Hey mathmari !!

Just a nitpick. Generally we round degrees of freedom down, so I believe we should pick $k=51$.
That's because we want to be sure with a confidence of 'at least' $1-\alpha$ before we reject the null hypothesis.
In case of doubt, we can't.
So we should round to the safe side. So we get the critical value $t_{k;1-\alpha/2}=t_{52;0.975}=1.67$.

Since $|T|=1.1218<1.67=t_{52;0.975}$ we don't reject the null hypothesis.

Is everything correct? (c) Do we have to do the same as in (b) just with unknown n? Yep. Yep. #### mathmari

##### Well-known member
MHB Site Helper
Just a nitpick. Generally we round degrees of freedom down, so I believe we should pick $k=51$.
That's because we want to be sure with a confidence of 'at least' $1-\alpha$ before we reject the null hypothesis.
In case of doubt, we can't.
So we should round to the safe side. Ah ok, I understand! Yep. Yep. We have the following at (c) :

We check again with an F-test if the variances are equal. Or is it not neccesary and it holds the same as at (b) ? The F-test would be the following:

The null hypothesis is $H_0:\sigma_Y^2=\sigma_X^2$ and the alternative hypothesis is $H_1:\sigma_Y^2>\sigma_X^2$.

The test statistic is \begin{equation*}F=\frac{{S_Y'}^2}{{S_X'}^2}=\frac{20^2}{14^2}=\frac{400}{196}\approx 2.0408\end{equation*}
$F$ is F-distributed with degress of freedom $\nu_Y=\nu_X=n-1$.

We have that $1-\alpha=99\%$.

The null hypothesis will be rejected if $F>F_{1-\alpha}(\nu_Y, \nu_X)=F_{0.99}(n-1, n-1)$.

How can we determine $F_{0.99}(n-1, n-1)$ without knowing $n$ ? #### Klaas van Aarsen

##### MHB Seeker
Staff member
We already know that we'll need a bigger $n$ than we had for (b) don't we?

Let's inspect the F-table with the smaller $\alpha$ and with the same $F$-value (since the variances are the same).
What happens if we increase the degrees of freedom of both the numerator and the denominator?
Is there a possibility that we can assume equal variances after all? #### mathmari

##### Well-known member
MHB Site Helper
We already know that we'll need a bigger $n$ than we had for (b) don't we?

Let's inspect the F-table with the smaller $\alpha$ and with the same $F$-value (since the variances are the same).
What happens if we increase the degrees of freedom of both the numerator and the denominator?
Is there a possibility that we can assume equal variances after all? So we have to check for which n we have at this table. For $n-1\geq 30$ do we nit get values smaller than $F=2.0408$ and so the null hypothesis is rejected, or not?

So we have to apply again a Welch-test.

Or am I wrong? #### Klaas van Aarsen

##### MHB Seeker
Staff member
So we have to check for which n we have at this table. For $n-1\geq 30$ do we nit get values smaller than $F=2.0408$ and so the null hypothesis is rejected, or not?

So we have to apply again a Welch-test.

Or am I wrong? There seem to be mistakes in that table. For instance $F_{0.99}(31,31)=1.98$ is lower than the values to the left and right of it, which is not possible. I think we should use another table.

In R we can do:
Code:
> qf(0.99, 43:45, 43:45)
 2.056934 2.039508 2.022824
So for $n-1\ge 44$ the critical $F$-values are below our $F=2.0408$, so we will have to reject the null hypothesis for those $n$, and apply the Welch-test. #### mathmari

##### Well-known member
MHB Site Helper
There seem to be mistakes in that table. For instance $F_{0.99}(31,31)=1.98$ is lower than the values to the left and right of it, which is not possible. I think we should use another table.
Ah ok!

In R we can do:
Code:
> qf(0.99, 43:45, 43:45)
 2.056934 2.039508 2.022824
So for $n-1\ge 44$ the critical $F$-values are below our $F=2.0408$, so we will have to reject the null hypothesis for those $n$, and apply the Welch-test. We want for which $n$ the null hypothesis of (a) can be rejected.
So do we have to take cases and find $n$ if $\sigma_x=\sigma_y$, i.e. with a two-samples t-test and also if $\sigma_x<\sigma_y$, i.e. with a Welch-test? #### Klaas van Aarsen

##### MHB Seeker
Staff member
We want for which $n$ the null hypothesis of (a) can be rejected.
So do we have to take cases and find $n$ if $\sigma_x=\sigma_y$, i.e. with a two-samples t-test and also if $\sigma_x<\sigma_y$, i.e. with a Welch-test?
Yep. We can do that.
So for $n-1 < 44$ we should assume equal variances, find the critical $n$, and verify that it indeed satisfies $n-1 < 44$.
And for $n-1 \ge 44$ we should assume unequal variances, find the critical $n$, and verify that it indeed satisfies $n-1 \ge 44$. #### mathmari

##### Well-known member
MHB Site Helper
For $n-1\ge 44$ we apply the Welch-Test.

The null hypothesis is $H_0: \mu_X-\mu_Y=0$ and the alternative hypothesis is $H_1:\mu_X-\mu_Y\neq 0$.

The test statistic is $T$ for the t-test with unknown variances is \begin{equation*}T=\frac{\overline{X}-\overline{Y}-0}{\sqrt{\frac{{S_X'}^2}{n_X}+\frac{{S_Y'}^2}{n_Y}}}=\frac{79-80}{\sqrt{\frac{14^2}{n}+\frac{20^2}{n}}}=\frac{1}{\sqrt{\frac{196}{n}+\frac{400}{n}}}=\frac{1}{\sqrt{\frac{596}{n}}}=\frac{\sqrt{n}}{2\sqrt{149}}\geq \frac{\sqrt{45}}{2\sqrt{149}}\approx 0.2748\end{equation*}

The null hypothesis will be rejected if $|T|>t_{k;1-\alpha/2}$.

The degree of freedom is \begin{align*}k&=\frac{\left (\frac{{S_X'}^2}{n_X}+\frac{{S_Y'}^2}{n_Y}\right )^2}{\frac{1}{n_X-1}\left (\frac{{S_X'}^2}{n_X}\right )^2+\frac{1}{n_Y-1}\left (\frac{{S_Y'}^2}{n_Y}\right )^2}=\frac{\left (\frac{14^2}{n}+\frac{20^2}{n}\right )^2}{\frac{1}{n-1}\left (\frac{14^2}{n}\right )^2+\frac{1}{n-1}\left (\frac{20^2}{n}\right )^2}=\frac{\left (\frac{196}{n}+\frac{400}{n}\right )^2}{\frac{1}{n-1}\left (\frac{196}{n}\right )^2+\frac{1}{n-1}\left (\frac{400}{n}\right )^2} \\ & =\frac{\left (\frac{596}{n}\right )^2}{\frac{1}{n-1}\left (\frac{38416}{n^2}+\frac{160000}{n^2}\right )}=\frac{\frac{355216}{n^2}}{\frac{1}{n-1}\cdot \frac{198416}{n^2}}=\frac{355216\cdot (n-1)}{ 198416}\geq \frac{355216\cdot 44}{ 198416}=78.7714\end{align*} so $k=78$.

The critical value is therefore $t_{k;1-\alpha/2}=t_{78;0.995}=2.375$.

How can we compare $|T|\geq 0.2748$ and $t_{78;0.995}=2.375$ where we have inequalities? #### Klaas van Aarsen

##### MHB Seeker
Staff member
They are not inequalities if we pick a specific n.
In fact we have found that for n=45, we cannot reject H0.
We will need a bigger n.
How about n=100? Or n=1000? #### mathmari

##### Well-known member
MHB Site Helper
They are not inequalities if we pick a specific n.
In fact we have found that for n=45, we cannot reject H0.
We will need a bigger n.
How about n=100? Or n=1000? Ahh.. We reject the null hypothesis if $$|T|>t_{78;0.995}\Rightarrow \frac{\sqrt{n}}{2\sqrt{149}}>2.375\Rightarrow n>3361.81$$ right? #### Klaas van Aarsen

##### MHB Seeker
Staff member
Ahh.. We reject the null hypothesis if $$|T|>t_{78;0.995}\Rightarrow \frac{\sqrt{n}}{2\sqrt{149}}>2.375\Rightarrow n>3361.81$$ right? If n is bigger, doesn't the degrees of freedom k also become bigger? Then the critical t-value becomes smaller until it approaches the critical z-value. Doesn't it?

#### mathmari

##### Well-known member
MHB Site Helper
If n is bigger, doesn't the degrees of freedom k also become bigger? Then the critical t-value becomes smaller until it approaches the critical z-value. Doesn't it?
So, for big degree of freedom the crtitical t-value approximates the critical z-value and so $t_{k;0.995}\approx z_{0.995}= 2.575$ ? #### Klaas van Aarsen

##### MHB Seeker
Staff member
So, for big degree of freedom the crtitical t-value approximates the critical z-value and so $t_{k;0.995}\approx z_{0.995}= 2.575$ ? Ah. You already had the critical z-value! Then it's all correct.

#### mathmari

##### Well-known member
MHB Site Helper
For $n-1\ge 44$ we apply the Welch-test.

The null hypothesis is $H_0: \mu_X-\mu_Y=0$ and the alternative hypothesis is $H_1: \mu_X-\mu_Y\neq 0$.

The test statistic $T$ for the t-test is \begin{equation*}T=\frac{\overline{X}-\overline{Y}-0}{\sqrt{\frac{S_X'^2}{n_X}+\frac{S_Y'^2}{n_Y}}}=\frac{79-80}{\sqrt{\frac{14^2}{n}+\frac{20^2}{n}}}=\frac{-1}{\sqrt{\frac{596}{n}}}=-\frac{\sqrt{n}}{\sqrt{596}}\approx -0.04096\sqrt{n}\end{equation*}

The null hypothesis will be rejected if $|T|>t_{k;1-\alpha/2}=t_{k;0.995}$.

From $n\geq 30$ the t-distribution can be approximated by the normal distribution.

Since this holds in this case, $n\geq 45$, we have that $t_{k;0.995}\approx z_{0.995}=2.575$.

Therefore, so that the null hypothesis is rejected it must hold the following: \begin{equation*}|T|>t_{k;0.995}\Rightarrow 0.04096\sqrt{n}>2.575 \Rightarrow n>3952.16\end{equation*}
So, the null hypothesis will be rejected for a sample of size $n\geq 3953$.

Is everything correct? Let's consider the case $n-1<44$. We apply here a two-samples t-test.

The test statistic is $T=\frac{\overline{X}-\overline{Y}}{S\cdot \sqrt{\frac{1}{n_X}+\frac{1}{n_Y}}}$ with $S=\sqrt{\frac{(n_X-1)S_X^2+(n_Y-1)S_Y^2}{n_X+n_Y-2}}$, right?

So, we have that $S=\sqrt{\frac{(n-1)14^2+(n-1)20^2}{n+n-2}}=\sqrt{\frac{(n-1)196+(n-1)400}{2n-2}}=\sqrt{\frac{596(n-1)}{2(n-1)}}=\sqrt{\frac{596}{2}}\approx 17.2627$.

Thereforee we get $T=\frac{79-80}{17.2627\cdot \sqrt{\frac{1}{n}+\frac{1}{n}}}=\frac{-\sqrt{n}}{17.2627\cdot \sqrt{2}}=-0.0409615 \sqrt{n}$.

How could we deterine here $t_{k;0.995}$? We cannot approximate the t-distribution by a normal distribution for $n<30$. #### Klaas van Aarsen

##### MHB Seeker
Staff member
For $n-1\ge 44$ we apply the Welch-test.
...
So, the null hypothesis will be rejected for a sample of size $n\geq 3953$.

Is everything correct?
It looks correct to me. Let's consider the case $n-1<44$. We apply here a two-samples t-test.

The test statistic is $T=\frac{\overline{X}-\overline{Y}}{S\cdot \sqrt{\frac{1}{n_X}+\frac{1}{n_Y}}}$ with $S=\sqrt{\frac{(n_X-1)S_X^2+(n_Y-1)S_Y^2}{n_X+n_Y-2}}$, right?

So, we have that $S=\sqrt{\frac{(n-1)14^2+(n-1)20^2}{n+n-2}}=\sqrt{\frac{(n-1)196+(n-1)400}{2n-2}}=\sqrt{\frac{596(n-1)}{2(n-1)}}=\sqrt{\frac{596}{2}}\approx 17.2627$.

Thereforee we get $T=\frac{79-80}{17.2627\cdot \sqrt{\frac{1}{n}+\frac{1}{n}}}=\frac{-\sqrt{n}}{17.2627\cdot \sqrt{2}}=-0.0409615 \sqrt{n}$.

How could we deterine here $t_{k;0.995}$? We cannot approximate the t-distribution by a normal distribution for $n<30$. How about checking every $n$ between $1$ and $44$?
Maybe we can find a pattern so that we have to check fewer values. #### mathmari

##### Well-known member
MHB Site Helper
How about checking every $n$ between $1$ and $44$?
Maybe we can find a pattern so that we have to check fewer values. As I read now, the null hypothesis is rejected if $|T|>t_{1-\alpha/2;n_X+n_Y-2}$, isn't it?

Then we have the ollowing:
$$|T|>t_{1-\alpha/2;n_X+n_Y-2}\Rightarrow |T|>t_{0.995;n+n-2}\Rightarrow 0.0409615 \sqrt{n}>t_{0.995;2n-2}$$

To check that for every $n$ between $1$ and $44$ (i.e. for every $2n-2$ between $0$ and $86$) using the R-compiler do we write [m]qt(0.01, 0 : 86)[/m] ? If yes, we get only negativ values, and that would mean that the above holds for every $n$. #### Klaas van Aarsen

##### MHB Seeker
Staff member
As I read now, the null hypothesis is rejected if $|T|>t_{1-\alpha/2;n_X+n_Y-2}$, isn't it?

Then we have the ollowing:
$$|T|>t_{1-\alpha/2;n_X+n_Y-2}\Rightarrow |T|>t_{0.995;n+n-2}\Rightarrow 0.0409615 \sqrt{n}>t_{0.995;2n-2}$$

To check that for every $n$ between $1$ and $44$ (i.e. for every $2n-2$ between $0$ and $86$) using the R-compiler do we write [m]qt(0.01, 0 : 86)[/m] ? If yes, we get only negativ values, and that would mean that the above holds for every $n$. Shouldn't we check [m]qt(0.995, 0 : 86)[/m]? #### mathmari

##### Well-known member
MHB Site Helper
Shouldn't we check [m]qt(0.995, 0 : 86)[/m]? Oh yes So, at the left side of the inequation the lergst number that we get, i.e. for $n=44$, is about $0.271708$. At the right side every number is greater than $2.634212$.
So that inequality doesn't hold for any $n$.

Is this correct? #### Klaas van Aarsen

##### MHB Seeker
Staff member
Oh yes So, at the left side of the inequation the lergst number that we get, i.e. for $n=44$, is about $0.271708$. At the right side every number is greater than $2.634212$.
So that inequality doesn't hold for any $n$.

Is this correct? Looks correct to me. #### mathmari

##### Well-known member
MHB Site Helper
Looks correct to me. Great!! Thank you so much!! 