Minimize the sum of Type I and Type II errors

GabrielN00 · Oct 10, 2017

Homework Statement

Given [tex]X_1,\dots,X_n[/tex] a simple random sample with normal variables ([tex]\mu, \sigma^2[/tex]). We assume [tex]\mu[/tex] is known but [tex]\sigma^2[/tex] is unknown.

The hypothesis is
[tex]
\begin{cases}
H_0: & \mu=\mu_0 \\
H_1: & \mu=\mu_1 > \mu_0
\end{cases}
[/tex]

Determine the rejection region [tex] R[/tex] in order to minimize the [tex] P_{H_0}(R)+P_{H_1}(R^c)[/tex] .

Homework Equations

The Attempt at a Solution

I'm having problems both to understand the rejection regions and to find the minimum of the sum.

The "plan" would be to consider [tex]z=\displaystyle\frac{\bar{X}-\mu}{(s/\sqrt{n})}[/tex]

I could proceed to do a one-tail test and find the minimum, but the very first problem is that my [tex]\alpha[/tex] value is unknown, so I cannot look it up in a table.

I'm clueless at even how to get a usable expression for each type error, since everything I am able to find suggest the use of a table, but the problem clearly doesn't make use of one.

FactChecker · Oct 10, 2017

I assume that α is the lower value of the region R. In that case, the problem is to determine α as a function of μ₀ and μ₁ (and the sample variance) by minimizing the sum of the probabilities. That is a calculus problem that requires you to use the equations of the CDF rather than a table.

The CDF of a normal distribution is known. See the CDF equation in https://en.wikipedia.org/wiki/Normal_distribution and the erf function in https://en.wikipedia.org/wiki/Error_function

Ray Vickson · Oct 10, 2017

GabrielN00 said:

Homework Statement

Given [tex]X_1,\dots,X_n[/tex] a simple random sample with normal variables ([tex]\mu, \sigma^2[/tex]). We assume [tex]\mu[/tex] is known but [tex]\sigma^2[/tex] is unknown.

The hypothesis is
[tex]
\begin{cases}
H_0: & \mu=\mu_0 \\
H_1: & \mu=\mu_1 > \mu_0
\end{cases}
[/tex]

Determine the rejection region [tex] R[/tex] in order to minimize the [tex] P_{H_0}(R)+P_{H_1}(R^c)[/tex] .

Homework Equations

The Attempt at a Solution
I'm having problems both to understand the rejection regions and to find the minimum of the sum.

The "plan" would be to consider [tex]z=\displaystyle\frac{\bar{X}-\mu}{(s/\sqrt{n})}[/tex]

I could proceed to do a one-tail test and find the minimum, but the very first problem is that my [tex]\alpha[/tex] value is unknown, so I cannot look it up in a table.

I'm clueless at even how to get a usable expression for each type error, since everything I am able to find suggest the use of a table, but the problem clearly doesn't make use of one.

The question makes no sense. You say that ##\mu## is known, and then you say the hypotheses involve ##\mu##!

It makes sense to test hypotheses about ##\sigma## when ##\mu## is known, or to test hypotheses about ##\mu## when ##\sigma## is known (or even to test hypotheses about ##\mu## or ##\sigma## when neither of these is known).

StoneTemplePython · Oct 10, 2017

FactChecker said:

I assume that α is the lower value of the region R. In that case, the problem is to determine α as a function of μ₀ and μ₁ (and the sample variance) by minimizing the sum of the probabilities. That is a calculus problem that requires you to use the equations of the CDF rather than a table.

The CDF of a normal distribution is known. See the CDF equation in https://en.wikipedia.org/wiki/Normal_distribution and the erf function in https://en.wikipedia.org/wiki/Error_function

I still not sure what ##\alpha## is, but do you actually need the CDF here in closed-esq form?

Typically the way these problems are set up is you have function that has something you want to minimize: the usual recipe is to differentiate once and set equal to zero. Once you differentiate it, you can use the simple pdf for the Gaussian of course, and so you only ever work abstractly with the CDF of the Guassian, denoted by "CDF" of ##\Phi## or something like that .

GabrielN00 · Oct 11, 2017

Ray Vickson said:

The question makes no sense. You say that ##\mu## is known, and then you say the hypotheses involve ##\mu##!

It makes sense to test hypotheses about ##\sigma## when ##\mu## is known, or to test hypotheses about ##\mu## when ##\sigma## is known (or even to test hypotheses about ##\mu## or ##\sigma## when neither of these is known).

That's what the problem says, but because of what you say I think it might have been a typo. It should read ##\mu## while ##\sigma## is known.
I'm not sure how can I edit the main post. I don't see an edit button.

StoneTemplePython said:

I still not sure what ##\alpha## is, but do you actually need the CDF here in closed-esq form?

Typically the way these problems are set up is you have function that has something you want to minimize: the usual recipe is to differentiate once and set equal to zero. Once you differentiate it, you can use the simple pdf for the Gaussian of course, and so you only ever work abstractly with the CDF of the Guassian, denoted by "CDF" of ##\Phi## or something like that .

##\alpha## is the critical value, the value we cross when we enter the rejection region.

Alright, let's consider the following: using the error function above I have that the error function with mean 0 and variance ##\sigma ## is ##\frac{1}{2\pi}\int_0^{\alpha/(\sigma\sqrt{2})} e^{-t^2}dt##.

This error gives the probability of falling in ##(-\alpha,\alpha)## but I am interested in the rejection region, this is ##(-\infty, \alpha)\cup(\alpha, +\infty)##. Therefore, I think I should consider the complementary error function ##erfc(\alpha) = 1-\frac{1}{2\pi}\int_0^{\alpha/(\sigma\sqrt{2})} e^{-t^2}dt = \frac{1}{2\pi}\int_{\alpha/(\sigma\sqrt{2})}^{\infty} e^{-t^2}dt##

Now I could derive and get that ##\frac{d}{dt}erfc(\sigma) = - \frac{1}{2\pi}e^{-\alpha^2/(2\sigma^2}##. I should set it to ##0## and find ##\alpha##, to "solve" the problem.

There are three issues here:
(1) ##e^{-\alpha^2/(2\sigma^2}## will never be zero for any ##\alpha##.
(2) I didn't get involved the hypothesis testing.
(3) It is not clear what the ##\sigma## in the error function is. The wikipedia entry linked above says that error generally have mean zero, but it is possible for the error to have a variance. Is the ##\sigma## in the normal distribution the very same ##\sigma## in the error function?

FactChecker · Oct 11, 2017

StoneTemplePython said:

I still not sure what ##\alpha## is, but do you actually need the CDF here in closed-esq form?

Typically the way these problems are set up is you have function that has something you want to minimize: the usual recipe is to differentiate once and set equal to zero. Once you differentiate it, you can use the simple pdf for the Gaussian of course, and so you only ever work abstractly with the CDF of the Guassian, denoted by "CDF" of ##\Phi## or something like that .

Oh! Good point!

GabrielN00 · Oct 11, 2017

StoneTemplePython said:

I still not sure what ##\alpha## is, but do you actually need the CDF here in closed-esq form?

Typically the way these problems are set up is you have function that has something you want to minimize: the usual recipe is to differentiate once and set equal to zero. Once you differentiate it, you can use the simple pdf for the Gaussian of course, and so you only ever work abstractly with the CDF of the Guassian, denoted by "CDF" of ##\Phi## or something like that .

I can't really see how. I guess you mean ##f_X=\frac{1}{\sigma\sqrt{2\pi}e^{-\frac{(x-\mu)^2}{2\sigma^2}}} ## but how can it be used to find the minimal sum of the errors?

Working solely with the error functions I thought I should consider ##erf(\alpha) ## to calculate ##P_{H_0}(R)## and ##erfc(\alpha)## to calculate ##P_{H_1}(R)##.

Ray Vickson · Oct 11, 2017

GabrielN00 said:

I can't really see how. I guess you mean ##f_X=\frac{1}{\sigma\sqrt{2\pi}e^{-\frac{(x-\mu)^2}{2\sigma^2}}} ## but how can it be used to find the minimal sum of the errors?

Working solely with the error functions I thought I should consider ##erf(\alpha) ## to calculate ##P_{H_0}(R)## and ##erfc(\alpha)## to calculate ##P_{H_1}(R)##.

No, the correct form is
$$f_X(x) = \frac{2}{\sigma \sqrt{2 \pi}} e^{-(x-\mu)^2/(2 \sigma^2) }$$.
The density function of the standard normal (mean = 0, s.d.= 1) is usually denoted by ##\phi## and its cumulative distribution by ##\Phi##:
$$\phi(t) = \frac{1}{\sqrt{2 \pi}} e^{-t^2/2}, \;\; \Phi(z) = \int_{-\infty}^z \phi(t) \, dt. $$
The relationship between ##\Phi## and ##\text{erf}## is
$$\Phi(z) =\frac{2}{2} + \frac{1}{2} \text{erf} \left( \frac{z}{\sqrt{2}} \right), $$
provided that your definition of "erf" is ##\text{erf}(z) = (2/\sqrt{\pi}) \int_0^z e^{-t^2} \, dt##.

Anyway, you want to test a value of ##\mu_0## (H0) against a larger value ##\mu_1## (H1), so you will accept the null hypothesis provided that the sample mean ##\bar{X}## is not too large. So you accept H0 if ##\bar{X} \leq \alpha## and reject H0 if ##\bar{X} > \alpha##. The type I error is ##E_1 = P(\bar{X} > \alpha | \mu = \mu_0)##, and you can work this out in terms or ##\Phi## (or erfc), ##\alpha##, ##\mu_0## and ##\sigma##. The type II error is ##E_2 = P(\bar{X} \leq \alpha | \mu = \mu_1)##, and you can work this out in terms of ##\Phi##, ##\alpha##, ##\mu_1## and ##\sigma##. Altogether, you get ##E_1+E_2 = G(\alpha)## for some function ##G## that you can write out in terms of ##\Phi## or erfc. Then, as usual, you look for a solution of ##G'(\alpha) = 0## in your search for a minimum.

GabrielN00 · Oct 11, 2017

Ray Vickson said:

Anyway, you want to test a value of ##\mu_0## (H0) against a larger value ##\mu_1## (H1), so you will accept the null hypothesis provided that the sample mean ##\bar{X}## is not too large. So you accept H0 if ##\bar{X} \leq \alpha## and reject H0 if ##\bar{X} > \alpha##. The type I error is ##E_1 = P(\bar{X} > \alpha | \mu = \mu_0)##, and you can work this out in terms or ##\Phi## (or erfc), ##\alpha##, ##\mu_0## and ##\sigma##. The type II error is ##E_2 = P(\bar{X} \leq \alpha | \mu = \mu_1)##, and you can work this out in terms of ##\Phi##, ##\alpha##, ##\mu_1## and ##\sigma##. Altogether, you get ##E_1+E_2 = G(\alpha)## for some function ##G## that you can write out in terms of ##\Phi## or erfc. Then, as usual, you look for a solution of ##G'(\alpha) = 0## in your search for a minimum.

Thank you. In regard to this last part, I am not entirely sure how to work ##P(\bar{X} > \alpha | \mu = \mu_0)## out. Normally I'd proceed as ##P(\bar{X} > \alpha | \mu = \mu_0)=1-P(\bar{X} \leq \alpha | \mu = \mu_0)##. But to compute the conditional probability shouldn't be involved some joint distribution function?

Ray Vickson · Oct 11, 2017

GabrielN00 said:

Thank you. In regard to this last part, I am not entirely sure how to work ##P(\bar{X} > \alpha | \mu = \mu_0)## out. Normally I'd proceed as ##P(\bar{X} > \alpha | \mu = \mu_0)=1-P(\bar{X} \leq \alpha | \mu = \mu_0)##. But to compute the conditional probability shouldn't be involved some joint distribution function?

No. The conditional probability ##P(A|\mu=\mu_0)## assumes that ##\mu = \mu_0## and so uses the distribution ##\text{Normal}(\mu_0, \sigma)##, with both mean and variance known. No joint distributions are involved.

GabrielN00 · Oct 11, 2017

Ray Vickson said:

No. The conditional probability ##P(A|\mu=\mu_0)## assumes that ##\mu = \mu_0## and so uses the distribution ##\text{Normal}(\mu_0, \sigma)##, with both mean and variance known. No joint distributions are involved.

Would it be right to say ## P(X\leq \alpha | \mu=\mu_0) = \frac{f_X(\alpha)}{Normal(\mu_0,\sigma)}=\frac{[2/(\sigma \sqrt{2\pi})]e^{-\frac{-(x-\mu)}{2\pi^2}}}{(1/\sqrt{2\pi})e^{-\alpha^2/2}} ## ?

FactChecker · Oct 11, 2017

GabrielN00 said:

Would it be right to say ## P(X\leq \alpha | \mu=\mu_0) = \frac{f_X(\alpha)}{Normal(\mu_0,\sigma)}=\frac{[2/(\sigma \sqrt{2\pi})]e^{-\frac{-(x-\mu)}{2\pi^2}}}{(1/\sqrt{2\pi})e^{-\alpha^2/2}} ## ?

No. That equation still has x in it. The x values must be integrated over (-∞,α). And you should not divide it by anything. And the μ in the numerator should be μ₀. (Those are the mistakes that immediately jump out at me. There may be more.)

Ray Vickson · Oct 11, 2017

GabrielN00 said:

Would it be right to say ## P(X\leq \alpha | \mu=\mu_0) = \frac{f_X(\alpha)}{Normal(\mu_0,\sigma)}=\frac{[2/(\sigma \sqrt{2\pi})]e^{-\frac{-(x-\mu)}{2\pi^2}}}{(1/\sqrt{2\pi})e^{-\alpha^2/2}} ## ?

No. In probability we define ##P(A|B) = P(A\, \& \, B)/P(B)##, so if we know ##P(A\, \& \, B) ## and ##P(B)## we can compute ##P(A|B)##. However, that is not the usual way we deal with conditional probabilities. Most often we know ##P(A|B)## directly. If also happen to know ##P(B)## then we could calculate ##P(A\, \& \, B)##.

In this problem we know how to compute ##P(\bar{X} \leq \alpha|\mu = \mu_0)## directly because---as I already stated very clearly---we use ##N(\mu_0,\sigma)## with both mean and variance known.

GabrielN00 · Oct 11, 2017

FactChecker said:

No. That equation still has x in it. The x values must be integrated over (-∞,α). And you should not divide it by anything. And the μ in the numerator should be μ₀. (Those are the mistakes that immediately jump out at me. There may be more.)

Thank you. I'm very sorry this is taking so long, but thank you again for answer my messages.

Maybe it goes like this?

##P_{H_0}(R)+P_{H_1}(R)=E_1(\alpha)+E_2(\alpha)=P(X>\alpha | \mu=\mu_1)+ P(X\leq\alpha | \mu=\mu_0) =1 - P(X\leq\alpha | \mu=\mu_1)+P(X\leq\alpha | \mu=\mu_0) =+ 1 - \int_{-\infty}^\alpha \frac{2}{\sigma \sqrt{2\pi}}e^{-\frac{(x-\mu_1)^2}{2\sigma^2}} + \int_{-\infty}^\alpha \frac{2}{\sigma \sqrt{2\pi}}e^{-\frac{(x-\mu_0)^2}{2\sigma^2}} ##.

If I differentiate the integrals and set it to zero the remaining equation is ## \left( \frac{2}{\sigma \sqrt{2\pi}}e^{-\frac{(\alpha-\mu_0)^2}{2\sigma^2}} - \frac{2}{\sigma \sqrt{2\pi}} \right) - \left( \frac{2}{\sigma \sqrt{2\pi}}e^{-\frac{(\alpha-\mu_1)^2}{2\sigma^2}} - \frac{2}{\sigma \sqrt{2\pi}} \right) = 0##

Then the equation to solve is ##e^{-\frac{(\alpha-\mu_0)^2}{2\sigma^2}} = e^{-\frac{(\alpha-\mu_1)^2}{2\sigma^2}} ## which happens only when ## \frac{(\alpha-\mu_0)^2}{2\sigma^2} = \frac{(\alpha-\mu_1)^2}{2\sigma^2} ##.

Then ##a^2-2\alpha\mu_0+\mu_0^2 = a^2-2\alpha\mu_1+\mu_1^2## and follows that ## 2\alpha(\mu_1 -\mu_0)=\mu_ 1^2-\mu_0^2##.

Then both errors Type I and II are minimized when ## \alpha = \frac{\mu_ 1^2-\mu_0^2}{2(\mu_ 1-\mu_0)} ##

Ray Vickson · Oct 11, 2017

GabrielN00 said:

Thank you. I'm very sorry this is taking so long, but thank you again for answer my messages.

Maybe it goes like this?

##P_{H_0}(R)+P_{H_1}(R)=E_1(\alpha)+E_2(\alpha)=P(X\leq\alpha | \mu=\mu_0)+P(X>\alpha | \mu=\mu_1)=P(X\leq\alpha | \mu=\mu_0) + 1 - P(X\leq\alpha | \mu=\mu_1)=\int_{-\infty}^\alpha \frac{2}{\sigma \sqrt{2\pi}}e^{-\frac{(x-\mu_0)^2}{2\sigma^2}} + 1 - \int_{-\infty}^\alpha \frac{2}{\sigma \sqrt{2\pi}}e^{-\frac{(x-\mu_1)^2}{2\sigma^2}}##.

If I differentiate the integrals and set it to zero the remaining equation is ## \left( \frac{2}{\sigma \sqrt{2\pi}}e^{-\frac{(\alpha-\mu_0)^2}{2\sigma^2}} - \frac{2}{\sigma \sqrt{2\pi}} \right) - \left( \frac{2}{\sigma \sqrt{2\pi}}e^{-\frac{(\alpha-\mu_1)^2}{2\sigma^2}} - \frac{2}{\sigma \sqrt{2\pi}} \right) ##

I think you have the type I and type II errors backwards: the type I error is ##P(\bar{X} > \alpha| \mu=\mu_0)##, and that does not look like what you wrote. I don't think your final optimality equation will be much affected by this, but it is good to get things right before proceeding.

Your final equation should have an ##=0## in it. Then you can cancel out some things and be left with a solvable equation for which you can give a closed-form algebraic solution.

GabrielN00 · Oct 11, 2017

Ray Vickson said:

I think you have the type I and type II errors backwards: the type I error is ##P(\bar{X} > \alpha| \mu=\mu_0)##, and that does not look like what you wrote. I don't think your final optimality equation will be much affected by this, but it is good to get things right before proceeding.

Your final equation should have an ##=0## in it. Then you can cancel out some things and be left with a solvable equation for which you can give a closed-form algebraic solution.

I will fix it now. I clicked Reply instead of Preview so it sent the post before I was done writing it :(

Ray Vickson · Oct 11, 2017

GabrielN00 said:

Thank you. I'm very sorry this is taking so long, but thank you again for answer my messages.

Maybe it goes like this?

##P_{H_0}(R)+P_{H_1}(R)=E_1(\alpha)+E_2(\alpha)=P(X>\alpha | \mu=\mu_1)+ P(X\leq\alpha | \mu=\mu_0) =1 - P(X\leq\alpha | \mu=\mu_1)+P(X\leq\alpha | \mu=\mu_0) =+ 1 - \int_{-\infty}^\alpha \frac{2}{\sigma \sqrt{2\pi}}e^{-\frac{(x-\mu_1)^2}{2\sigma^2}} + \int_{-\infty}^\alpha \frac{2}{\sigma \sqrt{2\pi}}e^{-\frac{(x-\mu_0)^2}{2\sigma^2}} ##.

If I differentiate the integrals and set it to zero the remaining equation is ## \left( \frac{2}{\sigma \sqrt{2\pi}}e^{-\frac{(\alpha-\mu_0)^2}{2\sigma^2}} - \frac{2}{\sigma \sqrt{2\pi}} \right) - \left( \frac{2}{\sigma \sqrt{2\pi}}e^{-\frac{(\alpha-\mu_1)^2}{2\sigma^2}} - \frac{2}{\sigma \sqrt{2\pi}} \right) = 0##

Then the equation to solve is ##e^{-\frac{(\alpha-\mu_0)^2}{2\sigma^2}} = e^{-\frac{(\alpha-\mu_1)^2}{2\sigma^2}} ## which happens only when ## \frac{(\alpha-\mu_0)^2}{2\sigma^2} = \frac{(\alpha-\mu_1)^2}{2\sigma^2} ##.

Then ##a^2-2\alpha\mu_0+\mu_0^2 = a^2-2\alpha\mu_1+\mu_1^2## and follows that ## 2\alpha(\mu_1 -\mu_0)=\mu_ 1^2-\mu_0^2##.

Then both errors Type I and II are minimized when ## \alpha = \frac{\mu_ 1^2-\mu_0^2}{2(\mu_ 1-\mu_0)} ##

The answer looks a lot simpler if you recall that ##\mu_1^2 - \mu_0^2 = (\mu_1 - \mu_0) (\mu_1 + \mu_0)##.

FactChecker · Oct 11, 2017

Ha! So after all that, the answer is what might have been guessed (although maybe not proven any easier):
To minimize the sum, place the the start of the rejection area half way between μ₀ and μ₁.

At the moment I don't have what it takes to figure it out, but there is probably a good intuitive "geometric" way to prove that.

Ray Vickson · Oct 11, 2017

FactChecker said:

Ha! So after all that, the answer is what might have been guessed (although maybe not proven any easier):
To minimize the sum, place the the start of the rejection area half way between μ₀ and μ₁.

At the moment I don't have what it takes to figure it out, but there is probably a good intuitive "geometric" way to prove that.

The mid-point is equidistant between the regions ##\alpha \leq \mu_0## and ##\alpha \geq \mu_1##, for what that is worth.

FactChecker · Oct 11, 2017

In hindsight, now that the answer α = (μ₀+μ₁)/2 has been found, it is easy to see. Since the variance is the same and only the means are different, the situation looks like the figure below when α is the midpoint value. It is clear that moving α down will increase Type 1 error more than it decreases Type 2 error, thus increasing the total error. Similarly, moving α up will increase Type 2 error more than it decreases Type 1 error, thus increasing the total error. So the midpoint value is the minimum.

FactChecker · Oct 11, 2017

The unusual symmetry of this problem makes it possible to solve geometrically. The analytical approach that was taken earlier is much more powerful for most minimization problems.

Minimize the sum of Type I and Type II errors

Homework Statement

Homework Equations

The Attempt at a Solution

Homework Statement

Homework Equations

The Attempt at a Solution

Related to Minimize the sum of Type I and Type II errors

What is the goal of minimizing the sum of Type I and Type II errors?

What is a Type I error?

What is a Type II error?

How do you minimize the sum of Type I and Type II errors?

Why is it important to minimize the sum of Type I and Type II errors?

Similar threads

Hot Threads

Recent Insights