Minimize the sum of Type I and Type II errors

In summary, the problem is to determine the rejection region R in order to minimize the sum of probabilities for the given hypotheses and sample. This involves finding the critical value, or alpha, by minimizing the sum of probabilities using the known CDF equation for the normal distribution and the error function. However, there may be issues with the approach, such as the error function never being zero and uncertainty about the meaning of sigma in the error function.
  • #1
GabrielN00

Homework Statement


Given [tex]X_1,\dots,X_n[/tex] a simple random sample with normal variables ([tex]\mu, \sigma^2[/tex]). We assume [tex]\mu[/tex] is known but [tex]\sigma^2[/tex] is unknown.

The hypothesis is
[tex]
\begin{cases}
H_0: & \mu=\mu_0 \\
H_1: & \mu=\mu_1 > \mu_0
\end{cases}
[/tex]

Determine the rejection region [tex] R[/tex] in order to minimize the [tex] P_{H_0}(R)+P_{H_1}(R^c)[/tex] .

Homework Equations



The Attempt at a Solution

I'm having problems both to understand the rejection regions and to find the minimum of the sum.

The "plan" would be to consider [tex]z=\displaystyle\frac{\bar{X}-\mu}{(s/\sqrt{n})}[/tex]

I could proceed to do a one-tail test and find the minimum, but the very first problem is that my [tex]\alpha[/tex] value is unknown, so I cannot look it up in a table.

I'm clueless at even how to get a usable expression for each type error, since everything I am able to find suggest the use of a table, but the problem clearly doesn't make use of one.
 
Physics news on Phys.org
  • #2
I assume that α is the lower value of the region R. In that case, the problem is to determine α as a function of μ0 and μ1 (and the sample variance) by minimizing the sum of the probabilities. That is a calculus problem that requires you to use the equations of the CDF rather than a table.

The CDF of a normal distribution is known. See the CDF equation in https://en.wikipedia.org/wiki/Normal_distribution and the erf function in https://en.wikipedia.org/wiki/Error_function
 
  • #3
GabrielN00 said:

Homework Statement


Given [tex]X_1,\dots,X_n[/tex] a simple random sample with normal variables ([tex]\mu, \sigma^2[/tex]). We assume [tex]\mu[/tex] is known but [tex]\sigma^2[/tex] is unknown.

The hypothesis is
[tex]
\begin{cases}
H_0: & \mu=\mu_0 \\
H_1: & \mu=\mu_1 > \mu_0
\end{cases}
[/tex]

Determine the rejection region [tex] R[/tex] in order to minimize the [tex] P_{H_0}(R)+P_{H_1}(R^c)[/tex] .

Homework Equations



The Attempt at a Solution

I'm having problems both to understand the rejection regions and to find the minimum of the sum.

The "plan" would be to consider [tex]z=\displaystyle\frac{\bar{X}-\mu}{(s/\sqrt{n})}[/tex]

I could proceed to do a one-tail test and find the minimum, but the very first problem is that my [tex]\alpha[/tex] value is unknown, so I cannot look it up in a table.

I'm clueless at even how to get a usable expression for each type error, since everything I am able to find suggest the use of a table, but the problem clearly doesn't make use of one.

The question makes no sense. You say that ##\mu## is known, and then you say the hypotheses involve ##\mu##!

It makes sense to test hypotheses about ##\sigma## when ##\mu## is known, or to test hypotheses about ##\mu## when ##\sigma## is known (or even to test hypotheses about ##\mu## or ##\sigma## when neither of these is known).
 
  • #4
FactChecker said:
I assume that α is the lower value of the region R. In that case, the problem is to determine α as a function of μ0 and μ1 (and the sample variance) by minimizing the sum of the probabilities. That is a calculus problem that requires you to use the equations of the CDF rather than a table.

The CDF of a normal distribution is known. See the CDF equation in https://en.wikipedia.org/wiki/Normal_distribution and the erf function in https://en.wikipedia.org/wiki/Error_function

I still not sure what ##\alpha## is, but do you actually need the CDF here in closed-esq form?

Typically the way these problems are set up is you have function that has something you want to minimize: the usual recipe is to differentiate once and set equal to zero. Once you differentiate it, you can use the simple pdf for the Gaussian of course, and so you only ever work abstractly with the CDF of the Guassian, denoted by "CDF" of ##\Phi## or something like that .
 
  • Like
Likes FactChecker
  • #5
Ray Vickson said:
The question makes no sense. You say that ##\mu## is known, and then you say the hypotheses involve ##\mu##!

It makes sense to test hypotheses about ##\sigma## when ##\mu## is known, or to test hypotheses about ##\mu## when ##\sigma## is known (or even to test hypotheses about ##\mu## or ##\sigma## when neither of these is known).
That's what the problem says, but because of what you say I think it might have been a typo. It should read ##\mu## while ##\sigma## is known.
I'm not sure how can I edit the main post. I don't see an edit button.

StoneTemplePython said:
I still not sure what ##\alpha## is, but do you actually need the CDF here in closed-esq form?

Typically the way these problems are set up is you have function that has something you want to minimize: the usual recipe is to differentiate once and set equal to zero. Once you differentiate it, you can use the simple pdf for the Gaussian of course, and so you only ever work abstractly with the CDF of the Guassian, denoted by "CDF" of ##\Phi## or something like that .

##\alpha## is the critical value, the value we cross when we enter the rejection region.

Alright, let's consider the following: using the error function above I have that the error function with mean 0 and variance ##\sigma ## is ##\frac{1}{2\pi}\int_0^{\alpha/(\sigma\sqrt{2})} e^{-t^2}dt##.

This error gives the probability of falling in ##(-\alpha,\alpha)## but I am interested in the rejection region, this is ##(-\infty, \alpha)\cup(\alpha, +\infty)##. Therefore, I think I should consider the complementary error function ##erfc(\alpha) = 1-\frac{1}{2\pi}\int_0^{\alpha/(\sigma\sqrt{2})} e^{-t^2}dt = \frac{1}{2\pi}\int_{\alpha/(\sigma\sqrt{2})}^{\infty} e^{-t^2}dt##

Now I could derive and get that ##\frac{d}{dt}erfc(\sigma) = - \frac{1}{2\pi}e^{-\alpha^2/(2\sigma^2}##. I should set it to ##0## and find ##\alpha##, to "solve" the problem.

There are three issues here:
(1) ##e^{-\alpha^2/(2\sigma^2}## will never be zero for any ##\alpha##.
(2) I didn't get involved the hypothesis testing.
(3) It is not clear what the ##\sigma## in the error function is. The wikipedia entry linked above says that error generally have mean zero, but it is possible for the error to have a variance. Is the ##\sigma## in the normal distribution the very same ##\sigma## in the error function?
 
Last edited by a moderator:
  • #6
StoneTemplePython said:
I still not sure what ##\alpha## is, but do you actually need the CDF here in closed-esq form?

Typically the way these problems are set up is you have function that has something you want to minimize: the usual recipe is to differentiate once and set equal to zero. Once you differentiate it, you can use the simple pdf for the Gaussian of course, and so you only ever work abstractly with the CDF of the Guassian, denoted by "CDF" of ##\Phi## or something like that .
Oh! Good point!
 
  • #7
StoneTemplePython said:
I still not sure what ##\alpha## is, but do you actually need the CDF here in closed-esq form?

Typically the way these problems are set up is you have function that has something you want to minimize: the usual recipe is to differentiate once and set equal to zero. Once you differentiate it, you can use the simple pdf for the Gaussian of course, and so you only ever work abstractly with the CDF of the Guassian, denoted by "CDF" of ##\Phi## or something like that .

I can't really see how. I guess you mean ##f_X=\frac{1}{\sigma\sqrt{2\pi}e^{-\frac{(x-\mu)^2}{2\sigma^2}}} ## but how can it be used to find the minimal sum of the errors?

Working solely with the error functions I thought I should consider ##erf(\alpha) ## to calculate ##P_{H_0}(R)## and ##erfc(\alpha)## to calculate ##P_{H_1}(R)##.
 
  • #8
GabrielN00 said:
I can't really see how. I guess you mean ##f_X=\frac{1}{\sigma\sqrt{2\pi}e^{-\frac{(x-\mu)^2}{2\sigma^2}}} ## but how can it be used to find the minimal sum of the errors?

Working solely with the error functions I thought I should consider ##erf(\alpha) ## to calculate ##P_{H_0}(R)## and ##erfc(\alpha)## to calculate ##P_{H_1}(R)##.

No, the correct form is
$$f_X(x) = \frac{2}{\sigma \sqrt{2 \pi}} e^{-(x-\mu)^2/(2 \sigma^2) }$$.
The density function of the standard normal (mean = 0, s.d.= 1) is usually denoted by ##\phi## and its cumulative distribution by ##\Phi##:
$$\phi(t) = \frac{1}{\sqrt{2 \pi}} e^{-t^2/2}, \;\; \Phi(z) = \int_{-\infty}^z \phi(t) \, dt. $$
The relationship between ##\Phi## and ##\text{erf}## is
$$\Phi(z) =\frac{2}{2} + \frac{1}{2} \text{erf} \left( \frac{z}{\sqrt{2}} \right), $$
provided that your definition of "erf" is ##\text{erf}(z) = (2/\sqrt{\pi}) \int_0^z e^{-t^2} \, dt##.

Anyway, you want to test a value of ##\mu_0## (H0) against a larger value ##\mu_1## (H1), so you will accept the null hypothesis provided that the sample mean ##\bar{X}## is not too large. So you accept H0 if ##\bar{X} \leq \alpha## and reject H0 if ##\bar{X} > \alpha##. The type I error is ##E_1 = P(\bar{X} > \alpha | \mu = \mu_0)##, and you can work this out in terms or ##\Phi## (or erfc), ##\alpha##, ##\mu_0## and ##\sigma##. The type II error is ##E_2 = P(\bar{X} \leq \alpha | \mu = \mu_1)##, and you can work this out in terms of ##\Phi##, ##\alpha##, ##\mu_1## and ##\sigma##. Altogether, you get ##E_1+E_2 = G(\alpha)## for some function ##G## that you can write out in terms of ##\Phi## or erfc. Then, as usual, you look for a solution of ##G'(\alpha) = 0## in your search for a minimum.
 
  • Like
Likes GabrielN00
  • #9
Ray Vickson said:
Anyway, you want to test a value of ##\mu_0## (H0) against a larger value ##\mu_1## (H1), so you will accept the null hypothesis provided that the sample mean ##\bar{X}## is not too large. So you accept H0 if ##\bar{X} \leq \alpha## and reject H0 if ##\bar{X} > \alpha##. The type I error is ##E_1 = P(\bar{X} > \alpha | \mu = \mu_0)##, and you can work this out in terms or ##\Phi## (or erfc), ##\alpha##, ##\mu_0## and ##\sigma##. The type II error is ##E_2 = P(\bar{X} \leq \alpha | \mu = \mu_1)##, and you can work this out in terms of ##\Phi##, ##\alpha##, ##\mu_1## and ##\sigma##. Altogether, you get ##E_1+E_2 = G(\alpha)## for some function ##G## that you can write out in terms of ##\Phi## or erfc. Then, as usual, you look for a solution of ##G'(\alpha) = 0## in your search for a minimum.

Thank you. In regard to this last part, I am not entirely sure how to work ##P(\bar{X} > \alpha | \mu = \mu_0)## out. Normally I'd proceed as ##P(\bar{X} > \alpha | \mu = \mu_0)=1-P(\bar{X} \leq \alpha | \mu = \mu_0)##. But to compute the conditional probability shouldn't be involved some joint distribution function?
 
  • #10
GabrielN00 said:
Thank you. In regard to this last part, I am not entirely sure how to work ##P(\bar{X} > \alpha | \mu = \mu_0)## out. Normally I'd proceed as ##P(\bar{X} > \alpha | \mu = \mu_0)=1-P(\bar{X} \leq \alpha | \mu = \mu_0)##. But to compute the conditional probability shouldn't be involved some joint distribution function?

No. The conditional probability ##P(A|\mu=\mu_0)## assumes that ##\mu = \mu_0## and so uses the distribution ##\text{Normal}(\mu_0, \sigma)##, with both mean and variance known. No joint distributions are involved.
 
  • #11
Ray Vickson said:
No. The conditional probability ##P(A|\mu=\mu_0)## assumes that ##\mu = \mu_0## and so uses the distribution ##\text{Normal}(\mu_0, \sigma)##, with both mean and variance known. No joint distributions are involved.

Would it be right to say ## P(X\leq \alpha | \mu=\mu_0) = \frac{f_X(\alpha)}{Normal(\mu_0,\sigma)}=\frac{[2/(\sigma \sqrt{2\pi})]e^{-\frac{-(x-\mu)}{2\pi^2}}}{(1/\sqrt{2\pi})e^{-\alpha^2/2}} ## ?
 
  • #12
GabrielN00 said:
Would it be right to say ## P(X\leq \alpha | \mu=\mu_0) = \frac{f_X(\alpha)}{Normal(\mu_0,\sigma)}=\frac{[2/(\sigma \sqrt{2\pi})]e^{-\frac{-(x-\mu)}{2\pi^2}}}{(1/\sqrt{2\pi})e^{-\alpha^2/2}} ## ?
No. That equation still has x in it. The x values must be integrated over (-∞,α). And you should not divide it by anything. And the μ in the numerator should be μ0. (Those are the mistakes that immediately jump out at me. There may be more.)
 
  • #13
GabrielN00 said:
Would it be right to say ## P(X\leq \alpha | \mu=\mu_0) = \frac{f_X(\alpha)}{Normal(\mu_0,\sigma)}=\frac{[2/(\sigma \sqrt{2\pi})]e^{-\frac{-(x-\mu)}{2\pi^2}}}{(1/\sqrt{2\pi})e^{-\alpha^2/2}} ## ?

No. In probability we define ##P(A|B) = P(A\, \& \, B)/P(B)##, so if we know ##P(A\, \& \, B) ## and ##P(B)## we can compute ##P(A|B)##. However, that is not the usual way we deal with conditional probabilities. Most often we know ##P(A|B)## directly. If also happen to know ##P(B)## then we could calculate ##P(A\, \& \, B)##.

In this problem we know how to compute ##P(\bar{X} \leq \alpha|\mu = \mu_0)## directly because---as I already stated very clearly---we use ##N(\mu_0,\sigma)## with both mean and variance known.
 
  • Like
Likes GabrielN00
  • #14
FactChecker said:
No. That equation still has x in it. The x values must be integrated over (-∞,α). And you should not divide it by anything. And the μ in the numerator should be μ0. (Those are the mistakes that immediately jump out at me. There may be more.)

Thank you. I'm very sorry this is taking so long, but thank you again for answer my messages.

Maybe it goes like this?

##P_{H_0}(R)+P_{H_1}(R)=E_1(\alpha)+E_2(\alpha)=P(X>\alpha | \mu=\mu_1)+ P(X\leq\alpha | \mu=\mu_0) =1 - P(X\leq\alpha | \mu=\mu_1)+P(X\leq\alpha | \mu=\mu_0) =+ 1 - \int_{-\infty}^\alpha \frac{2}{\sigma \sqrt{2\pi}}e^{-\frac{(x-\mu_1)^2}{2\sigma^2}} + \int_{-\infty}^\alpha \frac{2}{\sigma \sqrt{2\pi}}e^{-\frac{(x-\mu_0)^2}{2\sigma^2}} ##.

If I differentiate the integrals and set it to zero the remaining equation is ## \left( \frac{2}{\sigma \sqrt{2\pi}}e^{-\frac{(\alpha-\mu_0)^2}{2\sigma^2}} - \frac{2}{\sigma \sqrt{2\pi}} \right) - \left( \frac{2}{\sigma \sqrt{2\pi}}e^{-\frac{(\alpha-\mu_1)^2}{2\sigma^2}} - \frac{2}{\sigma \sqrt{2\pi}} \right) = 0##

Then the equation to solve is ##e^{-\frac{(\alpha-\mu_0)^2}{2\sigma^2}} = e^{-\frac{(\alpha-\mu_1)^2}{2\sigma^2}} ## which happens only when ## \frac{(\alpha-\mu_0)^2}{2\sigma^2} = \frac{(\alpha-\mu_1)^2}{2\sigma^2} ##.

Then ##a^2-2\alpha\mu_0+\mu_0^2 = a^2-2\alpha\mu_1+\mu_1^2## and follows that ## 2\alpha(\mu_1 -\mu_0)=\mu_ 1^2-\mu_0^2##.

Then both errors Type I and II are minimized when ## \alpha = \frac{\mu_ 1^2-\mu_0^2}{2(\mu_ 1-\mu_0)} ##
 
Last edited by a moderator:
  • Like
Likes FactChecker
  • #15
GabrielN00 said:
Thank you. I'm very sorry this is taking so long, but thank you again for answer my messages.

Maybe it goes like this?

##P_{H_0}(R)+P_{H_1}(R)=E_1(\alpha)+E_2(\alpha)=P(X\leq\alpha | \mu=\mu_0)+P(X>\alpha | \mu=\mu_1)=P(X\leq\alpha | \mu=\mu_0) + 1 - P(X\leq\alpha | \mu=\mu_1)=\int_{-\infty}^\alpha \frac{2}{\sigma \sqrt{2\pi}}e^{-\frac{(x-\mu_0)^2}{2\sigma^2}} + 1 - \int_{-\infty}^\alpha \frac{2}{\sigma \sqrt{2\pi}}e^{-\frac{(x-\mu_1)^2}{2\sigma^2}}##.

If I differentiate the integrals and set it to zero the remaining equation is ## \left( \frac{2}{\sigma \sqrt{2\pi}}e^{-\frac{(\alpha-\mu_0)^2}{2\sigma^2}} - \frac{2}{\sigma \sqrt{2\pi}} \right) - \left( \frac{2}{\sigma \sqrt{2\pi}}e^{-\frac{(\alpha-\mu_1)^2}{2\sigma^2}} - \frac{2}{\sigma \sqrt{2\pi}} \right) ##

I think you have the type I and type II errors backwards: the type I error is ##P(\bar{X} > \alpha| \mu=\mu_0)##, and that does not look like what you wrote. I don't think your final optimality equation will be much affected by this, but it is good to get things right before proceeding.

Your final equation should have an ##=0## in it. Then you can cancel out some things and be left with a solvable equation for which you can give a closed-form algebraic solution.
 
  • #16
Ray Vickson said:
I think you have the type I and type II errors backwards: the type I error is ##P(\bar{X} > \alpha| \mu=\mu_0)##, and that does not look like what you wrote. I don't think your final optimality equation will be much affected by this, but it is good to get things right before proceeding.

Your final equation should have an ##=0## in it. Then you can cancel out some things and be left with a solvable equation for which you can give a closed-form algebraic solution.
I will fix it now. I clicked Reply instead of Preview so it sent the post before I was done writing it :(
 
  • #17
GabrielN00 said:
Thank you. I'm very sorry this is taking so long, but thank you again for answer my messages.

Maybe it goes like this?

##P_{H_0}(R)+P_{H_1}(R)=E_1(\alpha)+E_2(\alpha)=P(X>\alpha | \mu=\mu_1)+ P(X\leq\alpha | \mu=\mu_0) =1 - P(X\leq\alpha | \mu=\mu_1)+P(X\leq\alpha | \mu=\mu_0) =+ 1 - \int_{-\infty}^\alpha \frac{2}{\sigma \sqrt{2\pi}}e^{-\frac{(x-\mu_1)^2}{2\sigma^2}} + \int_{-\infty}^\alpha \frac{2}{\sigma \sqrt{2\pi}}e^{-\frac{(x-\mu_0)^2}{2\sigma^2}} ##.

If I differentiate the integrals and set it to zero the remaining equation is ## \left( \frac{2}{\sigma \sqrt{2\pi}}e^{-\frac{(\alpha-\mu_0)^2}{2\sigma^2}} - \frac{2}{\sigma \sqrt{2\pi}} \right) - \left( \frac{2}{\sigma \sqrt{2\pi}}e^{-\frac{(\alpha-\mu_1)^2}{2\sigma^2}} - \frac{2}{\sigma \sqrt{2\pi}} \right) = 0##

Then the equation to solve is ##e^{-\frac{(\alpha-\mu_0)^2}{2\sigma^2}} = e^{-\frac{(\alpha-\mu_1)^2}{2\sigma^2}} ## which happens only when ## \frac{(\alpha-\mu_0)^2}{2\sigma^2} = \frac{(\alpha-\mu_1)^2}{2\sigma^2} ##.

Then ##a^2-2\alpha\mu_0+\mu_0^2 = a^2-2\alpha\mu_1+\mu_1^2## and follows that ## 2\alpha(\mu_1 -\mu_0)=\mu_ 1^2-\mu_0^2##.

Then both errors Type I and II are minimized when ## \alpha = \frac{\mu_ 1^2-\mu_0^2}{2(\mu_ 1-\mu_0)} ##

The answer looks a lot simpler if you recall that ##\mu_1^2 - \mu_0^2 = (\mu_1 - \mu_0) (\mu_1 + \mu_0)##.
 
  • Like
Likes GabrielN00 and FactChecker
  • #18
Ha! So after all that, the answer is what might have been guessed (although maybe not proven any easier):
To minimize the sum, place the the start of the rejection area half way between μ0 and μ1.

At the moment I don't have what it takes to figure it out, but there is probably a good intuitive "geometric" way to prove that.
 
  • #19
FactChecker said:
Ha! So after all that, the answer is what might have been guessed (although maybe not proven any easier):
To minimize the sum, place the the start of the rejection area half way between μ0 and μ1.

At the moment I don't have what it takes to figure it out, but there is probably a good intuitive "geometric" way to prove that.

The mid-point is equidistant between the regions ##\alpha \leq \mu_0## and ##\alpha \geq \mu_1##, for what that is worth.
 
  • Like
Likes GabrielN00
  • #20
In hindsight, now that the answer α = (μ01)/2 has been found, it is easy to see. Since the variance is the same and only the means are different, the situation looks like the figure below when α is the midpoint value. It is clear that moving α down will increase Type 1 error more than it decreases Type 2 error, thus increasing the total error. Similarly, moving α up will increase Type 2 error more than it decreases Type 1 error, thus increasing the total error. So the midpoint value is the minimum.
minimizeSumOfErrors.png
 
  • Like
Likes StoneTemplePython and GabrielN00
  • #21
The unusual symmetry of this problem makes it possible to solve geometrically. The analytical approach that was taken earlier is much more powerful for most minimization problems.
 
Last edited:

Related to Minimize the sum of Type I and Type II errors

What is the goal of minimizing the sum of Type I and Type II errors?

The goal of minimizing the sum of Type I and Type II errors is to find a balance between correctly rejecting false hypotheses (Type I errors) and correctly accepting true hypotheses (Type II errors). This is important in scientific research to ensure that the results are accurate and reliable.

What is a Type I error?

A Type I error, also known as a false positive, occurs when a true null hypothesis is incorrectly rejected. This means that the researcher concludes there is a significant effect or relationship when in reality there is none.

What is a Type II error?

A Type II error, also known as a false negative, occurs when a false null hypothesis is incorrectly accepted. This means that the researcher fails to detect a significant effect or relationship when in reality there is one.

How do you minimize the sum of Type I and Type II errors?

To minimize the sum of Type I and Type II errors, researchers can adjust the significance level (alpha) or the power of the study. A lower significance level reduces the likelihood of Type I errors, while a higher power increases the likelihood of detecting true effects and reducing Type II errors.

Why is it important to minimize the sum of Type I and Type II errors?

It is important to minimize the sum of Type I and Type II errors to ensure the validity and accuracy of research findings. If the sum of these errors is high, it may lead to incorrect conclusions and misleading results. Minimizing these errors helps to improve the reliability and credibility of scientific research.

Similar threads

  • Calculus and Beyond Homework Help
Replies
2
Views
1K
  • Calculus and Beyond Homework Help
Replies
5
Views
2K
  • Calculus and Beyond Homework Help
Replies
1
Views
922
  • Set Theory, Logic, Probability, Statistics
Replies
6
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
3
Views
796
  • Calculus and Beyond Homework Help
Replies
2
Views
1K
  • Calculus and Beyond Homework Help
Replies
1
Views
1K
  • Calculus and Beyond Homework Help
Replies
1
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
9
Views
2K
  • Precalculus Mathematics Homework Help
Replies
1
Views
1K
Back
Top