- #1
mathpariah
- 4
- 0
hey
Gonna get straight to the point. I need to establish the probability difference between two probabilities p_1 and p_2 at 95%. Its the two probabilites that a cabin hook will hold for a certain force (25kN).
two samples, each with sizes "the originals" n_1=107, "cheap pirated ones" n_2=92. y_i are the amount of hooks which managed to successfully keep their acts together at 25kN for resp. sample
y_1=84 of which Y_1 is Bin(107,p_1) ≈ N(107*p_1, sqrt(107*p_1*q_1)) where q_1=1-p_1
and
y_2=12 of which Y_2 is Bin(92, p_2) ≈ N(92*p_2, sqrt(92*p_2*q_2))
now p_1 and p_2 can be estimated as follows:
^p_1 = y_1/107 which is an observation from ^P_1 ≈ N(p_1, sqrt((p_1*q_1)/107)
and ^p_2=y_2/92 obs ^P_2 ≈ N(p_2, sqrt((p_2*q_2)/92)
which gives us the estimated probability difference of:
^P_1-^P_2 ≈N(p_1-p_2, sqrt((p_1*q_1/107)+(p_2*q_2/92)))
which means you get the variable:
(^P_1-^P_2-(p_1-p_2))/sqrt((^P_1*^Q_1/107)+(^P_2*^Q_2/92)) ≈ N(0,1)
closing this stuff with z=z_0.975=1.96 from the normal dist. table gives me
INTERVAL_(p_1-p_2) = (^p_1-^p_2 -+ 1.96*sqrt((^p_1*^q_1/107)+(^p_2*^q_2/92)))=(0.55, 0.76)
This solution is supposedly correct and all I need is someone to help me understand the following:
1. When you subtract two stocastic variablesyou subtract the expected values from each other which I get, but the standard deviation is different... it becomes the sqrt of the sum of the independent stocastic variables standard deviations?
2. Why is the standard dev of the stocastic variable ^P_1 sqrt((p_1*q_1)/n_1)? can't you just call it σ_1? Are they the same? And is that always the case? Probability of something happening times 1- that probability through the sample size is equal to σ^2?
3. Why do you use the 97.5% probability when the question originally stated 95%? I know you use 1-α/2 to get there but I've never understood WHEN you can just go with 95% and when you have to use 97.5, for F distributions it seems going with 95% is ok even with 2 samples
4. Why do you use a Binomial distribution for this kind of problem?
would be pretty much amazing if anyone could help out with any or all of these questions, I am a donkey when it comes to math stat. thanks
Gonna get straight to the point. I need to establish the probability difference between two probabilities p_1 and p_2 at 95%. Its the two probabilites that a cabin hook will hold for a certain force (25kN).
two samples, each with sizes "the originals" n_1=107, "cheap pirated ones" n_2=92. y_i are the amount of hooks which managed to successfully keep their acts together at 25kN for resp. sample
y_1=84 of which Y_1 is Bin(107,p_1) ≈ N(107*p_1, sqrt(107*p_1*q_1)) where q_1=1-p_1
and
y_2=12 of which Y_2 is Bin(92, p_2) ≈ N(92*p_2, sqrt(92*p_2*q_2))
now p_1 and p_2 can be estimated as follows:
^p_1 = y_1/107 which is an observation from ^P_1 ≈ N(p_1, sqrt((p_1*q_1)/107)
and ^p_2=y_2/92 obs ^P_2 ≈ N(p_2, sqrt((p_2*q_2)/92)
which gives us the estimated probability difference of:
^P_1-^P_2 ≈N(p_1-p_2, sqrt((p_1*q_1/107)+(p_2*q_2/92)))
which means you get the variable:
(^P_1-^P_2-(p_1-p_2))/sqrt((^P_1*^Q_1/107)+(^P_2*^Q_2/92)) ≈ N(0,1)
closing this stuff with z=z_0.975=1.96 from the normal dist. table gives me
INTERVAL_(p_1-p_2) = (^p_1-^p_2 -+ 1.96*sqrt((^p_1*^q_1/107)+(^p_2*^q_2/92)))=(0.55, 0.76)
This solution is supposedly correct and all I need is someone to help me understand the following:
1. When you subtract two stocastic variablesyou subtract the expected values from each other which I get, but the standard deviation is different... it becomes the sqrt of the sum of the independent stocastic variables standard deviations?
2. Why is the standard dev of the stocastic variable ^P_1 sqrt((p_1*q_1)/n_1)? can't you just call it σ_1? Are they the same? And is that always the case? Probability of something happening times 1- that probability through the sample size is equal to σ^2?
3. Why do you use the 97.5% probability when the question originally stated 95%? I know you use 1-α/2 to get there but I've never understood WHEN you can just go with 95% and when you have to use 97.5, for F distributions it seems going with 95% is ok even with 2 samples
4. Why do you use a Binomial distribution for this kind of problem?
would be pretty much amazing if anyone could help out with any or all of these questions, I am a donkey when it comes to math stat. thanks