Comparing two averages with different group sizes

Mohammad · Aug 17, 2006

Hi everyone,

I have been recently intrigued by a seemingly simple problem: How to compare the averages of two groups with different sizes.

For example: Suppose you have a driver A who wins 100 out of 200 races, and a driver B who wins 1 out 2 races. It is clear that although the average is the same, driver A's achievement is less likely to occur (so it can be considered more valuable?).

I worked out a solution based on the Binomial distribution with the MLE for each driver as the parameter.

Pr(X = 100|1/2) = 0.0563 (N = 200)
Pr(X = 1|1/2) = 0.5 (N = 2)

The results matches my expectation as it indicates that the first event is less likely to occur. The problem however comes when I have a situation like this:

Driver A wins 65 out of 161 races.
Driver B wins 68 out of 244 races.

By evaluating the probabilities in the same way I got:

Pr(X = 65|65/161) = 0.0640
Pr(X = 68|68/244) = 0.0569

Intuitively, I reject this result because it is clear that driver A did a better job (because both drivers won almost the same number of races). I know it is probably because of the parameter I am using, but I don't know how to fix it.

Any thoughts?

EnumaElish · Nov 3, 2006

http://geography.uoregon.edu/GeogR/topics/ttest.htm Different group sizes are factored into the problem through the weighting of the variances across the two groups.

tacman · Nov 10, 2006

Hello, thank you for sharing your thoughts and calculations on this interesting problem. Comparing averages of two groups with different sizes can definitely be tricky, and I think you are on the right track with using the Binomial distribution and maximum likelihood estimation (MLE) to solve it. However, I believe the issue with your second example lies in the choice of parameter for the Binomial distribution.

In your first example, you used the parameter 1/2 for both drivers, which makes sense since both drivers have an equal chance of winning any given race. However, in the second example, you used the actual winning rates of each driver (65/161 and 68/244) as the parameters. This might not be the best approach because the number of races each driver participated in is not accounted for in the probability calculation.

One way to potentially solve this problem is to use the total number of races (161 and 244) as the parameters for both drivers. This would give you the probabilities of each driver winning a specific number of races out of the total number of races they participated in. This might give you a better comparison between the two drivers in terms of their overall success rate.

Another approach could be to use a different distribution, such as the Normal distribution, to compare the averages of the two groups. This would take into account the group sizes and could potentially give a more accurate comparison.

Overall, I think you are on the right track and it's great that you are thinking critically about how to compare averages of different group sizes. Hopefully, these suggestions will help you refine your solution. Good luck!

Comparing two averages with different group sizes

Related to Comparing two averages with different group sizes

1. What is the purpose of comparing two averages with different group sizes?

2. How do you calculate the average of a group with different sizes?

3. What is the best statistical test for comparing two averages with different group sizes?

4. How do you interpret the results of comparing two averages with different group sizes?

5. What are some potential limitations of comparing two averages with different group sizes?

Similar threads

Hot Threads

Recent Insights