Optimal linear combination of independent estimates for minimizing variance

  • Thread starter Andre
  • Start date
  • Tags
    Statistics
In summary, the conversation discusses a theoretical problem involving estimating the age of a certain event using different methods and datasets. The final result is determined to be 13218 ± 14 years, and the approach used involves treating all record sets as normal distributions and multiplying them for each data point. The conversation also suggests contacting the author of one of the records to inform them that their method may result in an outlier. The conversation also poses a problem involving determining the optimal choice of weights for a linear combination of two unbiased estimators of a random variable.
  • #1
Andre
4,311
74
edit done

I posted this here in an attempt to follow the rules for independent study. My knowledge of statistics is very rudimentary, so I would like to know if my approach does make any sense of not.

Homework Statement



The question is determine the age of a certain event based on a series of binominal/normal distibuted datasets, all looking for the same event age but with different methods.

Record 1 is given as counted 10 times by different people leading to an average of 13200 years with σ= 20

Record 2 is given as counted as 12985 certain years with 150 uncertain layers that may or may not be years. So the authors split the difference and conclude: 13260 years with an absolute error of 75 years

Record 3 is reported as 13215 counted years with a 1% error

Record 4 is calculated etc giving a result of 13195 years with σ=35

So what would be a realistic average value with realistic σ?

Homework Equations



Normal distribution.-

The Attempt at a Solution



I wondered if it would work if I treated all record sets as normal distributions and then multiply all four of them for each data point. To get σ's for records 2 and 3 I used the absolute error as the 3σ range, getting values of 25 and 44 years respectively.

Then I created this spreadsheet, using 5 year intervals, which is ample in the branch.

https://dl.dropboxusercontent.com/u/22026080/numbers-crunching.xlsx

I just multiplied all data in the series of the four records (column I) and then corrected it to get a sum of 1 under the graph (column F). Colums J-K-L are just a help to find the average and σ which turns out to be 13220 ± 20.



Does this make sense?

The result is close to the 2σ boundary with record 2. Therefore I made a refinement tool (column G) to find the least squares (manual - trial and error) which turned out to be 13218 ± 14(?) years.

Does that make sense too, since I'm working with 5 year intervals?

Finally, does it make sense to mail the author of record 2 telling him that his method of splitting the difference between certain and uncertain years makes his result an outlier?
 
Last edited by a moderator:
Physics news on Phys.org
  • #2
Here is a simple theoretical problem which points to one possible approach.

Suppose [itex]X_1[/itex] and [itex]X_2[/itex] are two independent estimates of a random variable [itex]X[/itex], with standard deviations [itex]\sigma_1[/itex] and [itex]\sigma_2[/itex], respectively. Consider a linear combination of [itex]X_1[/itex] and [itex]X_2[/itex], i.e.
[tex]Y = \lambda_1 X_1 + \lambda_2 X_2[/tex] where
[tex]0 \leq \lambda_i \leq 1[/tex] and [tex]\lambda_1 + \lambda_2 = 1[/tex]

If we assume that [itex]X_1[/itex] and [itex]X_2[/itex] are unbiased estimators of [itex]X[/itex], i.e. [itex]E(X_i) = E(X)[/itex], then you should find it easy to show that the mean of [itex]Y[/itex] is equal to the mean of [itex]X[/itex].

What choice of [itex]\lambda_1[/itex] and [itex]\lambda_2[/itex] minimizes the variance of [itex]Y[/itex]?

With just a tiny bit of statistics and calculus, you should be able to solve this problem. And then maybe you can then see a way to apply the result to your original problem.
 
Last edited:

Related to Optimal linear combination of independent estimates for minimizing variance

1. What is the purpose of statistics in scientific research?

Statistics is used to analyze and interpret data in order to make informed conclusions and decisions. It allows scientists to identify patterns, trends, and relationships within their data, and to draw meaningful insights from their research.

2. What are the main types of statistical analysis?

The main types of statistical analysis include descriptive statistics, which summarize data and describe its characteristics, and inferential statistics, which use sample data to make generalizations about a larger population.

3. How do you ensure the validity and reliability of statistical results?

To ensure the validity and reliability of statistical results, scientists must carefully design their studies and use appropriate sampling methods, control variables, and data analysis techniques. It is also important to replicate studies and consider the potential for bias in the data.

4. What is the difference between correlation and causation?

Correlation refers to a relationship between two variables, where a change in one variable is associated with a change in the other. Causation, on the other hand, refers to a cause-and-effect relationship, where one variable directly influences the other.

5. How can statistical analysis be used to make predictions?

By analyzing historical data and identifying patterns and trends, statistical analysis can be used to make predictions about future outcomes. However, it is important to note that predictions based on statistical analysis are not always accurate and can be influenced by various factors.

Similar threads

  • Precalculus Mathematics Homework Help
Replies
1
Views
602
  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
527
  • Set Theory, Logic, Probability, Statistics
Replies
21
Views
2K
  • Classical Physics
Replies
18
Views
2K
Replies
3
Views
1K
Replies
6
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
6
Views
2K
Replies
18
Views
1K
  • Introductory Physics Homework Help
Replies
12
Views
8K
Back
Top