Interpretation of Confidence Intervals and the like

In summary: Your professor is correct. If you don't know the distribution, you can't really say anything about it.
  • #1
MadRocketSci2
48
1
Statistics question: I'm having trouble understanding my statistics professor's objection to this interpretation of confidence intervals.

If you have some distribution with parameter x: X = Dist(x), and you perform a random experiment drawing N random variables from it, and derive from those some estimator y related to x, then I *want* to say the following, but my statistics prof insists it's an invalid interpretation:

If you have a space with the (unknown) parameter x and the statistic y, for each x there is a conditional probability distribution for the y that will be yielded from a draw of N variables. This leads to a joint probability distribution linking x and y. Now that you have a value for y, you are in a subspace of the space (y=whatever you got), and there is a distribution of probability for x. The probability that you find x within a certain range relates to this distribution.

My statistics prof objects that x is a specific value, not a random variable. While I understand that x *in fact* is either within or not within the interval, the best you can give with the imperfect information available is a finite probability. You have limited knowledge due to the statistic y, which is more than no knowledge and less than exact knowledge of what the parameter x is.

He insists you cannot make a probability statement about it, and that known confidence intervals don't give you any information about the underlying distribution, only future ones. That x either is or is not within any distribution, and that you cannot say anything about it.

(I already understand the point about a certain proportion of future confidence interval draws containing the parameter).

I don't know why you can't do this. Can anyone try to enlighten me? Any perspectives you have might help me understand the error.
 
Last edited:
Physics news on Phys.org
  • #2
MadRocketSci2 said:
Statistics question: I'm having trouble understanding my statistics professor's objection to this interpretation of confidence intervals.

If you have some distribution with parameter x: X = Dist(x), and you perform a random experiment drawing N random variables from it, and derive from those some estimator y related to x, then I *want* to say the following, but my statistics prof insists it's an invalid interpretation:

If you have a space with the (unknown) parameter x and the statistic y, for each x there is a conditional probability distribution for the y that will be yielded from a draw of N variables. This leads to a joint probability distribution linking x and y. Now that you have a value for y, you are in a subspace of the space (y=whatever you got), and there is a distribution of probability for x. The probability that you find x within a certain range relates to this distribution.

My statistics prof objects that x is a specific value, not a random variable. While I understand that x *in fact* is either within or not within the interval, the best you can give with the imperfect information available is a finite probability. You have limited knowledge due to the statistic y, which is more than no knowledge and less than exact knowledge of what the parameter x is.

He insists you cannot make a probability statement about it, and that known confidence intervals don't give you any information about the underlying distribution, only future ones. That x either is or is not within any distribution, and that you cannot say anything about it.

(I already understand the point about a certain proportion of future confidence interval draws containing the parameter).

I don't know why you can't do this. Can anyone try to enlighten me? Any perspectives you have might help me understand the error.

I think your professor is correct.

If you really don't know the distribution (which is what happens in most cases), then you can't really say what the distribution is for an unknown process.

But let's say that you absolutely know that your process is constrained to always be of that distribution, but your parameters are unknown.

If you construct a confidence interval you need to be aware in hypothesis testing that there are four probabilities to consider: P(H1 True| Interval), P(H1 False| Interval), P(H0 True | Interval) and P(H0 False | Interval), where H1 and H0 are alternative and null hypotheses respectively.

The thing is that we can get a false interpetation which means we can't say that the value has to lie in a given interval with so and so probability. It's a very subtle thing and depending on how you say it, some people will interpret it one way and some another but in terms of understanding what I am saying, think about the the four situations above involving H0 and H1 and in particular look at Type I and Type II errors to understand in detail what I am talking about.

Also to understand it a bit clearer, let's say you play a game called the 'trust game'.

It's very simple: you have to figure out after N questions and N observations of checking the answers whether someone is telling the truth.

Now let's say you do it for 100 times. Everything is true. Now 10,000 times. Again all true. Now 1,000,000 times. All true. But then on the 1,000,001th time the person lies. Although the guy has told the truth he is still not entirely truthful and now the premise that the guy is truthful is now false.

This is the kind of thing that you have to be aware of in statistics. Future data will make the conclusion stronger but it won't guarantee it and if you fall into the trap of assuming that with the huge number of data points being 1,000,000 observations above thinking that it says that the guy is truthful, then you are setting yourself up for some very bad interpretations and reasoning.
 
  • #3
MadRocketSci2 said:
If you have a space with the (unknown) parameter x and the statistic y, for each x there is a conditional probability distribution for the y that will be yielded from a draw of N variables. This leads to a joint probability distribution linking x and y.

It doesn't lead to a joint probability distribution of x and y, since you only know a conditional density that tells P(Y = y | X = x). You could get a joint density, if you also had a density that gave P(X = x) unconditioinally. There is a type of statistics that involves assuming such information, which is called a "prior distribution" for X. That type of statistics is Bayesian statistics. The type of statistics that your professor is teaching is "frequentist" statistics and he is correct that they confidence intervals cannot be interpreted as giving a probability that the parameter X is in a particular interval. (The type of interval that does this, in Bayesian statistics, is called a "credible interval".)

Most commonsense people ask questions like "What is the probability that my idea is true given the data?" or "What is the probability that the parameter is in this interval, given the data?". If you can cut through the terminology of frequentist statistics, you find that these questions are never answered Instead you get numbers that quantify the probability of the data, given the assumption that a hypothesis is true or that a parameter is somewhere specific. It's the difference between the probability of A given B versus the probability of B given A.
 
  • #4
Thanks, Stephen, chiro.

That clears a few things up. I forgot about the prior probability of P(X=x) which is necessary to construct the joint distribution.

Without making some assumption there, you can't get there from here. That's probably what my professor was getting at.

In the specific example we were working through (normal distributions with a known variance, and the sample mean), a uniform prior probability was what I was unconsciously assuming. Now that I think about it, for general statistics and general parameters, this might not always work.

If you were to choose a prior probability distribution to assume, you need a distribution such that integral(f(x,y)dx | y) = 1 for all y. (or sums if discrete) This places constraints on P(X=x), right? Not all prior fx(x) satisfy this, it would seem, and so you aren't completely free in your choice of prior probability? (If a choice meeting this constraint exists in the first place!)
 
  • #5
MadRocketSci2 said:
, a uniform prior probability was what I was unconsciously assuming. Now that I think about it, for general statistics and general parameters, this might not always work.

If you were thinking about a uniform probability over all real numbers, this would never work since there cannot be such a distribution. However! , it is possible to assume a uniform probability over the interval [-L,L] where L is a large number. From that you can compute a credible interval. It is (in your example) also possible to take the limit of this answer (as a function of L) as L approaches infinity. What you get, as I recall, is a Bayesian "credible interval" for the mean that is exactly the same numerical interval as the frequentist "confidence interval" - only now the interval has the interpretation that you want. Of course, taking such a limit raises interesting philosophical questions.

If you were to choose a prior probability distribution to assume, you need a distribution such that integral(f(x,y)dx | y) = 1 for all y. (or sums if discrete) This places constraints on P(X=x), right? Not all prior fx(x) satisfy this, it would seem, and so you aren't completely free in your choice of prior probability?

You'll find that almost any legitimate probability density for X will satisfy that in real life problems. You do need for the density f(y|x) to exist for all x where the prior is non-zero

The question is what prior density can a researcher assume without being accused of "fixing" the outcome of his statistical tests. Sometimes there is actual prior data about X and you just fit a prior distribution to it. People study which mathematical families of prior distributions give results that are easily worked with and use these families. (Look up "conjugate prior distribution".) A more philosophical approach called "The Maximum Entropy Principle" (advocated by William Jaynes, some of whose writings you can find online) is to assume a prior distribution that has maximum entropy subject to other constraints that the researcher knows about X.
 

Related to Interpretation of Confidence Intervals and the like

What is a confidence interval?

A confidence interval is a range of values that is likely to contain the true value of a population parameter. It is based on a sample of data and is used to estimate the true value of the population parameter with a certain level of confidence.

How is a confidence interval calculated?

A confidence interval is typically calculated using a formula that takes into account the sample size, the standard deviation of the sample, and the desired level of confidence. The most commonly used formula is the standard error formula which uses the sample mean, standard deviation, and sample size to calculate the confidence interval.

What is the significance of the confidence level?

The confidence level is the probability that the true population parameter falls within the calculated confidence interval. It is typically expressed as a percentage, such as 95% or 99%. A higher confidence level means that there is a greater probability that the true value falls within the interval.

How do I interpret a confidence interval?

A confidence interval can be interpreted as a range of values that is likely to contain the true value of the population parameter. It is important to note that the true value is not guaranteed to fall within the interval, but there is a high probability that it does. Additionally, a wider confidence interval indicates more uncertainty in the estimate, while a narrower interval indicates more precision.

What are some limitations of confidence intervals?

Confidence intervals are based on a sample of data and are subject to sampling error. This means that the interval may not accurately reflect the true value of the population parameter. Additionally, confidence intervals assume that the sample is representative of the population and that the data is normally distributed. If these assumptions are not met, the confidence interval may not be accurate.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
5
Views
585
  • Set Theory, Logic, Probability, Statistics
Replies
30
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
741
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
675
  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
593
  • Set Theory, Logic, Probability, Statistics
Replies
9
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
13
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
435
  • Set Theory, Logic, Probability, Statistics
Replies
6
Views
3K
  • Set Theory, Logic, Probability, Statistics
Replies
13
Views
2K
Back
Top