Conditional Expectation of a random variable

In summary: We have, from the definition of conditional probability, thatP(A|B) = P(A and B)/P(B)SoP(A and B) = P(A|B) P(B)andP(B and A) = P(B|A) P(A)If we assume P(A and B) = P(B and A) then we getP(A|B) P(B) = P(B|A) P(A)and after dividing by P(B) we getP(A|B) = (P(B|A) P(A))/P(B).That's called Bayes rule. We can use that to compute the probability of A given B in cases where it's easier to compute
  • #1
kblue
1
0
My professor made a rather concise statement in class, which sums to this: E(Y|X=xi) = constant. E(Y|X )= variable. Could anyone help me understand how the expectation is calculated for the second case? I understand that for different values of xi, we'll have different values for the expectation. This is where my thoughts are all muddled up:

E(Y|X)=[itex]\sum[/itex]i yi*P(Y=yi|X) = [itex]\sum[/itex]i yi * P(X|Y=yi)*P(Y=yi)/P(X).

Could anyone explain the above computation, and how that is a variable? Also, it is my understanding that summing the probability P(Y=yi|X) over all values of Y won't be 1. Is this true?
 
Physics news on Phys.org
  • #2
kblue said:
Could anyone explain the above computation, and how that is a variable?

One not-quite-correct explanation is to confuse "random variables" with ordinary variables.

It would go like this:

If you had an ordinary function such as g(X) = 3X then it woud be fair to say that expression g(3) represents a constant and the expression g(X) represents a fuction of X, which I suppose you would call "variable".

[itex] E(Y|X) [/itex] is some function of [itex] X [/itex].
When you give [itex] X [/itex] a specific value this is denoted by [itex] E(Y|X=x) [/itex] and that notation represents a constant.

The expression [itex] E(Y|X) [/itex] is not a two-variable function. The "[itex]Y[/itex]" in that notation jus tells you that you must do a summation over all possible values of [itex] Y [/itex]. Since you do that summation, the answer is not a function of the variable [itex] Y [/itex].

If we need a more precise explanation, we must heed the saying (that was the theme of a thread on the forum recently) "Random variables are not random and they are not variables".

It would be fair to say that [itex] E(Y|X) [/itex] depends on the random variable [itex]Y[/itex] because this says it depends on the entire distribution of [itex] Y [/itex]. "Random variables" are not ordinary variables because the definition of a "random variable" carries with it all the baggage about a distribution function that is not present in the definition or odinary variables. So [itex] E(Y|X) [/itex] isn't a function of an ordinary variable named "[itex]Y[/itex]".

Random variables technically do not take on specific values. It is their realizations that have specific values. When we say something like "Suppose the random variable X = 5", what we should say is "Suppose we have realization of the random variable X and the value of that realization is 5". The statement "X=x" means a realization of the random variable X is the value x.

I, myself, would have a hard time defining the notation [itex] E(Y|X) [/itex] using those precise notions and I tend to think about in the crude way that I first explained it! The [itex] E(Y[/itex] tells you to sum a certain function of possible realizations of the random variable [itex] Y [/itex] over all possible values that a realization may take. The [itex] X [/itex] tells you that when you do that sum, you assume that one particular value of the random variable [itex] X [/itex] has been realized and we abuse notation by denoting that value with the letter [itex] X [/itex] also. That particular value is a "variable" in the ordinary sense of the word variable.
"Variables" and "constants" are not adequately explained in ordinary mathematics courses. For example in the earlier discussion the literal "x" is used to repesent a "constant". We are asked to pretend it is a specific numerical value, yet at the same time it could be any specific numerical value. By contrast, in the function [itex] g(X) = 3X [/itex] we might be asked to pretend the literal "[itex]X[/itex]" is a "variable", but it seems to be on the same footing as the literal "x" insofar as it can take on any specific value. In ordinary math classes, you have to make your way though discussions that distinguish between variables and constants without have formal training of how to do that. (And most people with mathematical aptitude are able to.)

If you've taken logic courses or done structured computer programming, you know that symbols have a certain "scope". Within a certain context (such as an argument to a function) they can take unspecified values and within another context (such as a "read-only" global variable referenced inside a function but initialized outside of it) they hold only one specific value. That's the sort of formalism needed to deal with the distinction between variables and constants in a rigorous manner.
Also, it is my understanding that summing the probability P(Y=yi|X) over all values of Y won't be 1. Is this true?
No, I don't think that's true if by "summing" you mean that you assume each term in the sum assumes the same unspecified realization of the random variable X.

To understand the computation you asked about, think about Bayes rule.
 
Last edited:

Related to Conditional Expectation of a random variable

1. What is the definition of conditional expectation?

The conditional expectation of a random variable is the expected value of that variable given the knowledge of another variable. It is denoted as E[X|Y], where X is the random variable and Y is the condition on which the expectation is calculated.

2. How is conditional expectation different from regular expectation?

Regular expectation, or simply expectation, is the expected value of a random variable without any condition. On the other hand, conditional expectation takes into account a specific condition and calculates the expected value of the random variable based on that condition.

3. What is the formula for calculating conditional expectation?

The formula for calculating conditional expectation is E[X|Y] = ∑ x * P(X=x|Y), where x is the possible value of the random variable X and P(X=x|Y) is the probability of X taking that value given the condition Y.

4. How is conditional expectation useful in statistics and probability?

Conditional expectation is useful in statistics and probability as it allows us to make more accurate predictions and calculations based on a given condition. It also helps in understanding the relationship between two random variables and how one affects the expected value of the other.

5. Can conditional expectation be negative?

Yes, conditional expectation can be negative. This means that the expected value of the random variable given the condition is less than zero, which indicates a negative relationship between the two variables. It is important to consider the sign of the conditional expectation when interpreting its meaning in a given context.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
618
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
30
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
5
Views
650
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
107
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
6
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
9
Views
696
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
1K
Back
Top