Multiplication of conditional probability with several variables

In summary: The notation p(t|x,X,T) means the same thing as p(t|x,w,X,T) because t is a function only of x and w. p(t|x,w) means the same thing as p(t|x,w,X,T) because t has no random variation due to x,T.
  • #1
Ronald_Ku
17
0
Dear All,

I am a starter to machine learning and i am currently confused about the following problem:

what is the result of P(X|Y)P(Y|Z)?
In my book, it is written to be P(X|Z). But I don't think it is correct since
P(X|Z)= P(X|Y,Z)P(Y|Z)
But clearly P(X|Y)=/= P(X|Y,Z)

Assuming all Events are not independent.

I have simplified the problem in the above equation. The true equation is
p(w|x,t,α,β)proportional to p(t|x,w,β)p(w|α) from pattern recognition and machine learning written by christopher m. bishop.

Any helps and ideas will be very appreciated.
 
Physics news on Phys.org
  • #2
Ronald_Ku said:
since
P(X|Z)= P(X|Y,Z)P(Y|Z)

Are you saying the above is given as a special condition in the problem?

Or did you mean [itex] P( \ ( X \cap Y) | Z\ ) = P(X | \ (Y \cap Z)\ )\ P(Y | Z) [/itex] ?
 
  • #3
Stephen Tashi said:
Are you saying the above is given as a special condition in the problem?

Or did you mean [itex] P( \ ( X \cap Y) | Z\ ) = P(X | \ (Y \cap Z)\ )\ P(Y | Z) [/itex] ?


yes you are correct.
what I mean is P(x,y|z)=P(x|y,z)P(y|z)
 
  • #4
Ronald_Ku said:
what is the result of P(X|Y)P(Y|Z)?
In my book, it is written to be P(X|Z)

I don't see why that would be correct. Perhaps you need to explain the entire context for it. I don't have a copy of Bishop's book.
 
  • #5
It is in the introduction chapter of the book and is talking about polynomial curve fitting.
X,T refer to a training set while t refers to the predicted point at position x
W refers to the set of parameters of M-order polynomial, that
y(x,w) = w0 + w1*x + w2*x^2 + . . . + wM*x^M

it claims the following equation for the prediction of t with help of the training set and position x
p(t|x, X, T) =[itex]\int [/itex] p(t|x,w)p(w|X, T) dw

that means p(t|x,w)p(w|X, T)= p(t,w|x,X, T) for later maginalization
But I believe that p(t|x,w)=/= p(t|x,w,X, T)

If it is not clear enough, i can explain more
 
  • #6
Ronald_Ku said:
p(t|x, X, T) =[itex]\int [/itex] p(t|x,w)p(w|X, T) dw

To make sense of an expression denoting a probability, we must understand what the "probability space" is. Can you describe the space associated with the notation p(t,x,X,T) ? Is it possible that some of those variables are not random variables, but ordinary variables instead? For example, if I have 3 loaded dice then I might use the notation
p( X,k)
to mean "the probability of getting a result of X when I roll the k-th die".. That interpretation doesn't imply that "k" is a random variable. It doesn't implay that there is an experiment where I pick a die at random.
 
  • #7
Let me clarify what you mean: in the expression p(x|m,n), it is not necessary that m and n are random variable. They can be parameters. Whether one is a random variable depend on the setting of the experiment,right?
IN your case, k can be random variable and p(x,k) means getting a x at random and rolling the k die at random if the experiment is set to be this way.

I am not sure when it comes to my case.
In my case, the notation p(t|x, X, T)means
given the training set X,T and the position x, the probability of finding t. t is obviously random variable. But x,X,T can also be parameters. It is not explicitly written that they are random variables or parameters. The experiment can be predicting t at position x, given a fixed set of X,T. Or the experiment can be predicting t while picking x,X,T at random and now considering P(t|x,X,T). I don't know which experiment the author is doing.
 
  • #8
The fact that a p(...) notation can be interpreted in variouis ways, doesn't mean that an equation using it will be correct for each possible interpretation. I suppose an author might use ambiguous notation to assert that a whole family of equations are correct by writing one equation. In your case, I'll guess the author only has one specific interpretation in mind.

One way to make sense of:

[itex] p(t|x, X, T) = \int p(t|x,w) p(w,X,T) dw [/itex]

is to consider [itex] X,T [/itex] to be ordinary variables, not random variables. So within the equation [itex] X,T [/itex] can be treated as if they have some constant value.

The random variable [itex] t [/itex] is a function only of the random variables [itex] x [/itex] and [itex] w [/itex]
(i.e [itex] t = w_0 + w_1x + ... w_n x^n [/itex]). So the notation [itex] p(t|x,w) [/itex] means the same thing as [itex] p(t|x,w,X,T) [/itex] because [itex] t [/itex] has no random variation due to [itex] X, T [/itex].

But by that interpretation, the author could have written [itex] p(w | X,T) [/itex] as [itex] p(w) [/itex]. I supposed he needed to mention [itex] X, T [/itex] somewhere on the right hand side.

Leaving [itex] X,T [/itex] unmentioned, it isn't controversial that

[itex] p(t|x) = \int p(t|x,w) p(w) dw [/itex]

or, mentioning them everywhere, that

[itex] p(t|x,X,T) = \int p(t|x,w,X,T) p(w| X,T) dw [/itex]
 
  • Like
Likes 1 person
  • #9
Stephen Tashi said:
The fact that a p(...) notation can be interpreted in variouis ways, doesn't mean that an equation using it will be correct for each possible interpretation. I suppose an author might use ambiguous notation to assert that a whole family of equations are correct by writing one equation. In your case, I'll guess the author only has one specific interpretation in mind.

One way to make sense of:

[itex] p(t|x, X, T) = \int p(t|x,w) p(w,X,T) dw [/itex]

is to consider [itex] X,T [/itex] to be ordinary variables, not random variables. So within the equation [itex] X,T [/itex] can be treated as if they have some constant value.

The random variable [itex] t [/itex] is a function only of the random variables [itex] x [/itex] and [itex] w [/itex]
(i.e [itex] t = w_0 + w_1x + ... w_n x^n [/itex]). So the notation [itex] p(t|x,w) [/itex] means the same thing as [itex] p(t|x,w,X,T) [/itex] because [itex] t [/itex] has no random variation due to [itex] X, T [/itex].

But by that interpretation, the author could have written [itex] p(w | X,T) [/itex] as [itex] p(w) [/itex]. I supposed he needed to mention [itex] X, T [/itex] somewhere on the right hand side.

Leaving [itex] X,T [/itex] unmentioned, it isn't controversial that

[itex] p(t|x) = \int p(t|x,w) p(w) dw [/itex]

or, mentioning them everywhere, that

[itex] p(t|x,X,T) = \int p(t|x,w,X,T) p(w| X,T) dw [/itex]

Thanks so much.I may try to proceed in this direction and see if anything weird occur again.
 
  • #10
I have another question.
if the above equations are needed to be considered with the following equation.
p(w|X, T, α, β) ∝ p(T|X,w, β)p(w|α).------(a)
α, β are fixed.

The left hand side p(w|X,T) is posterior probability. The right hand side p(w) is the prior probability.
So X,T are random variables. Right?
In the book, it mentions that p(w|X,T) in the integral will be given by (a)
 
  • #11
Ronald_Ku said:
So X,T are random variables. Right?
In the book, it mentions that p(w|X,T) in the integral will be given by (a)

It isn't possible to interpret equations without some context. Establishing the context requires a verbal explanation.
A person who is familiar with the type of problem that Bishop is solving might understand his notation, but I haven't read a statement of what these equations are supposed to accomplish.

An elementary question that needs a verbl explanation is whether the p(...) notation is supposed to indicate the probability of an event or whether it supposed to denote a probability density function evaluated somewhere. (The value of a a density function evaluated at a point isn't equal to "the proability of" that point.)
 

Related to Multiplication of conditional probability with several variables

1. What is the formula for multiplying conditional probabilities with several variables?

The formula for multiplying conditional probabilities with several variables is P(A and B and C) = P(A|B and C) * P(B|C) * P(C).

2. How is the formula for multiplying conditional probabilities with several variables derived?

The formula for multiplying conditional probabilities with several variables is derived from the general rule of conditional probability, which states that P(A and B) = P(A|B) * P(B).

3. Can the formula for multiplying conditional probabilities with several variables be extended to more than three variables?

Yes, the formula for multiplying conditional probabilities with several variables can be extended to any number of variables, as long as the conditional probabilities are given for each pair of variables.

4. What is the difference between conditional probability and joint probability?

Conditional probability refers to the probability of an event occurring given that another event has already occurred. Joint probability, on the other hand, refers to the probability of two or more events occurring simultaneously. The formula for multiplying conditional probabilities with several variables is derived from the concept of joint probability.

5. How is the multiplication of conditional probabilities with several variables used in real-life applications?

The multiplication of conditional probabilities with several variables is used in various real-life applications, such as in medical diagnosis, risk assessment, and financial analysis. It helps to calculate the probability of multiple events occurring together, which is useful in making informed decisions.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
12
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
3
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
3
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
2K
  • Precalculus Mathematics Homework Help
Replies
4
Views
609
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
5
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
1K
  • Differential Equations
Replies
1
Views
1K
Replies
4
Views
805
Back
Top