Determining the distribution of a variable

In summary, to predict the distribution of a variable that is a combination of other random variables, you must add up the probabilities of all the combinations of values that make the function equal to the desired value. This is known as convolution and is done through integration for continuous random variables. If the variables are independent, the joint density can be calculated by multiplying their individual densities. The moment generating function can also be used to calculate the moments of the variable. Alternatively, a table of integral transforms or software such as Mathematica or Maple can be used to obtain the probability density function.
  • #1
musicgold
304
19
Hi,

I am trying to understand how to predict the distribution of a variable that is a combination of other random variables. For example, how should I determine the distribution of Variable A, which is the sum of Variable X and Variable Y? Variable X has a normal distribution and Variable B has a uniform distribution.

Also, what will be the distribution of Variable B, which is a product of X and Y?

Thanks.
 
Physics news on Phys.org
  • #2
musicgold said:
I am trying to understand how to predict the distribution of a variable that is a combination of other random variables.

The very general idea is that if you want to know the probability that A = r and A is a function of X and Y, then you must add up the probabilities of all the combinations of values of X and Y that make the function equal to r.

In the case of continuous random variables, the "adding up" is done by integration instead of taking a finite sum.

In the special case where A is the sum of X and Y, the process of doing the summation or integration is called a "convolution".

You can think of the convolution that computes Pr( A = X + Y = r) as doing the summation over all values x and y such at x + y = r. So if we take y as given then x = r - y.

The summation is :

The sum over all possible values of y of Pr( X = r-y and Y = y )

If X and Y are independent then the joint density Pr(X = r-y and Y =y ) is Pr(X=r-y) Pr(Y=y).

For continuous random variables, if X and Y are independent and X has density f(x) and Y has density g(y) the density h(r) of A = X + Y is

[tex] h(r) = \int f(r-y) g(y) \ dy [/tex]


Do you feel up to doing the calculus to solve your particular example? I think the answer will involve the cumulative of the normal distribution so it isn't a "closed form" solution.

For the case of the product of two random variables A = (X)(Y) you need to do a calculation that sums (or integrates) over all possible values of x,y where xy = r. So x = r/y.
 
  • #3
If you're only interested in the moments of A = X + Y, then the moment generating function becomes very useful because the MGF of A will be the product of the MGFs of X and Y (provided their independent).

If you have access to a good table of integral transforms, or Mathematica/Maple this is also an easy way to get the pdf for your variable. For example to do the example you asked about with the uniform distribution going over the interval from 0 to 1 and the normal variable having mean 0 and variance 1 you'd just do

Code:
InverseFourierTransform[(FourierTransform[
    UnitStep[x] - UnitStep[x - 1], x, t] FourierTransform[Exp[-x^2], 
    x, t]), t, x]

And Mathematica spits out

[tex]\frac{\text{Erf}[1-x]+\text{Erf}[x]}{2 \sqrt{2}} [/tex]

Note above I use the characteristic function, not mgfs of the pdf but it's the same idea.
 
Last edited:

Related to Determining the distribution of a variable

1. What is the purpose of determining the distribution of a variable?

The purpose of determining the distribution of a variable is to understand how the data is spread out and to identify patterns or trends within the data. This can help in making decisions, predicting future outcomes, and identifying potential outliers.

2. How is the distribution of a variable typically represented?

The distribution of a variable is typically represented graphically using a histogram, box plot, or scatter plot. These visual representations allow for an easy interpretation of the data and can help identify the shape, center, and spread of the distribution.

3. What are some common measures used to describe the distribution of a variable?

Some common measures used to describe the distribution of a variable include the mean, median, and mode. Standard deviation and variance are also commonly used to describe the spread of the data.

4. How does the shape of a distribution impact the analysis of a variable?

The shape of a distribution can provide valuable insights into the data. A symmetrical distribution, such as a normal distribution, indicates that the data is evenly spread out around the mean. A skewed distribution, on the other hand, can suggest that the data is not evenly distributed and may require further investigation.

5. Can the distribution of a variable change over time?

Yes, the distribution of a variable can change over time. This can happen due to various factors such as changes in the underlying population, external events, or the collection method of the data. It is important to regularly analyze and monitor the distribution of a variable to identify any changes and make necessary adjustments in analysis or decision making.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
30
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
5
Views
599
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
565
  • Set Theory, Logic, Probability, Statistics
Replies
6
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
3
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
8
Views
984
  • Set Theory, Logic, Probability, Statistics
Replies
9
Views
650
Back
Top