Bivariate Normal Distribution, contour ellipse containing given % samples?

In summary, the probability p for which the ellipse of all points (x,y) for which P(X = x, Y= y) = p contains a given % of the samples drawn from the distribution. For a multivariate normal, this corresponds to the area of the ellipse.
  • #1
codiloo
2
0
Given a bivariate gaussian distribution,
I'm attempting to find the probability p for which
the ellipse of all points (x,y) for which P(X = x, Y= y) = p contains
a given % of the samples drawn from the distribution.

I want the 2d equivalent for the 1 dimensional case:
given a normal distribution N(0,1):
e.g interval between points with p = 0.24197072 contains 68.2% of all samples
e.g interval between points with p = 0.05399097 contains 95.4% of all samples
e.g interval between points with p = 0.00013383 contains 99.6% of all samples
in two dimensions these interval boundries become an ellipse and I'm interested in finding the p value corresponding to a given % (contained samples in contour ellipse with p) value in the 2 dimensional case.

Some extra info:
A matlab, python (using numpy, scipy?) numerical approximation is ok, I don't need an analytic formula.

Actually I just want to draw the ellipses containing 75%, 95%, 99% of the samples in python (using matlibplot) for a given gaussian distribution (varying mean & covariance). I know how to do this if I obtain p first (contour plots).

Thank you for reading my question and I hope you can help.
 
Physics news on Phys.org
  • #2
Hi codiloo,

The probability hyper-ellipsoid hyper-volume for a multivariate normal follows

[tex](x-μ)^T \Sigma^{-1}(x-μ) ≤ χ^2_k(p)[/tex]

Where x is a k-dimensional vector, μ is the k-dimensional mean vector, Ʃ is the variance-covariance matrix and [itex]χ^2_k(p)[/itex] is the p quantile of the chi-square distribution with k degrees of freedom.

When k = 2 dimensions the expression represents the area of the ellipse you are asking for, and [itex]χ^2_2[/itex] behaves as an exponential distribution.
 
Last edited:
  • #3
After a huge calculation involving rotating co-ordinates I ended up with
P[(x, y) lies inside the contour pdf(x,y) = k] = 1 - 2πkD, where D is the determinant of the covariance matrix, i.e. = √(σ12σ22 - ρ4).
Note e.g. that the peak pdf value is 1/2πD
If it's right, there must be an easier way.
 

Related to Bivariate Normal Distribution, contour ellipse containing given % samples?

1. What is a bivariate normal distribution?

A bivariate normal distribution is a probability distribution that describes the relationship between two continuous variables. It is often used in statistics to model the joint distribution of two random variables.

2. How is a bivariate normal distribution different from a univariate normal distribution?

A univariate normal distribution only describes the distribution of one variable, while a bivariate normal distribution describes the relationship between two variables. This means that a bivariate normal distribution has two means, two variances, and a correlation coefficient, while a univariate normal distribution only has one mean and one variance.

3. What is the significance of the contour ellipse in a bivariate normal distribution?

The contour ellipse in a bivariate normal distribution represents the areas where a certain percentage of samples fall. This means that the contour ellipse contains the given percentage of samples within its boundaries, making it a useful tool for understanding the distribution of data and identifying outliers.

4. How is the given percentage of samples determined for the contour ellipse in a bivariate normal distribution?

The given percentage of samples for the contour ellipse is determined based on the confidence interval or level of significance chosen by the researcher. For example, a 95% confidence interval would result in a contour ellipse containing 95% of the samples.

5. What factors can affect the shape and size of the contour ellipse in a bivariate normal distribution?

The shape and size of the contour ellipse can be affected by the means, variances, and correlation coefficient of the two variables. If the means are further apart, the ellipse will be elongated in the direction of the means. If the variances are larger, the ellipse will be wider. And if the correlation coefficient is higher, the ellipse will be more circular.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
999
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
2K
  • Calculus and Beyond Homework Help
Replies
8
Views
3K
  • Set Theory, Logic, Probability, Statistics
Replies
11
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
3
Views
7K
  • Set Theory, Logic, Probability, Statistics
Replies
9
Views
3K
  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
38K
  • Calculus and Beyond Homework Help
Replies
3
Views
3K
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
2K
Back
Top