Why are normal distributions so frequent?

In summary: From this perspective, it becomes clear that as soon as we know anything about the distribution more than the mean and variance, using the normal curve would CERTAINLY be a mistake because you have to take advantage of all the information you have for probability to work.
  • #1
Why are there so many physical processes which are described (with more or less accuracy) by a normal distribution?
Physics news on Phys.org
  • #3
A more nefarious reason: It's easy. The normal distribution is extremely amenable to analysis. People oftentimes use a normal distribution when they shouldn't be doing that. I myself have been committed that statistical crime.
  • #4
D H said:
A more nefarious reason: It's easy. The normal distribution is extremely amenable to analysis. People oftentimes use a normal distribution when they shouldn't be doing that. I myself have been committed that statistical crime.

Something short of a crime, though. A misdemeanor, at most, surely!
  • #5
The worse crime I've seen is when companies use it to rank employee performance. Sorry for the Dilbert moment. :)
  • #6
Filip Larsen said:
The short answer is, that it is due to the Central Limit Theorem [1].

[1] http://en.wikipedia.org/wiki/Central_limit_theorem

I'm reading on the CLT and I'm getting more and more confused. I'm getting the idea that according to it every physical experiment would end up giving a normal distribution, but that is obviously false. Can someone clear my head?
  • #7
carllacan said:
I'm reading on the CLT and I'm getting more and more confused. I'm getting the idea that according to it every physical experiment would end up giving a normal distribution, but that is obviously false. Can someone clear my head?

Not the experiment itself, but the average of many results of the experiment. Suppose you have an experiment whose results have probabilities with fairly general conditions (finite mean and variance, etc.) When you take an average of many results of that experiment, the average will not exactly equal the mean, but the difference between the average and the mean will be close to normally distributed. The more data you average, the closer you can expect it to be to the mean.

This also explains why so many random errors are assumed to be normally distributed. If the error of a result may be the summation of many, many unknown factors, but you know that on average the error has a certain mean and variance, then a normal distribution is a natural choice. Common exceptions are when you know that the errors are never negative, (Chi-squared, log normal, etc.), when the error tends to be proportional to the expected result , or when you know something about the frequency content of the random errors (white noise, pink noise, brown noise, etc).
  • Like
Likes 1 person
  • #8
Another idea is that the normal distribution is the distribution that assumes the least information, given a specific mean and variance (maximum entropy).

Here's a silly example of where that could go wrong and also serves as a baby example of how that might work.

Say we are arguing over the existence of God. Because we are in complete ignorance and have no information about it, we should assume that it's 50/50 odds or probability 0.5 there is a God, probability 0.5 there isn't.

So, this is clearly nonsense, but there's a kind of logic to it. We really shouldn't make any assumption at all about the probabilities, in the absence of any information. However, distributing the probability evenly between the two possibilities assumes the least. So, if we are forced to make assumptions, the way to minimize what we are assuming is to distribute the probabilities evenly.

So, that's entropy maximization with no constraint (uniform distribution). If you specify mean and variance and allow a continuous distribution and do an analogous thing, you get the normal curve. One way to think of the central limit theorem is that repeated trials destroy information when they are averaged because the peculiarities of one particular trial are averaged away and only the overall trend remains (it's not clear from this perspective that the maximum entropy, that is, the normal curve, is actually achieved and this is one of the subtleties that have to be addressed when actually proving the CLT, using this approach).

A sneaky trick here is that you can artificially fix the mean and variance to be whatever you want, say 0 and 1 by a change of scale, even if the distributions were not originally that way, and this is exploited in the central limit theorem.

What's the relation between probability and information?

Here's a video series that explains it.

Entropy maximization is a slightly broader reason than the central limit theorem.

From this perspective, it becomes clear that as soon as we know anything about the distribution more than the mean and variance, using the normal curve would CERTAINLY be a mistake because you have to take advantage of all the information you have for probability to work. The converse is not clear. If you only know the mean and variance and nothing else, it may or may not be a mistake to use the normal curve. The normal curve just minimizes the possible amount of mistake made, in some sense.
Last edited by a moderator:
  • Like
Likes 2 people
  • #9
Very interesting. And why is it that the normal distribution has the most entropy?
  • #10
Very interesting. And why is it that the normal distribution has the most entropy?

I haven't figured out a much better answer than, "you do an ugly calculation and that's what happens," at the moment. I have a few thoughts, but they'd probably be sort of incoherent without a lot of work on my part, so I think I will pass on sharing them.
  • Like
Likes 1 person
  • #11
I realized a minor correction is needed here. If you get information that is consistent with the normal, then the normal still might be the right choice. It's new contradictory information that you need to worry about. So, you don't really need the maximum entropy principle to say that you should correct for new information because that's just true without much further thought. The maximum entropy principle just underscores it. Also, you might have other information that is equivalent to knowing the mean and variance.
  • #12
Personally, I like the way that Taylor (Intro to Error Analysis) explains it.

Consider a quantity that has a true value u that you want to make a measurement of.

If you have only a single source of error with magnitude E, and no bias in your measurements, you'll measure values of u + E and u - E with equal probability.

If you have 2 sources of error with magnitude E, and no bias in your measurements, you'll measure values of:
u - 2E (0.25)
u (0.50)
u + 2E (0.25)
The quantities in the brackets are the probabilities.

Extend the argument now to N sources of error. Your possible measurements will have values between u - NE and u + NE. Binary outcomes like this follow a binomial distribution, which would allow you to calculate the probability of any result in between these two values.

Then you just consider the limit as your number of sources of error N approaches infinity and the magnitude of your error E approaches zero. If you plot it out, you can see that you're approaching a normal distribution.

So a normal distribution results from any situation that's subject to a large number of very small, random variations. In that sense, it's not surprising that the normal distribution is so common.
  • Like
Likes 1 person
  • #13
carllacan said:
And why is it that the normal distribution has the most entropy?

"Normal distribution" refers to a family of distributions and not all members of that family have the same entropy. Is isn't clear what you mean by "the" normal distribution.
  • #14
Stephen Tashi said:
"Normal distribution" refers to a family of distributions and not all members of that family have the same entropy. Is isn't clear what you mean by "the" normal distribution.

In holeomorphic's post, he is given a mean and distribution. So the normal distribution is uniquely determined.
  • #15
Is there a "proof" that the normal distribution is, in fact, the binomial distribution as n approaches infinity?

If so, that would explain a lot.
  • #16
carllacan said:
Why are there so many physical processes which are described (with more or less accuracy) by a normal distribution?

The normal distribution results whenever a large number of small, independent factors are summed up. That's fairly common. It often shows up with measurements because the large factors have already been dealt with.

It is "robust" in that deviations from the assumptions often don't make that much difference.

In real life there is seldom an exact match to the normal. Usually the differences show up in the tails of the distributions. This is one reason that confidence intervals are 95%, as this avoids the tails.
  • Like
Likes 1 person
  • #17
Is there a "proof" that the normal distribution is, in fact, the binomial distribution as n approaches infinity?

If so, that would explain a lot.

Of course there is. That's just a special case of what we've been talking about because a binomial distribution comes from a repeated experiment, involving 2 possible outcomes, usually called "success" and "failure".

I think there are three ways to prove it that I know of in the case of the binomial distribution. One (historically, the first) involves approximating the binomial coefficients, using Stirling's approximating. A shorter non-rigorous version of this, not involving Stirling's approximation is more insightful, and to my mind, Stirling's approximation should really be thought of as a consequence of the central limit theorem, applied to i.i.d. Poisson random variables, since the algebraic proof of it is extremely ugly.

The shorter, heuristic version is explained here:


There is also a proof using entropy, as I've mentioned, and finally, there's also a proof using characteristic functions, which are basically Fourier transforms.

The first proof I mentioned is only for the binomial, the other two prove the central limit theorem in general.

Related to Why are normal distributions so frequent?

1. Why are normal distributions so frequently used in statistics?

Normal distributions, also known as Gaussian distributions, are used frequently in statistics because they accurately describe many real-world phenomena. This is due to the central limit theorem, which states that the sum of a large number of independent and identically distributed random variables will tend towards a normal distribution.

2. What makes normal distributions so special?

One of the main reasons normal distributions are considered special is because they have a symmetric bell-shaped curve, with the mean, median, and mode all being equal. This makes them easy to understand and work with in statistical analyses.

3. How does the normal distribution relate to the concept of "average"?

The normal distribution is often used to represent the concept of "average" because it is the most common distribution for values to fall around a central value. This central value is represented by the mean of the distribution, making it a useful measure of central tendency.

4. Are all data points in a normal distribution truly "normal"?

No, not all data points in a normal distribution are truly "normal". In fact, some data points may deviate significantly from the mean, especially in larger datasets. However, the majority of the data will still fall within a few standard deviations of the mean, making the distribution appear normal.

5. Can a dataset have more than one normal distribution?

Yes, it is possible for a dataset to have more than one normal distribution. This typically occurs when there are distinct subgroups within the dataset that follow different normal distributions. In this case, it may be more appropriate to analyze each subgroup separately to get a more accurate representation of the data.

Similar threads

  • Set Theory, Logic, Probability, Statistics
  • Set Theory, Logic, Probability, Statistics
  • Set Theory, Logic, Probability, Statistics
  • Set Theory, Logic, Probability, Statistics
  • Set Theory, Logic, Probability, Statistics
  • Set Theory, Logic, Probability, Statistics
  • Set Theory, Logic, Probability, Statistics
  • Set Theory, Logic, Probability, Statistics
  • Set Theory, Logic, Probability, Statistics