Welcome to our community

Be a part of something great, join today!

Monte Carlo method to generate standard normal

  • Thread starter
  • Admin
  • #1

Jameson

Administrator
Staff member
Jan 26, 2012
4,052
Problem:
I need to use the Monte Carlo method to generate the mean, variance and kurtosis of the standard normal distribution. It has to be coded in Matlab, so there are two parts to this question:

1) The theory
2) The code

Any help on either is appreciated.

My attempt:
To find $E[X]$ for a random variable we can use the definition \(\displaystyle E[X]=\int_{-\infty}^{\infty}x*f(x)dx\), where $f(x)$ is the distribution's pdf.

I believe the first step is to general a random sample to use. Now my thought is to use random standard uniform variables on $[0,1]$ but I'm not sure.

Once we do that we find the mean for each $X$ and average them for n samples. It should converge to 0.

There are clearly some mistakes here in the set up so any guidance or comments is appreciated.
 
  • Thread starter
  • Admin
  • #2

Jameson

Administrator
Staff member
Jan 26, 2012
4,052
Oh man was I way off. :eek:

I need to simply generate a large amount of standard normal random variables and then take their average to estimate $E[X]$.

Here's the Matlab code I did for this in case anyone is interested:

Code:
clear all;

x=randn(10000,1);

for i=1:10000
    m(i,1)=sum(x(1:i))/i;
end;

figure(1)
plot (1:10000,m(1:10000,:))
Now on to the variance and kurtosis!
 

zzephod

Well-known member
Feb 3, 2013
134
Oh man was I way off. :eek:

I need to simply generate a large amount of standard normal random variables and then take their average to estimate $E[X]$.

Here's the Matlab code I did for this in case anyone is interested:

Code:
clear all;

x=randn(10000,1);

for i=1:10000
    m(i,1)=sum(x(1:i))/i;
end;

figure(1)
plot (1:10000,m(1:10000,:))
Now on to the variance and kurtosis!
Code that generalises could be something like:

Code:
N=100;M=20;
x=randn(N,M);
mN=mean(x);  %takes means of the columns of x, leaves a sample of M means of N standard normals
GrandMean=mean(mN)
Code:
N=100;M=20;
x=randn(N,M);
vN=var(x,0);  %takes var of the columns of x, leaves a sample of M bias corrected vars of N standard normals
GrandVar=mean(vN)
Code:
N=100;M=20;
x=randn(N,M);
kN=kurtosis(x,0);  %takes kurts of the columns of x, leaves a sample of M bias 
                   %corrected kurts of N standard normals
GrandKur=mean(kN)
If you just want the statistic from a large sample set M=1.

If you are interested in the sampling distribution of your statistics look at the histograms of mN, vN and kN.

You might also be interested in the un-corrected stats in which case IIRC use var(x,1) and kurtosis(x,1)

.
 
Last edited:
  • Thread starter
  • Admin
  • #4

Jameson

Administrator
Staff member
Jan 26, 2012
4,052
For some reason I have a tendency to write things as one long column or row vector, so I'll make something that's 100x1 instead of 10x10. Sometimes the 10x10 configuration seems to have benefits, like you're pointing out.

Here is the code I just finished. I'm not quite satisfied because I had to cheat a little bit to get the kurtosis and use the fact that I know the mean is 0 and the S.D. is 1, instead of using the estimators I found.

Code:
clear all;
x=randn(10000,1);

for i=1:10000
    %mean
    m(i,1)=sum(x(1:i))/i;
    varianceofm(i,1)=var(m(1:i));

    %variance
    sse(i,1)=(x(i)-m(i))^2;
    v(i,1)=(1/(i))*sum(sse(1:i));
    varianceofv(i,1)=var(v(1:i));

    %kurtosis
    k(i,1)=(1/i)*sum(x(1:i).^4);
    varianceofk(i,1)=var(k(1:i));
end;
 

Bacterius

Well-known member
MHB Math Helper
Jan 26, 2012
644
You can also use a quasirandom (low discrepancy) number sequence to converge faster and require less samples, though it's not that big a deal for a 1D integral, but it's good to know since most numerical computing languages have something like that built-in as a drop-in replacement for e.g. randn().

Uniform random sequences converge at a rate of $O(\sqrt{n})$ for $n$ samples, whereas low-discrepancy sequences converge at a rate of $O(n)$, i.e. after $n$ samples the error will be on the order of $\frac{1}{n}$ compared to $\frac{1}{\sqrt{n}}$. The catch is that you need to pay attention to how you generate those sequences, as they may be correlated depending on the algorithm used (especially in high dimensions e.g. > 10, where it just gets messy) but the language will take care of that for you.