Testing how much random is my sample

In summary, the conversation discusses the topic of testing for randomness in a sample of natural numbers. The individual is looking for suggestions on which test to perform and is considering using Gaussian distribution and residual comparisons, as well as more sophisticated tests such as the Shapiro-Wilk test and the runs test. They also mention using software like Maxima and R for these tests. The conversation ends with the individual expressing gratitude for the suggestions and planning to try the runs test the next day.
  • #1
fluidistic
Gold Member
3,924
261
Hello guys,
I have a sample of about 400 natural numbers though I can get more numbers. To give you an idea the mean and the standard deviation are 29038031 and 1842882 respectively and I expect the numbers to follow a Gaussian distribution. I'd like to perform a test to tell me the probability that my sample is truly random. I just don't know which test to perform. I've read about diehard tests but I don't see how I could apply them.

So I'd like to hear some suggestions. Thanks!

Edit: 1st idea that I have: get more numbers. Then perform a Gaussian fit and calculate the residuals. Do the same for true random numbers following a Gaussian with the same mean and standard deviation and compare the residuals. I expect lower residuals with the true random numbers.

Edit2: Nevermind this idea would be useless. It would tell me how far from a Gaussian my distribution of numbers is, not how random they are...
 
Last edited:
Physics news on Phys.org
  • #2
A simple and intuitive way is to plot your histogrammed data on Gaussian "probability paper," so named from the days when plots were made on actual graph paper. You'll see visually how close to a Gaussian you are.
http://en.wikipedia.org/wiki/Normal_probability_plot

Other simple, but quantitative, tests include:
a) Compute the skew and kurtosis and see how close to Gaussian they are (normal values are 0 and 3).
b) The mean, mode and median are all equal for a normal distribution.

More sophisticated tests abound. See, e.g.,
http://en.wikipedia.org/wiki/Normality_test
 
  • #3
marcusl said:
A simple and intuitive way is to plot your histogrammed data on Gaussian "probability paper," so named from the days when plots were made on actual graph paper. You'll see visually how close to a Gaussian you are.
http://en.wikipedia.org/wiki/Normal_probability_plot

Other simple, but quantitative, tests include:
a) Compute the skew and kurtosis and see how close to Gaussian they are (normal values are 0 and 3).
b) The mean, mode and median are all equal for a normal distribution.

More sophisticated tests abound. See, e.g.,
http://en.wikipedia.org/wiki/Normality_test
I've just used the software maxima which performed a Shapiro-Wilk test to check whether my data follows a Gaussian and I think there are high chances that it does: it returned a Kendall's W of over 0.99 with a p-value near 0.27.
The thing is that I am not sure that this is telling me anything about the randomness of my numbers which is what I'm looking for.
 
  • #4
  • Like
Likes fluidistic
  • #5
WWGD said:
I am not sure I understood; do you know that the original distribution is normal and then you want to know if the sample is random? Have you tried the runs test?

https://home.ubalt.edu/ntsbarsh/business-stat/opre504.htm#rrunstest

And a good thing is that the test is non-parametric.
Thanks a lot! That's exactly what I was looking for, I'll try tomorrow.
Meanwhile I tried a very similar test with R programming and the result was that there is a high probability that my data is random. (p-value was over 0.4 and the null hypothesis is that the data is random while the alternative hypothesis was non randomness in the data).
 
  • #6
Glad I could help.
 

Related to Testing how much random is my sample

1. What is the purpose of testing how much random is my sample?

The purpose of testing how much random is your sample is to determine the level of randomness present in the data. This is important because randomness is a key factor in many scientific experiments and can impact the validity and reliability of the results.

2. How is randomness measured in a sample?

Randomness can be measured in a sample using statistical tests such as the chi-square test, Kolmogorov-Smirnov test, or the runs test. These tests compare the observed frequencies in the sample to the expected frequencies if the data were truly random.

3. What factors can affect the level of randomness in a sample?

The level of randomness in a sample can be affected by a variety of factors, such as the sampling method used, the size of the sample, and the nature of the data being collected. Other factors such as human error or bias can also impact the randomness of a sample.

4. Is it necessary to have a completely random sample for accurate results?

It is not always necessary to have a completely random sample for accurate results, as the level of randomness needed will depend on the specific research question and the type of analysis being conducted. However, a more random sample can increase the reliability and generalizability of the results.

5. What can be done to increase the randomness of a sample?

To increase the randomness of a sample, researchers can use random sampling techniques such as simple random sampling, stratified random sampling, or cluster sampling. Other methods, such as randomizing the order of data collection or using random number generators, can also help increase the level of randomness in a sample.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
5
Views
582
  • Set Theory, Logic, Probability, Statistics
Replies
30
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
6
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
947
  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
591
  • Set Theory, Logic, Probability, Statistics
Replies
11
Views
612
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
841
  • Set Theory, Logic, Probability, Statistics
Replies
3
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
0
Views
1K
Back
Top