Calculating noise in a data sample - what region to use

In summary, the requestor is asking about calculating the standard error of a data set, and whether it makes sense to use this value to determine error in the data. They also ask about the noise in the data, and whether it can be characterized.
  • #1
tappy
6
0
I have a data set of number of counts vs position where counts were detected. I want to find the noise in the sample. Am I right to think that by 'noise' the requestor wants to know the standard error (SE = stdev/sqrt(N)) where N is the sum of x-axis points.
Also if the above it true then is the SE calculated over the whole length of the data. The data acquired has almost zero (noisy) counts along the first half of the line profile (as expected) and then the signal kicks in later, so should SE be calculated separately on the noisy signal at the start and the real signal near the end or should SE be calculated over the whole data.
Thanks in advance
 
Physics news on Phys.org
  • #2
Hi tappy, :welcome:

A bit more context, please: what's this about and can you post a picture of the measurements.
For some, noise is the background level, for others it's the sigma in that level, etc..
 
Last edited:
  • #3
al at% vs wt %.png

The graph shows counts along a line profile.
I have calculated stdev & SE over the whole profile, over the part with signal and the part without signal so what bothers me is that if I use the SE over the whole profile as error in the data does it make sense as the SE in the part with signal and without signal is so much smaller in comparison.
For N=192, stddev is 17.7, SE = 1.3 but in the area with signal, stdev is 4.0, SE is 0.3 and in the area without signal stdev is 0.8, SE is <0.1.
Should I use the same SE over the whole data range or instead use the separate SE for each region.
Thanks.
 
  • #4
Apparently there is some transition between 90 and 120, so I'd leave out that part in analyzing. If you expect the signal to be constant below 90 and above 120, you have four average values, each with a stdev and an SE.
tappy said:
I want to find the noise in the sample
The noise looks pretty Poisson like, so it seems reasonable to average a number of channels. Seems to me the noise in the sample as such isn't all that interesting. Don't you process the result and come with one answer like Al mass% minus Al at % for > 120 = soandso ? And you want the SE in that number ?

Or is it two answers, like: blue > 120 minus blue < 90 +/- ... and Orange > 120 minus orange < 90 +/- ...
 
  • #5
Great thanks for that, leaving out the data between 90 and 120 makes sense but good to hear it from someone else. Indeed the noise isn't that interesting in this case however I suppose what got me worried was how should the error be presented, your suggestion of 4 average values is a good one and is how I will proceed with this.

Thanks for the help.
 
  • #6
So generally if you have a measurement even if you measure the exact same thing repeatedly you don't get the same result. So the measured value is considered a random variable, X.

Generally, you like to consider the measurement to be some true value plus some noise, ##X=x+\epsilon##, where x is not a random variable, it is the true value being measured, and ##\epsilon## is a random variable called the noise. So finding the noise means to characterize that random variable, ##\epsilon##.

If your noise is unbiased and Gaussian then ##\epsilon=N(0,\sigma)## so the noise can be characterized with a single number, the standard deviation. Your noise does not look so simple, so it may take more effort to characterize it. It may be Poisson distributed.
 

Related to Calculating noise in a data sample - what region to use

1. What is the purpose of calculating noise in a data sample?

The purpose of calculating noise in a data sample is to measure the amount of random variation or uncertainty present in the data. This can help in understanding the reliability and accuracy of the data and can also aid in identifying and removing any outliers or errors.

2. How is noise calculated in a data sample?

Noise in a data sample is typically calculated by taking the standard deviation of the sample. This is a measure of how much the data points vary from the mean. A higher standard deviation indicates a higher level of noise in the data sample.

3. What is the significance of choosing the right region when calculating noise in a data sample?

The region chosen for calculating noise in a data sample can greatly impact the results. If a region with a lot of outliers or extreme values is chosen, it can significantly increase the noise and skew the results. It is important to choose a region that is representative of the data and does not contain any extreme values.

4. How do you determine which region to use when calculating noise in a data sample?

The region used for calculating noise in a data sample should be chosen based on the specific data set and the research question at hand. It is important to consider the distribution of the data and any known factors that may affect the results. Some common techniques for selecting a region include using the interquartile range or visually inspecting a histogram of the data.

5. Can noise be completely eliminated from a data sample?

Noise is a natural and inevitable part of any data sample. While it can be reduced through careful data collection and analysis, it cannot be completely eliminated. The goal is to minimize the noise as much as possible to obtain more accurate and reliable results.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
28
Views
3K
  • Set Theory, Logic, Probability, Statistics
2
Replies
40
Views
4K
Replies
9
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
18
Views
3K
Replies
12
Views
1K
Replies
49
Views
6K
  • Introductory Physics Homework Help
Replies
6
Views
811
  • Introductory Physics Homework Help
Replies
7
Views
1K
Back
Top