Statistical assessment of the quality of event detention

In summary, the conversation discusses the development of an algorithm to detect events in time domain and the desire to know its efficiency. The problem is related to the time duration of the data, which consists of hundreds of minutes in multiple files. Instead of calculating the specificity and sensitivity for the entire data set, the idea is to choose random samples. The main concern is the quality of detection, particularly in terms of sensitivity and specificity. However, there is no prior information available aside from the algorithm itself. The suggestion is to simulate data from "real" events and compare them to the algorithm's detections. It is also mentioned that a null hypothesis and a way to compute probabilities are necessary for statistical analysis, which may be available in certain fields such as engineering
  • #1
StarWars
2
0
Hello.

I developed an algorithm to detect events in time domain and I want to know the efficiency of the algorithm.

The problem is related with the time duration of the data.

Each file has data with a time duration of hundreds of minutes and I have dozens of files.

Instead of calculate the specificity and the sensitivity of this algorithm for the entire data set, I was thinking to choose random samples.

My question is:

What is the correct approach to have a valid statistical analysis?

Thank you.
 
Physics news on Phys.org
  • #2
Unfortunately, applications of statistics involve subjective judgements. If you want practical advice about a valid statistical approach, you need to give more practical details of the situation. For example, what are you concerned about? - the number of events in the file? - the exact time when an event occurs? Do you have information about when an event "really" happened vs when the algorithm said it happened?
 
  • #3
I am studying sounds in time domain. Usually this sound has a low amplitude profile, just noise. Sometimes, a sound is generated and there is an increase in the signal amplitude.

The goal of the algorithm is to detect this increase in the signal amplitude. Unfortunately, the generation of a sound can be interpreted as random. It is possible that a low amplitude profile lasts for minutes or even hours without a single sound being generated. On the other hand, it is possible that there is a sequence of sounds for several minutes with a time difference between the sound n+1 and the sound n of just a few seconds.

I am concerned with the quality of the detection, sensitivity and specificity. This means a desire of knowing if a generated sound is either detected or not detected and if there is a "detected" sound when no sound is generated.

I do not have any prior information, just the one given by the algorithm.

Thank you
 
  • #4
StarWars said:
I do not have any prior information, just the one given by the algorithm.

If that means that you have no way to compare the detections from the algorithm to real events then I think you should resort to simulating data from "real" events and seeing how well the algorithm detects them. To simulate data from events, you need algorithms to do the simulation.

If you want to do statistical hypothesis testing on each set of data, you need a "null hypothesis", which could be that no sounds are present and that the data is generated by some specific random process. You need a way to compute the probability of getting similar data when those assumptions are true. If you have no algorithm or formula to compute this probability then you can't do hypothesis testing.

There may be situations in engineering and science where people have developed standard methods of dealing with your problem. You can try asking about your problem in the engineering or science sections of the forum and give more details.
 
  • #5


I would recommend using a random sampling approach to assess the quality of your event detection algorithm. This means randomly selecting a subset of your data files and evaluating the algorithm's performance on those files. By doing this, you can obtain a representative sample of your data and determine the overall efficiency of the algorithm.

To ensure a valid statistical analysis, it is important to choose a large enough sample size to accurately represent the entire data set. This will help to reduce the chance of bias and provide more reliable results. Additionally, it may be beneficial to stratify your sample to include a diverse range of data files, such as those with varying time durations.

Once you have selected your sample, you can then calculate the specificity and sensitivity of the algorithm on each file and then take the average across all files to determine the overall performance. This approach will provide a more comprehensive assessment of the algorithm's efficiency compared to only evaluating it on a single data set.

In addition to random sampling, it may also be helpful to compare your algorithm's performance to a known benchmark or gold standard. This can provide a point of reference and help to validate the results of your statistical analysis.

Overall, using a random sampling approach and comparing the results to a benchmark can help to ensure a valid statistical analysis and provide a more accurate evaluation of your event detection algorithm's quality.
 

Related to Statistical assessment of the quality of event detention

1. What is "Statistical assessment of the quality of event detection"?

"Statistical assessment of the quality of event detection" is a method used to evaluate the accuracy and reliability of event detection algorithms. It involves analyzing the statistical properties of detected events and comparing them to expected patterns and noise levels.

2. Why is it important to assess the quality of event detection?

Assessing the quality of event detection is important because it allows us to determine the accuracy and effectiveness of event detection algorithms. This helps us to understand the limitations and strengths of these algorithms and make improvements to them.

3. What are some common statistical measures used in event detection assessment?

Some common statistical measures used in event detection assessment include false positive and false negative rates, precision and recall, and F-measure. These measures help to evaluate the accuracy and completeness of event detection results.

4. How do you determine the expected patterns and noise levels for event detection?

The expected patterns and noise levels for event detection can be determined through various methods, such as analyzing historical data, creating synthetic data sets, or conducting experiments with known events. These methods help to establish a baseline for comparison with the detected events.

5. Can statistical assessment alone determine the quality of event detection?

No, statistical assessment alone cannot determine the quality of event detection. It should be used in conjunction with other methods, such as visual inspection and expert judgment, to get a comprehensive understanding of the performance of event detection algorithms.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
697
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
9
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
12
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
19
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
16
Views
3K
Replies
2
Views
990
  • STEM Academic Advising
Replies
4
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
3K
Back
Top