Audio Analysis/Synthesis, DSP/FFT/Spectrogram

  • Thread starter neurocomp2003
  • Start date
  • Tags
    Audio
In summary, the conversation revolves around understanding digital audio for brain modelling research and the ultimate goal of having microphone input and speaker output. The format of raw sound/wave buffer for soundcards is discussed and it is noted that it can vary depending on platform and hardware. Links to references are requested for further clarification. The use of FFT and wavelets for audio analysis is also discussed, with a preference for wavelets due to their ability to provide both time and frequency resolution. The WAV format is mentioned as a choice for the work being done, with some useful links provided for more information.
  • #1
neurocomp2003
1,366
3
Hi, I'm trying to understand digital audio on my own for brain modelling research (i'm trying to wire up input to neuralnet, which I've done with a vision component using a camera).
[edited...] i forgot to mention that the ultimate goal is to have microphone input and speaker output.

I've made a similar inquiry in the computer->software forum but was focused on software rather than the dsp part, and with no reply. In this post i'd rather focus on the dsp/audio part. I use the Windows OS

Questions :
[Q1] What is the format of raw sound/wave buffer (that can be played by soundcard) and is it platform and hardware(soundcard) dependent? In image buffers you have some order of RGBA or some other colour scheme. Is this the same for the soudn buffer some sequence of [t,Hz,Phase,Amplitude]? Is this format PCM? and links to references would help.

[Q2] Is the raw sound buffer the same input that should go into a FFT

[Q3] What is displayed in the 3rd axis of spectrogram [t vs Hz vs ###]? Is it the PSD of FFT and is the PSD just log(sqrt(||(Re,Im)||)) or is it just the Re component or somthing to do with amplitude??

[Q4] Is it better to perform audio analysis with FFT or wavelet [ spectrogram vs scaleogram? I came up on a link that says wavelets are better for both Hz and time resolution.

THanks for the time, I'm really new to DSP.

Oh ya, any links to newsgroups where i can ask these questions would also help.

Most of my google searches come up with how to play sound, and with window functions, sample rates and other parameters to wave file. But doesn't actually say what goes into a sound buffer.
 
Last edited:
Engineering news on Phys.org
  • #2
To Neuro,

Interesting subject.
I have chosen the WAV format for my work.
Note:
(1) it is lossless
(2) it is simple (and ugh! max size).
(3) works between my Windows and Linux systems.

Check these URLs for a starter.

WIKI
Audio File format Definition
http://en.wikipedia.org/wiki/Audio_file_format
"It is important to distinguish between a file format and a codec."

NCH
Audio File Formats Types:
http://www.nch.com.au/acm/formats.html
Note: raw header-less PCM is a file-format frequently called "WAV".

http://www.digitaltips.org/audio/audio101.asp

http://wiki.audacityteam.org/index.php?title=WAV
Note: Little Endian vs. Big Endian transmission methods.

http://edutechwiki.unige.ch/en/Digital_audio

Hope that helps get you started.

glene77is, Memphis, TN
 
Last edited by a moderator:

Related to Audio Analysis/Synthesis, DSP/FFT/Spectrogram

1. What is audio analysis and synthesis?

Audio analysis and synthesis is the process of breaking down and manipulating sound signals to understand their characteristics and then recreate those signals using electronic or digital means. It involves using mathematical algorithms and techniques to analyze and synthesize sound waves.

2. What is DSP and how is it related to audio analysis and synthesis?

DSP stands for Digital Signal Processing and it is a branch of engineering that deals with the manipulation and analysis of digital signals, including audio signals. DSP techniques are used in audio analysis and synthesis to process and manipulate sound signals in real-time.

3. What is FFT and how is it used in audio analysis?

FFT stands for Fast Fourier Transform and it is a mathematical algorithm used to convert a signal from its original domain (time or space) to a representation in the frequency domain. In audio analysis, FFT is used to break down an audio signal into its individual frequency components, allowing for the identification of different frequencies and their amplitudes.

4. What is a spectrogram and how is it used in audio analysis?

A spectrogram is a visual representation of the frequencies and amplitudes of an audio signal over time. It is created using FFT analysis and displays the intensity of different frequencies as a function of time. Spectrograms are used in audio analysis to analyze the frequency content and changes in a sound signal over time.

5. How is audio analysis and synthesis used in real-world applications?

Audio analysis and synthesis have a wide range of applications in different fields, including music production, speech recognition, noise cancellation, and audio enhancement. It is also used in medical imaging and research, such as in analyzing heart and lung sounds, as well as in sonar and radar systems for detecting and identifying objects.

Similar threads

  • Programming and Computer Science
Replies
1
Views
4K
Replies
4
Views
3K
Back
Top