We closed this forum 18 June 2010. It has served us well since 2005 as the ALPHA forum did before it from 2002 to 2005. New discussions are ongoing at the new URL http://forum.processing.org. You'll need to sign up and get a new user account. We're sorry about that inconvenience, but we think it's better in the long run. The content on this forum will remain online.
IndexProgramming Questions & HelpSound,  Music Libraries › Detecting *change* in frequency of human voice
Page Index Toggle Pages: 1
Detecting *change* in frequency of human voice? (Read 987 times)
Detecting *change* in frequency of human voice?
Nov 20th, 2006, 1:52pm
 
Hi all - hope someone can help...

I need to detect changes in  pitch/frequency of someone 'singing' (squaking) into the mic.  I don't need to detect the actual pitch (i.e b# or something), just the approx. magnitude of the change if they start on a low 'note'/frequency then raise their frequency to a high 'note'.

At the moment i'm using ESS, and just finding the (e.g.) 6 spectrum bins with the largest magnitude, and averaging their indexes - i.e. if bins 23, 45, 46, 47, 67, and 68 were largest in spectrum[], i just add those numbers together and /6.

This kind of works, but not that well - the average you get fluctuates pretty wildly even when I hold the same note (or just about anyway :), but does generally rise when i go up an octave say.  I had a scout around the web and it seems that auto-correlation is a good way of detecting actual pitch when faced with noisy/multiple freqeuncies of human voice ( http://www.dsprelated.com/showmessage/67695/1.php for Java implementation), but i don't really know how to use this, and anyway i just need the approx. magnitude of changes in pitch.

If anyone can think of a better way of doing this i'd very much like to know - say root mean squared instead of averages, or damping the average like the FFT spectrum is damped.

Ok, all the best and thanks to everyone that's making processing so great to use.

Thoams.
Re: Detecting *change* in frequency of human voice
Reply #1 - Jan 25th, 2007, 5:32pm
 
You probably want a weighted average rather than a direct average of those bins. Something like:

Code:

float top, bottom;
for(int i = 0; i < bigBins.length; i++) {
top += bigBins[i] * fft[bigBins[i]];
bottom += fft[bigBins[i]];
}
float center = top / bottom;


You might also want to visualize where the actual peaks are, depending on your FFT size I wouldn't be surprised if you're getting overtones in the "biggest bins" list, which will definitely throw off the center.
Page Index Toggle Pages: 1