How to smooth audio FFT data

edited December 2013 in Questions about Code

I was looking at this Web Audio API demo, part of this nice book

If you look at the demo, the fft peaks fall smoothly. I'm trying to do same with Processing in Java mode using the minim library. I've looked at how this is done with the web audio api in the doFFTAnalysis() method and tried to replicate this with minim. I also tried to port how abs() works with the complex type:

/ 26.2.7/3 abs(__z):  Returns the magnitude of __z.
00565   template<typename _Tp>
00566     inline _Tp
00567     __complex_abs(const complex<_Tp>& __z)
00568     {
00569       _Tp __x = __z.real();
00570       _Tp __y = __z.imag();
00571       const _Tp __s = std::max(abs(__x), abs(__y));
00572       if (__s == _Tp())  // well ...
00573         return __s;
00574       __x /= __s; 
00575       __y /= __s;
00576       return __s * sqrt(__x * __x + __y * __y);
00577     }
00578 

So my code looks like this:

import ddf.minim.*;
import ddf.minim.analysis.*;

private int blockSize = 512;
private Minim minim;
private AudioInput in;
private FFT         mfft;
private float[]    time = new float[blockSize];//time domain
private float[]    real = new float[blockSize];
private float[]    imag = new float[blockSize];
private float[]    freq = new float[blockSize];//smoothed freq. domain

public void setup() {
  minim = new Minim(this);
  in = minim.getLineIn(Minim.STEREO, blockSize);
  mfft = new FFT( in.bufferSize(), in.sampleRate() );
}
public void draw() {
  background(255);
  for (int i = 0; i < blockSize; i++) time[i] = in.left.get(i);
  mfft.forward( time);
  real = mfft.getSpectrumReal();
  imag = mfft.getSpectrumImaginary();

  final float magnitudeScale = 1.0 / mfft.specSize();
  final float k = (float)mouseX/width;

  for (int i = 0; i < blockSize; i++)
  {      
      float creal = real[i];
      float cimag = imag[i];
      float s = Math.max(creal,cimag);
      creal /= s;
      cimag /= s; 
      float absComplex = (float)(s * Math.sqrt(creal*creal + cimag*cimag));
      float scalarMagnitude = absComplex * magnitudeScale;        
      freq[i] = (k * mfft.getBand(i) + (1 - k) * scalarMagnitude);

      line( i, height, i, height - freq[i]*8 );
  }
  fill(0);
  text("smoothing: " + k,10,10);
}

I'm not getting errors, which is good, but I'm not getting the expected behaviour which is bad. I expected the peaks to fall slower when smoothing(k) is close 1, but as far as I can tell my code only scales the bands.

Unfortunately math and sound isn't my strong point, so I'm stabbing in the dark. Can someone more experienced please give me some pointers on how I could get smooth fft peaks ?

UPDATE

I've got a simple solution for smoothing (continuously diminish values of each previous fft band if the current fft band is not higher:

import ddf.minim.analysis.*;
import ddf.minim.*;

Minim       minim;
AudioInput  in;
FFT         fft;

float smoothing = 0;
float[] fftReal;
float[] fftImag;
float[] fftSmooth;
int specSize;
void setup(){
  size(640, 360, P3D);
  minim = new Minim(this);
  in = minim.getLineIn(Minim.STEREO, 512);
  fft = new FFT(in.bufferSize(), in.sampleRate());
  specSize = fft.specSize();
  fftSmooth = new float[specSize];
  fftReal   = new float[specSize];
  colorMode(HSB,specSize,100,100);
}

void draw(){
  background(0);
  stroke(255);

  fft.forward( in.left);
  fftReal = fft.getSpectrumReal();
  fftImag = fft.getSpectrumImaginary();
  for(int i = 0; i < specSize; i++)
  {
    float band = fft.getBand(i);

    fftSmooth[i] *= smoothing;
    if(fftSmooth[i] < band) fftSmooth[i] = band;
    stroke(i,100,50);
    line( i, height, i, height - fftSmooth[i]*8 );
    stroke(i,100,100);
    line( i, height, i, height - band*8 );


  }
  text("smoothing: " + (int)(smoothing*100),10,10);
}
void keyPressed(){
  float inc = 0.01;
  if(keyCode == UP && smoothing < 1-inc) smoothing += inc;
  if(keyCode == DOWN && smoothing > inc) smoothing -= inc;
}

FFT smooth

The faded graph is the smoothed one and the fully saturated one is the live one.

I am however still missing something, in comparison to the Web Audio API demo:

Web Audio API demo

I think the Web Audio API might take into account that the medium and higher frequencies will need to be scaled to be closer to what we perceive, but I'm not sure how to tackle that.

I was trying to read more on how the RealtimeAnalyser class does this for the WebAudioAPI, but it seems FFTFrame class's doFFT method might do the logarithmic scaling. I haven't figured out how doFFT works yet.

How can I scale a raw FFT graph with a logarithmic scale to account for perception ? Thank you, George

Answers

Sign In or Register to comment.