Week 20

Race to the finish!

There is still quite a lot of work to do. This week I spent my time calculating the signal-to-noise Ration (SNR) over all the segmented CHILDES files and VoxForge data. I did this so I could get a better picture (mean, variance) of how rich the audio data is.

From taking a quick look at some of the segmented .wav files, it appears that some of it is really noisy, some of them barely contain speech. At this point, I don’t think there is much we can do in terms of compression, normalizing, or any other signal processing to get the data to LDC quality at the moment. Truthfully, we could spend the few months on cleaning up the speech data anyways.

For a quick review of SNR:

Signal-to-noise ratio is defined as the ratio of the power of a signal (meaningful information) and the power of background noise (unwanted signal).

Typically SNR is expressed as dB:

At first I was simply calculating the SNR signal with MATLAB’s snr function. However, when I compared the mean SNR of the VoxForge data and the mean SNR of the CHILDES data, I found that the SNR values were pretty similar. This wasn’t particularly bad news, but it had me suspicious of the MATLAB function.

According to MATLAB, there are a couple of ways one can use snr():

  1. r = snr(x,y) returns the signal-to-noise ratio (SNR) in decibels of a signal x.
  2. r = snr(x) returns the SNR in decibels relative to the carrier (dBc) of a real-valued sinusoidal input signal, x.
  • for additional ways to use snr(), visit the function’s documentation page.

I was actually using (2) because I didn’t have any noise to compute. This isn’t really a problem, it’s just that using SNR in this way may provide little insight into the actual quality of the signal. So, it was suggested to me by Dr. Brumberg to use another definition of SNR:

Here, SNR is the ratio between the mean of the input signal (or expected value) and the standard deviation. So, we attempt to use this definition in conjunction with the Hilbert function:

Where the Hilbert function returns a complex analytic signal, from a real data sequence. The idea here is we get the mean amplitude (envelope) of the signal (magnitude of the Hilbert function) and divide it by the standard deviation (square of the variance). Haven’t seen this done anywhere, but I still think it could be a useful measure.

Best,
EO