Week 36 (Extension)

The Return of LTASS

For a recap, I’m still estimating the Long Term Average Speech Spectrum (LTASS) and other associated frequency characteristics for the CHILDES corpus.

Resampling

In order to properly calculate LTASS it is necessary that all the audio files in CHILDES corpus have the same sampling frequency. Namely, a sampling frequency of 16 kHz. 16 kHz was chosen because 16 kHz is a smaller quantity than 22 kHz and 44.1 kHz. Moreover, humans can perceive sounds between the frequency ranges of 20 Hz to 20 kHz while intelligible speech (voice frequency) doesn’t contain frequencies above 3.4 kHz. And according to the Nyquist Sampling Theorem, in order to reliably reconstruct a signal, the sampling frequency must be greater than twice the maximum frequency component. In fact, this is why it’s common for audio equipment to have sampling rates of 44.1 kHz (Sampling Frequency > (2 * 20 kHz)), 48 kHz, and even 96 kHz. The effects of having such a high sampling rate on sound and speech perception is debatable (call an audiophile). Though, in 1999 Oohashi et al. showed that brain activity is still affected by the presence of high-frequency sounds (i.e. sounds with frequencies above 20 kHz).

% Read in audio file
[input_signal, Fs] = audioread('Bernstein_Children_Alice_alice1_1.wav');

% To change the sample rate from originalFs to desiredFs
% Determine the rational number P/Q, such that
% (P/Q)*originalFs = desiredFs
[P,Q] = rat(16e3/Fs)

% Check to see how much this new quantity differs from the original sampling frequency
abs(P/Q*Fs-16000)

% Resample audio file at desired sampling frequency
resampled_signal = resample(input_signal, P, Q);

Clipping

I noticed MATLAB was outputting clipping warnings when I was using the audiowrite function, so I looked into that:

Warning: Data clipped when writing file.
> In audiowrite>clipInputData (line 396)
  In audiowrite (line 176)
  In get_spectra (line 84)

In general, clipping is a form of waveform distortion that occurs when an amplifier is overdriven and attempts to deliver an output that is beyond its maximum capability:

clipped_waveform Figure from Lava Kumar Bangalore

In the case of analog clipping (physical-based systems), you ask the amplifier to amplify a signal by some value (let’s say 10 dB), which it will attempt to deliver given it’s physical constraints (i.e. voltage supplied to the system, materials, size) at a certain time. However, the MATLAB warning I was getting refers to digital clipping (digital-based systems). In the case of digital clipping, the input signal is restricted by the range of “chosen representation”. For example, if I try to amplify (double) the integer 32,000 through a system that uses a 16-bit representation, then instead of getting back 64,000 the output would be maxed (truncated) at 32,767 (2^16).

According to MATLAB documentation, the audiowrite function has a valid range of -1.0 <= y <= 1.0 when y is a double. Hence, values that are outside of that range will get clipped. Unfortunately, these audio files have very poor quality and distortion (clipping) is quite noticeable already in audio files like this. Though, we can attempt to suppress these warnings by normalizing the audio files before resampling:

.
.
.
% Normalize audio
if(info.NumChannels == 1)
    input_signal = input_signal/max(abs(input_signal));
else
    input_signal = input_signal/max(abs(input_signal(:)));
end
input_signal = input_signal * 0.9; % Leave headroom
% Resample audio
[P,Q] = rat(desiredFs/Fs);
resampled_signal = resample(input_signal,P,Q);
audiowrite(thisBaseFileName,resampled_signal,desiredFs);

Best,
EO