Engee documentation
Notebook

Spectral analysis and filtering of an audio file

This example describes a typical technical computing workflow for digital signal processing tasks. The example involves importing, visualising and listening to an audio file containing a mixture of a useful signal and high-frequency interference. Then the signal is analysed in the frequency domain, the specification of a suitable digital filter is selected, its synthesis is performed, and the filter characteristics are visualised. Finally, the developed filter is applied to the original signal, the filtering result is analysed and visualised.

Connecting libraries and visualising audio data

We will need basic libraries for digital signal processing and working with WAV files:

In [ ]:
#Pkg.add("DSP")
#Pkg.add("WAV")
In [ ]:
using DSP
using WAV
using Base64

We will import an audio file containing a mixture of a voice message (countdown) and a crowd noise overlaid on it, previously passed through an upper-pass filter. Such a synthesised signal will allow us to more clearly separate the frequency ranges of the useful signal and the noise.

In [ ]:
filename = "$(@__DIR__)/countdown_noisy.wav"
audio_data, fs = wavread(filename);
nsamples = length(audio_data);
dt = 1/fs;
t = 0:dt:((nsamples - 1) .* dt);
original = audio_data[1:nsamples];

Here we declare a custom function that allows you to listen to WAV files and use the interactive player tool:

In [ ]:
function audioplayer(s, fs);
    buf = IOBuffer();
    wavwrite(s, buf; Fs=fs);
    data = base64encode(unsafe_string(pointer(buf.data), buf.size));
    markup = """<audio controls="controls" {autoplay}>
                <source src="data:audio/wav;base64,$data" type="audio/wav" />
                Your browser does not support the audio element.
                </audio>"""
    display("text/html", markup);
end 

Let's listen to the original audio file (useful signal against a background of high-frequency noise):

In [ ]:
audioplayer(original, fs)

Let's display the signal in the time domain: on the graph we can observe characteristic periodic amplitude bursts corresponding to the voice message of the countdown (for the speed of drawing in the time domain it is useful to thin the long signals - in our case we take every 10th count).

In [ ]:
plot(t[1:10:end], original[1:10:end], 
    xlabel = "Время, с", 
    ylabel = "Амплитуда", 
    title = "Исходный сигнал во времени")
Out[0]:

Spectral analysis

Visualising the signal in the time domain does not give us a clear understanding of how to separate the voice signal from the noise. That is why we need to perform spectral analysis of the signal, i.e. estimate its spectrum (power spectral density). For this purpose we will use the function welch_pgram. It should be noted that to save calculation time we take only the first 2^16 samples of the original signal.

In [ ]:
endindx = 2^16;
sig_crop = original[1:endindx];
p1 = DSP.welch_pgram(sig_crop, onesided=true, fs=fs);
plot(freq(p1), pow2db.(power(p1)), 
    xguide = "Частота, Гц", 
    yguide = "Мощность/частота, дБ/Гц",
    title = "Спектр исходного сигнала")
Out[0]:

Noise is non-stationary in time, so it is useful to estimate the spectrogram of the signal, i.e. the time dependence of the spectrum. For this purpose, let us use the function spectrogram:

In [ ]:
spgr = DSP.spectrogram(original, 511, 256; fs = fs);
heatmap(spgr.time, spgr.freq, pow2db.(spgr.power), 
        ylim=(0,10000), 
        clim=(-140,-40), 
        xlabel = "Время, сек", 
        ylabel = "Частота, Гц",
        title = "Спектрограмма исходного сигнала")
Out[0]:

As can be seen from the spectrogram, throughout the entire audio file the noise in the spectrum occupies the band above 3 kHz, which means that the speech signal can be separated by a standard low-pass filter (LF).

Digital filter design

Let's design a digital filter with suitable characteristics. To create a low-pass filter we use the function digitalfilter, in which we specify the filter response type lowpass with a passband frequency of 3000 Hz, and declare the method of its synthesis Elliptic: this corresponds to an elliptic filter with infinite impulse response (IIR) and the required characteristics (filter order - 10, ripple in the passband - not more than 0.1 dB, suppression in the fence band - not less than 60 dB):

In [ ]:
myfilt = digitalfilter(Lowpass(3000; fs), Elliptic(10, 0.1, 60));

Let's calculate and display the filter characteristics. First of all, we are interested in the amplitude-frequency response (AFR), which allows us to understand which frequencies of the original signal spectrum our filter will pass through and which ones will be suppressed. To do this, let's use the function freqresp:

In [ ]:
H, w = freqresp(myfilt);
freq_vec = fs*w/(2*pi);
plot(freq_vec, pow2db.(abs.(H))*2, linewidth=3,
     xlabel = "Частота, Гц",
     ylabel = "Амплитуда, дБ",
     title = "АЧХ фильтра нижних частот")
Out[0]:

Now let's display the phase-frequency response (FFR) of the filter with the function phaseresp:

In [ ]:
phi, w = phaseresp(myfilt);
plot(freq_vec, rad2deg.(phi/2), linewidth = 3, 
    xlabel = "Частота, Гц",
    ylabel = "Фаза, град",
    title = "ФЧХ фильтра нижних частот")
Out[0]:

Finally, let's calculate and display the impulse response (IR) of the filter (more precisely, its first 100 samples, since our filter has an infinite IR) with the function impresp:

In [ ]:
myimpresp = impresp(myfilt,100);
plot(myimpresp, line=:stem, marker=:circle, 
    xlabel = "Отсчёты ИХ", 
    ylabel = "Значения ИХ",
    title = "Импульсная характеристика ФНЧ")
Out[0]:

Let's apply the developed filter to the source audio signal with the function filt and display the result of processing in the frequency domain:

In [ ]:
filtered = filt(myfilt, original);
outsig = filtered[1:endindx];
p2 = DSP.welch_pgram(outsig, onesided=true, fs=fs);
plot(freq(p1), [pow2db.(power(p1)), pow2db.(power(p2))], 
    xlabel = "Частота, Гц", 
    ylabel = "Мощность/частота, дБ/Гц",
    title = "Спектр обработанного сигнала")
Out[0]:

The spectrum shows that frequencies above 3 kHz have been successfully suppressed.

Let's also display the result of filtering in the time domain:

In [ ]:
plot(t[1:10:end], [original[1:10:end], filtered[1:10:end]], 
    xlabel = "Время, с", 
    ylabel = "Амплитуда", 
    title = "Обработанный сигнал во времени")
Out[0]:

The graph shows that there is no noise filling between the spoken words. Let's make sure that the processing was successful by listening to the filtered audio signal:

In [ ]:
audioplayer(filtered, fs)

The applied filter successfully removed noise from the signal, but also suppressed the high-frequency components of the voice signal, so when listening to it, it sounds not so brightly coloured with overtones and hisses.

Conclusion

Using a selected audio file as an example, we have looked at a typical digital signal processing workflow in Engee, including import, visualisation, analysis and processing.