Engee documentation
Notebook

LPC Speech Analysis and Synthesis

This example shows how to implement a speech compression method using the EngeeDSP library functionality. This example implements linear prediction coding (LPC), a method used mainly in audio and speech processing to represent the spectral envelope of a digital speech signal in a compressed form using information from a linear prediction model. The model implements the analysis and synthesis of the LPC speech signal.

In the "analysis" section, reflection coefficients are extracted from the signal and used to calculate the residual signal.

In the synthesis section, the signal is reconstructed using the residual signal and reflection coefficients.

The residual signal and reflection coefficients require fewer bits to encode than the original speech signal.

Connecting libraries, declaring input data

Connecting libraries

In [ ]:
Pkg.add(["WAV"])
In [ ]:
using .EngeeDSP;
using  Plots;
plotlyjs();
using WAV;
using Base64;

Declaring variables

In [ ]:
sample = 80;
var_load = load_audio();
step = EngeeDSP.step;
In1 = step( var_load, "$(@__DIR__)/check_signal.wav", sample );

Declaring LPC data structures and functions

In this model, the speech signal is divided into frames of 80 samples with an overlap of 160 samples.

Each frame is processed by the Hamming window.

The tenth-order autocorrelation coefficients are found, and then the reflection coefficients are calculated from the autocorrelation coefficients using the Levinson-Darbin algorithm. The original speech signal passes through an analyzing filter, which is a null filter with coefficients equal to the reflection coefficients obtained above. The filter output is a residual signal.
This residual signal passes through the synthesis filter, which is the reverse of the analysis filter.

The output of the synthesizing filter is the original signal.

In [ ]:
mutable struct LPCAnalysisAndSynthesisOfSpeech
    obj_Pre_Emphasis
    obj_Overlap_Analysis_Windows
    obj_Window
    obj_Autocorrelation
    obj_Levinson_Durbin
    obj_Time_Varying_Analysis_Filter
    obj_Pad
    obj_FFT
    obj_MathFunction1
    obj_RC_To_InvSine
    obj_5_bit_Quantizer
    obj_6_bit_Quantizer
    obj_Inv_Sine_to_RC
    obj_Time_Varying_Synthesis_Filter
    obj_De_emphasis_Filter
    function LPCAnalysisAndSynthesisOfSpeech()
        new(
        DescretFIRFilter("Dialog parameters","Direct form",[1, -.95],"Columns as channels",0,false,"None"),
        Buffer(160,80,0),
        WindowFunction("Apply window to input","Hamming","Symmetric"),
        Autocorrelation("Biased",10),
        LevinsonDurbin("A and K",false,true),
        DescretFIRFilter("Input port","Lattice MA","Columns as channels",0,false,"None"),
        EngeeDSP.Pad("Columns","Specify via dialog",0,"User-specified",256,"End","None"),
        EngeeFFT("Auto",false,false,true),
        MathFunction("reciprocal","Exact","auto"),
        TrigonometricFunction("asin","auto",false),
        Quantizer(0.1,false),
        Quantizer(0.03125,false),
        TrigonometricFunction("sin","None","auto"),
        AllpoleFilter("Input port","Lattice AR","Columns as channels",0),
        AllpoleFilter("Dialog parameters","Direct form",[1 , -.95],"Columns as channels",0)
        )
    end
end

Configuring the LPC settings.

In [ ]:
function setup(obj::LPCAnalysisAndSynthesisOfSpeech,In1)
    Pre_Emphasis_set = EngeeDSP.setup(obj.obj_Pre_Emphasis,In1) 
    Overlap_Analysis_Windows_set = EngeeDSP.setup(obj.obj_Overlap_Analysis_Windows,Pre_Emphasis_set);
    Window_set = EngeeDSP.setup(obj.obj_Window,Overlap_Analysis_Windows_set);
    Autocorrelation_set = EngeeDSP.setup(obj.obj_Autocorrelation,Window_set);
    Levinson_Durbin_set = EngeeDSP.setup(obj.obj_Levinson_Durbin,Autocorrelation_set);
    Time_Varying_Analysis_Filter_set = EngeeDSP.setup(obj.obj_Time_Varying_Analysis_Filter,Pre_Emphasis_set,Levinson_Durbin_set[2]);
    RC_To_InvSine_set = EngeeDSP.setup(obj.obj_RC_To_InvSine,Levinson_Durbin_set[2]);
    bit_Quantizer_5_set = EngeeDSP.setup(obj.obj_5_bit_Quantizer,RC_To_InvSine_set);
    bit_Quantizer_6_set = EngeeDSP.setup(obj.obj_6_bit_Quantizer,Time_Varying_Analysis_Filter_set);
    Inv_Sine_to_RC_set = EngeeDSP.setup(obj.obj_Inv_Sine_to_RC,bit_Quantizer_5_set);
    Time_Varying_Synthesis_Filter_set = EngeeDSP.setup(obj.obj_Time_Varying_Synthesis_Filter,bit_Quantizer_6_set,Inv_Sine_to_RC_set);
    De_emphasis_Filter_set = EngeeDSP.setup(obj.obj_De_emphasis_Filter,Time_Varying_Synthesis_Filter_set);
end
Out[0]:
setup (generic function with 1 method)

Declaring the settings for the first processing step.

In [ ]:
function step1(obj::LPCAnalysisAndSynthesisOfSpeech,In1)
    Pre_Emphasis_out = EngeeDSP.step(obj.obj_Pre_Emphasis,In1); 
    Overlap_Analysis_Windows_out = EngeeDSP.step(obj.obj_Overlap_Analysis_Windows,Pre_Emphasis_out); 
    Window_out = EngeeDSP.step(obj.obj_Window,Overlap_Analysis_Windows_out); 
    Autocorrelation_out = EngeeDSP.step(obj.obj_Autocorrelation,Window_out); 
    Levinson_Durbin_out = EngeeDSP.step(obj.obj_Levinson_Durbin,Autocorrelation_out);
    Time_Varying_Analysis_Filter_out = EngeeDSP.step(obj.obj_Time_Varying_Analysis_Filter,Pre_Emphasis_out,Levinson_Durbin_out[2]);
    Pad_out = EngeeDSP.step(obj.obj_Pad,Levinson_Durbin_out[1]);
    FFT_out = EngeeDSP.step(obj.obj_FFT,Pad_out);
    MathFunction1_out = EngeeDSP.step(obj.obj_MathFunction1,FFT_out);
    RC_To_InvSine_out = EngeeDSP.step(obj.obj_RC_To_InvSine,Levinson_Durbin_out[2]);
    bit_Quantizer_5_out = EngeeDSP.step(obj.obj_5_bit_Quantizer,RC_To_InvSine_out);
    bit_Quantizer_6_out = EngeeDSP.step(obj.obj_6_bit_Quantizer,Time_Varying_Analysis_Filter_out); 
    Inv_Sine_to_RC_out = EngeeDSP.step(obj.obj_Inv_Sine_to_RC,bit_Quantizer_5_out);
    Time_Varying_Synthesis_Filter_out = EngeeDSP.step(obj.obj_Time_Varying_Synthesis_Filter,bit_Quantizer_6_out,Inv_Sine_to_RC_out);
    De_emphasis_Filter_out = EngeeDSP.step(obj.obj_De_emphasis_Filter,Time_Varying_Synthesis_Filter_out);
    De_emphasis_Filter_out,MathFunction1_out;
end
Out[0]:
step1 (generic function with 1 method)

Implementation of the LPC algorithm

Calling the function and processing the equalizer output.

In [ ]:
obj = LPCAnalysisAndSynthesisOfSpeech() 
setup(obj,In1[1]);
Out_a = zeros(size(vcat(In1...)))
Out_p = In1.*1im
size(In1,1)

for i = 0:size(In1,1) - 2

    output_a = step1(obj,In1[i+1]) 
    Out_a[sample*i + 1 : sample*(i + 1)] = output_a[1]
    
    output_p  = step1(obj,In1[i+1]) 
    Out_p[i+1] = output_p[2]
end

Processing and analysis of results

Setting up the player.

In [ ]:
In2 = vcat(In1...);
function audioplayer(s, fs);
  buf = IOBuffer();
  wavwrite(s, buf; Fs=fs);
  data = base64encode(unsafe_string(pointer(buf.data), buf.size));
  markup = """<audio controls="controls" {autoplay}>
              <source src="data:audio/wav;base64,$data" type="audio/wav" />
              Your browser does not support the audio element.
              </audio>"""
  display("text/html", markup);
  end 
Out[0]:
audioplayer (generic function with 1 method)

Playback of the original audio file

In [ ]:
audioplayer(In2, 8000)

Playback of an encrypted audio file.

In [ ]:
audioplayer(Out_a, 8000)

Graph of visual comparison of the original and encrypted files.

In [ ]:
  plot(In2)
  plot!(Out_a)
Out[0]:

Output of the spectrum analyzer graph.

In [ ]:
u1 = Out_p[1]; u2 = Out_p[2];
uSubset1 = u1[1:floor(Int,length(u1)/2+1)];
uSubset2 = u2[1:floor(Int,length(u2)/2+1)];
y1 = 6.02059991327962 .* log2.(abs.(uSubset1)+hypot.(eps.(real(uSubset1)),eps.(imag(uSubset1))));
y2 = 6.02059991327962 .* log2.(abs.(uSubset2)+hypot.(eps.(real(uSubset2)),eps.(imag(uSubset2))));
In [ ]:
plot([1:129], y1[:], fillrange = -20, fillalpha = 0.35, c = 1, ylabel = "Mag^2", legend = false)
a = plot!([1:129],y1[:], msw = 0, ms = 2.5, xlabel = "Frequency")
plot([1:129], y2[:], fillrange = -20, fillalpha = 0.35, c = 1, ylabel = "Mag^2", legend = false)
b = plot!([1:129],y2[:], msw = 0, ms = 2.5, xlabel = "Frequency")
plot!(a,b)
Out[0]:

Conclusion

Based on the results of this demonstration, we analyzed the methods of interaction with functions from the EngeeDSP library, and also demonstrated the possibility of creating interactive blocks inside scripts. This example clearly demonstrates the operation of the LPC coding method and allows us to evaluate its effectiveness.
It is one of the most powerful speech analysis methods and one of the most useful methods for encoding high-quality speech with low data transfer rates, providing highly accurate estimates of speech parameters.