LPC Speech analysis and synthesis
This example shows how to implement a speech compression method using the EngeeDSP library functionality. This example implements linear prediction coding (LPC), which is a technique used primarily in audio and speech processing to represent the spectral envelope of a digital speech signal in a compressed form using linear prediction model information. The model implements LPC analysis and synthesis of the speech signal.
In the analysis section, the reflection coefficients are extracted from the signal and used to calculate the residual signal.
In the "synthesis" section, the signal is reconstructed using the residual signal and reflection coefficients.
The residual signal and reflection coefficients require fewer bits to encode than the original speech signal.
Connecting libraries, declaring input data
Connecting libraries
Pkg.add(["WAV"])
using .EngeeDSP;
using Plots;
plotlyjs();
using WAV;
using Base64;
Declaring variables
sample = 80;
var_load = load_audio();
step = EngeeDSP.step;
In1 = step( var_load, "$(@__DIR__)/check_signal.wav", sample );
Declaring LPC data structures and functions
In this model, the speech signal is divided into frames of size 80 samples with an overlap of 160 samples.
Each frame is processed by a Hamming window.
Tenth order autocorrelation coefficients are found and then from the autocorrelation coefficients the reflection coefficients are calculated using Levinson-Durbin algorithm. The original speech signal is passed through an analysing filter, which is a null filter with coefficients equal to the reflection coefficients obtained above. The output of the filter is the residual signal.
This residual signal passes through the synthesis filter, which is the inverse of the analysis filter.
The output of the synthesis filter is the original signal.
mutable struct LPCAnalysisAndSynthesisOfSpeech
obj_Pre_Emphasis
obj_Overlap_Analysis_Windows
obj_Window
obj_Autocorrelation
obj_Levinson_Durbin
obj_Time_Varying_Analysis_Filter
obj_Pad
obj_FFT
obj_MathFunction1
obj_RC_To_InvSine
obj_5_bit_Quantizer
obj_6_bit_Quantizer
obj_Inv_Sine_to_RC
obj_Time_Varying_Synthesis_Filter
obj_De_emphasis_Filter
function LPCAnalysisAndSynthesisOfSpeech()
new(
DescretFIRFilter("Dialog parameters","Direct form",[1, -.95],"Columns as channels",0,false,"None"),
Buffer(160,80,0),
WindowFunction("Apply window to input","Hamming","Symmetric"),
Autocorrelation("Biased",10),
LevinsonDurbin("A and K",false,true),
DescretFIRFilter("Input port","Lattice MA","Columns as channels",0,false,"None"),
EngeeDSP.Pad("Columns","Specify via dialog",0,"User-specified",256,"End","None"),
EngeeFFT("Auto",false,false,true),
MathFunction("reciprocal","Exact","auto"),
TrigonometricFunction("asin","auto",false),
Quantizer(0.1,false),
Quantizer(0.03125,false),
TrigonometricFunction("sin","None","auto"),
AllpoleFilter("Input port","Lattice AR","Columns as channels",0),
AllpoleFilter("Dialog parameters","Direct form",[1 , -.95],"Columns as channels",0)
)
end
end
Adjusting the LPC parameters.
function setup(obj::LPCAnalysisAndSynthesisOfSpeech,In1)
Pre_Emphasis_set = EngeeDSP.setup(obj.obj_Pre_Emphasis,In1)
Overlap_Analysis_Windows_set = EngeeDSP.setup(obj.obj_Overlap_Analysis_Windows,Pre_Emphasis_set);
Window_set = EngeeDSP.setup(obj.obj_Window,Overlap_Analysis_Windows_set);
Autocorrelation_set = EngeeDSP.setup(obj.obj_Autocorrelation,Window_set);
Levinson_Durbin_set = EngeeDSP.setup(obj.obj_Levinson_Durbin,Autocorrelation_set);
Time_Varying_Analysis_Filter_set = EngeeDSP.setup(obj.obj_Time_Varying_Analysis_Filter,Pre_Emphasis_set,Levinson_Durbin_set[2]);
RC_To_InvSine_set = EngeeDSP.setup(obj.obj_RC_To_InvSine,Levinson_Durbin_set[2]);
bit_Quantizer_5_set = EngeeDSP.setup(obj.obj_5_bit_Quantizer,RC_To_InvSine_set);
bit_Quantizer_6_set = EngeeDSP.setup(obj.obj_6_bit_Quantizer,Time_Varying_Analysis_Filter_set);
Inv_Sine_to_RC_set = EngeeDSP.setup(obj.obj_Inv_Sine_to_RC,bit_Quantizer_5_set);
Time_Varying_Synthesis_Filter_set = EngeeDSP.setup(obj.obj_Time_Varying_Synthesis_Filter,bit_Quantizer_6_set,Inv_Sine_to_RC_set);
De_emphasis_Filter_set = EngeeDSP.setup(obj.obj_De_emphasis_Filter,Time_Varying_Synthesis_Filter_set);
end
Declares the settings for the first processing step.
function step1(obj::LPCAnalysisAndSynthesisOfSpeech,In1)
Pre_Emphasis_out = EngeeDSP.step(obj.obj_Pre_Emphasis,In1);
Overlap_Analysis_Windows_out = EngeeDSP.step(obj.obj_Overlap_Analysis_Windows,Pre_Emphasis_out);
Window_out = EngeeDSP.step(obj.obj_Window,Overlap_Analysis_Windows_out);
Autocorrelation_out = EngeeDSP.step(obj.obj_Autocorrelation,Window_out);
Levinson_Durbin_out = EngeeDSP.step(obj.obj_Levinson_Durbin,Autocorrelation_out);
Time_Varying_Analysis_Filter_out = EngeeDSP.step(obj.obj_Time_Varying_Analysis_Filter,Pre_Emphasis_out,Levinson_Durbin_out[2]);
Pad_out = EngeeDSP.step(obj.obj_Pad,Levinson_Durbin_out[1]);
FFT_out = EngeeDSP.step(obj.obj_FFT,Pad_out);
MathFunction1_out = EngeeDSP.step(obj.obj_MathFunction1,FFT_out);
RC_To_InvSine_out = EngeeDSP.step(obj.obj_RC_To_InvSine,Levinson_Durbin_out[2]);
bit_Quantizer_5_out = EngeeDSP.step(obj.obj_5_bit_Quantizer,RC_To_InvSine_out);
bit_Quantizer_6_out = EngeeDSP.step(obj.obj_6_bit_Quantizer,Time_Varying_Analysis_Filter_out);
Inv_Sine_to_RC_out = EngeeDSP.step(obj.obj_Inv_Sine_to_RC,bit_Quantizer_5_out);
Time_Varying_Synthesis_Filter_out = EngeeDSP.step(obj.obj_Time_Varying_Synthesis_Filter,bit_Quantizer_6_out,Inv_Sine_to_RC_out);
De_emphasis_Filter_out = EngeeDSP.step(obj.obj_De_emphasis_Filter,Time_Varying_Synthesis_Filter_out);
De_emphasis_Filter_out,MathFunction1_out;
end
Implementation of the LPC algorithm
Function call and processing of EQ output.
obj = LPCAnalysisAndSynthesisOfSpeech()
setup(obj,In1[1]);
Out_a = zeros(size(vcat(In1...)))
Out_p = In1.*1im
size(In1,1)
for i = 0:size(In1,1) - 2
output_a = step1(obj,In1[i+1])
Out_a[sample*i + 1 : sample*(i + 1)] = output_a[1]
output_p = step1(obj,In1[i+1])
Out_p[i+1] = output_p[2]
end
Processing and analysing the results
Setting up the player.
In2 = vcat(In1...);
function audioplayer(s, fs);
buf = IOBuffer();
wavwrite(s, buf; Fs=fs);
data = base64encode(unsafe_string(pointer(buf.data), buf.size));
markup = """<audio controls="controls" {autoplay}>
<source src="data:audio/wav;base64,$data" type="audio/wav" />
Your browser does not support the audio element.
</audio>"""
display("text/html", markup);
end
Playing back the original audio file
audioplayer(In2, 8000)
Playing back an encrypted audio file.
audioplayer(Out_a, 8000)
Graph of visual comparison of original and encrypted files.
plot(In2)
plot!(Out_a)
Spectral analyser graph output.
u1 = Out_p[1]; u2 = Out_p[2];
uSubset1 = u1[1:floor(Int,length(u1)/2+1)];
uSubset2 = u2[1:floor(Int,length(u2)/2+1)];
y1 = 6.02059991327962 .* log2.(abs.(uSubset1)+hypot.(eps.(real(uSubset1)),eps.(imag(uSubset1))));
y2 = 6.02059991327962 .* log2.(abs.(uSubset2)+hypot.(eps.(real(uSubset2)),eps.(imag(uSubset2))));
plot([1:129], y1[:], fillrange = -20, fillalpha = 0.35, c = 1, ylabel = "Mag^2", legend = false)
a = plot!([1:129],y1[:], msw = 0, ms = 2.5, xlabel = "Frequency")
plot([1:129], y2[:], fillrange = -20, fillalpha = 0.35, c = 1, ylabel = "Mag^2", legend = false)
b = plot!([1:129],y2[:], msw = 0, ms = 2.5, xlabel = "Frequency")
plot!(a,b)
Conclusion
As a result of this demonstration we have learnt how to interact with functions from EngeeDSP library, and also demonstrated the possibility of creating interactive blocks inside scripts. This example clearly demonstrates the work of LPC coding method and allows us to evaluate its efficiency.
This is one of the most powerful speech analysis methods and one of the most useful methods for coding good quality speech at low bit rates, providing highly accurate estimates of speech parameters.