LPC语音分析与合成
此示例演示如何使用EngeeDSP库功能实现语音压缩方法。 此示例实现线性预测编码(lpc),该方法主要用于音频和语音处理,以使用来自线性预测模型的信息以压缩形式表示数字语音信号的频谱包络。 模型实现对LPC语音信号的分析与合成。
在"分析"部分,从信号中提取反射系数并用于计算残差信号。
在合成部分中,使用残差信号和反射系数重构信号。
残差信号和反射系数需要比原始语音信号更少的比特来编码。
连接库,声明输入数据
连接图书馆
In [ ]:
Pkg.add(["WAV"])
In [ ]:
using .EngeeDSP;
using Plots;
plotlyjs();
using WAV;
using Base64;
声明变量
In [ ]:
sample = 80;
var_load = load_audio();
step = EngeeDSP.step;
In1 = step( var_load, "$(@__DIR__)/check_signal.wav", sample );
声明LPC数据结构和函数
在该模型中,语音信号被分成80个样本的帧,重叠160个样本。
每个帧由汉明窗口处理。
找到十阶自相关系数,然后使用Levinson-Darbin算法从自相关系数计算反射系数。 原始语音信号经过分析滤波器,该分析滤波器是系数等于上述得到的反射系数的空滤波器。 滤波器输出为残差信号。
此残差信号通过合成滤波器,该合成滤波器与分析滤波器相反。
合成滤波器的输出为原始信号。
In [ ]:
mutable struct LPCAnalysisAndSynthesisOfSpeech
obj_Pre_Emphasis
obj_Overlap_Analysis_Windows
obj_Window
obj_Autocorrelation
obj_Levinson_Durbin
obj_Time_Varying_Analysis_Filter
obj_Pad
obj_FFT
obj_MathFunction1
obj_RC_To_InvSine
obj_5_bit_Quantizer
obj_6_bit_Quantizer
obj_Inv_Sine_to_RC
obj_Time_Varying_Synthesis_Filter
obj_De_emphasis_Filter
function LPCAnalysisAndSynthesisOfSpeech()
new(
DescretFIRFilter("Dialog parameters","Direct form",[1, -.95],"Columns as channels",0,false,"None"),
Buffer(160,80,0),
WindowFunction("Apply window to input","Hamming","Symmetric"),
Autocorrelation("Biased",10),
LevinsonDurbin("A and K",false,true),
DescretFIRFilter("Input port","Lattice MA","Columns as channels",0,false,"None"),
EngeeDSP.Pad("Columns","Specify via dialog",0,"User-specified",256,"End","None"),
EngeeFFT("Auto",false,false,true),
MathFunction("reciprocal","Exact","auto"),
TrigonometricFunction("asin","auto",false),
Quantizer(0.1,false),
Quantizer(0.03125,false),
TrigonometricFunction("sin","None","auto"),
AllpoleFilter("Input port","Lattice AR","Columns as channels",0),
AllpoleFilter("Dialog parameters","Direct form",[1 , -.95],"Columns as channels",0)
)
end
end
配置LPC设置。
In [ ]:
function setup(obj::LPCAnalysisAndSynthesisOfSpeech,In1)
Pre_Emphasis_set = EngeeDSP.setup(obj.obj_Pre_Emphasis,In1)
Overlap_Analysis_Windows_set = EngeeDSP.setup(obj.obj_Overlap_Analysis_Windows,Pre_Emphasis_set);
Window_set = EngeeDSP.setup(obj.obj_Window,Overlap_Analysis_Windows_set);
Autocorrelation_set = EngeeDSP.setup(obj.obj_Autocorrelation,Window_set);
Levinson_Durbin_set = EngeeDSP.setup(obj.obj_Levinson_Durbin,Autocorrelation_set);
Time_Varying_Analysis_Filter_set = EngeeDSP.setup(obj.obj_Time_Varying_Analysis_Filter,Pre_Emphasis_set,Levinson_Durbin_set[2]);
RC_To_InvSine_set = EngeeDSP.setup(obj.obj_RC_To_InvSine,Levinson_Durbin_set[2]);
bit_Quantizer_5_set = EngeeDSP.setup(obj.obj_5_bit_Quantizer,RC_To_InvSine_set);
bit_Quantizer_6_set = EngeeDSP.setup(obj.obj_6_bit_Quantizer,Time_Varying_Analysis_Filter_set);
Inv_Sine_to_RC_set = EngeeDSP.setup(obj.obj_Inv_Sine_to_RC,bit_Quantizer_5_set);
Time_Varying_Synthesis_Filter_set = EngeeDSP.setup(obj.obj_Time_Varying_Synthesis_Filter,bit_Quantizer_6_set,Inv_Sine_to_RC_set);
De_emphasis_Filter_set = EngeeDSP.setup(obj.obj_De_emphasis_Filter,Time_Varying_Synthesis_Filter_set);
end
Out[0]:
声明第一个处理步骤的设置。
In [ ]:
function step1(obj::LPCAnalysisAndSynthesisOfSpeech,In1)
Pre_Emphasis_out = EngeeDSP.step(obj.obj_Pre_Emphasis,In1);
Overlap_Analysis_Windows_out = EngeeDSP.step(obj.obj_Overlap_Analysis_Windows,Pre_Emphasis_out);
Window_out = EngeeDSP.step(obj.obj_Window,Overlap_Analysis_Windows_out);
Autocorrelation_out = EngeeDSP.step(obj.obj_Autocorrelation,Window_out);
Levinson_Durbin_out = EngeeDSP.step(obj.obj_Levinson_Durbin,Autocorrelation_out);
Time_Varying_Analysis_Filter_out = EngeeDSP.step(obj.obj_Time_Varying_Analysis_Filter,Pre_Emphasis_out,Levinson_Durbin_out[2]);
Pad_out = EngeeDSP.step(obj.obj_Pad,Levinson_Durbin_out[1]);
FFT_out = EngeeDSP.step(obj.obj_FFT,Pad_out);
MathFunction1_out = EngeeDSP.step(obj.obj_MathFunction1,FFT_out);
RC_To_InvSine_out = EngeeDSP.step(obj.obj_RC_To_InvSine,Levinson_Durbin_out[2]);
bit_Quantizer_5_out = EngeeDSP.step(obj.obj_5_bit_Quantizer,RC_To_InvSine_out);
bit_Quantizer_6_out = EngeeDSP.step(obj.obj_6_bit_Quantizer,Time_Varying_Analysis_Filter_out);
Inv_Sine_to_RC_out = EngeeDSP.step(obj.obj_Inv_Sine_to_RC,bit_Quantizer_5_out);
Time_Varying_Synthesis_Filter_out = EngeeDSP.step(obj.obj_Time_Varying_Synthesis_Filter,bit_Quantizer_6_out,Inv_Sine_to_RC_out);
De_emphasis_Filter_out = EngeeDSP.step(obj.obj_De_emphasis_Filter,Time_Varying_Synthesis_Filter_out);
De_emphasis_Filter_out,MathFunction1_out;
end
Out[0]:
LPC算法的实现
调用函数并处理均衡器输出。
In [ ]:
obj = LPCAnalysisAndSynthesisOfSpeech()
setup(obj,In1[1]);
Out_a = zeros(size(vcat(In1...)))
Out_p = In1.*1im
size(In1,1)
for i = 0:size(In1,1) - 2
output_a = step1(obj,In1[i+1])
Out_a[sample*i + 1 : sample*(i + 1)] = output_a[1]
output_p = step1(obj,In1[i+1])
Out_p[i+1] = output_p[2]
end
结果的处理和分析
设置播放器。
In [ ]:
In2 = vcat(In1...);
function audioplayer(s, fs);
buf = IOBuffer();
wavwrite(s, buf; Fs=fs);
data = base64encode(unsafe_string(pointer(buf.data), buf.size));
markup = """<audio controls="controls" {autoplay}>
<source src="data:audio/wav;base64,$data" type="audio/wav" />
Your browser does not support the audio element.
</audio>"""
display("text/html", markup);
end
Out[0]:
播放原始音频文件
In [ ]:
audioplayer(In2, 8000)
加密音频文件的播放。
In [ ]:
audioplayer(Out_a, 8000)
原始和加密文件的视觉比较的图表。
In [ ]:
plot(In2)
plot!(Out_a)
Out[0]:
频谱分析仪曲线图的输出。
In [ ]:
u1 = Out_p[1]; u2 = Out_p[2];
uSubset1 = u1[1:floor(Int,length(u1)/2+1)];
uSubset2 = u2[1:floor(Int,length(u2)/2+1)];
y1 = 6.02059991327962 .* log2.(abs.(uSubset1)+hypot.(eps.(real(uSubset1)),eps.(imag(uSubset1))));
y2 = 6.02059991327962 .* log2.(abs.(uSubset2)+hypot.(eps.(real(uSubset2)),eps.(imag(uSubset2))));
In [ ]:
plot([1:129], y1[:], fillrange = -20, fillalpha = 0.35, c = 1, ylabel = "Mag^2", legend = false)
a = plot!([1:129],y1[:], msw = 0, ms = 2.5, xlabel = "Frequency")
plot([1:129], y2[:], fillrange = -20, fillalpha = 0.35, c = 1, ylabel = "Mag^2", legend = false)
b = plot!([1:129],y2[:], msw = 0, ms = 2.5, xlabel = "Frequency")
plot!(a,b)
Out[0]:
结论
基于本演示的结果,我们分析了与EngeeDSP库中的函数交互的方法,并演示了在脚本中创建交互块的可能性。 这个例子清楚地演示了LPC编码方法的操作,并允许我们评估其有效性。
它是最强大的语音分析方法之一,也是最有用的方法之一,用于编码低数据传输率的高质量语音,提供高度准确的语音参数估计。