LPC 语音分析与合成
本示例展示了如何使用 EngeeDSP 库功能实现语音压缩方法。该示例实现了线性预测编码(LPC),这是一种主要用于音频和语音处理的技术,使用线性预测模型信息以压缩形式表示数字语音信号的频谱包络。该模型实现了对语音信号的 LPC 分析和合成。
在分析部分,从信号中提取反射系数并用于计算残差信号。
在 "合成 "部分,利用残余信号和反射系数重建信号。
与原始语音信号相比,残余信号和反射系数所需的编码比特更少。
连接库,声明输入数据
连接库
In [ ]:
Pkg.add(["WAV"])
In [ ]:
using .EngeeDSP;
using Plots;
plotlyjs();
using WAV;
using Base64;
声明变量
In [ ]:
sample = 80;
var_load = load_audio();
step = EngeeDSP.step;
In1 = step( var_load, "$(@__DIR__)/check_signal.wav", sample );
声明 LPC 数据结构和函数
在此模型中,语音信号被划分为大小为 80 个采样点的帧,重叠部分为 160 个采样点。
每个帧由一个汉明窗口处理。
找到十阶自相关系数,然后使用 Levinson-Durbin 算法从自相关系数中计算出反射系数。原始语音信号通过一个分析滤波器,这是一个系数等于上述反射系数的空滤波器。滤波器的输出是残余信号。
残余信号通过合成滤波器,合成滤波器是分析滤波器的逆滤波器。
合成滤波器的输出是原始信号。
In [ ]:
mutable struct LPCAnalysisAndSynthesisOfSpeech
obj_Pre_Emphasis
obj_Overlap_Analysis_Windows
obj_Window
obj_Autocorrelation
obj_Levinson_Durbin
obj_Time_Varying_Analysis_Filter
obj_Pad
obj_FFT
obj_MathFunction1
obj_RC_To_InvSine
obj_5_bit_Quantizer
obj_6_bit_Quantizer
obj_Inv_Sine_to_RC
obj_Time_Varying_Synthesis_Filter
obj_De_emphasis_Filter
function LPCAnalysisAndSynthesisOfSpeech()
new(
DescretFIRFilter("Dialog parameters","Direct form",[1, -.95],"Columns as channels",0,false,"None"),
Buffer(160,80,0),
WindowFunction("Apply window to input","Hamming","Symmetric"),
Autocorrelation("Biased",10),
LevinsonDurbin("A and K",false,true),
DescretFIRFilter("Input port","Lattice MA","Columns as channels",0,false,"None"),
EngeeDSP.Pad("Columns","Specify via dialog",0,"User-specified",256,"End","None"),
EngeeFFT("Auto",false,false,true),
MathFunction("reciprocal","Exact","auto"),
TrigonometricFunction("asin","auto",false),
Quantizer(0.1,false),
Quantizer(0.03125,false),
TrigonometricFunction("sin","None","auto"),
AllpoleFilter("Input port","Lattice AR","Columns as channels",0),
AllpoleFilter("Dialog parameters","Direct form",[1 , -.95],"Columns as channels",0)
)
end
end
调整 LPC 参数。
In [ ]:
function setup(obj::LPCAnalysisAndSynthesisOfSpeech,In1)
Pre_Emphasis_set = EngeeDSP.setup(obj.obj_Pre_Emphasis,In1)
Overlap_Analysis_Windows_set = EngeeDSP.setup(obj.obj_Overlap_Analysis_Windows,Pre_Emphasis_set);
Window_set = EngeeDSP.setup(obj.obj_Window,Overlap_Analysis_Windows_set);
Autocorrelation_set = EngeeDSP.setup(obj.obj_Autocorrelation,Window_set);
Levinson_Durbin_set = EngeeDSP.setup(obj.obj_Levinson_Durbin,Autocorrelation_set);
Time_Varying_Analysis_Filter_set = EngeeDSP.setup(obj.obj_Time_Varying_Analysis_Filter,Pre_Emphasis_set,Levinson_Durbin_set[2]);
RC_To_InvSine_set = EngeeDSP.setup(obj.obj_RC_To_InvSine,Levinson_Durbin_set[2]);
bit_Quantizer_5_set = EngeeDSP.setup(obj.obj_5_bit_Quantizer,RC_To_InvSine_set);
bit_Quantizer_6_set = EngeeDSP.setup(obj.obj_6_bit_Quantizer,Time_Varying_Analysis_Filter_set);
Inv_Sine_to_RC_set = EngeeDSP.setup(obj.obj_Inv_Sine_to_RC,bit_Quantizer_5_set);
Time_Varying_Synthesis_Filter_set = EngeeDSP.setup(obj.obj_Time_Varying_Synthesis_Filter,bit_Quantizer_6_set,Inv_Sine_to_RC_set);
De_emphasis_Filter_set = EngeeDSP.setup(obj.obj_De_emphasis_Filter,Time_Varying_Synthesis_Filter_set);
end
Out[0]:
声明第一个处理步骤的设置。
In [ ]:
function step1(obj::LPCAnalysisAndSynthesisOfSpeech,In1)
Pre_Emphasis_out = EngeeDSP.step(obj.obj_Pre_Emphasis,In1);
Overlap_Analysis_Windows_out = EngeeDSP.step(obj.obj_Overlap_Analysis_Windows,Pre_Emphasis_out);
Window_out = EngeeDSP.step(obj.obj_Window,Overlap_Analysis_Windows_out);
Autocorrelation_out = EngeeDSP.step(obj.obj_Autocorrelation,Window_out);
Levinson_Durbin_out = EngeeDSP.step(obj.obj_Levinson_Durbin,Autocorrelation_out);
Time_Varying_Analysis_Filter_out = EngeeDSP.step(obj.obj_Time_Varying_Analysis_Filter,Pre_Emphasis_out,Levinson_Durbin_out[2]);
Pad_out = EngeeDSP.step(obj.obj_Pad,Levinson_Durbin_out[1]);
FFT_out = EngeeDSP.step(obj.obj_FFT,Pad_out);
MathFunction1_out = EngeeDSP.step(obj.obj_MathFunction1,FFT_out);
RC_To_InvSine_out = EngeeDSP.step(obj.obj_RC_To_InvSine,Levinson_Durbin_out[2]);
bit_Quantizer_5_out = EngeeDSP.step(obj.obj_5_bit_Quantizer,RC_To_InvSine_out);
bit_Quantizer_6_out = EngeeDSP.step(obj.obj_6_bit_Quantizer,Time_Varying_Analysis_Filter_out);
Inv_Sine_to_RC_out = EngeeDSP.step(obj.obj_Inv_Sine_to_RC,bit_Quantizer_5_out);
Time_Varying_Synthesis_Filter_out = EngeeDSP.step(obj.obj_Time_Varying_Synthesis_Filter,bit_Quantizer_6_out,Inv_Sine_to_RC_out);
De_emphasis_Filter_out = EngeeDSP.step(obj.obj_De_emphasis_Filter,Time_Varying_Synthesis_Filter_out);
De_emphasis_Filter_out,MathFunction1_out;
end
Out[0]:
LPC 算法的实现
函数调用和 EQ 输出处理。
In [ ]:
obj = LPCAnalysisAndSynthesisOfSpeech()
setup(obj,In1[1]);
Out_a = zeros(size(vcat(In1...)))
Out_p = In1.*1im
size(In1,1)
for i = 0:size(In1,1) - 2
output_a = step1(obj,In1[i+1])
Out_a[sample*i + 1 : sample*(i + 1)] = output_a[1]
output_p = step1(obj,In1[i+1])
Out_p[i+1] = output_p[2]
end
处理和分析结果
设置播放器
In [ ]:
In2 = vcat(In1...);
function audioplayer(s, fs);
buf = IOBuffer();
wavwrite(s, buf; Fs=fs);
data = base64encode(unsafe_string(pointer(buf.data), buf.size));
markup = """<audio controls="controls" {autoplay}>
<source src="data:audio/wav;base64,$data" type="audio/wav" />
Your browser does not support the audio element.
</audio>"""
display("text/html", markup);
end
Out[0]:
播放原始音频文件
In [ ]:
audioplayer(In2, 8000)
播放加密音频文件
In [ ]:
audioplayer(Out_a, 8000)
原始文件和加密文件的可视化对比图。
In [ ]:
plot(In2)
plot!(Out_a)
Out[0]:
光谱分析器图表输出。
In [ ]:
u1 = Out_p[1]; u2 = Out_p[2];
uSubset1 = u1[1:floor(Int,length(u1)/2+1)];
uSubset2 = u2[1:floor(Int,length(u2)/2+1)];
y1 = 6.02059991327962 .* log2.(abs.(uSubset1)+hypot.(eps.(real(uSubset1)),eps.(imag(uSubset1))));
y2 = 6.02059991327962 .* log2.(abs.(uSubset2)+hypot.(eps.(real(uSubset2)),eps.(imag(uSubset2))));
In [ ]:
plot([1:129], y1[:], fillrange = -20, fillalpha = 0.35, c = 1, ylabel = "Mag^2", legend = false)
a = plot!([1:129],y1[:], msw = 0, ms = 2.5, xlabel = "Frequency")
plot([1:129], y2[:], fillrange = -20, fillalpha = 0.35, c = 1, ylabel = "Mag^2", legend = false)
b = plot!([1:129],y2[:], msw = 0, ms = 2.5, xlabel = "Frequency")
plot!(a,b)
Out[0]:
输出
通过本演示,我们学习了与EngeeDSP库中的函数进行交互的方法,还演示了在脚本中创建交互块的可能性。 本例清楚地演示了 LPC 编码方法的工作,并允许我们评估其效率。
这是最强大的语音分析方法之一,也是在低比特率条件下对高质量语音进行编码的最有用方法之一,可提供高度精确的语音参数估计。