[go: up one dir, main page]

CN102543086B - A device and method for voice bandwidth extension based on audio watermark - Google Patents

A device and method for voice bandwidth extension based on audio watermark Download PDF

Info

Publication number
CN102543086B
CN102543086B CN2011104223927A CN201110422392A CN102543086B CN 102543086 B CN102543086 B CN 102543086B CN 2011104223927 A CN2011104223927 A CN 2011104223927A CN 201110422392 A CN201110422392 A CN 201110422392A CN 102543086 B CN102543086 B CN 102543086B
Authority
CN
China
Prior art keywords
frequency
watermark
domain envelope
parameters
time
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN2011104223927A
Other languages
Chinese (zh)
Other versions
CN102543086A (en
Inventor
陈喆
殷福亮
赵承勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dalian University of Technology
Original Assignee
Dalian University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dalian University of Technology filed Critical Dalian University of Technology
Priority to CN2011104223927A priority Critical patent/CN102543086B/en
Publication of CN102543086A publication Critical patent/CN102543086A/en
Application granted granted Critical
Publication of CN102543086B publication Critical patent/CN102543086B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

本发明公开了一种基于音频水印的语音带宽扩展的装置及方法。该装置及方法:开始部分,人发出的语音是宽带信号,在通过电话线传输之前,将高频参数嵌入到窄带码流中,通过电话线传输窄带语音信号;在接收端进行A律解码,然后提取高频参数,使用此高频参数恢复宽带语音中的高频部分,最后将高频语音和低频语音合成宽带语音。该装置及方法利用音频水印的特性,在窄带语音中建立一条隐藏的信道,利用此信道传输高频语音的参数,从而在不改变原有网络协议的前提下,实现了语音信号的频带扩展。

Figure 201110422392

The present invention discloses a device and method for voice bandwidth expansion based on audio watermark. The device and method: at the beginning, the voice uttered by a person is a broadband signal. Before being transmitted through a telephone line, high-frequency parameters are embedded into a narrowband code stream, and the narrowband voice signal is transmitted through the telephone line; A-law decoding is performed at the receiving end, and then high-frequency parameters are extracted, and the high-frequency parameters are used to restore the high-frequency part of the broadband voice, and finally the high-frequency voice and the low-frequency voice are synthesized into broadband voice. The device and method utilize the characteristics of the audio watermark to establish a hidden channel in the narrowband voice, and utilize this channel to transmit the parameters of the high-frequency voice, thereby realizing the frequency band expansion of the voice signal without changing the original network protocol.

Figure 201110422392

Description

一种基于音频水印的语音带宽扩展的装置和方法A device and method for voice bandwidth extension based on audio watermark

技术领域 technical field

本发明涉及语音处理技术,特别涉及一种基于音频水印的语音带宽扩展的装置和方法。 The invention relates to speech processing technology, in particular to a device and method for extending speech bandwidth based on audio watermark.

背景技术 Background technique

人类语音信号的主要能量集中于0.3~3.4KHz,4KHz带宽就可保证足够的可懂度。因此,国际电信联盟(ITU)制定的公用电话网(PSTN)编码标准G.711(即A律和μ律)的采样频率为8KHz,并一直沿用至今。 The main energy of the human speech signal is concentrated in 0.3-3.4KHz, and the bandwidth of 4KHz can guarantee sufficient intelligibility. Therefore, the sampling frequency of the public telephone network (PSTN) coding standard G.711 (that is, A law and μ law) formulated by the International Telecommunication Union (ITU) is 8KHz, and it has been used until now.

窄带语音在保证一定可懂度的同时,降低了对通信带宽的需求,但这是以牺牲语音的自然性为代价的。窄带语音丢失了原始语音中的高频分量,所以它听起来不够自然。为了提高语音质量,ITU-T提出了第一个用于远程电话会议的宽带语音编解码器G.722。宽带语音通信可以通过重新设计传输链路来实现,但是对于庞大的PSTN固话网络来说,重新设计传输链路耗资过大。 Narrowband speech reduces the demand for communication bandwidth while ensuring a certain intelligibility, but this is at the expense of the naturalness of speech. Narrowband speech loses the high-frequency components of the original speech, so it doesn't sound natural. In order to improve voice quality, ITU-T proposed the first wideband voice codec G.722 for teleconferencing. Broadband voice communication can be realized by redesigning the transmission link, but for the huge PSTN fixed-line network, the cost of redesigning the transmission link is too high.

传统的水印是指纸张对着光线时所见的标记,一般用于重要票据的真伪检测。而数字水印技术是利用多媒体数字作品普遍存在的冗余性和随机性,把某些数字信息嵌入到数字作品中,实现信息的隐藏传输。数字水印主要用于保护数字作品的版权和完整性。由于人的听觉比视觉灵敏,将水印嵌入到音频比嵌入到图像要困难的多。 The traditional watermark refers to the mark seen when the paper is facing the light, and it is generally used for the authenticity detection of important bills. The digital watermarking technology uses the redundancy and randomness of multimedia digital works to embed some digital information into digital works to realize the hidden transmission of information. Digital watermarking is mainly used to protect the copyright and integrity of digital works. Since human hearing is more sensitive than vision, it is much more difficult to embed watermark into audio than into image.

基于最低有效位(LSB)的音频水印:基于LSB的语音带宽扩展的方法是将高频参数嵌入到编码码流的最低位来实现,该方法嵌入水印的数量多、算法简单,适合误码率较低的通信信道。 Least significant bit (LSB)-based audio watermarking: The method of LSB-based voice bandwidth extension is to embed high-frequency parameters into the lowest bit of the encoded code stream. This method has a large number of embedded watermarks, a simple algorithm, and is suitable for bit error rates. Lower communication channel.

基于时域回声隐藏技术的音频水印:基于时域回声隐藏技术的音频水印是利用了人耳听觉特性中的时域掩蔽效应:一个声音信号虽然已经结束,但它对另一个声音的听觉能力还有影响。该方法嵌入的水印数量较少,嵌入水印以后对原始的声音有一定的影响。 Audio watermarking based on time-domain echo concealment technology: Audio watermarking based on time-domain echo concealment technology utilizes the time-domain masking effect in the auditory characteristics of the human ear: although a sound signal has ended, its auditory ability to another sound is still influential. This method embeds a small number of watermarks, which will have a certain impact on the original sound after the watermark is embedded.

基于频域离散傅里叶变换的音频水印该方法首先对音频信息进行DFT变换,然后选择其中频率范围为2.4~6.4kHz的DFT系数进行水印嵌入,并用表示水印序列的频谱分量来替换相应的DFT系数。该方法虽然有很好的稳健性,但当嵌入水印与原始DFT系数差别过大时,对原始语音的影响较大。 Audio watermarking based on frequency-domain discrete Fourier transform This method first performs DFT transformation on the audio information, and then selects the DFT coefficients with a frequency range of 2.4-6.4kHz for watermark embedding, and replaces the corresponding DFT with the spectral components representing the watermark sequence coefficient. Although this method has good robustness, when the difference between the embedded watermark and the original DFT coefficient is too large, it will have a great impact on the original speech.

基于频域离散余弦变换的音频水印:该方法先对时域信号做离散余弦变换,然后对序列进行修正离散余弦变换(MDCT),通过对MDCT的系数进行改变以嵌入水印。该方法有很好的稳健性,但嵌入水印的数量较少。 Audio watermarking based on frequency-domain discrete cosine transform: This method first performs discrete cosine transform on the time-domain signal, and then performs modified discrete cosine transform (MDCT) on the sequence, and embeds the watermark by changing the coefficient of MDCT. This method has good robustness, but the number of embedded watermarks is small.

现有技术的缺点:以上方法在稳健性、隐蔽性和嵌入水印数量三个方面不能做到很好的均衡,都有其各自的缺点,因此不能够较好的用于语音带宽扩展。 Disadvantages of the prior art: the above methods cannot achieve a good balance in the three aspects of robustness, concealment and the number of embedded watermarks, and each has its own shortcomings, so it cannot be better used for voice bandwidth expansion.

发明内容 Contents of the invention

针对现有音频水印实现带宽扩展的各种缺点和不足,本发明提供了一种基于音频水印的语音带宽扩展的装置和方法。 Aiming at various shortcomings and deficiencies in realizing bandwidth extension by existing audio watermarks, the present invention provides a device and method for audio bandwidth extension based on audio watermarks.

为了达到上述目的,本发明提供的一种基于音频水印的语音带宽扩展的方法,包括以下步骤: In order to achieve the above-mentioned purpose, a kind of method for the voice bandwidth expansion based on audio watermark provided by the present invention, comprises the following steps:

步骤A.使用QMF分析滤波器组模块将宽带语音分成两个部分:0~8000Hz的窄带语音和8000~16000Hz的高频分量;并将两个输出信号采样频率降至8KHz,得到低频信号s L (n)和高频信号s H (n)。 Step A. Use the QMF analysis filter bank module to divide the wideband speech into two parts: the narrowband speech of 0-8000Hz and the high-frequency component of 8000-16000Hz; and reduce the sampling frequency of the two output signals to 8KHz to obtain the low-frequency signal s L ( n ) and high frequency signal s H ( n ).

步骤B.通过提取高频参数模块提取30个高频参数:16个时域包络参数、12个频域包络参数、平均时域包络参数和平均频域包络参数;该部分参考了文献《基于分层宽带语音编解码系统的DTX/CNG算法研究与实现》的做法,以下是各个参数的具体提取方法: Step B. Extract 30 high-frequency parameters by extracting high-frequency parameters module: 16 time-domain envelope parameters, 12 frequency-domain envelope parameters, average time-domain envelope parameters and average frequency-domain envelope parameters; this part refers to According to the practice of the document "Research and Implementation of DTX/CNG Algorithm Based on Layered Wideband Speech Codec System", the following is the specific extraction method of each parameter:

步骤B1.提取16个时域包络参数和平均时域包络参数: Step B1. Extract 16 time-domain envelope parameters and average time-domain envelope parameters:

每20ms的高频分量s H (n)等分为16段,每段包括10个采样点;16个时域包络参数为: The high-frequency component s H ( n ) of every 20ms is divided into 16 segments, and each segment includes 10 sampling points; the 16 time-domain envelope parameters are:

   。         .

计算平均时域包络: Compute the average time-domain envelope:

                             

用时域包络参数T(i)与平均值

Figure 750206DEST_PATH_IMAGE004
作差进行归一化: Using the time-domain envelope parameter T ( i ) and the mean
Figure 750206DEST_PATH_IMAGE004
Normalize with difference:

   

Figure 592260DEST_PATH_IMAGE002
Figure 592260DEST_PATH_IMAGE002
.

步骤B2.提取12个频域包络参数和平均频域包络参数: Step B2. extract 12 frequency domain envelope parameters and average frequency domain envelope parameters:

高频分量s H (n)的当前帧的160个采样点与上一帧的最后48个采用点经过加窗处理得

Figure 800518DEST_PATH_IMAGE006
,这里使用窗长208个样点窗函数window(n): The 160 sampling points of the current frame of the high-frequency component s H ( n ) and the last 48 sampling points of the previous frame are obtained by windowing
Figure 800518DEST_PATH_IMAGE006
, here the window function window ( n ) with a window length of 208 samples is used:

Figure DEST_PATH_IMAGE007
   
Figure 677207DEST_PATH_IMAGE008
Figure DEST_PATH_IMAGE007
   
Figure 677207DEST_PATH_IMAGE008

其中,N=208; Among them, N=208;

对加窗后的信号补0至256点,然后做256点的FFT变换得S F (k): Add 0 to 256 points to the signal after windowing, and then perform FFT transformation of 256 points to get S F ( k ):

   

Figure 309790DEST_PATH_IMAGE010
Figure 309790DEST_PATH_IMAGE010
.

其中,L=256;将频域分为12个均匀间隔,计算每个间隔的频域包络参数,并转换成对数加权子带能量参数。  Among them, L = 256; divide the frequency domain into 12 uniform intervals, calculate the frequency domain envelope parameters of each interval, and convert them into logarithmic weighted subband energy parameters.

计算平均频域包络: Compute the average frequency-domain envelope:

Figure DEST_PATH_IMAGE011
Figure DEST_PATH_IMAGE011
.

将频域包络参数F(i)与平均值

Figure 639140DEST_PATH_IMAGE012
作差进行归一化: The frequency domain envelope parameters F ( i ) and the average
Figure 639140DEST_PATH_IMAGE012
Normalize with difference:

Figure DEST_PATH_IMAGE013
   
Figure 713406DEST_PATH_IMAGE014
Figure DEST_PATH_IMAGE013
Figure 713406DEST_PATH_IMAGE014
.

步骤C.通过G.711编解码模块将窄带语音信号s L (n) 通过A律编码器编码,得到每个点8bit数据长度的码流,将水印信息嵌入到码流中,通过电话线传送到网络中;接收端从码流中提取出水印信息,并通过A律解码器解码,得到窄带语音信号。 Step C. Encode the narrowband speech signal s L ( n ) through the A-law encoder through the G.711 codec module to obtain a code stream with a data length of 8 bits for each point, embed the watermark information into the code stream, and send the voice signal through the telephone line It is transmitted to the network; the receiving end extracts the watermark information from the code stream, and decodes it through an A-law decoder to obtain a narrowband voice signal.

步骤D.通过水印嵌入模块将水印嵌入到码流中包括以下两种方式: Step D. Embedding the watermark into the code stream through the watermark embedding module includes the following two methods:

D1.通过水印嵌入模块将水印均匀的嵌入到码流中:由于一帧信号有160个采样点,而嵌入水印的比特数为66bit,每隔一个采样点嵌入1比特信息。 D1. Evenly embed the watermark into the code stream through the watermark embedding module: Since there are 160 sampling points in a frame signal, and the number of bits embedded in the watermark is 66 bits, 1 bit of information is embedded in every other sampling point.

或者D2.通过水印嵌入模块将水印信息有选择的嵌入到幅度小的抽样点中;使用C0~C7代表编码码流的最低位到最高位;根据G.711协议,最高位C7代表采样点的符号位,C6~C4为段落码,C3~C0为段内码;段落码越小,码流所代表的采样值的幅度越小;本方法使用C6位将信号划分为大信号,即C6=1和小信号,即C6=0,当C6为0时嵌入水印;如果一帧嵌入的位置不够66个,则选择在其他位置嵌入水印。  Or D2. Selectively embed the watermark information into the sampling points with small amplitude through the watermark embedding module; use C0~C7 to represent the lowest bit to the highest bit of the encoded code stream; according to the G.711 protocol, the highest bit C7 represents the sampling point Sign bit, C6~C4 are paragraph codes, C3~C0 are intra-segment codes; the smaller the paragraph code, the smaller the sampling value represented by the code stream; this method uses C6 bit to divide the signal into large signals, that is, C6= 1 and small signal, that is, C6=0, when C6 is 0, the watermark is embedded; if there are not enough 66 embedded positions in one frame, choose to embed the watermark in other positions. the

步骤E.通过提取水印模块提取水印与步骤D对应,包括两种方式: Step E. Extracting the watermark through the extracting watermark module corresponds to step D, including two ways:

E1.通过提取水印模块提取水印的过程是根据嵌入水印的位置进行提取。 E1. The process of extracting the watermark by the extracting watermark module is based on the position of the embedded watermark.

或者E2.根据码流的特点来判断是否嵌入了水印;从一帧的起始判断,若C6为0,则从最低位提取水印,C6为1时不提取水印;若到达帧尾时提取的水印不足66比特,则返回一帧的起始点,在C6为1处的位置提取,直到提取66比特水印。 Or E2. Determine whether a watermark is embedded according to the characteristics of the code stream; judge from the beginning of a frame, if C6 is 0, then extract the watermark from the lowest bit, and do not extract the watermark when C6 is 1; if it reaches the end of the frame, extract the watermark If the watermark is less than 66 bits, return to the starting point of a frame, and extract at the position where C6 is 1, until a 66-bit watermark is extracted.

步骤F.通过恢复高频语音模块使用白噪声来恢复高频语音: Step F. Recover high-frequency speech using white noise by the Recover High-Frequency Speech module:

    首先将产生的白噪声序列通过由低频语音构造的AR模型,然后使用提取的高频参数对其进行时域包络整形和频域包络整形,即可得到高频语音信号。 First pass the generated white noise sequence through the AR model constructed from low-frequency speech, and then use the extracted high-frequency parameters to perform time-domain envelope shaping and frequency-domain envelope shaping to obtain high-frequency speech signals.

步骤F1.使用白噪声恢复高频语音: Step F1. Recover high frequency speech using white noise:

由于高频语音和低频语音有一定的相关性,使用解码得到的低频语音构造AR模型;在解码端产生白噪声序列,将此序列通过构造的AR模型进行成型处理,使噪声具备高频语音的特征。 Since high-frequency speech and low-frequency speech have a certain correlation, the low-frequency speech obtained by decoding is used to construct an AR model; a white noise sequence is generated at the decoding end, and the sequence is shaped through the constructed AR model to make the noise have the characteristics of high-frequency speech feature.

步骤F2.时域包络局部调整,该部分参考了文献《基于分层宽带语音编解码系统的DTX/CNG算法研究与实现》的做法: Step F2. Local adjustment of the time domain envelope, this part refers to the practice of the document "DTX/CNG Algorithm Research and Implementation Based on Layered Wideband Speech Codec System":

从水印中恢复的归一化时域包络参数和平均时域包络计算高频信号的时域包络参数: Compute the time envelope parameters of the high-frequency signal from the normalized time envelope parameters recovered from the watermark and the average time envelope parameters:

Figure DEST_PATH_IMAGE015
   
Figure 195334DEST_PATH_IMAGE016
Figure DEST_PATH_IMAGE015
Figure 195334DEST_PATH_IMAGE016
.

由噪声和高频信号的时域包络参数计算时域局部增益因子: Calculate the time-domain local gain factor from the time-domain envelope parameters of the noise and high-frequency signals:

Figure DEST_PATH_IMAGE017
Figure DEST_PATH_IMAGE017
.

使用时域局部增益因子对噪声的时域包络进行调整: Adjust the temporal envelope of the noise using a temporal local gain factor:

Figure 251015DEST_PATH_IMAGE018
   
Figure 818394DEST_PATH_IMAGE020
Figure 251015DEST_PATH_IMAGE018
Figure 818394DEST_PATH_IMAGE020
.

两段之间的增益因子使用线性插值的方法进行处理: The gain factor between two segments is processed using a linear interpolation method:

Figure 617722DEST_PATH_IMAGE022
Figure 617722DEST_PATH_IMAGE022
.

步骤F3.频域包络局部调整,该部分参考了文献《基于分层宽带语音编解码系统的DTX/CNG算法研究与实现》的做法: Step F3. Local adjustment of the frequency domain envelope, this part refers to the practice of the document "Research and Implementation of DTX/CNG Algorithm Based on Layered Wideband Speech Codec System":

对时域调整后的信号按照提取12个频域包络参数和平均频域包络参数进行处理,得到噪声的对数加权子带能量参数

Figure DEST_PATH_IMAGE023
和平均频域包络
Figure 787165DEST_PATH_IMAGE024
。按照时域包络局部调整中对噪声的时域包络局部调整方法,对噪声的频域包络进行局部调整。 The signal adjusted in the time domain is processed by extracting 12 frequency domain envelope parameters and the average frequency domain envelope parameters to obtain the logarithmically weighted subband energy parameters of the noise
Figure DEST_PATH_IMAGE023
and the average frequency domain envelope
Figure 787165DEST_PATH_IMAGE024
. According to the local adjustment method of the time domain envelope of the noise in the local adjustment of the time domain envelope, the frequency domain envelope of the noise is locally adjusted.

步骤F4.频域包络全局调整: Step F4. Global adjustment of the frequency domain envelope:

由噪声和高频信号的平均频域包络计算每一帧的频域全局增益因子: Calculate the frequency-domain global gain factor for each frame from the average frequency-domain envelope of the noise and high-frequency signal:

Figure DEST_PATH_IMAGE025
Figure DEST_PATH_IMAGE025
.

使用频域全局增益因子对每一帧的频域包络进行全局调整: Globally adjust the frequency-domain envelope for each frame using a frequency-domain global gain factor:

Figure 92375DEST_PATH_IMAGE026
   
Figure DEST_PATH_IMAGE027
Figure 92375DEST_PATH_IMAGE026
Figure DEST_PATH_IMAGE027
.

将调整后的频谱做IFFT变换,然后用window窗函数对得到时域信号加窗后存入长度为208的buffer中: Perform IFFT transformation on the adjusted spectrum, and then use the window window function to window the obtained time domain signal and store it in a buffer with a length of 208:

Figure 396318DEST_PATH_IMAGE028
Figure 396318DEST_PATH_IMAGE028
.

其中,L=256, n=0,1,…207。 Among them, L=256, n=0,1,...207.

将前一帧buffer中的最后48个点的值与当前帧buffer中的前48个点相加,然后与当前帧buffer中n=48~159的值构成当前帧恢复出的时域信号。 Add the value of the last 48 points in the buffer of the previous frame to the first 48 points in the buffer of the current frame, and then form the time domain signal recovered from the current frame with the value of n=48~159 in the buffer of the current frame.

步骤F5.时域包络全局调整: Step F5. Global adjustment of time domain envelope:

按照频域包络全局调整的步骤对时域包络进行全局调整,调整后的信号

Figure DEST_PATH_IMAGE029
即为由噪声估计的高频信号。 Globally adjust the time-domain envelope according to the steps of global adjustment of the frequency-domain envelope, the adjusted signal
Figure DEST_PATH_IMAGE029
That is, the high-frequency signal estimated by the noise.

步骤G.通过QMF合成滤波器组模块将8KHz采用频率的低频信号

Figure 484490DEST_PATH_IMAGE030
和估计出的高频信号
Figure 190278DEST_PATH_IMAGE029
提高采样频率到16kHz,然后分别通过低通和高通FIR滤波器,处理完的信号为
Figure DEST_PATH_IMAGE031
Figure 338494DEST_PATH_IMAGE032
,滤波器的系数与QMF分析滤波器相同。 Step G. Through the QMF synthesis filter bank module, the low-frequency signal of the 8KHz frequency is used
Figure 484490DEST_PATH_IMAGE030
and the estimated high-frequency signal
Figure 190278DEST_PATH_IMAGE029
Increase the sampling frequency to 16kHz, and then pass through the low-pass and high-pass FIR filters respectively, and the processed signal is
Figure DEST_PATH_IMAGE031
and
Figure 338494DEST_PATH_IMAGE032
, the coefficients of the filter are the same as the QMF analysis filter.

将两信号相加即得到最终16KHz采样频率的宽带信号: Add the two signals to get the final broadband signal with 16KHz sampling frequency:

.

本发明另提供一种基于音频水印的语音带宽扩展的装置。所述基于音频水印的语音带宽扩展的装置包括:QMF分析滤波器组模块、提取高频参数模块、G.711编解码模块、水印嵌入模块、提取水印模块、恢复高频语音模块及QMF合成滤波器组模块。 The present invention further provides a device for extending the voice bandwidth based on the audio watermark. The device for extending the voice bandwidth based on the audio watermark includes: a QMF analysis filter bank module, a module for extracting high-frequency parameters, a G.711 codec module, a watermark embedding module, a module for extracting watermarks, a module for restoring high-frequency voice, and a QMF synthesis filter group module.

所述QMF分析滤波器组模块将宽带语音分成两个部分:0~8000Hz的窄带语音和8000~16000Hz的高频分量;并将两个输出信号采样频率降至8KHz,得到低频信号s L (n)和高频信号s H (n)。 The QMF analysis filter bank module divides the wideband speech into two parts: the narrowband speech of 0~8000Hz and the high frequency component of 8000~16000Hz; and the sampling frequency of the two output signals is reduced to 8KHz to obtain the low frequency signal s L ( n ) and high frequency signal s H ( n ).

所述提取高频参数模块提取30个高频参数:16个时域包络参数、12个频域包络参数、平均时域包络参数和平均频域包络参数;该部分参考了文献《基于分层宽带语音编解码系统的DTX/CNG算法研究与实现》的做法,以下是各个参数的具体提取方法: The module for extracting high-frequency parameters extracts 30 high-frequency parameters: 16 time-domain envelope parameters, 12 frequency-domain envelope parameters, average time-domain envelope parameters and average frequency-domain envelope parameters; this part refers to the document " Based on the practice of DTX/CNG Algorithm Research and Implementation of Layered Wideband Speech Codec System, the following is the specific extraction method of each parameter:

提取16个时域包络参数和平均时域包络参数: Extract 16 time domain envelope parameters and average time domain envelope parameters:

每20ms的高频分量s H (n)等分为16段,每段包括10个采样点;16个时域包络参数为: The high-frequency component s H ( n ) of every 20ms is divided into 16 segments, and each segment includes 10 sampling points; the 16 time-domain envelope parameters are:

   。         .

计算平均时域包络: Compute the average time-domain envelope:

Figure 80983DEST_PATH_IMAGE036
。           
Figure 80983DEST_PATH_IMAGE036
.

用时域包络参数T(i)与平均值

Figure 392010DEST_PATH_IMAGE004
作差进行归一化: Using the time-domain envelope parameter T ( i ) and the mean
Figure 392010DEST_PATH_IMAGE004
Normalize with difference:

Figure DEST_PATH_IMAGE037
   
Figure 225974DEST_PATH_IMAGE035
Figure DEST_PATH_IMAGE037
Figure 225974DEST_PATH_IMAGE035
.

提取12个频域包络参数和平均频域包络参数: Extract 12 frequency domain envelope parameters and average frequency domain envelope parameters:

高频分量s H (n)的当前帧的160个采样点与上一帧的最后48个采用点经过加窗处理得,这里使用窗长208个样点窗函数window(n): The 160 sampling points of the current frame of the high-frequency component s H ( n ) and the last 48 sampling points of the previous frame are obtained by windowing , here the window function window ( n ) with a window length of 208 samples is used:

   

Figure 934484DEST_PATH_IMAGE040
Figure 934484DEST_PATH_IMAGE040
.

其中,N=208。 Among them, N=208.

对加窗后的信号补0至256点,然后做256点的FFT变换得S F (k): Add 0 to 256 points to the signal after windowing, and then perform FFT transformation of 256 points to get S F ( k ):

Figure DEST_PATH_IMAGE041
   
Figure 100017DEST_PATH_IMAGE042
Figure DEST_PATH_IMAGE041
Figure 100017DEST_PATH_IMAGE042
.

其中,L=256;将频域分为12个均匀间隔,计算每个间隔的频域包络参数,并转换成对数加权子带能量参数。  Among them, L = 256; divide the frequency domain into 12 uniform intervals, calculate the frequency domain envelope parameters of each interval, and convert them into logarithmic weighted subband energy parameters.

计算平均频域包络: Compute the average frequency-domain envelope:

.

将频域包络参数F(i)与平均值

Figure 104882DEST_PATH_IMAGE044
作差进行归一化: The frequency domain envelope parameters F ( i ) and the average
Figure 104882DEST_PATH_IMAGE044
Normalize with difference:

Figure DEST_PATH_IMAGE045
   
Figure 415253DEST_PATH_IMAGE046
Figure DEST_PATH_IMAGE045
Figure 415253DEST_PATH_IMAGE046
.

所述G.711编解码模块将窄带语音信号s L (n) 通过A律编码器编码,得到每个点8bit数据长度的码流,将水印信息嵌入到码流中,通过电话线传送到网络中;接收端从码流中提取出水印信息,并通过A律解码器解码,得到窄带语音信号。 The G.711 encoding and decoding module encodes the narrowband voice signal s L ( n ) through an A-law encoder to obtain a code stream of 8 bit data length for each point, embeds the watermark information into the code stream, and transmits it to the In the network; the receiving end extracts the watermark information from the code stream, and decodes it through an A-law decoder to obtain a narrowband voice signal.

所述水印嵌入模块将水印嵌入到码流中包括以下两种方式: The watermark embedding module includes the following two ways to embed the watermark in the code stream:

方式一:通过水印嵌入模块将水印均匀的嵌入到码流中:由于一帧信号有160个采样点,而嵌入水印的比特数为66bit,每隔一个采样点嵌入1比特信息。 Method 1: Evenly embed the watermark into the code stream through the watermark embedding module: Since there are 160 sampling points in a frame signal, and the number of bits embedded in the watermark is 66 bits, 1 bit of information is embedded in every other sampling point.

方式二:通过水印嵌入模块将水印信息有选择的嵌入到幅度小的抽样点中;使用C0~C7代表编码码流的最低位到最高位;根据G.711协议,最高位C7代表采样点的符号位,C6~C4为段落码,C3~C0为段内码;段落码越小,码流所代表的采样值的幅度越小;本方法使用C6位将信号划分为大信号,即C6=1和小信号,即C6=0,当C6为0时嵌入水印;如果一帧嵌入的位置不够66个,则选择在其他位置嵌入水印。  Method 2: Use the watermark embedding module to selectively embed watermark information into small sampling points; use C0~C7 to represent the lowest bit to the highest bit of the encoded code stream; according to the G.711 protocol, the highest bit C7 represents the sampling point Sign bit, C6~C4 are paragraph codes, C3~C0 are intra-segment codes; the smaller the paragraph code, the smaller the sampling value represented by the code stream; this method uses C6 bit to divide the signal into large signals, that is, C6= 1 and small signal, that is, C6=0, when C6 is 0, the watermark is embedded; if there are not enough 66 embedded positions in one frame, choose to embed the watermark in other positions. the

所述提取水印模块提取水印与水印嵌入模块对应,包括两种方式: The watermark extraction module corresponding to the watermark extraction module includes two methods:

方式一:通过提取水印模块提取水印的过程是根据嵌入水印的位置进行提取。 Method 1: The process of extracting the watermark by the watermark extracting module is based on the position of the embedded watermark.

方式二:根据码流的特点来判断是否嵌入了水印;从一帧的起始判断,若C6为0,则从最低位提取水印,C6为1时不提取水印;若到达帧尾时提取的水印不足66比特,则返回一帧的起始点,在C6为1处的位置提取,直到提取66比特水印。 Method 2: Determine whether a watermark is embedded according to the characteristics of the code stream; judge from the beginning of a frame, if C6 is 0, extract the watermark from the lowest bit, and do not extract the watermark when C6 is 1; if it reaches the end of the frame, extract the watermark If the watermark is less than 66 bits, return to the starting point of a frame, and extract at the position where C6 is 1, until a 66-bit watermark is extracted.

所述恢复高频语音模块使用白噪声来恢复高频语音: The recovery high-frequency speech module uses white noise to restore high-frequency speech:

    首先将产生的白噪声序列通过由低频语音构造的AR模型,然后使用提取的高频参数对其进行时域包络整形和频域包络整形,即可得到高频语音信号。 First pass the generated white noise sequence through the AR model constructed from low-frequency speech, and then use the extracted high-frequency parameters to perform time-domain envelope shaping and frequency-domain envelope shaping to obtain high-frequency speech signals.

使用白噪声恢复高频语音: Recover high-frequency speech using white noise:

由于高频语音和低频语音有一定的相关性,使用解码得到的低频语音构造AR模型;在解码端产生白噪声序列,将此序列通过构造的AR模型模块进行成型处理,使噪声具备高频语音的特征。 Since there is a certain correlation between high-frequency speech and low-frequency speech, the low-frequency speech obtained by decoding is used to construct an AR model; a white noise sequence is generated at the decoding end, and the sequence is shaped through the constructed AR model module to make the noise have high-frequency speech Characteristics.

时域包络局部调整,该部分参考了文献《基于分层宽带语音编解码系统的DTX/CNG算法研究与实现》的做法: Partial adjustment of the time domain envelope, this part refers to the practice of the document "Research and Implementation of DTX/CNG Algorithm Based on Layered Wideband Speech Codec System":

从水印中恢复的归一化时域包络参数和平均时域包络计算高频信号的时域包络参数: Compute the time envelope parameters of the high-frequency signal from the normalized time envelope parameters recovered from the watermark and the average time envelope parameters:

   

Figure 648920DEST_PATH_IMAGE048
Figure 648920DEST_PATH_IMAGE048
.

由噪声和高频信号的时域包络参数计算时域局部增益因子: Calculate the time-domain local gain factor from the time-domain envelope parameters of the noise and high-frequency signals:

Figure DEST_PATH_IMAGE049
Figure DEST_PATH_IMAGE049
.

使用时域局部增益因子对噪声的时域包络进行调整: Adjust the temporal envelope of the noise using a temporal local gain factor:

Figure 183806DEST_PATH_IMAGE050
 
Figure DEST_PATH_IMAGE051
  
Figure 172622DEST_PATH_IMAGE052
Figure 183806DEST_PATH_IMAGE050
Figure DEST_PATH_IMAGE051
Figure 172622DEST_PATH_IMAGE052
.

两段之间的增益因子使用线性插值的方法进行处理: The gain factor between two segments is processed using a linear interpolation method:

Figure 97853DEST_PATH_IMAGE022
Figure 97853DEST_PATH_IMAGE022
.

频域包络局部调整,该部分参考了文献《基于分层宽带语音编解码系统的DTX/CNG算法研究与实现》的做法: Local adjustment of the frequency domain envelope, this part refers to the practice of the document "Research and Implementation of DTX/CNG Algorithm Based on Layered Wideband Speech Codec System":

对时域调整后的信号按照提取12个频域包络参数和平均频域包络参数进行处理,得到噪声的对数加权子带能量参数和平均频域包络。按照时域包络局部调整中对噪声的时域包络局部调整方法,对噪声的频域包络进行局部调整。 The signal adjusted in the time domain is processed by extracting 12 frequency domain envelope parameters and the average frequency domain envelope parameters to obtain the logarithmically weighted subband energy parameters of the noise and the average frequency domain envelope . According to the local adjustment method of the time domain envelope of the noise in the local adjustment of the time domain envelope, the frequency domain envelope of the noise is locally adjusted.

频域包络全局调整: Frequency domain envelope global adjustment:

由噪声和高频信号的平均频域包络计算每一帧的频域全局增益因子: Calculate the frequency-domain global gain factor for each frame from the average frequency-domain envelope of the noise and high-frequency signal:

Figure 543375DEST_PATH_IMAGE025
Figure 543375DEST_PATH_IMAGE025
.

使用频域全局增益因子对每一帧的频域包络进行全局调整: Globally adjust the frequency-domain envelope for each frame using a frequency-domain global gain factor:

Figure 766021DEST_PATH_IMAGE026
   
Figure 794020DEST_PATH_IMAGE027
Figure 766021DEST_PATH_IMAGE026
Figure 794020DEST_PATH_IMAGE027
.

将调整后的频谱做IFFT变换,然后用window窗函数对得到时域信号加窗后存入长度为208的buffer装置中: Perform IFFT transformation on the adjusted spectrum, and then use the window window function to window the obtained time domain signal and store it in the buffer device with a length of 208:

Figure 37919DEST_PATH_IMAGE028
Figure 37919DEST_PATH_IMAGE028
.

其中,L=256, n=0,1,…207。 Among them, L=256, n=0,1,...207.

将前一帧buffer装置中的最后48个点的值与当前帧buffer装置中的前48个点相加,然后与当前帧buffer装置中n=48~159的值构成当前帧恢复出的时域信号。 Add the value of the last 48 points in the buffer device of the previous frame to the first 48 points in the buffer device of the current frame, and then form the time domain restored by the current frame with the value of n=48~159 in the buffer device of the current frame Signal.

时域包络全局调整: Time domain envelope global adjustment:

按照频域包络全局调整的步骤对时域包络进行全局调整,调整后的信号即为由噪声估计的高频信号。 Globally adjust the time-domain envelope according to the steps of global adjustment of the frequency-domain envelope, the adjusted signal That is, the high-frequency signal estimated by the noise.

所述QMF合成滤波器组模块将8KHz采用频率的低频信号

Figure 940465DEST_PATH_IMAGE030
和估计出的高频信号提高采样频率到16kHz,然后分别通过低通和高通FIR滤波器,处理完的信号为
Figure 683610DEST_PATH_IMAGE031
Figure 44184DEST_PATH_IMAGE032
,滤波器的系数与QMF分析滤波器相同。 The QMF synthesis filter bank module converts low-frequency signals using a frequency of 8KHz
Figure 940465DEST_PATH_IMAGE030
and the estimated high-frequency signal Increase the sampling frequency to 16kHz, and then pass through the low-pass and high-pass FIR filters respectively, and the processed signal is
Figure 683610DEST_PATH_IMAGE031
and
Figure 44184DEST_PATH_IMAGE032
, the coefficients of the filter are the same as the QMF analysis filter.

将两信号相加即得到最终16KHz采样频率的宽带信号: Add the two signals to get the final broadband signal with 16KHz sampling frequency:

Figure 493620DEST_PATH_IMAGE033
Figure 493620DEST_PATH_IMAGE033

有益效果:本发明给出了一种基于音频水印改善话音质量的方法。该方法利用音频水印的特性,在窄带语音中建立一条隐藏的信道,利用此信道传输高频语音的参数,从而在不改变原有网络协议的前提下,实现了语音信号的频带扩展。本发明使用自适应音频水印实现语音带宽扩展,对原始语音的影响较小、嵌入的高频信息较多、鲁棒性好,适合各种类型的语音,恢复出的宽带语音听觉效果较窄带语音好。 Beneficial effect: the present invention provides a method for improving voice quality based on audio watermark. This method utilizes the characteristics of audio watermarking to establish a hidden channel in narrow-band speech, and uses this channel to transmit the parameters of high-frequency speech, thereby realizing the frequency band extension of speech signals without changing the original network protocol. The present invention uses the adaptive audio watermark to realize voice bandwidth expansion, which has less impact on the original voice, more embedded high-frequency information, good robustness, and is suitable for various types of voice, and the auditory effect of the recovered wideband voice is better than that of narrowband voice good.

附图说明 Description of drawings

图1 本发明原理框图。 Figure 1 is a schematic block diagram of the present invention.

图2 本发明window窗函数。 Fig. 2 The window function of the present invention.

图3 本发明G.711编码码流格式。 Fig. 3 The G.711 code stream format of the present invention.

图4 本发明恢复高频语音框图。 Fig. 4 is a block diagram of recovering high-frequency speech in the present invention.

具体实施方式 Detailed ways

下面结合附图和实施例对本发明进行详细说明。 The present invention will be described in detail below in conjunction with the accompanying drawings and embodiments.

图1给出了本发明完整的原理框图。开始部分,人发出的语音是宽带信号,在通过电话线传输之前,将高频参数嵌入到窄带码流中,通过电话线传输窄带语音信号;在接收端进行A律解码,然后使用高频参数提取模块提取高频参数,使用高频参数合成模块恢复宽带语音中的高频部分,最后将高频语音和低频语音合成宽带语音。 Fig. 1 has provided the complete functional block diagram of the present invention. At the beginning, the human voice is a wideband signal. Before transmission through the telephone line, the high frequency parameters are embedded into the narrowband code stream, and the narrowband voice signal is transmitted through the telephone line; A-law decoding is performed at the receiving end, and then the high frequency parameters are used The extraction module extracts high-frequency parameters, uses the high-frequency parameter synthesis module to restore the high-frequency part in the broadband speech, and finally synthesizes the high-frequency speech and the low-frequency speech into broadband speech.

本发明原理框图中涉及的各个模块介绍如下: Each module involved in the principle block diagram of the present invention is introduced as follows:

1、QMF分析滤波器组模块 1. QMF analysis filter bank module

开始部分人发出语音是宽带语音,而电话线上传输的是窄带语音,所以本发明使用QMF分析滤波器组将宽带语音分成两个部分:0~8000Hz的窄带语音和8000~16000Hz的高频分量。本发明中的QMF分析滤波器使用64阶的FIR滤波器,低通FIR滤波器hL(n)的系数见附录。高通滤波器hH(n)是由低通滤波器hL(n)频移得到,也就是使用复正弦序列

Figure DEST_PATH_IMAGE053
调制,即:
Figure 676471DEST_PATH_IMAGE054
=
Figure DEST_PATH_IMAGE055
=
Figure 439503DEST_PATH_IMAGE056
。 At the beginning part of people's speech is wideband speech, and what is transmitted on the telephone line is narrowband speech, so the present invention uses QMF analysis filter bank to divide broadband speech into two parts: the narrowband speech of 0~8000Hz and the high frequency component of 8000~16000Hz . The QMF analysis filter in the present invention uses a 64-order FIR filter, and the coefficients of the low-pass FIR filter h L (n) are shown in the appendix. The high-pass filter h H (n) is obtained by frequency-shifting the low-pass filter h L (n), that is, using the complex sine sequence
Figure DEST_PATH_IMAGE053
modulation, that is:
Figure 676471DEST_PATH_IMAGE054
=
Figure DEST_PATH_IMAGE055
=
Figure 439503DEST_PATH_IMAGE056
.

将宽带信号通过QMF分析滤波器组,并将两个输出信号采样频率降至8KHz,就可得到低频信号sL(n)和高频信号sH(n)。 By passing the broadband signal through the QMF analysis filter bank and reducing the sampling frequency of the two output signals to 8KHz, the low frequency signal s L (n) and the high frequency signal s H (n) can be obtained.

2、提取高频参数模块 2. Extract high-frequency parameter module

本发明所需提取30个高频参数:16个时域包络参数、12个频域包络参数、平均时域包络参数和平均频域包络参数。以下是各个参数的具体提取方法。 The present invention needs to extract 30 high frequency parameters: 16 time domain envelope parameters, 12 frequency domain envelope parameters, average time domain envelope parameters and average frequency domain envelope parameters. The specific extraction method of each parameter is as follows.

(1) 提取16个时域包络参数和平均时域包络参数 (1) Extract 16 time-domain envelope parameters and average time-domain envelope parameters

每20ms的高频分量sH(n)等分为16段,每段包括10个采样点。16个时域包络参数为: The high-frequency component s H (n) of every 20ms is equally divided into 16 segments, and each segment includes 10 sampling points. The 16 time domain envelope parameters are:

Figure 236557DEST_PATH_IMAGE034
   
Figure 907710DEST_PATH_IMAGE035
Figure 236557DEST_PATH_IMAGE034
Figure 907710DEST_PATH_IMAGE035
.

计算平均时域包络: Compute the average time-domain envelope:

Figure 81203DEST_PATH_IMAGE036
  。              
Figure 81203DEST_PATH_IMAGE036
.

用时域包络参数T(i)与平均值

Figure 904933DEST_PATH_IMAGE004
作差进行归一化: Using the time-domain envelope parameter T(i) and the mean
Figure 904933DEST_PATH_IMAGE004
Normalize with difference:

Figure 935206DEST_PATH_IMAGE037
   
Figure 31338DEST_PATH_IMAGE035
Figure 935206DEST_PATH_IMAGE037
Figure 31338DEST_PATH_IMAGE035
.

(2) 提取12个频域包络参数和平均频域包络参数 (2) Extract 12 frequency domain envelope parameters and average frequency domain envelope parameters

高频分量sH(n)的当前帧的160个采样点与上一帧的最后48个采用点经过加窗处理得

Figure 555991DEST_PATH_IMAGE038
,这里使用窗长208个样点窗函数window(n): The 160 sampling points of the current frame of the high-frequency component s H (n) and the last 48 sampling points of the previous frame are obtained by windowing
Figure 555991DEST_PATH_IMAGE038
, here the window function window(n) with a window length of 208 samples is used:

Figure 421179DEST_PATH_IMAGE039
   
Figure 421179DEST_PATH_IMAGE039
.

其中,N=208。窗函数如图2所示。 Among them, N=208. The window function is shown in Figure 2.

对加窗后的信号补0至256点,然后做256点的FFT变换得SF(k): Add 0 to 256 points to the windowed signal, and then perform FFT transformation of 256 points to get S F (k):

Figure 18831DEST_PATH_IMAGE041
   
Figure 268547DEST_PATH_IMAGE042
Figure 18831DEST_PATH_IMAGE041
Figure 268547DEST_PATH_IMAGE042
.

其中,L=256。将频域分为12个均匀间隔,计算每个间隔的频域包络参数,并转换成对数加权子带能量参数。频域包络子带划分和各自带的对数加权能量F(i)的计算方法见附录。 Among them, L=256. The frequency domain is divided into 12 uniform intervals, and the frequency domain envelope parameters of each interval are calculated and converted into logarithmically weighted subband energy parameters. See the appendix for the division of frequency domain envelope subbands and the calculation method of the logarithmically weighted energy F(i) of each band.

计算平均频域包络: Compute the average frequency-domain envelope:

Figure 126257DEST_PATH_IMAGE043
Figure 126257DEST_PATH_IMAGE043
.

将频域包络参数F(i)与平均值

Figure 498332DEST_PATH_IMAGE044
作差进行归一化: The frequency domain envelope parameters F(i) and the average
Figure 498332DEST_PATH_IMAGE044
Normalize with difference:

Figure 569056DEST_PATH_IMAGE045
   
Figure 569056DEST_PATH_IMAGE045
.

3、G.711编解码模块 3. G.711 codec module

将窄带语音信号sL(n) 通过A律编码器编码,得到每个点8bit数据长度的码流,将水印信息嵌入到码流中,通过电话线传送到网络中。接收端从码流中提取出水印信息,并通过A律解码器解码,得到窄带语音信号。 The narrowband voice signal s L (n) is encoded by an A-law encoder to obtain a code stream with a data length of 8 bits for each point, and the watermark information is embedded into the code stream, which is transmitted to the network through the telephone line. The receiving end extracts the watermark information from the code stream, and decodes it through an A-law decoder to obtain a narrowband voice signal.

4、水印嵌入模块 4. Watermark embedding module

现有的最低有效位嵌入算法是简单的将水印信息嵌入到窄带码流的最低位中,针对传输协议的特点和人耳的主观感觉,本文提出两种改进型最低有效位嵌入算法。 The existing LSB embedding algorithm simply embeds the watermark information into the lowest bit of the narrowband code stream. According to the characteristics of the transmission protocol and the subjective perception of the human ear, this paper proposes two improved LSB embedding algorithms.

第一种方法是将水印较为均匀的嵌入到码流中:由于一帧信号有160个采样点,而嵌入水印的比特数为66bit,可以每隔一个采样点嵌入1比特信息。这样可以避免因局部失真过大导致听觉效果时好时坏,使整体听觉效果保持在一个较高的水平。 The first method is to embed the watermark into the code stream more uniformly: since there are 160 sampling points in a frame signal, and the number of bits embedded in the watermark is 66 bits, 1 bit of information can be embedded in every other sampling point. In this way, the auditory effect can be avoided due to excessive local distortion, so that the overall auditory effect can be kept at a high level.

第二种方法是根据传输协议的特点和人耳的听觉特性提出一种有选择的最低有效位嵌入算法。G.711使用的是非均匀量化,信号抽样值小时,量化间隔也小;信号抽样值大时,量化间隔也大。所以,如果改变小的抽样值的编码码流,抽样值的变化幅度小,改变大的抽样值的编码码流,抽样值的变化幅度大。这样使得无论将水印嵌入到小的抽样点还是大的抽样点,理论上讲得到的信噪比变化很小。但是根据人耳的时域掩蔽效应,一个大信号对后面小信号的掩蔽效应,使得小信号的修改不易被人耳察觉。根据这个特性,可以将水印信息有选择的嵌入到幅度小的抽样点中,使得水印的隐藏性更好。使用C0~C7代表编码码流的最低位到最高位,如图3所示。跟据G.711协议,最高位C7代表采样点的符号位,C6~C4为段落码,C3~C0为段内码。段落码越小,码流所代表的采样值的幅度越小。本文使用C6位将信号划分为大信号(C6=1)和小信号(C6=0),当C6为0时嵌入水印。如果一帧嵌入的位置不够66个,则选择在其他位置嵌入水印。 The second method is to propose a selective LSB embedding algorithm according to the characteristics of the transmission protocol and the auditory characteristics of the human ear. G.711 uses non-uniform quantization. When the signal sampling value is small, the quantization interval is also small; when the signal sampling value is large, the quantization interval is also large. Therefore, if you change the coded stream with a small sample value, the change range of the sample value is small, and if you change the code stream with a large sample value, the change range of the sample value is large. In this way, no matter whether the watermark is embedded in a small sampling point or a large sampling point, theoretically speaking, the signal-to-noise ratio changes little. However, according to the time-domain masking effect of the human ear, the masking effect of a large signal on the subsequent small signal makes the modification of the small signal difficult to be detected by the human ear. According to this feature, the watermark information can be selectively embedded into the sampling points with small amplitude, so that the watermark can be hidden better. Use C0-C7 to represent the lowest bit to the highest bit of the encoded code stream, as shown in Figure 3. According to the G.711 protocol, the highest bit C7 represents the sign bit of the sampling point, C6~C4 are paragraph codes, and C3~C0 are intra-segment codes. The smaller the paragraph code, the smaller the amplitude of the sampling value represented by the code stream. In this paper, the C6 bit is used to divide the signal into a large signal (C6=1) and a small signal (C6=0), and a watermark is embedded when C6 is 0. If there are not enough 66 embedded positions in a frame, choose to embed watermarks in other positions.

5、提取水印模块 5. Extract watermark module

根据嵌入算法的不同,使用与其对应的水印提取方法。第一种算法提取水印的过程是根据嵌入水印的位置进行提取。第二种方法是根据码流的特点来判断是否嵌入了水印。从一帧的起始判断,若C6为0,则从最低位提取水印,C6为1时不提取水印。若到达帧尾时提取的水印不足66比特,则返回一帧的起始点,在C6为1处的位置提取,直到提取66比特水印。 Depending on the embedding algorithm, use the corresponding watermark extraction method. The process of extracting the watermark by the first algorithm is based on the position of the embedded watermark. The second method is to judge whether the watermark is embedded according to the characteristics of the code stream. Judging from the beginning of a frame, if C6 is 0, the watermark is extracted from the lowest bit, and when C6 is 1, no watermark is extracted. If the extracted watermark is less than 66 bits when reaching the end of the frame, return to the starting point of a frame, and extract at the position where C6 is 1, until a 66-bit watermark is extracted.

6、恢复高频语音模块 6. Restore the high-frequency voice module

由于高频语音特性与噪声比较类似,本模块使用白噪声来恢复高频语音。首先将产生的白噪声序列通过由低频语音构造的AR模型,然后使用提取的高频参数对其进行时域包络整形和频域包络整形,即可得到高频语音信号。恢复高频语音框图如图4所示。 Since the characteristics of high-frequency speech are similar to noise, this module uses white noise to restore high-frequency speech. First, pass the generated white noise sequence through the AR model constructed from low-frequency speech, and then use the extracted high-frequency parameters to perform time-domain envelope shaping and frequency-domain envelope shaping to obtain high-frequency speech signals. The block diagram of recovering high-frequency speech is shown in Figure 4.

(1) 使用白噪声恢复高频语音 (1) Restoring high-frequency speech using white noise

由于高频语音和低频语音有一定的相关性,使用解码得到的低频语音构造AR模型。在解码端产生白噪声序列,将此序列通过构造的AR模型进行成型处理,使噪声具备高频语音的特征。 Since there is a certain correlation between high-frequency speech and low-frequency speech, the AR model is constructed using the decoded low-frequency speech. A white noise sequence is generated at the decoding end, and the sequence is shaped through the constructed AR model to make the noise have the characteristics of high-frequency speech.

(2) 时域包络局部调整 (2) Local adjustment of the time domain envelope

从水印中恢复的归一化时域包络参数和平均时域包络计算高频信号的时域包络参数: Compute the time envelope parameters of the high-frequency signal from the normalized time envelope parameters recovered from the watermark and the average time envelope parameters:

Figure 337609DEST_PATH_IMAGE047
   
Figure 818269DEST_PATH_IMAGE048
Figure 337609DEST_PATH_IMAGE047
Figure 818269DEST_PATH_IMAGE048
.

由噪声和高频信号的时域包络参数计算时域局部增益因子: Calculate the time-domain local gain factor from the time-domain envelope parameters of the noise and high-frequency signals:

Figure 923760DEST_PATH_IMAGE049
Figure 923760DEST_PATH_IMAGE049
.

使用时域局部增益因子对噪声的时域包络进行调整: Adjust the temporal envelope of the noise using a temporal local gain factor:

Figure 780857DEST_PATH_IMAGE050
 
Figure 537461DEST_PATH_IMAGE051
  
Figure 2071DEST_PATH_IMAGE052
Figure 780857DEST_PATH_IMAGE050
Figure 537461DEST_PATH_IMAGE051
Figure 2071DEST_PATH_IMAGE052
.

两段之间的增益因子使用线性插值的方法进行处理: The gain factor between two segments is processed using a linear interpolation method:

Figure 47387DEST_PATH_IMAGE022
Figure 47387DEST_PATH_IMAGE022
.

(3) 频域包络局部调整 (3) Local adjustment of the frequency domain envelope

对时域调整后的信号按照提取12个频域包络参数和平均频域包络参数进行处理,得到噪声的对数加权子带能量参数

Figure 504914DEST_PATH_IMAGE023
和平均频域包络
Figure 863826DEST_PATH_IMAGE024
。按照时域包络局部调整中对噪声的时域包络局部调整方法,对噪声的频域包络进行局部调整。 The signal adjusted in the time domain is processed by extracting 12 frequency domain envelope parameters and the average frequency domain envelope parameters to obtain the logarithmically weighted subband energy parameters of the noise
Figure 504914DEST_PATH_IMAGE023
and the average frequency domain envelope
Figure 863826DEST_PATH_IMAGE024
. According to the local adjustment method of the time domain envelope of the noise in the local adjustment of the time domain envelope, the frequency domain envelope of the noise is locally adjusted.

(4) 频域包络全局调整 (4) Global adjustment of the frequency domain envelope

由噪声和高频信号的平均频域包络计算每一帧的频域全局增益因子: Calculate the frequency-domain global gain factor for each frame from the average frequency-domain envelope of the noise and high-frequency signal:

Figure 951868DEST_PATH_IMAGE025
Figure 951868DEST_PATH_IMAGE025
.

使用频域全局增益因子对每一帧的频域包络进行全局调整: Globally adjust the frequency-domain envelope for each frame using a frequency-domain global gain factor:

Figure 281218DEST_PATH_IMAGE026
   
Figure 281218DEST_PATH_IMAGE026
.

将调整后的频谱做IFFT变换,然后用图2的window窗函数对得到时域信号加窗后存入长度为208的buffer中: Perform IFFT transformation on the adjusted spectrum, and then use the window window function in Figure 2 to window the obtained time-domain signal and store it in the buffer with a length of 208:

Figure 696467DEST_PATH_IMAGE028
Figure 696467DEST_PATH_IMAGE028
.

其中,L=256, n=0,1,…207。 Among them, L=256, n=0,1,...207.

将前一帧buffer中的最后48个点的值与当前帧buffer中的前48个点相加,然后与当前帧buffer中n=48~159的值构成当前帧恢复出的时域信号。 Add the value of the last 48 points in the buffer of the previous frame to the first 48 points in the buffer of the current frame, and then form the time domain signal recovered from the current frame with the value of n=48~159 in the buffer of the current frame.

(5) 时域包络全局调整 (5) Global adjustment of the time domain envelope

按照频域包络全局调整的步骤对时域包络进行全局调整,调整后的信号

Figure 17727DEST_PATH_IMAGE029
即为由噪声估计的高频信号。 Globally adjust the time-domain envelope according to the steps of global adjustment of the frequency-domain envelope, the adjusted signal
Figure 17727DEST_PATH_IMAGE029
That is, the high-frequency signal estimated by the noise.

7、QMF合成滤波器组模块 7. QMF synthesis filter bank module

将8KHz采用频率的低频信号

Figure 772056DEST_PATH_IMAGE030
和估计出的高频信号
Figure 322118DEST_PATH_IMAGE029
提高采样频率到16kHz,然后分别通过低通和高通FIR滤波器,处理完的信号为
Figure 907820DEST_PATH_IMAGE031
Figure 337664DEST_PATH_IMAGE032
,滤波器的系数与QMF分析滤波器相同。 8KHz low-frequency signal with frequency
Figure 772056DEST_PATH_IMAGE030
and the estimated high-frequency signal
Figure 322118DEST_PATH_IMAGE029
Increase the sampling frequency to 16kHz, and then pass through the low-pass and high-pass FIR filters respectively, and the processed signal is
Figure 907820DEST_PATH_IMAGE031
and
Figure 337664DEST_PATH_IMAGE032
, the coefficients of the filter are the same as the QMF analysis filter.

将两信号相加即得到最终16KHz采样频率的宽带信号: Add the two signals to get the final broadband signal with 16KHz sampling frequency:

Figure 126760DEST_PATH_IMAGE033
Figure 126760DEST_PATH_IMAGE033
.

总结:本实施例提出两种改进型最低有效位水印嵌入算法。一种改进方法是每隔一个采样点嵌入1比特信息,这样可以避免因局部失真过大导致听觉效果时好时坏,使整体听觉效果保持在一个较高的水平。另一种改进方法是根据传输协议的特点和人耳的听觉特性提出一种有选择的最低有效位嵌入算法。根据人耳的时域掩蔽效应,一个大信号对后面小信号的掩蔽效应,使得小信号的修改不易被人耳察觉。根据这个特性,可以将水印信息有选择的嵌入到幅度小的抽样点中,使得水印的隐藏性更好。 Summary: This embodiment proposes two improved LSB watermark embedding algorithms. An improvement method is to embed 1-bit information at every other sampling point, which can avoid the good and bad auditory effect caused by excessive local distortion, and keep the overall auditory effect at a high level. Another improvement method is to propose a selective LSB embedding algorithm according to the characteristics of the transmission protocol and the auditory characteristics of the human ear. According to the time-domain masking effect of the human ear, the masking effect of a large signal on the subsequent small signal makes the modification of the small signal difficult to be detected by the human ear. According to this feature, the watermark information can be selectively embedded into the sampling points with small amplitude, so that the watermark can be hidden better.

本系统基于上述水印算法,将语音信号中的高频信息嵌入到窄带码流中,通过有线电话网传输出去,在接收端提取出语音的高频参数,合成宽带语音,从而实现语音信号的频谱扩展。由于水印算法的掩蔽性更好,所以即使在接收端没有提取水印与合成宽带语音的功能模块,也不会影响正常的通话质量。而具有该功能的电话终端将会听到扩展频谱后的语言,通话质量得到很大改善。 Based on the above watermarking algorithm, this system embeds the high-frequency information in the voice signal into the narrow-band code stream, transmits it through the wired telephone network, extracts the high-frequency parameters of the voice at the receiving end, and synthesizes broadband voice, thereby realizing the frequency spectrum of the voice signal expand. Because the watermark algorithm has better concealment, even if there is no functional module for extracting watermark and synthesizing broadband voice at the receiving end, it will not affect the normal call quality. The telephone terminal with this function will hear the language after the spread spectrum, and the call quality is greatly improved.

以上内容是结合优选技术方案对本发明所做的进一步详细说明,不能认定发明的具体实施仅限于这些说明。对本发明所属技术领域的普通技术人员来说,在不脱离本发明的构思的前提下,还可以做出简单的推演及替换,都应当视为本发明的保护范围。  The above content is a further detailed description of the present invention in combination with preferred technical solutions, and it cannot be assumed that the specific implementation of the invention is limited to these descriptions. For those of ordinary skill in the technical field to which the present invention belongs, simple deduction and substitutions can be made without departing from the concept of the present invention, which should be regarded as the protection scope of the present invention. the

附录 appendix

频域包络的自带划分: The built-in division of the frequency domain envelope:

Figure 667462DEST_PATH_IMAGE058
Figure 667462DEST_PATH_IMAGE058

各自带的对数加权能量F(i)的计算方法: Calculation method of logarithm-weighted energy F ( i ) of each belt:

0子带: 0 subband:

Figure DEST_PATH_IMAGE059
Figure DEST_PATH_IMAGE059

Figure DEST_PATH_IMAGE061
Figure 580853DEST_PATH_IMAGE062
Figure DEST_PATH_IMAGE063
Figure 122824DEST_PATH_IMAGE064
Figure DEST_PATH_IMAGE065
Figure DEST_PATH_IMAGE061
,
Figure 580853DEST_PATH_IMAGE062
,
Figure DEST_PATH_IMAGE063
,
Figure 122824DEST_PATH_IMAGE064
,
Figure DEST_PATH_IMAGE065
.

1~10子带: 1~10 subbands:

Figure 263955DEST_PATH_IMAGE066
   
Figure DEST_PATH_IMAGE067
Figure 263955DEST_PATH_IMAGE066
   
Figure DEST_PATH_IMAGE067

Figure 637299DEST_PATH_IMAGE068
Figure DEST_PATH_IMAGE069
Figure 956416DEST_PATH_IMAGE070
Figure 637299DEST_PATH_IMAGE068
,
Figure DEST_PATH_IMAGE069
,
Figure 956416DEST_PATH_IMAGE070
,

, 

Figure 246045DEST_PATH_IMAGE074
Figure 746297DEST_PATH_IMAGE075
Figure 979963DEST_PATH_IMAGE076
, ,
Figure 246045DEST_PATH_IMAGE074
,
Figure 746297DEST_PATH_IMAGE075
,
Figure 979963DEST_PATH_IMAGE076

Figure 452533DEST_PATH_IMAGE077
Figure 287951DEST_PATH_IMAGE079
Figure 325308DEST_PATH_IMAGE080
Figure 452533DEST_PATH_IMAGE077
, ,
Figure 287951DEST_PATH_IMAGE079
,
Figure 325308DEST_PATH_IMAGE080
.

11子带: 11 subbands:

Figure 999052DEST_PATH_IMAGE082
Figure 224628DEST_PATH_IMAGE083
Figure 987047DEST_PATH_IMAGE084
Figure 230947DEST_PATH_IMAGE085
Figure 496319DEST_PATH_IMAGE086
Figure 999052DEST_PATH_IMAGE082
,
Figure 224628DEST_PATH_IMAGE083
,
Figure 987047DEST_PATH_IMAGE084
,
Figure 230947DEST_PATH_IMAGE085
,
Figure 496319DEST_PATH_IMAGE086

Figure 396142DEST_PATH_IMAGE087
Figure 396142DEST_PATH_IMAGE087
.

Claims (2)

1.一种基于音频水印的语音带宽扩展的方法,包括以下步骤:1. A method for voice bandwidth expansion based on audio watermarking, comprising the following steps: 步骤A.使用QMF分析滤波器组模块将宽带语音分成两个部分:0~8000Hz的窄带语音和8000~16000Hz的高频分量;并将两个输出信号通过一个降采样模块,将采样频率降至8KHz,得到低频信号sL(n)和高频信号sH(n);Step A. Use the QMF analysis filter bank module to divide the wideband speech into two parts: the narrowband speech of 0-8000 Hz and the high-frequency component of 8000-16000 Hz; and pass the two output signals through a down-sampling module to reduce the sampling frequency to 8KHz, get low frequency signal s L (n) and high frequency signal s H (n); 步骤B.通过提取高频参数模块提取30个高频参数:16个时域包络参数、12个频域包络参数、平均时域包络参数和平均频域包络参数;以下是各个参数的具体提取方法:Step B. Extract 30 high-frequency parameters by extracting high-frequency parameters module: 16 time-domain envelope parameters, 12 frequency-domain envelope parameters, average time-domain envelope parameters and average frequency-domain envelope parameters; the following are the parameters The specific extraction method: 步骤B1.提取16个时域包络参数和平均时域包络参数:Step B1. Extract 16 time-domain envelope parameters and average time-domain envelope parameters: 每20ms的高频分量sH(n)等分为16段,每段包括10个采样点;16个时域包络参数为:The high-frequency component s H (n) of every 20ms is divided into 16 segments, each segment includes 10 sampling points; the 16 time-domain envelope parameters are: TT (( ii )) == 11 22 loglog 22 [[ ΣΣ nno == 00 99 sthe s Hh 22 (( nno ++ 1010 ii )) ]] ,, ii == 0,10,1 ,, ·&Center Dot; ·&Center Dot; ·&Center Dot; ,, 1515 计算平均时域包络:Compute the average time-domain envelope: Mm TT == 11 1616 ΣΣ ii == 00 1515 TT (( ii )) 用时域包络参数T(i)与平均值MT作差进行归一化:Normalization is performed by making a difference between the time-domain envelope parameter T(i) and the mean value MT : TM(i)=T(i)-MT i=0,1,…,15T M (i) = T (i) - M T i = 0, 1, ..., 15 步骤B2.提取12个频域包络参数和平均频域包络参数:Step B2. extract 12 frequency domain envelope parameters and average frequency domain envelope parameters: 高频分量sH(n)的当前帧的160个采样点与上一帧的最后48个采用点经过加窗处理得
Figure FDA00003045505700013
这里使用窗长208个样点窗函数window(n):
The 160 sampling points of the current frame of the high-frequency component s H (n) and the last 48 sampling points of the previous frame are obtained by windowing
Figure FDA00003045505700013
Here the window function window(n) with a window length of 208 samples is used:
sthe s Hh ww (( nno )) == sthe s Hh (( nno )) windowwindow (( nno )) ,, nno == 0,10,1 ,, .. ·&Center Dot; ·&Center Dot; ·&Center Dot; NN -- 11 其中,N=208;对加窗后的信号补0至256点,然后做256点的FFT变换得SF(k):Among them, N=208; add 0 to 256 points to the windowed signal, and then perform FFT transformation of 256 points to get S F (k): SS Ff (( kk )) == FFTFFT [[ sthe s Hh ww (( nno )) ]] == ΣΣ nno == 00 LL -- 11 sthe s Hh ww (( nno )) ee -- jj 22 ππ LL knk n ,, kk == 0,10,1 ,, ·&Center Dot; ·· ·&Center Dot; ,, LL -- 11 其中,L=256;将频域分为12个均匀间隔,计算每个间隔的频域包络参数,并转换成对数加权子带能量参数;Among them, L=256; divide the frequency domain into 12 uniform intervals, calculate the frequency domain envelope parameters of each interval, and convert them into logarithmic weighted subband energy parameters; 计算平均频域包络:Compute the average frequency-domain envelope: Mm Ff == 11 1212 ΣΣ ii == 00 1111 Ff (( ii )) 将频域包络参数F(i)与平均值MF作差进行归一化:The difference between the frequency domain envelope parameter F(i) and the mean M F is normalized: FM(i)=F(i)-MF i=0,1,…,11F M (i) = F (i) - M F i = 0, 1, ..., 11 步骤C.通过G.711编解码模块将窄带语音信号sL(n)通过A律编码器编码,得到每个点8bit数据长度的码流,将水印信息嵌入到码流中,通过电话线传送到网络中;接收端从码流中提取出水印信息,并通过A律解码器解码,得到窄带语音信号;Step C. Use the G.711 codec module to encode the narrowband voice signal s L (n) through an A-law encoder to obtain a code stream with a data length of 8 bits for each point, embed the watermark information into the code stream, and transmit it through the telephone line into the network; the receiving end extracts the watermark information from the code stream, and decodes it through an A-law decoder to obtain a narrowband voice signal; 步骤D.通过水印嵌入模块将水印嵌入到码流中包括以下两种方式:Step D. Embedding the watermark into the code stream through the watermark embedding module includes the following two methods: D1.通过水印嵌入模块将水印均匀的嵌入到码流中:由于一帧信号有160个采样点,而嵌入水印的比特数为66bit,每隔一个采样点嵌入1比特信息;D1. Embed the watermark evenly into the code stream through the watermark embedding module: since there are 160 sampling points in a frame signal, and the number of bits embedded in the watermark is 66 bits, 1 bit of information is embedded in every other sampling point; 或者D2.通过水印嵌入模块将水印信息有选择的嵌入到幅度小的抽样点中;使用C0~C7代表编码码流的最低位到最高位;根据G.711协议,最高位C7代表采样点的符号位,C6~C4为段落码,C3~C0为段内码;段落码越小,码流所代表的采样值的幅度越小;本方法使用C6位将信号划分为大信号,即C6=1和小信号,即C6=0,当C6为0时嵌入水印;Or D2. Use the watermark embedding module to selectively embed the watermark information into the sampling points with small amplitude; use C0~C7 to represent the lowest bit to the highest bit of the coded stream; according to the G.711 protocol, the highest bit C7 represents the sampling point Sign bit, C6~C4 are paragraph codes, C3~C0 are intra-segment codes; the smaller the paragraph code, the smaller the sampling value represented by the code stream; this method uses C6 bit to divide the signal into large signals, that is, C6= 1 and small signal, that is, C6=0, when C6 is 0, the watermark is embedded; 如果一帧嵌入的位置不够66个,则选择在其他位置嵌入水印;If there are not enough 66 embedded positions in one frame, choose to embed watermarks in other positions; 步骤E.通过提取水印模块提取水印与步骤D对应,包括采用以下两种方式之一:Step E. Extracting the watermark through the extracting watermark module corresponds to step D, including adopting one of the following two methods: E1.根据嵌入水印的位置提取水印;E1. Extract the watermark according to the position where the watermark is embedded; 或者E2.根据码流的特点来判断是否嵌入了水印;从一帧的起始判断,若C6为0,则从最低位提取水印,C6为1时不提取水印;若到达帧尾时提取的水印不足66比特,则返回一帧的起始点,在C6为1处的位置提取,直到提取66比特水印;Or E2. Determine whether a watermark is embedded according to the characteristics of the code stream; judge from the beginning of a frame, if C6 is 0, then extract the watermark from the lowest bit, and do not extract the watermark when C6 is 1; if it reaches the end of the frame, extract the watermark If the watermark is less than 66 bits, return to the starting point of a frame, extract at the position where C6 is 1, until the 66-bit watermark is extracted; 步骤F.通过恢复高频语音模块使用白噪声来恢复高频语音:Step F. Recover high-frequency speech using white noise by the Recover High-Frequency Speech module: 首先将产生的白噪声序列通过由低频语音构造的AR模型,然后使用提取的高频参数对其进行时域包络整形和频域包络整形,即可得到高频语音信号;First pass the generated white noise sequence through the AR model constructed from low-frequency speech, and then use the extracted high-frequency parameters to perform time-domain envelope shaping and frequency-domain envelope shaping to obtain high-frequency speech signals; 步骤F1.使用白噪声恢复高频语音:Step F1. Recover high frequency speech using white noise: 由于高频语音和低频语音有一定的相关性,使用解码得到的低频语音构造AR模型;在解码端产生白噪声序列,将此序列通过构造的AR模型进行成型处理,使噪声具备高频语音的特征;Since high-frequency speech and low-frequency speech have a certain correlation, the low-frequency speech obtained by decoding is used to construct an AR model; a white noise sequence is generated at the decoding end, and the sequence is shaped through the constructed AR model to make the noise have the characteristics of high-frequency speech feature; 步骤F2.时域包络局部调整:Step F2. Time-domain envelope local adjustment: 从水印中恢复的归一化时域包络参数和平均时域包络计算高频信号的时域包络参数:Compute the time envelope parameters of the high-frequency signal from the normalized time envelope parameters recovered from the watermark and the average time envelope parameters: TT ^^ (( ii )) == TT Mm (( ii )) ++ Mm TT ,, ii == 0,10,1 ,, ·&Center Dot; ·· ·· ,, 1515 由噪声和高频信号的时域包络参数计算时域局部增益因子:Calculate the time-domain local gain factor from the time-domain envelope parameters of the noise and high-frequency signal: gaingain __ tt (( ii )) == 22 TT ^^ (( ii )) -- TT ~~ (( ii )) 上式中,
Figure FDA00003045505700041
为高频信号的时域包络参数,
Figure FDA00003045505700042
为经过AR模型处理的白噪声的时域包络参数;
In the above formula,
Figure FDA00003045505700041
is the time-domain envelope parameter of the high-frequency signal,
Figure FDA00003045505700042
is the time-domain envelope parameter of the white noise processed by the AR model;
使用时域局部增益因子对噪声的时域包络进行调整:Adjust the temporal envelope of the noise using a temporal local gain factor: seedt(n+10i)=seed(n+10i)gain_t(n+10i) n=0,1,…,9 i=0,1,…,15seed t (n+10i)=seed(n+10i) gain_t(n+10i) n=0,1,...,9 i=0,1,...,15 上式中,seed为产生的白噪声序列,生成方法是混合同余法;seedt序列为经过时域局部增益因子调制后的白噪声序列;In the above formula, seed is the generated white noise sequence, and the generation method is the mixed congruential method; the seed t sequence is the white noise sequence modulated by the local gain factor in the time domain; 两段之间的增益因子使用线性插值的方法进行处理:The gain factor between two segments is processed using a linear interpolation method:
Figure FDA00003045505700043
Figure FDA00003045505700043
步骤F3.频域包络局部调整:Step F3. Local adjustment of the frequency domain envelope: 对时域调整后的信号按照提取12个频域包络参数和平均频域包络参数进行处理,得到噪声的对数加权子带能量参数
Figure FDA00003045505700044
和平均频域包络
Figure FDA00003045505700045
按照时域包络局部调整中对噪声的时域包络局部调整方法,对噪声的频域包络进行局部调整;
The signal adjusted in the time domain is processed by extracting 12 frequency domain envelope parameters and the average frequency domain envelope parameters to obtain the logarithmically weighted subband energy parameters of the noise
Figure FDA00003045505700044
and the average frequency domain envelope
Figure FDA00003045505700045
According to the local adjustment method of the time domain envelope of the noise in the local adjustment of the time domain envelope, the frequency domain envelope of the noise is locally adjusted;
步骤F4.频域包络全局调整:Step F4. Global adjustment of the frequency domain envelope: 由噪声和高频信号的平均频域包络计算每一帧的频域全局增益因子:Calculate the frequency-domain global gain factor for each frame from the average frequency-domain envelope of the noise and high-frequency signal: gaingain __ mfmf == 22 Mm ^^ Ff -- Mm ~~ Ff 上式中,
Figure FDA00003045505700047
为高频信号的平均频域包络参数,
Figure FDA00003045505700048
为处理后的噪声的平均频域包络参数;
In the above formula,
Figure FDA00003045505700047
is the average frequency-domain envelope parameter of the high-frequency signal,
Figure FDA00003045505700048
is the average frequency domain envelope parameter of the processed noise;
使用频域全局增益因子对每一帧的频域包络进行全局调整:Globally adjust the frequency-domain envelope for each frame using a frequency-domain global gain factor: SS gg (( ii )) == SS ~~ Ff (( ii )) gaingain __ mfmf ,, ii == 0,10,1 ,, ·· ·· ·&Center Dot; ,, 255255 将调整后的频谱做IFFT变换,然后用window窗函数对得到时域信号加窗后存入长度为208的buffer中:Perform IFFT transformation on the adjusted spectrum, and then use the window window function to window the obtained time domain signal and store it in a buffer with a length of 208: bufbuf (( nno )) == windowwindow (( nno )) ** IFFTIFFT {{ SS gg (( kk )) }} == windowwindow (( nno )) ** 11 LL ΣΣ kk == 00 LL -- 11 SS gg (( kk )) ee -- jj 22 ππ LL nknk 其中,L=256,n=0,1,…207;Among them, L=256, n=0,1,...207; 将前一帧buffer中的最后48个点的值与当前帧buffer中的前48个点相加,然后与当前帧buffer中n=48~159的值构成当前帧恢复出的时域信号;Add the value of the last 48 points in the buffer of the previous frame to the first 48 points in the buffer of the current frame, and then form the time domain signal recovered by the current frame with the value of n=48~159 in the buffer of the current frame; 步骤F5.时域包络全局调整:Step F5. Global adjustment of time domain envelope: 按照频域包络全局调整的步骤对时域包络进行全局调整,调整后的信号
Figure FDA00003045505700053
即为由噪声估计的高频信号;
Globally adjust the time-domain envelope according to the steps of global adjustment of the frequency-domain envelope, the adjusted signal
Figure FDA00003045505700053
That is, the high-frequency signal estimated by the noise;
步骤G.通过QMF合成滤波器组模块将8KHz采用频率的低频信号和估计出的高频信号
Figure FDA00003045505700056
提高采样频率到16kHz,然后分别通过低通和高通FIR滤波器,处理完的信号为s16L(n)和s16H(H),滤波器的系数与QMF分析滤波器相同;
Step G. Through the QMF synthesis filter bank module, the low-frequency signal of the 8KHz frequency is used and the estimated high-frequency signal
Figure FDA00003045505700056
Increase the sampling frequency to 16kHz, and then pass through the low-pass and high-pass FIR filters respectively, and the processed signals are s 16L (n) and s 16H (H), and the coefficients of the filters are the same as those of the QMF analysis filter;
将两信号相加即得到最终16KHz采样频率的宽带信号:Add the two signals to get the final broadband signal with 16KHz sampling frequency: SS ~~ wbwb (( nno )) == SS 1616 LL (( nno )) ++ SS 1616 Hh (( nno )) ..
2.一种基于音频水印的语音带宽扩展的装置,其特征在于,所述基于音频水印的语音带宽扩展的装置包括:QMF分析滤波器组模块、提取高频参数模块、G.711编解码模块、水印嵌入模块、提取水印模块、恢复高频语音模块及QMF合成滤波器组模块;2. A device based on the voice bandwidth expansion of the audio watermark, characterized in that, the device of the voice bandwidth expansion based on the audio watermark comprises: a QMF analysis filter bank module, a high-frequency parameter extraction module, and a G.711 codec module , a watermark embedding module, a watermark extraction module, a recovery high-frequency speech module and a QMF synthesis filter bank module; 所述QMF分析滤波器组模块将宽带语音分成两个部分:0~8000Hz的窄带语音和8000~16000Hz的高频分量;并将两个输出信号采样频率降至8KHz,得到低频信号sL(n)和高频信号sH(n);The QMF analysis filter bank module divides the wideband speech into two parts: the narrowband speech of 0~8000Hz and the high frequency component of 8000~16000Hz; and the sampling frequency of the two output signals is reduced to 8KHz to obtain the low frequency signal s L (n ) and high frequency signal s H (n); 通过所述提取高频参数模块提取30个高频参数:16个时域包络参数、12个频域包络参数、平均时域包络参数和平均频域包络参数;以下是各个参数的具体提取方法:Extract 30 high-frequency parameters by the module of extracting high-frequency parameters: 16 time-domain envelope parameters, 12 frequency-domain envelope parameters, average time-domain envelope parameters and average frequency-domain envelope parameters; the following are the parameters of each parameter Specific extraction method: 提取16个时域包络参数和平均时域包络参数:Extract 16 time domain envelope parameters and average time domain envelope parameters: 每20ms的高频分量sH(n)等分为16段,每段包括10个采样点;16个时域包络参数为:The high-frequency component s H (n) of every 20ms is divided into 16 segments, each segment includes 10 sampling points; the 16 time-domain envelope parameters are: TT (( ii )) == 11 22 loglog 22 [[ ΣΣ nno == 00 99 sthe s Hh 22 (( nno ++ 1010 ii )) ]] ,, ii == 0,10,1 ,, ·&Center Dot; ·&Center Dot; ·&Center Dot; ,, 1515 计算平均时域包络:Compute the average time-domain envelope: Mm TT == 11 1616 ΣΣ ii == 00 1515 TT (( ii )) 用时域包络参数T(i)与平均值MT作差进行归一化:Normalization is performed by making a difference between the time-domain envelope parameter T(i) and the mean value MT : TM(i)=T(i)-MT i=0,1,…,15T M (i) = T (i) - M T i = 0, 1, ..., 15 提取12个频域包络参数和平均频域包络参数:Extract 12 frequency domain envelope parameters and average frequency domain envelope parameters: 高频分量sH(n)的当前帧的160个采样点与上一帧的最后48个采用点经过加窗处理得
Figure FDA00003045505700063
这里使用窗长208个样点窗函数window(n):
The 160 sampling points of the current frame of the high-frequency component s H (n) and the last 48 sampling points of the previous frame are obtained by windowing
Figure FDA00003045505700063
Here the window function window(n) with a window length of 208 samples is used:
sthe s Hh ww (( nno )) == sthe s Hh (( nno )) windowwindow (( nno )) ,, nno == 0,10,1 ,, ·· ·· ·· ,, NN -- 11 其中,N=208;Among them, N=208; 对加窗后的信号补0至256点,然后做256点的FFT变换得SF(k):Add 0 to 256 points to the windowed signal, and then perform FFT transformation of 256 points to get S F (k): SS Ff (( kk )) == FFTFFT [[ sthe s Hh ww (( nno )) ]] == ΣΣ nno == 00 LL -- 11 sthe s Hh ww (( nno )) ee -- jj 22 ππ LL knk n ,, kk == 0,10,1 ,, ·· ·· ·· ,, LL -- 11 其中,L=256;将频域分为12个均匀间隔,计算每个间隔的频域包络参数,并转换成对数加权子带能量参数;Among them, L=256; divide the frequency domain into 12 uniform intervals, calculate the frequency domain envelope parameters of each interval, and convert them into logarithmic weighted subband energy parameters; 计算平均频域包络:Compute the average frequency-domain envelope: Mm Ff == 11 1212 ΣΣ ii == 00 1111 Ff (( ii )) 将频域包络参数F(i)与平均值MF作差进行归一化:The difference between the frequency domain envelope parameter F(i) and the mean M F is normalized: FM(i)=F(i)-MF i=0,1,…,11F M (i) = F (i) - M F i = 0, 1, ..., 11 所述G.711编解码模块将窄带语音信号sL(n)通过A律编码器编码,得到每个点8bit数据长度的码流,将水印信息嵌入到码流中,通过电话线传送到网络中;接收端从码流中提取出水印信息,并通过A律解码器解码,得到窄带语音信号;The G.711 codec module encodes the narrowband voice signal s L (n) through an A-law encoder to obtain a code stream with a data length of 8 bits at each point, embeds the watermark information into the code stream, and transmits it to the network through the telephone line Middle; the receiving end extracts the watermark information from the code stream, and decodes it through an A-law decoder to obtain a narrowband voice signal; 所述水印嵌入模块将水印嵌入到码流中,包括采用以下两种方式之一:The watermark embedding module embeds the watermark into the code stream, including adopting one of the following two methods: 方式一:通过水印嵌入模块将水印均匀的嵌入到码流中:由于一帧信号有160个采样点,而嵌入水印的比特数为66bit,每隔一个采样点嵌入1比特信息;Method 1: Embed the watermark evenly into the code stream through the watermark embedding module: since there are 160 sampling points in a frame signal, and the number of bits embedded in the watermark is 66 bits, 1 bit of information is embedded in every other sampling point; 方式二:通过水印嵌入模块将水印信息有选择的嵌入到幅度小的抽样点中;使用C0~C7代表编码码流的最低位到最高位;根据G.711协议,最高位C7代表采样点的符号位,C6~C4为段落码,C3~C0为段内码;段落码越小,码流所代表的采样值的幅度越小;本方法使用C6位将信号划分为大信号,即C6=1和小信号,即C6=0,当C6为0时嵌入水印;如果一帧嵌入的位置不够66个,则选择在其他位置嵌入水印;Method 2: Use the watermark embedding module to selectively embed the watermark information into the sampling points with small amplitude; use C0~C7 to represent the lowest bit to the highest bit of the encoded code stream; according to the G.711 protocol, the highest bit C7 represents the sampling point Sign bit, C6~C4 are paragraph codes, C3~C0 are intra-segment codes; the smaller the paragraph code, the smaller the sampling value represented by the code stream; this method uses C6 bit to divide the signal into large signals, that is, C6= 1 and small signal, that is, C6=0, when C6 is 0, embed the watermark; if there are not enough 66 embedded positions in one frame, choose to embed the watermark in other positions; 所述提取水印模块提取水印与水印嵌入模块对应,包括采用以下两种方式之一:The extracting watermark module extracts the watermark corresponding to the watermark embedding module, including adopting one of the following two methods: 方式一:通过提取水印模块提取水印的过程是根据嵌入水印的位置进行提取;Method 1: The process of extracting the watermark through the watermark extraction module is based on the position of the embedded watermark; 方式二:根据码流的特点来判断是否嵌入了水印;从一帧的起始判断,若C6为0,则从最低位提取水印,C6为1时不提取水印;若到达帧尾时提取的水印不足66比特,则返回一帧的起始点,在C6为1处的位置提取,直到提取66比特水印;Method 2: Determine whether a watermark is embedded according to the characteristics of the code stream; judge from the beginning of a frame, if C6 is 0, extract the watermark from the lowest bit, and do not extract the watermark when C6 is 1; if it reaches the end of the frame, extract the watermark If the watermark is less than 66 bits, return to the starting point of a frame, extract at the position where C6 is 1, until the 66-bit watermark is extracted; 所述恢复高频语音模块使用白噪声来恢复高频语音:The recovery high-frequency speech module uses white noise to restore high-frequency speech: 首先将产生的白噪声序列通过由低频语音构造的AR模型,然后使用提取的高频参数对其进行时域包络整形和频域包络整形,即可得到高频语音信号;First pass the generated white noise sequence through the AR model constructed from low-frequency speech, and then use the extracted high-frequency parameters to perform time-domain envelope shaping and frequency-domain envelope shaping to obtain high-frequency speech signals; 使用白噪声恢复高频语音:Recover high-frequency speech using white noise: 由于高频语音和低频语音有一定的相关性,使用解码得到的低频语音构造AR模型;在解码端产生白噪声序列,将此序列通过构造的AR模型进行成型处理,使噪声具备高频语音的特征;Since high-frequency speech and low-frequency speech have a certain correlation, the low-frequency speech obtained by decoding is used to construct an AR model; a white noise sequence is generated at the decoding end, and the sequence is shaped through the constructed AR model to make the noise have the characteristics of high-frequency speech feature; 时域包络局部调整:Time Domain Envelope Local Adjustment: 从水印中恢复的归一化时域包络参数和平均时域包络计算高频信号的时域包络参数:Compute the time envelope parameters of the high-frequency signal from the normalized time envelope parameters recovered from the watermark and the average time envelope parameters: TT ^^ (( ii )) == TT Mm (( ii )) ++ Mm TT ,, ii == 0,10,1 ,, ·&Center Dot; ·· ·&Center Dot; ,, 1515 由噪声和高频信号的时域包络参数计算时域局部增益因子:Calculate the time-domain local gain factor from the time-domain envelope parameters of the noise and high-frequency signals: gaingain __ tt (( ii )) == 22 TT ^^ (( ii )) -- TT ~~ (( ii )) 上式中,
Figure FDA00003045505700091
为高频信号的时域包络参数,
Figure FDA00003045505700092
为经过AR模型处理的白噪声的时域包络参数;
In the above formula,
Figure FDA00003045505700091
is the time-domain envelope parameter of the high-frequency signal,
Figure FDA00003045505700092
is the time-domain envelope parameter of the white noise processed by the AR model;
使用时域局部增益因子对噪声的时域包络进行调整:Adjust the temporal envelope of the noise using a temporal local gain factor: seedt(n+10i)=seed(n+10i)gain_t(n+10i) n=0,1,…,9 i=0,1,…,15seed t (n+10i)=seed(n+10i)gain_t(n+10i) n=0,1,...,9 i=0,1,...,15 上式中,seed为产生的白噪声序列,生成方法是混合同余法;seedt序列为经过时域局部增益因子调制后的白噪声序列;In the above formula, seed is the generated white noise sequence, and the generation method is the mixed congruential method; the seed t sequence is the white noise sequence modulated by the local gain factor in the time domain; 两段之间的增益因子使用线性插值的方法进行处理:The gain factor between two segments is processed using a linear interpolation method: 频域包络局部调整:Frequency Domain Envelope Local Adjustment: 对时域调整后的信号按照提取12个频域包络参数和平均频域包络参数进行处理,得到噪声的对数加权子带能量参数
Figure FDA00003045505700094
和平均频域包络按照时域包络局部调整中对噪声的时域包络局部调整方法,对噪声的频域包络进行局部调整;
The signal adjusted in the time domain is processed by extracting 12 frequency domain envelope parameters and the average frequency domain envelope parameters to obtain the logarithmically weighted subband energy parameters of the noise
Figure FDA00003045505700094
and the average frequency domain envelope According to the local adjustment method of the time domain envelope of the noise in the local adjustment of the time domain envelope, the frequency domain envelope of the noise is locally adjusted;
频域包络全局调整:Frequency domain envelope global adjustment: 由噪声和高频信号的平均频域包络计算每一帧的频域全局增益因子:Calculate the frequency-domain global gain factor for each frame from the average frequency-domain envelope of the noise and high-frequency signal: gaingain __ mfmf == 22 Mm ^^ Ff -- Mm ~~ Ff 上式中,
Figure FDA00003045505700097
为高频信号的平均频域包络参数,
Figure FDA00003045505700098
为处理后的噪声的平均频域包络参数;
In the above formula,
Figure FDA00003045505700097
is the average frequency-domain envelope parameter of the high-frequency signal,
Figure FDA00003045505700098
is the average frequency domain envelope parameter of the processed noise;
使用频域全局增益因子对每一帧的频域包络进行全局调整:Globally adjust the frequency-domain envelope for each frame using a frequency-domain global gain factor: i=0,1,…,255i=0,1,...,255 上式中,
Figure FDA00003045505700101
为调整之前的每一帧语音的频域包络;
In the above formula,
Figure FDA00003045505700101
To adjust the frequency domain envelope of each frame of speech before;
将调整后的频谱做IFFT变换,然后用window窗函数对得到时域信号加窗后存入长度为208的buffer中:Perform IFFT transformation on the adjusted spectrum, and then use the window window function to window the obtained time domain signal and store it in a buffer with a length of 208: bufbuf (( nno )) == windowwindow (( nno )) ** IFFTIFFT {{ SS gg (( kk )) }} == windowwindow (( nno )) ** 11 LL ΣΣ kk == 00 LL -- 11 SS gg (( kk )) ee -- jj 22 ππ LL nknk 其中,L=256,n=0,1,…207;Among them, L=256, n=0,1,...207; 将前一帧buffer中的最后48个点的值与当前帧buffer中的前48个点相加,然后与当前帧buffer中n=48~159的值构成当前帧恢复出的时域信号;Add the value of the last 48 points in the buffer of the previous frame to the first 48 points in the buffer of the current frame, and then form the time domain signal recovered by the current frame with the value of n=48~159 in the buffer of the current frame; 时域包络全局调整:Time domain envelope global adjustment: 按照频域包络全局调整的步骤对时域包络进行全局调整,调整后的信号
Figure FDA00003045505700103
即为由噪声估计的高频信号;
Globally adjust the time-domain envelope according to the steps of global adjustment of the frequency-domain envelope, the adjusted signal
Figure FDA00003045505700103
That is, the high-frequency signal estimated by the noise;
所述QMF合成滤波器组模块将8KHz采用频率的低频信号
Figure FDA00003045505700104
和估计出的高频信号提高采样频率到16kHz,然后分别通过低通和高通FIR滤波器,处理完的信号为s16L(n)和s16H(n),滤波器的系数与QMF分析滤波器相同;
The QMF synthesis filter bank module converts low-frequency signals using a frequency of 8KHz
Figure FDA00003045505700104
and the estimated high-frequency signal Increase the sampling frequency to 16kHz, and then pass through the low-pass and high-pass FIR filters respectively. The processed signals are s 16L (n) and s 16H (n), and the coefficients of the filters are the same as those of the QMF analysis filter;
将两信号相加即得到最终16KHz采样频率的宽带信号:Add the two signals to get the final broadband signal with 16KHz sampling frequency: SS ~~ wbwb (( nno )) == SS 1616 LL (( nno )) ++ SS 1616 Hh (( nno )) ..
CN2011104223927A 2011-12-16 2011-12-16 A device and method for voice bandwidth extension based on audio watermark Expired - Fee Related CN102543086B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2011104223927A CN102543086B (en) 2011-12-16 2011-12-16 A device and method for voice bandwidth extension based on audio watermark

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2011104223927A CN102543086B (en) 2011-12-16 2011-12-16 A device and method for voice bandwidth extension based on audio watermark

Publications (2)

Publication Number Publication Date
CN102543086A CN102543086A (en) 2012-07-04
CN102543086B true CN102543086B (en) 2013-08-14

Family

ID=46349824

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2011104223927A Expired - Fee Related CN102543086B (en) 2011-12-16 2011-12-16 A device and method for voice bandwidth extension based on audio watermark

Country Status (1)

Country Link
CN (1) CN102543086B (en)

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103474079A (en) * 2012-08-06 2013-12-25 苏州沃通信息科技有限公司 Voice encoding method
CN103001915B (en) * 2012-11-30 2015-01-28 南京邮电大学 Time domain reshaping method of asymmetric limiting light orthogonal frequency division multiplexing (OFDM) communication system
KR101771828B1 (en) * 2013-01-29 2017-08-25 프라운호퍼 게젤샤프트 쭈르 푀르데룽 데어 안겐반텐 포르슝 에. 베. Audio Encoder, Audio Decoder, Method for Providing an Encoded Audio Information, Method for Providing a Decoded Audio Information, Computer Program and Encoded Representation Using a Signal-Adaptive Bandwidth Extension
EA028755B9 (en) 2013-04-05 2018-04-30 Долби Лабораторис Лайсэнзин Корпорейшн COMPANDING SYSTEM AND METHOD FOR REDUCING THE QUANTUM NOISE USING AN ADVANCED SPECTRAL EXPANSION
CN103258543B (en) * 2013-04-12 2015-06-03 大连理工大学 A Method for Extending the Bandwidth of Artificial Voice
EP3044789B1 (en) * 2013-09-12 2019-09-11 Saudi Arabian Oil Company Dynamic threshold methods, systems, computer readable media, and program code for filtering noise and restoring attenuated high-frequency components of acoustic signals
EP4325488A3 (en) * 2014-02-28 2024-05-15 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Decoding device, encoding device, decoding method, encoding method, terminal device, and base station device
CN104269173B (en) * 2014-09-30 2018-03-13 武汉大学深圳研究院 The audio bandwidth expansion apparatus and method of switch mode
CN108074578A (en) * 2016-11-17 2018-05-25 中国科学院声学研究所 A kind of transmission of audio frequency watermark and the system and method for information exchange
CN106612168B (en) * 2016-12-23 2019-07-16 中国电子科技集团公司第三十研究所 A Speech Out-of-sync Detection Method Based on PCM Coding Features
CN110544472B (en) * 2019-09-29 2021-12-31 上海依图信息技术有限公司 Method for improving performance of voice task using CNN network structure
CN112885363B (en) * 2019-11-29 2024-11-08 北京三星通信技术研究有限公司 Voice sending method and device, voice receiving method and device, and electronic device
CN112151046B (en) * 2020-09-25 2024-06-18 北京百瑞互联技术股份有限公司 Method, device and medium for adaptively adjusting multi-channel transmission code rate of LC3 encoder
CN115910081B (en) * 2021-08-05 2025-08-05 腾讯科技(深圳)有限公司 Voice signal processing method, device, electronic device and computer storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4667340A (en) * 1983-04-13 1987-05-19 Texas Instruments Incorporated Voice messaging system with pitch-congruent baseband coding
CN101140759A (en) * 2006-09-08 2008-03-12 华为技术有限公司 Bandwidth extension method and system for voice or audio signal
CN101521014A (en) * 2009-04-08 2009-09-02 武汉大学 Audio bandwidth expansion coding and decoding devices
CN102105931A (en) * 2008-07-11 2011-06-22 弗朗霍夫应用科学研究促进协会 Apparatus and method for generating a bandwidth extended signal

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4667340A (en) * 1983-04-13 1987-05-19 Texas Instruments Incorporated Voice messaging system with pitch-congruent baseband coding
CN101140759A (en) * 2006-09-08 2008-03-12 华为技术有限公司 Bandwidth extension method and system for voice or audio signal
CN102105931A (en) * 2008-07-11 2011-06-22 弗朗霍夫应用科学研究促进协会 Apparatus and method for generating a bandwidth extended signal
CN101521014A (en) * 2009-04-08 2009-09-02 武汉大学 Audio bandwidth expansion coding and decoding devices

Also Published As

Publication number Publication date
CN102543086A (en) 2012-07-04

Similar Documents

Publication Publication Date Title
CN102543086B (en) A device and method for voice bandwidth extension based on audio watermark
Djebbar et al. A view on latest audio steganography techniques
US10403295B2 (en) Methods for improving high frequency reconstruction
CN101918999B (en) Methods and apparatus to perform audio watermarking and watermark detection and extraction
CN104200808B (en) Signal handling equipment and method
CN102522092B (en) One based on G. Apparatus and method for 711.1 voice bandwidth extension
CN1397064A (en) System and method for modifying speech signals
CN1808568B (en) Audio encoding/decoding apparatus having watermark insertion/abstraction function and method using the same
CN101246688B (en) A method, system and device for encoding and decoding background noise signals
CN103208289A (en) Digital audio watermarking method capable of resisting re-recording attack
Chen et al. An audio watermark-based speech bandwidth extension method
CN102074238A (en) Linear interference cancellation-based speech secrete communication method
CN114863942B (en) Model training method for voice quality conversion, method and device for improving voice quality
Zhang et al. Robust Audio Watermarking Based on Extended Improved Spread Spectrum with Perceptual Masking.
Chen et al. Speech bandwidth extension by data hiding and phonetic classification
Chen et al. Artificial bandwidth extension of telephony speech by data hiding
Chen et al. Telephony speech enhancement by data hiding
Prasad et al. Speech bandwidth extension aided by magnitude spectrum data hiding
Licai et al. Information hinding based on GSM full rate speech coding
CN114974270A (en) Audio information self-adaptive hiding method
Nishimura Data hiding for audio signals that are robust with respect to air transmission and a speech codec
KR101350599B1 (en) Method and apparatus for Transmitting and Receiving Voice Packet
Nishimura Steganographic band width extension for the AMR codec of low-bit-rate modes.
Xu et al. Content-based digital watermarking for compressed audio
CN103474079A (en) Voice encoding method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20130814

Termination date: 20151216

EXPY Termination of patent right or utility model