[go: up one dir, main page]

CN101458930B - Excitation signal generation in bandwidth spreading and signal reconstruction method and apparatus - Google Patents

Excitation signal generation in bandwidth spreading and signal reconstruction method and apparatus Download PDF

Info

Publication number
CN101458930B
CN101458930B CN200710198774XA CN200710198774A CN101458930B CN 101458930 B CN101458930 B CN 101458930B CN 200710198774X A CN200710198774X A CN 200710198774XA CN 200710198774 A CN200710198774 A CN 200710198774A CN 101458930 B CN101458930 B CN 101458930B
Authority
CN
China
Prior art keywords
frequency
exc
env
signal
limit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN200710198774XA
Other languages
Chinese (zh)
Other versions
CN101458930A (en
Inventor
胡瑞敏
张勇
谢昭
王晓晨
肖玮
马付伟
王庭红
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Wuhan University WHU
Original Assignee
Huawei Technologies Co Ltd
Wuhan University WHU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd, Wuhan University WHU filed Critical Huawei Technologies Co Ltd
Priority to CN200710198774XA priority Critical patent/CN101458930B/en
Priority to PCT/CN2008/073368 priority patent/WO2009076871A1/en
Publication of CN101458930A publication Critical patent/CN101458930A/en
Application granted granted Critical
Publication of CN101458930B publication Critical patent/CN101458930B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/08Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/038Speech enhancement, e.g. noise reduction or echo cancellation using band spreading techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Quality & Reliability (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

本发明公开了一种带宽扩展中激励信号的生成方法,将窄带低频信号通过频谱折叠再合成,生成所需要的高频激励信号。本发明还提供相应的带宽扩展中高频信号的重建方法及装置。本发明方案由于利用低频信号产生高频信号,基于信号低频和高频频谱具有的调和特性,能够对语音和音乐信号都进行较好的扩展,所采用的频谱折叠方式也保证了高低频在衔接处信号频谱的连续;实验证明本发明方案适合对7~14kHz超宽频信号进行扩展。

Figure 200710198774

The invention discloses a method for generating an excitation signal in bandwidth expansion, which recombines a narrow-band low-frequency signal through spectrum folding to generate a required high-frequency excitation signal. The invention also provides a corresponding reconstruction method and device for medium and high frequency signals with extended bandwidth. The scheme of the present invention uses low-frequency signals to generate high-frequency signals, and based on the harmonic characteristics of the low-frequency and high-frequency spectra of the signals, it can better expand both voice and music signals, and the spectrum folding method adopted also ensures that the high and low frequencies are connected The continuity of the signal spectrum at the place; the experiment proves that the scheme of the present invention is suitable for extending the 7-14kHz ultra-wideband signal.

Figure 200710198774

Description

带宽扩展中激励信号的生成及信号重建方法和装置Exciting signal generation and signal reconstruction method and device in bandwidth extension

技术领域technical field

本发明涉及带宽扩展技术领域,具体涉及带宽扩展中激励信号的生成方法和高频信号的重建方法以及相应的装置,本发明特别适用于超宽频带宽扩展。The invention relates to the technical field of bandwidth expansion, in particular to a method for generating an excitation signal in bandwidth expansion, a method for reconstructing a high-frequency signal, and a corresponding device. The invention is particularly suitable for ultra-wideband bandwidth expansion.

背景技术Background technique

带宽扩展(BWE:BandWidth Extension)技术是一种通过选择适当的参数模型,将频带范围较窄的信号扩展到频带范围更宽的技术,从而提高感知音频信号的质量。Bandwidth Extension (BWE: BandWidth Extension) technology is a technology that expands a signal with a narrower frequency range to a wider frequency range by selecting an appropriate parameter model, thereby improving the quality of the perceived audio signal.

通常在编码码率受限的条件下,例如移动和网络环境中,基于人耳对于低频信号更敏锐这一听觉特性,为了取得较好的编码效果,一般会将绝大多数可用比特分配给低频信号,但由于高频成分对声音质量的主观印象仍然起到重要作用,因此仍希望在解码端尽可能好的重建高频信号。下面对目前所使用的一种从窄带语音(0~3.4kHz)扩展到宽带语音(0~7kHz)的方法(ITU-T,G.729.1)进行介绍,其采用时域带宽扩展(TDBWE:Time Domain BandWidth Extension)的方式,具体方案包括:Usually under the conditions of limited coding rate, such as mobile and network environments, based on the auditory characteristic that the human ear is more sensitive to low-frequency signals, in order to achieve better coding effects, most of the available bits are generally allocated to low-frequency signals. signal, but because the high-frequency components still play an important role in the subjective impression of sound quality, it is still hoped that the high-frequency signal can be reconstructed as well as possible at the decoding end. The following is an introduction to a method (ITU-T, G.729.1) currently used to expand from narrowband speech (0 to 3.4kHz) to wideband speech (0 to 7kHz), which uses Time Domain Bandwidth Extension (TDBWE: Time Domain BandWidth Extension), the specific solutions include:

一、编码端1. Encoding end

①预处理① Pretreatment

对输入的以16kHz采样率进行采样获得的高频信号进行频谱折叠,即,将输入信号的高频部分的4~8kHz频段折叠至0~4kHz部分,此过程等价于将高频部分的160个时域样点均乘以(-1)n。再将折叠后的信号通过3/4低通滤波器,滤除其3~4kHz的频段,即对应于原频段中7~8kHz的部分,经过预处理后的信号为S’(n),n=0,…,159。Perform spectrum folding on the input high-frequency signal obtained by sampling at a sampling rate of 16kHz, that is, fold the 4-8kHz frequency band of the high-frequency part of the input signal to the 0-4kHz part. This process is equivalent to folding the 160kHz frequency band of the high-frequency part Time-domain samples are multiplied by (-1) n . Then pass the folded signal through a 3/4 low-pass filter to filter out the 3-4kHz frequency band, which corresponds to the 7-8kHz part of the original frequency band, and the pre-processed signal is S'(n), n = 0, . . . , 159.

②时域谱包络参数的抽取②Extraction of time-domain spectral envelope parameters

20ms的S’(n)帧被细分为16个长度为1.25ms的片断,每个片段包含10采样点。对每10个样点进行一次时域谱包络参数的计算,计算公式如下:The 20ms S'(n) frame is subdivided into 16 segments with a length of 1.25ms, and each segment contains 10 samples. The time-domain spectrum envelope parameters are calculated for every 10 sample points, and the calculation formula is as follows:

TT envenv (( ii )) == 11 22 loglog 22 {{ ΣΣ nno == 00 99 [[ SS ′′ (( nno ++ ii ×× 1010 )) ]] 22 }} ,, ii == 00 ,, .. .. .. ,, 1515 ,,

共得到16个时域谱包络参数Tenv(i)。A total of 16 time-domain spectral envelope parameters T env (i) are obtained.

③频域谱包络参数的抽取③ Extraction of spectrum envelope parameters in frequency domain

在G.729.1中,编码端只对20ms帧的后10ms子帧(80个样点)进行频域参数的提取,由解码端通过插值得到前10ms子帧的频域参数。在计算频域参数时,对S’(n)帧后10ms子帧的序列加128点的汉宁窗wF,该窗由144点的上升汉宁窗的前72点和112点的下降汉宁窗的后56点构成,结合处在第72个样点;该窗前看32个样点,后看16个样点,加上当前子帧的80个样点共为128点。加窗后的信号为:In G.729.1, the encoder only extracts the frequency domain parameters of the last 10ms subframe (80 samples) of the 20ms frame, and the decoder obtains the frequency domain parameters of the first 10ms subframe through interpolation. When calculating the frequency domain parameters, a 128-point Hanning window w F is added to the sequence of the 10ms subframe after the S'(n) frame, which consists of the first 72 points of the 144-point rising Hanning window and the 112-point falling Hanning window The last 56 points of the Ning window are composed of the 72nd sample point at the junction; 32 samples are viewed in front of the window, 16 samples are viewed behind the window, and 80 samples of the current subframe are added to form a total of 128 points. The signal after windowing is:

Sw(n)=S’(n)·wF(n+31),n=-31,…,96。 Sw (n)=S'(n)· wF (n+31), n=-31, . . . , 96.

对Sw(n)采用快速傅立叶变换(FFT:Fast Fourier Transform)由时域变换至频域,FFT变换的长度为64,得到Sfft(n),n=0,…,64。由于在预处理过程中进行了3/4低通滤波,因此在变换至频域后,只有前面3/4的频谱数据是有效的;并且由于FFT变换具有对称性,因此前面32个频域数据中只需要选取前24个数据便足以表达0~3kHz的频段,根据前20个频域数据计算频域谱包络参数为:S w (n) is transformed from the time domain to the frequency domain by using Fast Fourier Transform (FFT: Fast Fourier Transform). The length of the FFT transformation is 64 to obtain S fft (n), n=0, . . . , 64. Due to the 3/4 low-pass filtering in the preprocessing process, only the first 3/4 of the spectral data is valid after being transformed into the frequency domain; and due to the symmetry of the FFT transformation, the first 32 frequency domain data It is only necessary to select the first 24 data to express the frequency band of 0-3kHz. According to the first 20 frequency domain data, the frequency domain spectrum envelope parameters are calculated as:

Ff envenv (( jj )) == 11 22 loglog 22 {{ ΣΣ nno == 22 jj 22 (( jj ++ 11 )) WW Ff (( nno -- 22 jj )) [[ SS fftfft (( nno )) ]] 22 }} ,, jj == 00 ,, .. .. .. ,, 1111 ,,

其中WF(n)为加权函数,WF(0)=WF(2)=0.5,WF(1)=1。Where W F (n) is a weighting function, W F (0)=W F (2)=0.5, W F (1)=1.

④参数的量化④Quantification of parameters

对16个Tenv(i)和12个Fenv(j)进行去均值分裂矢量量化。首先计算Tenv(i)的平均值MT,在对数域用5bit标量量化MT;分别计算Tenv(i)和Fenv(j)与量化标量的残差;然后将16个时域残差分裂为2个8维矢量,使用同一码本分别用7bit量化,将12个频域残差分裂为3个4维矢量,使用不同码本,前两个4维矢量分别用5bit量化,最后一个4维矢量用4bit量化。The 16 T env (i) and 12 F env (j) are subjected to de-mean splitting vector quantization. First calculate the average value M T of T env (i), and quantize M T with a 5bit scalar in the logarithmic domain; respectively calculate the residuals of T env (i) and F env (j) and the quantization scalar; then the 16 time domain The residual is split into two 8-dimensional vectors, and the same codebook is used to quantize them with 7bit respectively. The 12 frequency-domain residuals are split into three 4-dimensional vectors, and different codebooks are used. The first two 4-dimensional vectors are quantized with 5bit respectively. The last 4-dimensional vector is quantized with 4bit.

二、解码端2. Decoder

①激励生成① Incentive Generation

带宽扩展的激励信号(Excitation Signal)由核心层解码参数重建得到。下列核心层解码参数被用于生成带宽扩展的激励信号:整数基音延迟T0、分数 基音延时frac;固定码本贡献的能量Ec、自适应码本贡献的能量Ep;核心层中的基本层固定码本激励c(n)、c(n)的增益gc,自适应码本激励v(n)、v(n)的增益gp;核心层中的增强层增强激励c’(n)、c’(n)的增益genhThe bandwidth-extended excitation signal (Excitation Signal) is reconstructed from the decoding parameters of the core layer. The following core layer decoding parameters are used to generate the excitation signal for bandwidth extension: integer pitch delay T0, fractional pitch delay frac; energy Ec contributed by the fixed codebook, energy Ep contributed by the adaptive codebook; fixed base layer in the core layer The gain g c of the codebook excitation c(n), c(n), the gain g p of the adaptive codebook excitation v(n), v(n); the enhancement layer in the core layer enhances the excitation c'(n), The gain g enh of c'(n).

通过估算清、浊音增益贡献计算每一帧自适应码本和固定码本(包括增强层码本)激励的比率,然后由清浊音各自的激励乘以增益组成初步激励信号,再对初步激励信号根据基音延迟等参数进行基音延时的后处理,获得最终的激励信号exc(n)。exc(n)还需要通过3/4低通滤波器,将频率范围限制为0~3kHz。Calculate the excitation ratio of the adaptive codebook and the fixed codebook (including the enhancement layer codebook) for each frame by estimating the gain contributions of unvoiced and voiced sounds, and then multiply the respective excitations of unvoiced and voiced sounds by the gain to form the initial excitation signal, and then calculate the initial excitation signal The post-processing of the pitch delay is performed according to the pitch delay and other parameters to obtain the final excitation signal exc(n). exc(n) also needs to pass a 3/4 low-pass filter to limit the frequency range to 0-3kHz.

②参数的解码②Decoding of parameters

从码流中解码出16个时域谱包络参数Tenv(i)和12个频域谱包络参数Fenv(j),解码过程是编码端的量化编码过程的逆过程。16 time-domain spectral envelope parameters T env (i) and 12 frequency-domain spectral envelope parameters F env (j) are decoded from the code stream, and the decoding process is the inverse process of the quantization encoding process at the encoding end.

③时域谱包络整形③Time domain spectrum envelope shaping

时域整形主要是对激励信号的能量进行调整。按照编码端Tenv(i)的计算方法计算激励信号exc(n)的时域谱包络参数,得到16个T’env(i),再由Tenv(i)分别减去T’env(i)得出两者的能量差值,从而获得应调整的能量幅度gain:Time-domain shaping is mainly to adjust the energy of the excitation signal. Calculate the time-domain spectrum envelope parameters of the excitation signal exc(n) according to the calculation method of T env (i) at the encoding end, and obtain 16 T' env (i), and then subtract T' env ( i) Get the energy difference between the two, so as to obtain the energy range gain that should be adjusted:

gain=2^[Tenv(i)-T’env(i)];gain=2^[T env (i)-T' env (i)];

然后由160个样点的激励信号exc(n)分别乘以对应的gain来恢复出时域调整后的信号ST(n)。Then, the excitation signal exc(n) of 160 samples is multiplied by the corresponding gain to restore the time-domain adjusted signal S T (n).

④频域谱包络整形④Frequency Domain Spectral Envelope Shaping

解码出的频域参数Fenv(j)表征了20ms帧的后10ms,其前10ms帧的频域参数可通过当前帧与前一个20ms帧的频域参数插值得到,将当前帧前后10ms的频域参数分别记为Fenv,1(j)、Fenv,2(j)。The decoded frequency domain parameter F env (j) represents the last 10 ms of the 20 ms frame, and the frequency domain parameters of the first 10 ms frame can be obtained by interpolating the frequency domain parameters of the current frame and the previous 20 ms frame. The domain parameters are denoted as F env,1 (j) and F env,2 (j), respectively.

然后与时域的处理方法类似,将ST(n)按照编码端的计算方法执行频域参数抽取,每10ms抽取一次,计算得出两组频域参数,记为F’env,1(j)、F’env,2(j)。由Fenv,1(j)、Fenv,2(j)分别与F’env,1(j)、F’ env,2(j)的差值得到两个子帧的调整幅度GF,1(j)、GF,2(j)。由于频域计算是分频段进行的,因此采用一个滤波器组对分别与每个频域参数对应的信号频段的频谱包络分别进行调整,显然共有12个滤波器,采用GF,1(j)、GF,2(j)分别对滤波器组的系数进行加权,然后分别对前后10ms子帧进行滤波,获得频域整形后的信号输出SHB(n)。Then, similar to the processing method in the time domain, S T (n) performs frequency-domain parameter extraction according to the calculation method of the encoding end, and extracts once every 10ms, and calculates two sets of frequency-domain parameters, which are recorded as F' env, 1 (j) , F' env, 2 (j). By the difference between F env, 1 (j), F env, 2 (j) and F' env, 1 (j), F' env, 2 (j) respectively, the adjustment range G F of two subframes is obtained, 1 ( j), G F, 2 (j). Since the frequency domain calculation is carried out in divided frequency bands, a filter bank is used to adjust the spectrum envelope of the signal frequency band corresponding to each frequency domain parameter respectively. Obviously, there are 12 filters in total, and G F, 1 (j ), G F, 2 (j) weight the coefficients of the filter bank respectively, and then filter the front and rear 10ms subframes respectively to obtain the signal output SHB (n) after frequency domain shaping.

⑤BWE的后处理⑤ Post-processing of BWE

由于时域和频域两重调整后,可能会产生部分毛刺,因此采用自适应幅度压缩函数来进行后处理以减小包络的偏离。后处理方法为每80个样点处理一次,将其分为三段,前段6个样点,中段70个样点,最后4个样点,经过后处理的包络调整后的输出为(每行依次为前段、中断和后段):Since some burrs may occur after the double adjustment of the time domain and the frequency domain, an adaptive amplitude compression function is used for post-processing to reduce the deviation of the envelope. The post-processing method is to process once every 80 sample points, and divide it into three sections, 6 sample points in the front section, 70 sample points in the middle section, and 4 sample points in the last section. The output after the post-processing envelope adjustment is (per The lines are pre-paragraph, break and post-paragraph in order):

Figure DEST_PATH_GSB00000446189200011
Figure DEST_PATH_GSB00000446189200011

其中,Tenv(i)是与当前调整的样点对应的时域谱包络参数。Wherein, T env (i) is a time-domain spectral envelope parameter corresponding to the currently adjusted sample point.

由于在编码端将4~8kHz的高频信号折叠至0~4kHz,因此在解码端还原时,应再次进行频谱折叠。折叠方法与编码端的频谱折叠方式类似,由于重建的输出信号为0~3kHz,因此可将3~4kHz的频域系数补0后进行折叠获得4~7kHz的高频重建信号。Since the 4-8kHz high-frequency signal is folded to 0-4kHz at the encoding end, spectrum folding should be performed again when restoring at the decoding end. The folding method is similar to the spectral folding method at the encoding end. Since the reconstructed output signal is 0-3kHz, the frequency domain coefficients of 3-4kHz can be filled with 0 and folded to obtain the high-frequency reconstruction signal of 4-7kHz.

在提出本发明的过程中,发明人发现,上述带宽扩展技术的解码端激励生成采用类似语音生成模型中的二元激励产生方法产生,适合对语音信号编码,而对类音乐信号的编码效果则较差;并且实验证明在上述激励生成方式下,若将该带宽扩展技术用于7~14kHz的超宽带扩展,噪声大,编码效果差,说明该技术不适合应用到超宽带扩展中。In the process of proposing the present invention, the inventors found that the excitation generation at the decoding end of the above-mentioned bandwidth extension technology is generated using a method similar to the binary excitation generation method in the speech generation model, which is suitable for coding speech signals, while the coding effect for music-like signals is better. It is poor; and the experiment proves that under the above excitation generation method, if the bandwidth expansion technology is used for 7-14kHz ultra-wideband expansion, the noise is large and the coding effect is poor, indicating that this technology is not suitable for application in ultra-wideband expansion.

本发明提供一种带宽扩展中激励信号的生成方法以及相应的高频信号的重建方法和装置,适用于在宽带和超宽带扩展中对语音和音乐等音频信号进行高频重建。The invention provides a method for generating an excitation signal in bandwidth expansion and a corresponding high-frequency signal reconstruction method and device, which are suitable for high-frequency reconstruction of audio signals such as speech and music in broadband and ultra-wideband expansion.

一种带宽扩展中激励信号的生成方法,包括:生成频率范围为0~B0的第一激励信号exc(n),n=0,...,N-1;对exc(n)进行频谱折叠,生成频率范围为B0~2B0的第二激励信号excfold(n);对exc(n)和excfold(n)进行合成滤波,输出频率范围为0~2B0的第三激励信号excHB(m),m=0,...,2N-1,所述第三激励信号excHB(m)用于作为高频激励信号进行高频信号的重建。A method for generating an excitation signal in bandwidth expansion, comprising: generating a first excitation signal exc(n) whose frequency range is 0~ B0 , n=0,...,N-1; performing frequency spectrum on exc(n) Fold to generate the second excitation signal exc fold (n) with a frequency range of B 0 to 2B 0 ; perform synthesis filtering on exc (n) and exc fold (n), and output a third excitation signal with a frequency range of 0 to 2B 0 exc HB (m), m=0, . . . , 2N-1, the third excitation signal exc HB (m) is used as a high-frequency excitation signal to perform high-frequency signal reconstruction.

一种带宽扩展中高频信号的重建方法,包括:按照前述激励信号的生成方 法生成激励信号excHB(m),m=0,...,2N-1;解码获得时域谱包络参数Tenv(i)和频域谱包络参数Fenv(j),其中i=0,...,I-1、j=0,...,J-1;按照Tenv(i)对excHB(m)的时域谱包络进行调整,每个Tenv(i)对应调整excHB(m)中包括A个时域样点的一段,A≤2N/I,生成时域调整后的信号ST(m);按照Fenv(j)对ST(m)的频域谱包络进行调整,每个Fenv(j)对应调整ST(m)频域中带宽为B1的一个子带,B1≤B2/J,B2为ST(m)的频带宽度,生成频域调整后的重建信号SF(m);对SF(m)进行频谱折叠,生成频率范围为2B0~2B0+B2的高频重建信号SHB(m)。A kind of rebuilding method of high-frequency signal in bandwidth extension, comprising: generate excitation signal exc HB (m) according to the generating method of aforementioned excitation signal, m=0,..., 2N-1; Decoding obtains time-domain spectrum envelope parameter T env (i) and frequency-domain spectral envelope parameters F env (j), where i=0, ..., I-1, j = 0, ..., J-1; according to T env (i) for exc The time-domain spectral envelope of HB (m) is adjusted, and each T env (i) is adjusted correspondingly to a section including A time-domain samples in exc HB (m), A≤2N/I, generating the time-domain adjusted Signal S T (m); adjust the frequency domain spectrum envelope of S T (m) according to F env (j), and each F env (j) corresponds to adjust the bandwidth of B 1 in the frequency domain of S T (m) A sub-band, B 1 ≤ B 2 /J, B 2 is the frequency bandwidth of S T (m), and generates the reconstructed signal S F (m) adjusted in the frequency domain; performs spectrum folding on S F (m), and generates frequency The high-frequency reconstructed signal SHB (m) in the range of 2B 0 -2B 0 +B 2 .

一种带宽扩展中激励信号的生成装置,包括:核心解码模块,用于输出频率范围为0~B0的第一激励信号exc(n),n=0,...,N-1;频谱折叠模块,用于对exc(n)进行频谱折叠,输出频率范围为B0~2B0的第二激励信号excfold(n);合成滤波模块,用于对exc(n)和excfold(n)进行合成滤波,输出频率范围为0~2B0的第三激励信号excHB(m),m=0,...,2N-1,所述第三激励信号excHB(m)用于作为高频激励信号进行高频信号的重建。A kind of generating device of excitation signal in the bandwidth expansion, comprises: core decoding module, is used for the first excitation signal exc(n) that output frequency range is 0~B 0 , n=0,...,N-1; Spectrum The folding module is used to carry out frequency spectrum folding to exc(n), and the output frequency range is the second excitation signal exc fold (n) of B 0 ~ 2B 0 ; the synthesis filtering module is used to exc (n) and exc fold (n ) for synthesis filtering, the output frequency range is the third excitation signal exc HB (m) of 0~ 2B0 , m=0,..., 2N-1, and the third excitation signal exc HB (m) is used as The high-frequency excitation signal is used to reconstruct the high-frequency signal.

一种带宽扩展中高频信号的重建装置,包括:激励信号生成单元,采用权利要求15~17任意一项所述的激励信号的生成装置的逻辑结构,用于生成激励信号excHB(m),m=0,...,2N-1;解码单元,用于解码输出时域谱包络参数Tenv(i)和频域谱包络参数Fenv(j),其中i=0,...,I-1、j=0,...,J-1;时域整形单元,用于按照Tenv(i)对excHB(m)的时域谱包络进行调整,每个Tenv(i)对应调整excHB(m)中包括A个时域样点的一段,A≤2N/I,输出时域调整后的信号ST(m);频域整形单元,用于按照Fenv(j)对ST(m)的频域谱包络进行调整,每个Fenv(j)对应调整ST(m)频域中带宽为B1的一个子带,B1≤B2/J,B2为ST(m)的频带宽度,输出频域调整后的重建信号SF(m);频谱折叠单元,用于对输入的SF(m)进行频谱折叠,生成频率范围为2B0~2B0+B2的高频重建信号SHB(m)。A reconstruction device for a bandwidth-extended medium-high frequency signal, comprising: an excitation signal generation unit, adopting the logic structure of the excitation signal generation device described in any one of claims 15 to 17, for generating the excitation signal exc HB (m), m=0,...,2N-1; the decoding unit is used to decode and output the time-domain spectral envelope parameter T env (i) and the frequency-domain spectral envelope parameter F env (j), wherein i=0, .. ., I - 1 , j=0, . (i) Corresponding adjustment exc HB (m) includes a section of A time-domain sample points, A≤2N/I, output time-domain adjusted signal S T (m); frequency domain shaping unit, used for according to F env (j) Adjust the frequency domain spectral envelope of ST (m), each F env (j) corresponds to adjust a subband with a bandwidth of B 1 in the frequency domain of ST (m), B 1B 2 / J, B 2 is the frequency bandwidth of S T (m), the reconstructed signal S F (m) after output frequency domain adjustment; Spectrum folding unit, is used for carrying out spectrum folding to the S F (m) of input, and the generation frequency range is High-frequency reconstruction signal SHB (m) of 2B 0 ~ 2B 0 +B 2 .

上述技术方案采用将窄带低频信号通过频谱折叠再合成的方式生成所需要的高频激励信号;由于利用低频信号产生高频信号,基于信号低频和高频频谱具有的调和特性,能够对语音和音乐信号都进行较好的扩展,所采用的频谱折叠方式也保证了高低频在衔接处信号频谱的连续;实验证明,不仅适合对4~7kHz频带信号进行带宽扩展,而且也适合对7~14kHz超宽带信号进行扩展。The above technical solution adopts the method of resynthesizing the narrow-band low-frequency signal through spectrum folding to generate the required high-frequency excitation signal; because the low-frequency signal is used to generate the high-frequency signal, based on the harmonic characteristics of the low-frequency and high-frequency spectrum of the signal, it can be used for speech and music. The signals are well expanded, and the spectrum folding method adopted also ensures the continuity of the signal spectrum at the junction of high and low frequencies; experiments have proved that it is not only suitable for bandwidth expansion of 4-7kHz frequency band signals, but also suitable for 7-14kHz ultra- Broadband signals are extended.

附图说明Description of drawings

图1是本发明实施例的激励信号的生成方法的步骤示意图;1 is a schematic diagram of the steps of a method for generating an excitation signal according to an embodiment of the present invention;

图2是本发明实施例的激励信号的生成装置的逻辑结构示意图;FIG. 2 is a schematic diagram of a logical structure of an excitation signal generating device according to an embodiment of the present invention;

图3是本发明实施例的高频信号的重建方法的步骤示意图;FIG. 3 is a schematic diagram of steps of a high-frequency signal reconstruction method according to an embodiment of the present invention;

图4是本发明实施例的高频信号的重建装置的逻辑结构示意图。Fig. 4 is a schematic diagram of a logic structure of a high-frequency signal reconstruction device according to an embodiment of the present invention.

具体实施方式Detailed ways

本发明实施例提供一种带宽扩展中激励信号的生成方法,将窄带低频信号通过频谱折叠再合成,生成所需要的高频激励信号。本发明实施例还提供相应的带宽扩展中高频信号的重建方法,以及带宽扩展中激励信号的生成装置和高频信号的重建装置。以下分别进行详细说明。An embodiment of the present invention provides a method for generating an excitation signal in bandwidth expansion, which recombines a narrowband low-frequency signal through spectrum folding to generate a required high-frequency excitation signal. Embodiments of the present invention also provide a corresponding method for reconstructing high-frequency signals in bandwidth expansion, and a device for generating an excitation signal and a device for reconstructing high-frequency signals in bandwidth expansion. Each will be described in detail below.

参考图1,本发明实施例的带宽扩展中激励信号的生成方法主要包括步骤:With reference to Fig. 1, the generation method of excitation signal in the bandwidth expansion of the embodiment of the present invention mainly comprises steps:

A1、生成频率范围为0~B0的第一激励信号,该第一激励信号通常为一种窄带激励信号。A1. Generate a first excitation signal with a frequency range from 0 to B 0 , where the first excitation signal is usually a narrowband excitation signal.

本实施例中,作为第一激励信号的窄带激励信号exc(n),n=0,…,N-1,由解码核心层码流获得的参数重建得到。exc(n)可基于编码端的核心层编码方式采用码本激励线性预测(CELP:Code Excited Linear Prediction)重建获得,例如前述背景技术中的激励信号重建方式。In this embodiment, the narrowband excitation signal exc(n), n=0, . exc(n) can be reconstructed based on the coding method of the core layer at the coding end by using Code Excited Linear Prediction (CELP: Code Excited Linear Prediction), for example, the excitation signal reconstruction method in the aforementioned background technology.

为简化处理过程,降低运算复杂度,本实施例中提供一种基于CELP的简单有效的exc(n)生成方式,包括:In order to simplify the processing process and reduce the computational complexity, a simple and effective exc(n) generation method based on CELP is provided in this embodiment, including:

①解码核心码流获得固定码本激励和自适应码本激励以及各自的增益。① Decode the core code stream to obtain fixed codebook excitation and adaptive codebook excitation and their respective gains.

根据编码端核心层的编码方式,固定码本激励可由基本层固定码本激励c(n)和增强层增强激励c’(n)两部分组成,相应的增益分别为gc和genhAccording to the encoding method of the core layer at the encoder, the fixed codebook excitation can be composed of two parts: the fixed codebook excitation c(n) of the base layer and the enhanced excitation c'(n) of the enhancement layer, and the corresponding gains are g c and g enh respectively.

②按照各自的增益加权迭加固定码本激励和自适应码本激励获得exc(n)。② Obtain exc(n) by superimposing fixed codebook excitation and adaptive codebook excitation according to their respective gain weights.

在固定码本激励包括两部分的情况下,exc(n)的计算公式为:In the case where the fixed codebook excitation consists of two parts, the formula for exc(n) is:

exc(n)=gp·v(n)+gc·c(n)+genh·c’(n)exc(n)=g p ·v(n)+g c ·c(n)+g enh ·c'(n)

其中,v(n)为自适应码本激励,gp为v(n)的增益。Among them, v(n) is the adaptive codebook excitation, and g p is the gain of v(n).

通常exc(n)的频率范围为0~4kHz,一帧由时长为20ms的160个时域样点组成,即B0=4kHz,N=160。Usually the frequency range of exc(n) is 0-4kHz, and one frame is composed of 160 time-domain samples with a duration of 20ms, that is, B 0 =4kHz, N=160.

A2、对exc(n)进行频谱折叠,生成频率范围为B0~2B0的第二激励信号; 对应于exc(n)窄带、低频的性质,该第二激励信号可视为窄带高频信号excfold(n)。A2. Perform spectrum folding on exc(n) to generate a second excitation signal with a frequency range of B 0 to 2B 0 ; corresponding to the narrow-band and low-frequency nature of exc(n), the second excitation signal can be regarded as a narrow-band high-frequency signal exc fold (n).

此过程等价于将exc(n)的N个时域样点均乘以(-1)nThis process is equivalent to multiplying the N time-domain samples of exc(n) by (-1) n .

A3、对exc(n)和excfold(n)进行合成滤波,输出频率范围为0~2B0的第三激励信号,该第三激励信号即为带宽扩展的高频激励信号。A3. Perform synthesis filtering on exc(n) and exc fold (n), and output a third excitation signal with a frequency range of 0 to 2B 0 . The third excitation signal is a high-frequency excitation signal with extended bandwidth.

所称合成滤波是将exc(n)和excfold(n)的频谱进行合并,获得带宽扩展为0~2B0的高频激励信号excHB(m),m=0,…,2N-1。一种可选的合成方式为:The so-called synthesis filtering is to combine exc(n) and exc fold (n) spectrums to obtain a high frequency excitation signal exc HB (m) whose bandwidth is extended to 0-2B 0 , m=0,...,2N-1. An optional synthesis method is:

采用正交镜象滤波器(QMF:Quandrature Mirror Filter)对exc(n)和excfold(n)进行正交镜像合成滤波。A quadrature mirror filter (QMF: Quandrature Mirror Filter) is used to perform quadrature mirror synthesis filtering on exc(n) and exc fold (n).

此外,还可以根据实际应用的需要,进一步对频率范围为0~2B0的excHB(m)进行低通、高通或带通滤波,输出部分频率范围的excHB(m)。基于目前的音频信号编码对频率范围的要求,一般对于宽带信号要求的频率范围为0~7kHz,包括0~4kHz的低频部分和4~7kHz的高频部分;对于超宽带信号要求的频率范围为0~14kHz,包括0~8kHz的低频部分和8~14kHz的高频部分,可见高频部分编码的带宽通常为低频部分的3/4,因此这种情况下,还需要对基于低频激励生成的高频激励进行进一步的处理,即:In addition, according to the needs of practical applications, low-pass, high-pass or band-pass filtering can be further performed on exc HB (m) with a frequency range of 0 to 2B 0 to output exc HB (m) in a partial frequency range. Based on the current audio signal coding requirements for the frequency range, generally the frequency range required for broadband signals is 0-7kHz, including the low-frequency part of 0-4kHz and the high-frequency part of 4-7kHz; the frequency range required for ultra-wideband signals is 0~14kHz, including the low frequency part of 0~8kHz and the high frequency part of 8~14kHz, it can be seen that the bandwidth of the high frequency part is usually 3/4 of the low frequency part, so in this case, it is also necessary to The high-frequency excitations are further processed, namely:

A4、对频率范围为0~2B0的excHB(m)进行3/4低通滤波,输出频率范围为0~3B0/2的excHB(m)。A4. Perform 3/4 low-pass filtering on exc HB (m) with a frequency range of 0 to 2B 0 , and output exc HB (m) with a frequency range of 0 to 3B 0 /2.

该频率范围为0~3B0/2的高频激励信号即可用于重建频率范围为2B0~3.5B0的宽带或超宽带高频信号。The high-frequency excitation signal with a frequency range of 0 to 3B 0 /2 can be used to reconstruct a broadband or ultra-wideband high-frequency signal with a frequency range of 2B 0 to 3.5B 0 .

下面对用于执行上述激励信号生成方法的本发明实施例的带宽扩展中激励信号的生成装置进行说明,参考图2,其基本逻辑结构包括:The device for generating the excitation signal in the bandwidth extension of the embodiment of the present invention for performing the method for generating the above excitation signal will be described below. Referring to FIG. 2, its basic logical structure includes:

核心解码模块101,用于输出频率范围为0~B0的第一激励信号exc(n),n=0,…,N-1;该核心解码模块101可采用基于CELP的处理模块,输出的exc(n)可分为两路,分别提供给频谱折叠模块102和合成滤波模块103;The core decoding module 101 is used to output the first excitation signal exc(n) whose frequency range is 0~ B0 , n=0, ..., N-1; the core decoding module 101 can adopt a processing module based on CELP, and the output exc(n) can be divided into two paths, which are provided to the spectrum folding module 102 and the synthesis filter module 103 respectively;

频谱折叠模块102,用于对exc(n)进行频谱折叠,输出频率范围为B0~2B0 的第二激励信号excfold(n);The spectrum folding module 102 is used to perform spectrum folding on exc(n), and output the second excitation signal exc fold (n) whose frequency range is B 0 to 2B 0 ;

合成滤波模块103,用于对exc(n)和excfold(n)进行合成滤波,输出频率范围 为0~2B0的第三激励信号excHB(m),m=0,…,2N-1;该合成滤波模块103可采用正交镜像合成滤波器。Synthetic filtering module 103, for carrying out synthetic filtering to exc (n) and exc fold (n), output frequency range is the 3rd excitation signal exc HB (m) of 0~ 2B0 , m=0,...,2N-1 ; The synthesis filter module 103 may use an orthogonal mirror synthesis filter.

此外,基于前述生成方法中对激励信号频率范围要求的描述,本实施例的激励信号生成装置还可包括:In addition, based on the description of the frequency range requirements of the excitation signal in the aforementioned generation method, the excitation signal generating device of this embodiment may further include:

3/4低通滤波器104,用于输入频率范围为0~2B0的excHB(m),对其进行3/4低通滤波,输出频率范围为0~3B0/2的excHB(m)。The 3/4 low - pass filter 104 is used to input the exc HB (m) whose frequency range is 0~2B 0 , perform 3/4 low-pass filtering on it, and output the exc HB (m) whose frequency range is 0~3B 0 /2 m).

为更好的理解上述实施例,下面以在超宽带带宽扩展中的一种应用为例,说明上述激励信号生成过程:首先由基于CELP编码的核心层提取出0~4kHz的一帧激励信号exc(n)(160个样点);然后通过频谱折叠的方式折叠到4~8kHz频段,生成4~8kHz频段的激励信号excfold(n)(160个样点);然后经过QMF合成滤波器,将exc(n)与excfold(n)合成所需的全频段激励excqmf(m)(320个样点),此时信号的带宽为0~8kHz;再将全频段激励信号excqmf(m)通过3/4低通滤波器滤波,得到0~6kHz的激励信号excHB(m)(320个样点)。In order to better understand the above-mentioned embodiment, an application in UWB bandwidth expansion is taken as an example below to illustrate the above-mentioned excitation signal generation process: first, a frame of excitation signal exc of 0-4kHz is extracted by the core layer based on CELP coding (n) (160 sample points); then fold to 4~8kHz frequency band by the mode of spectrum folding, generate the excitation signal exc fold (n) (160 sample points) of 4~8kHz frequency band; then through QMF synthesis filter, Synthesize exc(n) and exc fold (n) to the required full-band excitation exc qmf (m) (320 samples), at this time the signal bandwidth is 0 ~ 8kHz; then the full-band excitation signal exc qmf (m ) is filtered by a 3/4 low-pass filter to obtain an excitation signal exc HB (m) (320 samples) of 0-6kHz.

上述激励信号生成方法与装置实施例中,采用将窄带低频信号通过频谱折叠再合成的方式生成所需要的高频激励信号;由于利用低频信号产生高频信号,基于信号低频和高频频谱具有的调和特性,能够对语音和音乐信号都进行较好的扩展,解决了现有时域带宽扩展中采用的类似语音生成模型中的二元激励产生方法对于类音乐信号的编码效果比较差的问题。此外,所采用的频谱折叠方式也保证了高低频在衔接处信号频谱的连续;实验证明,上述激励信号生成方案不仅适合对4~7kHz频带信号进行带宽扩展,而且也适合对7~14kHz超宽带信号进行扩展。In the above embodiments of the excitation signal generation method and device, the required high-frequency excitation signal is generated by resynthesizing the narrow-band low-frequency signal through spectrum folding; since the low-frequency signal is used to generate the high-frequency signal, based on the low-frequency and high-frequency spectrum of the signal. Harmonic characteristics can better expand both speech and music signals, and solve the problem that the binary excitation generation method in the similar speech generation model used in the existing time-domain bandwidth expansion has a relatively poor coding effect on music-like signals. In addition, the spectrum folding method adopted also ensures the continuity of the high- and low-frequency signal spectrum at the junction; the experiment proves that the above excitation signal generation scheme is not only suitable for bandwidth expansion of 4-7kHz frequency band signals, but also suitable for 7-14kHz ultra-wideband The signal is extended.

下面对基于上述激励信号生成方法的本发明实施例的带宽扩展中高频信号的重建方法进行说明。参考图3,主要包括步骤:The reconstruction method of the mid-high frequency signal with bandwidth extension based on the above excitation signal generation method according to the embodiment of the present invention will be described below. Referring to Figure 3, it mainly includes steps:

B1、生成高频激励信号。B1. Generate a high-frequency excitation signal.

高频激励信号excHB(m),m=0,…,2N-1,的生成方法参照前述实施例,其带宽为B2,B2=2B0或3B0/2,通常使用后者。The generation method of the high-frequency excitation signal exc HB (m), m= 0 , .

B2、解码获得时域谱包络参数和频域谱包络参数。B2. Decoding and obtaining time-domain spectrum envelope parameters and frequency-domain spectrum envelope parameters.

按照与编码端的编码方式对应的解码方式从码流中解码出时域谱包络参 数Tenv(i),i=0,…,I-1,和频域谱包络参数Fenv(j),j=0,…,J-1,具体编解码方式本实施例不作限定。需要说明的是,此解码的步骤在整个重建过程中并无严格的逻辑顺序要求,可与其他步骤同步或顺序进行,并且不一定要求同时解码出Tenv(i)和Fenv(j),只要在重建过程中使用某参数之前已执行相应参数的解码即可。According to the decoding method corresponding to the encoding method of the encoder, the time-domain spectral envelope parameter T env (i), i=0, ..., I-1, and the frequency-domain spectral envelope parameter F env (j) are decoded from the code stream , j=0, . . . , J−1, the specific encoding and decoding methods are not limited in this embodiment. It should be noted that this decoding step does not have a strict logical sequence requirement in the whole reconstruction process, it can be performed synchronously or sequentially with other steps, and it is not necessarily required to decode T env (i) and F env (j) at the same time, It is sufficient as long as the decoding of the corresponding parameter has been performed before using it in the reconstruction process.

B3、按照Tenv(i)对excHB(m)的时域谱包络进行调整,生成时域调整后的信号ST(m)。B3. Adjust the time domain spectrum envelope of exc HB (m) according to T env (i) to generate a time domain adjusted signal S T (m).

时域谱包络调整过程相应于编码端时域谱包络参数的提取过程执行,每个Tenv(i)对应调整excHB(m)中包括A个时域样点的一段,A≤2N/I,即,所调整的样点数目可以是2N个样点的全部或部分。每个Tenv(i)与所调整样点的对应关系和编码端提取过程中的对应关系相同。具体调整方式可采用例如前述背景技术中的时域谱包络调整方式等。The time-domain spectral envelope adjustment process is executed corresponding to the extraction process of the time-domain spectral envelope parameters at the encoding end, and each T env (i) corresponds to adjusting a section of exc HB (m) including A time-domain samples, A≤2N /I, that is, the adjusted number of samples may be all or part of 2N samples. The corresponding relationship between each T env (i) and the adjusted sample point is the same as the corresponding relationship during the extraction process at the encoding end. A specific adjustment method may adopt, for example, the time-domain spectrum envelope adjustment method in the aforementioned background art.

为提供更好的调整效果,本实施例中提供一种时域谱包络调整方式,包括:In order to provide a better adjustment effect, a time-domain spectrum envelope adjustment method is provided in this embodiment, including:

①按照编码端计算Tenv(i)的方式,计算excHB(m)的时域谱包络参数T’env(i)。① Calculate the time-domain spectrum envelope parameter T' env (i) of exc HB (m) according to the method of calculating T env (i) at the encoder end.

所称编码端计算Tenv(i)的方式即编码端提取需要编码的高频信号Shb(m)的Tenv(i)的过程。Shb(m)通常由编码端对需要编码的信号的高频部分进行预处理得到:首先将采样后分频得到的高频信号折叠到低频段,然后按编码的频率范围要求进行低通滤波。一种T’env(i)的计算方式示例如下:The so-called way of calculating T env (i) at the encoding end is the process of extracting T env (i) of the high-frequency signal Shb (m) to be encoded at the encoding end. Shb (m) is usually obtained by preprocessing the high-frequency part of the signal to be encoded by the encoding end: first fold the high-frequency signal obtained by frequency division after sampling to the low-frequency band, and then perform low-pass filtering according to the frequency range requirements of the encoding . An example of a calculation method of T' env (i) is as follows:

将excHB(m)的2N个样点分为I段,每段A个样点,计算每段的对数域能量Divide the 2N sample points of exc HB (m) into I segments, each segment has A sample points, and calculate the log domain energy of each segment

TT '' envenv (( ii )) == 11 22 loglog 22 {{ ΣΣ aa == 00 AA -- 11 [[ excexc HBHB (( aa ++ ii ×× AA )) ]] 22 }} ,, ii == 00 ,, .. .. .. ,, II -- 11 ..

通常可取10个样点为一段,即A=10,此时T’env(i)的数目为I=N/5。Usually, 10 sample points can be taken as a segment, that is, A=10, and the number of T' env (i) at this time is I=N/5.

②根据Tenv(i)与T’env(i)之间的能量差值计算时域初步增益因子gT(i)。② Calculate the time-domain preliminary gain factor g T (i) according to the energy difference between T env (i) and T' env (i).

一种gT(i)的计算方式示例如下:An example of the calculation method of g T (i) is as follows:

gT(i)=2^[Tenv(i)-T’env(i)], gT (i)=2^[ Tenv (i) -T'env (i)],

显然,每个gT(i)对应于excHB(m)中包括A个时域样点的一段,对应关系与T’env(i)和excHB(m)中样点的对应关系相同。Apparently, each g T (i) corresponds to a segment in exc HB (m) including A time-domain samples, and the corresponding relationship is the same as that between T' env (i) and exc HB (m).

③插值每个gT(i)获得A个增益因子。③ Interpolate each g T (i) to obtain A gain factors.

可根据需要采用各种插值方式将每个gT(i)扩展为A个增益因子gT,i(a),a= 0,…,A-1,例如可简单的令每个gT,i(a)均等于gT(i)。为获得较好的时域调整效果,在A=10的情况下,本实施例中提供一种平滑插值算法来计算gT,i(a):Various interpolation methods can be adopted as required to expand each g T (i) into A gain factors g T, i (a), a=0,..., A-1, for example, each g T, i (a) is equal to g T (i). In order to obtain a better time-domain adjustment effect, in the case of A=10, a smooth interpolation algorithm is provided in this embodiment to calculate g T,i (a):

gT,i(a)=wT(a)·gT(i)+[1-wT(a)]·glast T,i(a);其中,wT(a)为窗函数,glast T,i(a)为上一帧excHB(m)对应样点的增益因子。wT(a)具体为:g T, i (a)=w T (a) g T (i)+[1-w T (a)] g last T, i (a); where, w T (a) is a window function, g last T, i (a) is the gain factor of the sample point corresponding to exc HB (m) in the previous frame. w T (a) is specifically:

ww TT (( aa )) == 11 22 {{ 11 -- coscos [[ (( aa ++ 11 )) ππ 66 ]] }} ,, aa == 00 ,, .. .. .. ,, 44 11 ,, aa == 55 ,, .. .. .. ,, 99

上述插值算法可以理解为,对前5个gT,i(a)采用上一帧平滑插值得到的相应的glast T,i(a)进行平滑处理,对后5个gT,i(a)则采用gT(i)的值。The above interpolation algorithm can be understood as smoothing the first 5 g T, i (a) using the corresponding g last T, i (a) obtained by smooth interpolation in the previous frame, and smoothing the last 5 g T, i (a ) then take the value of g T (i).

④根据gT,i(a)调整excHB(m)的A×I个样点的增益,获得ST(m)。④ According to g T, i (a), adjust the gain of A×I sample points of exc HB (m) to obtain S T (m).

excHB(m)的时域谱包络整形通过将接受调整的样点值与相应的增益因子gT,i(a)通过简单相乘得到:The time-domain spectral envelope shaping of exc HB (m) is obtained by simple multiplication of the adjusted sample point value with the corresponding gain factor g T,i (a):

ST(m)=gT,i(a)·excHB(m)。S T (m) = g T, i (a) · exc HB (m).

B4、按照Fenv(j)对ST(m)的频域谱包络进行调整,生成频域调整后的重建信号SF(m)。B4. Adjust the frequency-domain spectral envelope of S T (m) according to F env (j) to generate a frequency-domain adjusted reconstruction signal S F (m).

与时域谱包络调整过程类似,频域谱包络调整过程同样相应于编码端频域谱包络参数的提取过程执行,每个Fenv(i)对应调整ST(m)频域中带宽为B1的一个子带,B1≤B2/J,B2为ST(m)也即excHB(m)的频带宽度。每个Fenv(j)与所调整频带的对应关系和编码端提取过程中的对应关系相同。具体调整方式可采用例如前述背景技术中的频域谱包络调整方式等。Similar to the time-domain spectral envelope adjustment process, the frequency-domain spectral envelope adjustment process is also performed corresponding to the extraction process of the frequency-domain spectral envelope parameters at the encoder, and each F env (i) corresponds to the adjustment of S T (m) in the frequency domain The bandwidth is a subband of B 1 , B 1B 2 /J, and B 2 is the frequency bandwidth of ST (m), that is, exc HB (m). The corresponding relationship between each F env (j) and the adjusted frequency band is the same as the corresponding relationship during the extraction process at the encoding end. A specific adjustment method may be, for example, the frequency domain spectrum envelope adjustment method in the aforementioned background art.

为降低运算复杂度,提高调整效果,本实施例中提供一种频域谱包络调整方式,包括:In order to reduce the computational complexity and improve the adjustment effect, a frequency domain spectrum envelope adjustment method is provided in this embodiment, including:

①按照编码端计算Fenv(j)的方式,对ST(m)进行时频变换生成频域信号SF1(m)并且计算SF1(m)的频域谱包络参数F’env(j)。①According to the method of calculating F env (j) at the encoding end, perform time-frequency transformation on S T (m) to generate frequency domain signal S F1 (m) and calculate the frequency domain spectrum envelope parameter F' env ( j).

所称编码端计算Fenv(j)的方式即编码端提取需要编码的高频信号Shb(m)的Fenv(j)的过程。一种F’env(i)的计算方式示例如下:The so-called method of calculating F env (j) at the encoding end is the process of extracting F env (j) of the high-frequency signal Shb (m) to be encoded at the encoding end. An example of the calculation method of F' env (i) is as follows:

为ST(m)及上一帧ST,last(m)加窗wTDAC(k)获得加窗后的信号Sw(k),k=0,…, 4N-1,其中,Windowing w TDAC (k) for ST (m) and last frame ST, last (m) to obtain windowed signal S w (k), k=0,..., 4N-1, where,

Sw(k)=wTDAC(k)·ST,last(k),k=0,…,2N-1,S w (k) = w TDAC (k) S T, last (k), k = 0, ..., 2N-1,

Sw(k)=wTDAC(k)·ST(k-2N),k=2N,…,4N-1;S w (k) = w TDAC (k) S T (k-2N), k = 2N, ..., 4N-1;

对Sw(k)进行离散余弦变换(DCT:Diserete Cosine Transform)生成SF1(m),具体变换方式可采用改进型离散余弦变换(MDCT:Modified DCT),Perform discrete cosine transform (DCT: Diserete Cosine Transform) on S w (k) to generate S F1 (m). The specific transformation method can be modified discrete cosine transform (MDCT: Modified DCT),

SS Ff 11 (( mm )) == ΣΣ kk == 00 44 NN -- 11 SS ww (( kk )) coscos [[ ππ 88 NN (( 22 kk ++ 11 ++ 22 NN )) (( 22 mm ++ 11 )) ]] ;;

抽取SF1(m)的前D×J个样点计算F’env(j),Take the first D×J samples of S F1 (m) to calculate F' env (j),

Ff '' envenv (( jj )) == 11 22 loglog 22 {{ ΣΣ dd == 00 DD. -- 11 [[ SS Ff 11 (( dd ++ jj ×× DD. )) ]] 22 }} ..

由于excHB(m)的生成过程中可能执行了限制频带范围的3/4低通滤波处理,这种情况下仅有0~3B0/2频段的数据是有效的,因此,在进行时频变换后,只需抽取2N个频域样点的前3/2N个点用于计算F’env(j)即可,此时D×J=3/2N。Since the generation of exc HB (m) may have performed a 3/4 low-pass filtering process that limits the frequency range, in this case only the data in the 0-3B 0 /2 frequency band is valid, so the time-frequency After transformation, it is only necessary to extract the first 3/2N points of the 2N frequency domain samples for calculating F' env (j), at this time D×J=3/2N.

通常可取16个样点作为一个子频带,即D=16,此时F’env(j)的数目为J=3N/32。此外,所使用的窗函数wTDAC(k)可选择如下正弦窗:Usually, 16 samples can be taken as a sub-band, that is, D=16, and the number of F' env (j) at this time is J=3N/32. In addition, the used window function w TDAC (k) can choose the following sine window:

wTDAC(k)=sin[(k+0.5)π/4N]。w TDAC (k)=sin[(k+0.5)π/4N].

②根据Fenv(j)与F’env(j)之间的能量差值计算频域初步增益因子gF(j),每个gF(j)对应于SF1(m)中包括D个频域样点的一段,D×J≤2N。②Calculate the preliminary gain factor g F (j) in the frequency domain according to the energy difference between F env (j) and F' env (j), each g F (j) corresponds to S F1 (m) including D A segment of frequency domain samples, D×J≤2N.

一种gF(j)的计算方式示例如下:An example of the calculation method of g F (j) is as follows:

gF(i)=2^[Fenv(j)-F’env(j)],g F (i) = 2^[F env (j)-F' env (j)],

每个gF(i)和SF1(m)子频带的对应关系与F’env(i)和SF1(m)子频带的对应关系相同。The corresponding relationship between each g F (i) and S F1 (m) sub-band is the same as the corresponding relationship between F' env (i) and S F1 (m) sub-band.

③插值每个gF(j)获得D个增益因子gF,j(d),d=0,…,D-1。③ Interpolate each g F (j) to obtain D gain factors g F, j (d), d=0, ..., D-1.

具体插值方法可参考前述时域增益因子的插值方法,当然也可以采用其他插值方法,不再赘述。For a specific interpolation method, reference may be made to the aforementioned interpolation method for the time-domain gain factor, and of course other interpolation methods may also be used, which will not be repeated here.

④根据gF,j(d)调整SF1(m)的D×J个样点的增益,生成调整后的频域信号SF2(m)。与时域谱包络的调整类似,将频域样点值与相应的增益因子gF,j(d)简单相乘即可:④ Adjust the gain of D×J samples of S F1 (m) according to g F, j (d), and generate the adjusted frequency domain signal S F2 (m). Similar to the adjustment of the time-domain spectrum envelope, simply multiply the frequency-domain sample point value with the corresponding gain factor g F, j (d):

SF2(m)=gF,j(d)·SF1(m)。S F2 (m) = g F, j (d) · S F1 (m).

⑤对SF2(m)进行所述时频变换的逆变换,获得SF(m)。⑤ Carry out the inverse transformation of the time-frequency transformation on S F2 (m) to obtain S F (m).

例如,若在频域调整前采用MDCT变换到频域,此时则采用逆MDCT(IMDCT)变换到时域。For example, if the MDCT is used to transform to the frequency domain before the frequency domain adjustment, then the inverse MDCT (IMDCT) is used to transform to the time domain.

B5、对SF(m)进行频谱折叠,生成频率范围为2B0~2B0+B2的高频重建信号SHB(m)。B5. Perform spectrum folding on S F (m) to generate a high-frequency reconstruction signal SHB (m) with a frequency range of 2B 0 to 2B 0 +B 2 .

由于在编码端是将高频信号折叠至低频段,因此在解码端还原时,应再次进行频谱折叠。折叠方法与编码端进行高频信号预处理时的频谱折叠方式类似。若在重建过程中,基于编码对频率范围的要求对激励信号进行了低通滤波,此时可将滤波去掉的高频部分的频域系数补0后进行折叠获得最终的高频重建信号。Since high-frequency signals are folded to low-frequency bands at the encoding end, spectrum folding should be performed again when restoring at the decoding end. The folding method is similar to the spectral folding method when the high-frequency signal is preprocessed at the encoder end. If during the reconstruction process, low-pass filtering is performed on the excitation signal based on the frequency range requirements of the encoding, at this time, the frequency domain coefficients of the high-frequency part removed by filtering can be filled with 0 and then folded to obtain the final high-frequency reconstruction signal.

进一步的,由于在上述信号重建过程中过了时域和频域两重调整,很可能使重建信号出现毛刺,为了消除这些毛刺,可以在进行频谱折叠之前先对时频调整后的信号SF(m)进行后处理,即,在步骤B5之前增加如下步骤:Furthermore, due to the double adjustment of time domain and frequency domain in the above signal reconstruction process, it is likely to cause glitches in the reconstructed signal. In order to eliminate these glitches, the time-frequency adjusted signal S F can be adjusted before spectrum folding (m) Post-processing, that is, adding the following steps before step B5:

B51、使用包络调整阈值limit1(i)、limit2(i)对SF(m)进行包络调整。调整后的SF(m)为:B51. Perform envelope adjustment on S F (m) using envelope adjustment thresholds limit 1 (i) and limit 2 (i). The adjusted S F (m) is:

在m=m1~m2的部分中,若|SF,old(m)|<limit1(i),则SF(m)=SF,old(m),In the part of m=m 1 ~m 2 , if |S F,old (m)|<limit 1 (i), then S F (m)=S F,old (m),

在m=m2+1~m3的部分中,若limit1(i)≤|SF,old(m)|≤limit2(i),则SF(m)=[SF,old(m)-limit1(i)]/2+limit1(i),In the part of m=m 2 +1~m 3 , if limit 1 (i)≤|S F, old (m)|≤limit 2 (i), then S F (m)=[S F, old ( m)-limit 1 (i)]/2+limit 1 (i),

在m=m3+1~m4的部分中,若|SF,old(m)|>limit2(i),则SF(m)=[SF,old(m)-limit2(i)]/16+limit2(i),其中,SF,old(m)为包络调整前的SF(m);limit1(i)、limit2(i)与SF(m)中时域样点的对应关系,和Tenv(i)与SF(m)中时域样点的对应关系相同。In the part of m=m 3 +1~m 4 , if |S F, old (m)|>limit 2 (i), then S F (m)=[S F, old (m)-limit 2 ( i)]/16+limit 2 (i), where S F, old (m) is S F (m) before envelope adjustment; limit 1 (i), limit 2 (i) and S F (m) The corresponding relationship between the time-domain samples in T env (i) and the corresponding relationship between the time-domain samples in S F (m) is the same.

在上述后处理过程中,一种较好的阈值limit1(i)、limit2(i)设置方式为:In the above post-processing process, a better threshold limit 1 (i), limit 2 (i) setting method is:

limit1(i)=2^Tenv(i),limit 1 (i) = 2^T env (i),

limit2(i)=[2^Tenv(i)]×2.5。limit 2 (i)=[2^T env (i)]×2.5.

此外,上述后处理过程可对每80个样点处理一次,将每80个样点分为三段,前6个样点(m1~m2的部分),中间70个样点(m2+1~m3的部分),最后4个样点(m3+1~m4的部分)。举例说明如下:若N=160,则时频调整后的信号为320 个样点,可分4次进行后处理;其中m1~m2的部分为0~5、80~85、160~165、240~245的部分;m2+1~m3的部分为6~75、86~155、166~235、246~315的部分;m3+1~m4的部分为76~79、156~159、236~239、316~319的部分。In addition, the above post-processing process can process every 80 sample points once, and divide each 80 sample points into three sections, the first 6 sample points (m 1 ~ m 2 part), the middle 70 sample points (m 2 +1~m 3 part), the last 4 sample points (m 3 +1~m 4 part). An example is as follows: if N=160, the signal after time-frequency adjustment is 320 samples, which can be divided into 4 times for post-processing; among them, the part of m 1 ~m 2 is 0~5, 80~85, 160~165 , 240~245 parts; m 2 +1~m 3 parts are 6~75, 86~155, 166~235, 246~315 parts; m 3 +1~m 4 parts are 76~79, 156 ~159, 236~239, 316~319 parts.

下面对用于执行上述高频信号重建方法的本发明实施例的带宽扩展中高频信号的重建装置进行说明,参考图4,其基本逻辑结构包括:The following describes the reconstruction device of the medium and high frequency signal of the bandwidth extension of the embodiment of the present invention for performing the above high frequency signal reconstruction method. Referring to FIG. 4, its basic logical structure includes:

激励信号生成单元201,采用前述实施例的激励信号的生成装置的逻辑结构,用于生成激励信号excHB(m),m=0,…,2N-1;The excitation signal generation unit 201 adopts the logical structure of the excitation signal generation device of the foregoing embodiment, and is used to generate the excitation signal exc HB (m), m=0,...,2N-1;

解码单元202,用于解码输出时域谱包络参数Tenv(i)和频域谱包络参数Fenv(j),其中i=0,…,I-1、j=0,…,J-1;Decoding unit 202, used to decode and output time-domain spectral envelope parameters T env (i) and frequency-domain spectral envelope parameters F env (j), wherein i=0, ..., I-1, j=0, ..., J -1;

时域整形单元203,用于按照解码单元202输出的Tenv(i)对激励信号生成单元201输出的excHB(m)的时域谱包络进行调整,每个Tenv(i)对应调整excHB(m)中包括A个时域样点的一段,A≤2N/I,输出时域调整后的信号ST(m);The time-domain shaping unit 203 is configured to adjust the time-domain spectral envelope of the exc HB (m) output by the excitation signal generating unit 201 according to the T env (i) output by the decoding unit 202, and adjust each T env (i) correspondingly Exc HB (m) includes a section of A time-domain samples, A≤2N/I, and outputs the time-domain adjusted signal S T (m);

频域整形单元204,用于按照解码单元202输出的Fenv(j)对时域整形单元203输出的ST(m)的频域谱包络进行调整,每个Fenv(j)对应调整ST(m)频域中带宽为B1的一个子带,B1≤B2/J,B2为ST(m)的频带宽度,输出频域调整后的重建信号SF(m);The frequency domain shaping unit 204 is used to adjust the frequency domain spectrum envelope of ST (m) output by the time domain shaping unit 203 according to the F env (j) output by the decoding unit 202, and adjust each F env (j) correspondingly S T (m) is a sub-band with a bandwidth of B 1 in the frequency domain, B 1B 2 /J, B 2 is the frequency bandwidth of S T (m), output the reconstructed signal S F (m) after frequency domain adjustment ;

频谱折叠单元205,用于对输入的SF(m)进行频谱折叠,生成频率范围为2B0~2B0+B2的高频重建信号SHB(m)。The spectrum folding unit 205 is configured to perform spectrum folding on the input S F (m) to generate a high-frequency reconstruction signal SHB (m) with a frequency range of 2B 0 -2B 0 +B 2 .

此外,基于前述重建方法中为消除信号毛刺使用的后处理过程,本实施例的高频信号重建装置还可包括:In addition, based on the post-processing process used to eliminate signal burrs in the aforementioned reconstruction method, the high-frequency signal reconstruction device of this embodiment may further include:

后处理单元206,用于使用包络调整阈值limit1(i)、limit2(i)对频域整形单元204输出的SF(m)进行包络调整,调整后的SF(m)为:在m=m1~m2的部分中,若|SF,old(m)|<limit1(i),则SF(m)=SF,old(m);在m=m2+1~m3的部分中,若limit1(i)≤|SF,old(m)|≤limit2(i),则SF(m)=[SF,old(m)-limit1(i)]/2+limit1(i);在m=m3+1~m4的部分中,若|SF,old(m)|>limit2(i),则SF(m)=[SF,old(m)-limit2(i)]/16+limit2(i);其中,SF,old(m)为包络调整前的SF(m);limit1(i)、limit2(i)与SF(m)中时域样点的对应关系,和Tenv(i)与SF(m)中时域样点的对应关系相同;将调整后的SF(m)输出给频谱折叠单元205。The post-processing unit 206 is configured to use the envelope adjustment thresholds limit 1 (i), limit 2 (i) to perform envelope adjustment on the SF (m) output by the frequency domain shaping unit 204, and the adjusted SF (m) is : In the part of m=m 1 ~m 2 , if |S F,old (m)|<limit 1 (i), then S F (m)=S F,old (m); in m=m 2 In the part of +1~m 3 , if limit 1 (i)≤|S F, old (m)|≤limit 2 (i), then S F (m)=[S F, old (m)-limit 1 (i)]/2+limit 1 (i); in the part of m=m 3 +1~m 4 , if |S F, old (m)|>limit 2 (i), then S F (m) =[S F, old (m)-limit 2 (i)]/16+limit 2 (i); where, S F, old (m) is S F (m) before envelope adjustment; limit 1 (i ), the corresponding relationship between limit 2 (i) and time-domain samples in S F (m) is the same as the corresponding relationship between T env (i) and time-domain samples in S F (m); the adjusted S F (m) is output to the spectrum folding unit 205 .

上述高频信号重建方法与装置实施例中进一步提供的时域增益因子平滑插值方法能够获得更好的时域调整效果;进一步提供的具体频域谱包络调整方式避免了在解码端使用多项滤波器组分频段对信号滤波,简化了处理过程,降低了运算复杂度;进一步提供的整形后处理方式能够更好的消除整形过程出现的毛刺。The time-domain gain factor smooth interpolation method further provided in the above-mentioned high-frequency signal reconstruction method and device embodiment can obtain a better time-domain adjustment effect; the further provided specific frequency-domain spectral envelope adjustment method avoids the use of multiple The filter component frequency band filters the signal, which simplifies the processing process and reduces the computational complexity; the post-shaping processing method further provided can better eliminate the glitches in the shaping process.

为更好的理解上述实施例,下面以在超宽带带宽扩展中的一种应用为例,说明上述高频信号重建过程:In order to better understand the above-mentioned embodiment, an application in ultra-wideband bandwidth expansion is taken as an example below to illustrate the above-mentioned high-frequency signal reconstruction process:

①生成0~6kHz的高频激励信号excHB(m),时域每帧320个样点。即,2N=320,B0=4kHz,B2=3B0/2=6kHz。① Generate 0-6kHz high-frequency excitation signal exc HB (m), 320 samples per frame in the time domain. That is, 2N=320, B 0 =4 kHz, B 2 =3B 0 /2=6 kHz.

②从码流中解码获得32个时域谱包络参数Tenv(i),i=0,…,31,每个对应10个时域样点,即I=32,A=10。② Obtain 32 time-domain spectral envelope parameters T env (i) by decoding from the code stream, i=0,...,31, each corresponding to 10 time-domain samples, ie I=32, A=10.

③将excHB(m)同样分为32个小段,每段10个样点,计算对应的T’env(i):③ Divide exc HB (m) into 32 subsections, each with 10 samples, and calculate the corresponding T' env (i):

TT '' envenv (( ii )) == 11 22 loglog 22 {{ &Sigma;&Sigma; aa == 00 99 [[ excexc HBHB (( aa ++ ii &times;&times; 1010 )) ]] 22 }} ..

然后计算时域增益gT(i)=2^[Tenv(i)-T’env(i)],并用平滑插值算法插值每个gT(i):Then calculate the time-domain gain g T (i) = 2^[T env (i)-T' env (i)], and interpolate each g T (i) with a smooth interpolation algorithm:

gT,i(a)=wT(a)·gT(i)+[1-wT(a)]·glast T,i(a),a=0,…,4。g T,i (a)=w T (a)·g T (i)+[1−w T (a)]·g last T,i (a),a=0, . . . ,4.

gT,i(a)=gT(i),a=5,…,9。g T, i (a) = g T (i), a = 5, . . . , 9.

其中,wT(a)={0.0669872981f,0.2500000000f,0.5000000000f,0.7500000000f,0.9330127019f},a依次为0~4,f表示浮点数。然后计算出时域整形后的信号:Wherein, w T (a)={0.0669872981f, 0.2500000000f, 0.5000000000f, 0.7500000000f, 0.9330127019f}, a is 0 to 4 in sequence, and f represents a floating point number. Then calculate the signal after time domain shaping:

ST(m)=gT,i(a)·excHB(m)。S T (m) = g T, i (a) · exc HB (m).

④从码流中解码获得15个频域谱包络参数Fenv(j),j=0,…,14,每个对应0.4kHz带宽的子频带,即J=15。④ Obtain 15 frequency-domain spectral envelope parameters F env (j) by decoding from the code stream, j=0,...,14, each corresponding to a sub-band with a bandwidth of 0.4kHz, that is, J=15.

⑤对ST(m)及上一帧ST,last(m)加正弦窗wTDAC(k),⑤ Add sine window w TDAC (k) to S T (m) and the previous frame S T, last (m),

wTDAC(k)=sin[(k+0.5)π/640],k=0,…,639;获得加窗后的信号Sw(k),w TDAC (k)=sin[(k+0.5)π/640], k=0, ..., 639; obtain the windowed signal S w (k),

Sw(k)=wTDAC(k)·ST,last(k),k=0,…,319,S w (k) = w TDAC (k) S T, last (k), k = 0, ..., 319,

Sw(k)=wTDAC(k)·ST(k-2N),k=320,…,639;S w (k) = w TDAC (k) S T (k-2N), k = 320,..., 639;

然后对加窗后的Sw(k)序列进行640点的MDCT,生成频域信号SF1(m),Then perform 640-point MDCT on the windowed S w (k) sequence to generate a frequency domain signal S F1 (m),

SS Ff 11 (( mm )) == &Sigma;&Sigma; kk == 00 639639 SS ww (( kk )) coscos [[ &pi;&pi; 12801280 (( 22 kk ++ 11 ++ 320320 )) (( 22 mm ++ 11 )) ]] ;;

由于生成excHB(m)的过程中进行了3/4低通滤波,滤除了6~8kHz的频段数据,因此只有0~6kHz频段的数据是有效的,因此抽取SF1(m)的前240个点用于计算15个F’env(j),每16个点一组,即D=16,Since 3/4 low-pass filtering is performed in the process of generating exc HB (m), the data in the 6-8kHz frequency band is filtered out, so only the data in the 0-6kHz frequency band is valid, so the first 240 bits of S F1 (m) are extracted Points are used to calculate 15 F' env (j), every 16 point group, namely D=16,

Ff '' envenv (( jj )) == 11 22 loglog 22 {{ &Sigma;&Sigma; dd == 00 1515 [[ SS Ff 11 (( dd ++ jj &times;&times; 1616 )) ]] 22 }} ..

然后计算频域增益gF(i)=2^[Fenv(j)-F’env(j)],获得频域整形后的信号SF2(m)=gF(i)·SF1(m)。再对SF1(m)进行IMDCT得到SF(m)。Then calculate the frequency domain gain g F (i)=2^[F env (j)-F' env (j)], and obtain the frequency domain shaped signal S F2 (m)=g F (i)·S F1 ( m). Then perform IMDCT on S F1 (m) to obtain S F (m).

⑥对SF(m)的320个样点每80个样点处理一次,每次分为三段,前6个样点,中间70个样点,最后4个样点,按照limit1(i)=2^Tenv(i),limit2(i)=[2^Tenv(i)]×2.5进行包络调整。⑥The 320 samples of S F (m) are processed every 80 samples, and each time is divided into three sections, the first 6 samples, the middle 70 samples, and the last 4 samples, according to the limit 1 (i )=2^T env (i), limit 2 (i)=[2^T env (i)]×2.5 for envelope adjustment.

⑦然后对包络调整后的0~6kHz的信号进行频谱折叠,获得8~14kHz的高频重建信号SHB(m)。⑦Then perform spectrum folding on the signal of 0-6 kHz after envelope adjustment, and obtain the high-frequency reconstruction signal SHB (m) of 8-14 kHz.

将SHB(m)与核心码流解码得到的低频信号(0~8kHz)合并(例如通过QMF合成)即可得到完整的超宽带重建信号(0~14kHz)。Combining the SHB (m) with the low-frequency signal (0-8kHz) obtained by decoding the core code stream (for example, through QMF synthesis) can obtain a complete ultra-wideband reconstruction signal (0-14kHz).

本领域普通技术人员可以理解上述实施例的各种方法中的全部或部分步骤是可以通过程序来指令相关的硬件来完成,该程序可以存储于一计算机可读存储介质中,存储介质可以包括:ROM、RAM、磁盘或光盘等。Those of ordinary skill in the art can understand that all or part of the steps in the various methods of the above-mentioned embodiments can be completed by instructing related hardware through a program, and the program can be stored in a computer-readable storage medium, and the storage medium can include: ROM, RAM, disk or CD, etc.

以上对本发明所提供的带宽扩展中激励信号的生成方法以及相应的高频信号的重建方法和装置进行了详细介绍,本文中应用了具体个例对本发明的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本发明的方法及其核心思想;同时,对于本领域的一般技术人员,依据本发明的思想,在具体实施方式及应用范围上均会有改变之处,综上所述,本说明书内容不应理解为对本发明的限制。The method for generating the excitation signal in the bandwidth extension provided by the present invention and the corresponding high-frequency signal reconstruction method and device have been introduced in detail above. In this paper, specific examples are used to illustrate the principle and implementation of the present invention. The above implementation The description of the example is only used to help understand the method of the present invention and its core idea; at the same time, for those of ordinary skill in the art, according to the idea of the present invention, there will be changes in the specific implementation and scope of application. In summary As stated above, the content of this specification should not be construed as limiting the present invention.

Claims (19)

1. the generation method of pumping signal is characterized in that during a bandwidth was expanded, and comprising:
The generated frequency scope is 0~B 0The first pumping signal exc (n), n=0 ..., N-1;
Exc (n) is carried out spectrum folding, and the generated frequency scope is B 0~2B 0The second pumping signal exc Fold(n);
To exc (n) and exc Fold(n) carry out synthetic filtering, reference frequency output is 0~2B 0The 3rd pumping signal exc HB(m), m=0 ..., 2N-1, described the 3rd pumping signal exc HB(m) be used for carrying out the reconstruction of high-frequency signal as high-frequency excitation signal.
2. the generation method of pumping signal according to claim 1 is characterized in that, and is described to exc (n) and exc Fold(n) step of carrying out synthetic filtering is specially: to exc (n) and exc Fold(n) carry out the orthogonal mirror image synthetic filtering.
3. the generation method of pumping signal according to claim 1 and 2 is characterized in that, also comprises:
To frequency range is 0~2B 0Exc HB(m) carry out 3/4 low-pass filtering, reference frequency output is 0~3B 0/ 2 exc HB(m).
4. the generation method of pumping signal according to claim 3 is characterized in that, the step of described generation exc (n) is specially:
The decoding core code stream obtains constant codebook excitations and adaptive codebook excitation and gain separately;
According to described constant codebook excitations of gain weighting superposition and adaptive codebook excitation acquisition exc (n) separately.
5. the generation method of pumping signal according to claim 4 is characterized in that:
Described constant codebook excitations comprises that basic layer constant codebook excitations c (n) and enhancement layer strengthen excitation c ' (n), and corresponding gain is respectively g cAnd g Enh
Calculate exc (n) according to following formula:
Exc (n)=g pV (n)+g cC (n)+g EnhC ' (n) wherein, v (n) is adaptive codebook excitation, g pBe the gain of v (n), N=160, B 0=4kHz.
6. the method for reconstructing of a bandwidth expansion medium-high frequency signal is characterized in that, comprising:
Generate pumping signal exc according to any described method of claim 1~5 HB(m), m=0 ..., 2N-1;
Decoding obtains time domain spectrum envelope parameter T Env(i) and frequency domain spectra envelope parameters F Env(j), i=0 wherein ..., I-1, j=0 ..., J-1;
According to T Env(i) to exc HB(m) time domain spectrum envelope is adjusted, each T Env(i) the corresponding exc that adjusts HB(m) comprise a section of A time domain sampling point in, A≤2N/I generates the adjusted signal S of time domain T(m);
According to F Env(j) to S T(m) frequency domain spectra envelope is adjusted, each F Env(j) the corresponding S that adjusts T(m) bandwidth is B in the frequency domain 1A subband, B 1≤ B 2/ J, B 2Be S T(m) frequency span generates the adjusted reconstruction signal S of frequency domain F(m);
To S F(m) carry out spectrum folding, the generated frequency scope is 2B 0~2B 0+ B 2High-frequency reconstruction signal S HB(m).
7. the method for reconstructing of high-frequency signal according to claim 6 is characterized in that, and is described according to T Env(i) to exc HB(m) step that time domain spectrum envelope is adjusted comprises:
Calculate T according to coding side Env(i) mode is calculated exc HB(m) time domain spectrum envelope parameter T ' Env(i);
According to T Env(i) and T ' Env(i) the energy difference between is calculated the preliminary gain factor g of time domain T(i), each g T(i) corresponding to exc HB(m) comprise a section of A time domain sampling point in;
Each g of interpolation T(i) obtain A gain factor g T, i(a), a=0 ..., A-1;
According to g T, i(a) adjust exc HBThe gain of the sampling point of A * I (m) obtains S T(m).
8. the method for reconstructing of high-frequency signal according to claim 7 is characterized in that, A=10, and I=N/5, described according to T Env(i) and T ' Env(i) the energy difference between is calculated g T(i) step is specially:
g T(i)=2^[T env(i)-T’ env(i)];
Each g of described interpolation T(i) obtain A g T, i(a) step is specially:
g T, i(a)=w T(a) g T(i)+[1-w T(a)] g Last T, i(a); Wherein, w T(a) be window function, work as a=0 ..., 4 o'clock, w T(a)=1/2{1-cos[(a+1) π/6] }, work as a=5 ..., 9 o'clock, w T(a)=1; g Last T, i(a) be previous frame exc HB(m) gain factor of corresponding sampling point.
9. according to the method for reconstructing of any described high-frequency signal of claim 6~8, it is characterized in that, described according to F Env(j) to S T(m) step that frequency domain spectra envelope is adjusted comprises:
Calculate F according to coding side Env(j) mode is to S T(m) carry out time-frequency conversion and generate frequency-region signal S F1(m) and calculate S F1(m) frequency domain spectra envelope parameters F ' Env(j);
According to F Env(j) and F ' Env(j) the energy difference between is calculated the preliminary gain factor g of frequency domain F(j), each g F(j) corresponding to S F1(m) comprise a section of D frequency domain sampling point, D * J≤2N in;
Each g of interpolation F(j) obtain D gain factor g F, j(d), d=0 ..., D-1;
According to g F, j(d) adjust S F1The gain of the sampling point of D * J (m) generates adjusted frequency-region signal S F2(m);
To S F2(m) carry out the inverse transformation of described time-frequency conversion, obtain S F(m).
10. the method for reconstructing of high-frequency signal according to claim 9 is characterized in that, and is described according to coding side calculating F Env(j) mode generates S F1(m) and calculate F ' Env(j) step comprises:
Be S T(m) and previous frame S T, last(m) windowing w TDAC(k) the signal S after the acquisition windowing w(k), k=0 ..., 4N-1, wherein,
S w(k)=w TDAC(k)·S T,last(k),k=0,…,2N-1,
S w(k)=w TDAC(k)·S T(k-2N),k=2N,…,4N-1;
To S w(k) carry out discrete cosine transform and generate S F1(m),
Figure S200710198774XC00031
Extract S F1(m) preceding D * J sampling point calculates F ' Env(j),
Figure S200710198774XC00032
11. the method for reconstructing of high-frequency signal according to claim 10 is characterized in that: D=16, J=3N/32, described window function w TDAC(k) be:
w TDAC(k)=sin[(k+0.5)π/4N]。
12. the method for reconstructing according to any described high-frequency signal of claim 6~11 is characterized in that, to S F(m) carry out spectrum folding before, also comprise:
Use envelope to adjust threshold value limit 1(i), limit 2(i) to S F(m) carry out the envelope adjustment, adjusted S F(m) be:
At m=m 1~m 2Part in, if | S F, old(m) |<limit 1(i), S then F(m)=S F, old(m),
At m=m 2+ 1~m 3Part in, if limit 1(i)≤| S F, old(m) |≤limit 2(i), S then F(m)=[S F, old(m)-limit 1(i)]/2+limit 1(i),
At m=m 3+ 1~m 4Part in, if | S F, old(m) |>limit 2(i), S then F(m)=[S F, old(m)-limit 2(i)]/16+limit 2(i), wherein, S F, old(m) adjust preceding S for envelope F(m); Limit 1(i), limit 2(i) and S F(m) corresponding relation of time domain sampling point in, and T Env(i) and S F(m) corresponding relation of time domain sampling point is identical in.
13. the method for reconstructing of high-frequency signal according to claim 12 is characterized in that: described limit 1(i), limit 2(i) be,
limit 1(i)=2^T env(i),limit 2(i)=[2^T env(i)]×2.5。
14. the method for reconstructing of high-frequency signal according to claim 13 is characterized in that: N=160; Described m 1~m 2Part be 0~5,80~85,160~165,240~245 part; Described m 2+ 1~m 3Part be 6~75,86~155,166~235,246~315 part; Described m 3+ 1~m 4Part be 76~79,156~159,236~239,316~319 part.
15. the generating apparatus of pumping signal is characterized in that during a bandwidth was expanded, and comprising:
The core codec module, being used for reference frequency output is 0~B 0The first pumping signal exc (n), n=0 ..., N-1;
The spectrum folding module is used for exc (n) is carried out spectrum folding, and reference frequency output is B 0~2B 0The second pumping signal exc Fold(n);
The synthetic filtering module is used for exc (n) and exc Fold(n) carry out synthetic filtering, reference frequency output is 0~2B 0The 3rd pumping signal exc HB(m), m=0 ..., 2N-1, described the 3rd pumping signal exc HB(m) be used for carrying out the reconstruction of high-frequency signal as high-frequency excitation signal.
16. the generating apparatus of pumping signal according to claim 15 is characterized in that: described synthetic filtering module is the orthogonal mirror image composite filter.
17. the generating apparatus according to claim 15 or 16 described pumping signals is characterized in that, also comprises:
3/4 low-pass filter, being used for the incoming frequency scope is 0~2B 0Exc HB(m), it is carried out 3/4 low-pass filtering, reference frequency output is 0~3B 0/ 2 exc HB(m).
18. the reconstructing device of a bandwidth expansion medium-high frequency signal is characterized in that, comprising:
The pumping signal generation unit, the logical organization of the generating apparatus of any described pumping signal of employing claim 15~17 is used to generate pumping signal exc HB(m), m=0 ..., 2N-1;
Decoding unit is used for decoding output time domain spectrum envelope parameter T Env(i) and frequency domain spectra envelope parameters F Env(j), i=0 wherein ..., I-1, j=0 ..., J-1;
The time domain shaping unit is used for according to T Env(i) to exc HB(m) time domain spectrum envelope is adjusted, each T Env(i) the corresponding exc that adjusts HB(m) comprise a section of A time domain sampling point in, A≤2N/I, the adjusted signal S of output time domain T(m);
The frequency-domain shaping unit is used for according to F Env(j) to S T(m) frequency domain spectra envelope is adjusted, each F Env(j) the corresponding S that adjusts T(m) bandwidth is B in the frequency domain 1A subband, B 1≤ B 2/ J, B 2Be S T(m) frequency span, the adjusted reconstruction signal S of output frequency domain F(m);
The spectrum folding unit is used for the S to input F(m) carry out spectrum folding, the generated frequency scope is 2B 0~2B 0+ B 2High-frequency reconstruction signal S HB(m).
19. the reconstructing device of high-frequency signal according to claim 18 is characterized in that, also comprises:
Post-processing unit is used to use envelope to adjust threshold value limit 1(i), limit 2(i) S that described frequency-domain shaping unit is exported F(m) carry out the envelope adjustment, adjusted S F(m) be: at m=m 1~m 2Part in, if | S F, old(m) |<limit 1(i), S then F(m)=S F, old(m); At m=m 2+ 1~m 3Part in, if limit 1(i)≤| S F, old(m) |≤limit 2(i), S then F(m)=[S F, old(m)-limit 1(i)]/2+limit 1(i); At m=m 3+ 1~m 4Part in, if | S F, old(m) |>limit 2(i), S then F(m)=[S F, old(m)-limit 2(i)]/16+limit 2(i); Wherein, S F, old(m) adjust preceding S for envelope F(m); Limit 1(i), limit 2(i) and S F(m) corresponding relation of time domain sampling point in, and T Env(i) and S F(m) corresponding relation of time domain sampling point is identical in; With adjusted S F(m) export to described spectrum folding unit.
CN200710198774XA 2007-12-12 2007-12-12 Excitation signal generation in bandwidth spreading and signal reconstruction method and apparatus Expired - Fee Related CN101458930B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN200710198774XA CN101458930B (en) 2007-12-12 2007-12-12 Excitation signal generation in bandwidth spreading and signal reconstruction method and apparatus
PCT/CN2008/073368 WO2009076871A1 (en) 2007-12-12 2008-12-08 Method and apparatus for generating excitation signal and regenerating signal in bandwidth extension

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN200710198774XA CN101458930B (en) 2007-12-12 2007-12-12 Excitation signal generation in bandwidth spreading and signal reconstruction method and apparatus

Publications (2)

Publication Number Publication Date
CN101458930A CN101458930A (en) 2009-06-17
CN101458930B true CN101458930B (en) 2011-09-14

Family

ID=40769743

Family Applications (1)

Application Number Title Priority Date Filing Date
CN200710198774XA Expired - Fee Related CN101458930B (en) 2007-12-12 2007-12-12 Excitation signal generation in bandwidth spreading and signal reconstruction method and apparatus

Country Status (2)

Country Link
CN (1) CN101458930B (en)
WO (1) WO2009076871A1 (en)

Families Citing this family (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2780971A1 (en) * 2009-11-19 2011-05-26 Telefonaktiebolaget L M Ericsson (Publ) Improved excitation signal bandwidth extension
US8600737B2 (en) * 2010-06-01 2013-12-03 Qualcomm Incorporated Systems, methods, apparatus, and computer program products for wideband speech coding
US12002476B2 (en) 2010-07-19 2024-06-04 Dolby International Ab Processing of audio signals during high frequency reconstruction
CA3027803C (en) 2010-07-19 2020-04-07 Dolby International Ab Processing of audio signals during high frequency reconstruction
KR101826331B1 (en) * 2010-09-15 2018-03-22 삼성전자주식회사 Apparatus and method for encoding and decoding for high frequency bandwidth extension
CN103915104B (en) * 2012-12-31 2017-07-21 华为技术有限公司 Signal bandwidth extended method and user equipment
CN104036781B (en) * 2013-03-05 2017-02-22 深港产学研基地 Voice signal bandwidth expansion device and method
CN103165134B (en) * 2013-04-02 2015-01-14 武汉大学 Coding and decoding device of audio signal high frequency parameter
CN104517611B (en) * 2013-09-26 2016-05-25 华为技术有限公司 A high-frequency excitation signal prediction method and device
US10410645B2 (en) * 2014-03-03 2019-09-10 Samsung Electronics Co., Ltd. Method and apparatus for high frequency decoding for bandwidth extension
CN104269173B (en) * 2014-09-30 2018-03-13 武汉大学深圳研究院 The audio bandwidth expansion apparatus and method of switch mode
CN109074813B (en) 2015-09-25 2020-04-03 杜比实验室特许公司 Processing high-definition audio data
CN107221334B (en) * 2016-11-01 2020-12-29 武汉大学深圳研究院 Audio bandwidth extension method and extension device
CN107545900B (en) * 2017-08-16 2020-12-01 广州广晟数码技术有限公司 Method and apparatus for generating medium and high frequency string signals for bandwidth extension encoding and decoding
CN107682096B (en) * 2017-09-14 2020-07-14 大连理工大学 A Narrowband Random Signal Generation Method Based on Multilevel Interpolation
WO2019145955A1 (en) 2018-01-26 2019-08-01 Hadasit Medical Research Services & Development Limited Non-metallic magnetic resonance contrast agent
IL313348B2 (en) 2018-04-25 2025-08-01 Dolby Int Ab Combining high-frequency reconstruction techniques with reduced post-processing delay
CN112189231B (en) 2018-04-25 2024-09-20 杜比国际公司 Integration of high-frequency audio reconstruction technology
CN110556123B (en) * 2019-09-18 2024-01-19 腾讯科技(深圳)有限公司 Band expansion method, device, electronic equipment and computer readable storage medium
CN110556121B (en) * 2019-09-18 2024-01-09 腾讯科技(深圳)有限公司 Band expansion method, device, electronic equipment and computer readable storage medium
CN116110424B (en) * 2021-11-11 2025-07-15 腾讯科技(深圳)有限公司 Voice bandwidth expansion method and related device
CN114999503B (en) * 2022-05-23 2024-08-27 北京百瑞互联技术股份有限公司 Full-bandwidth spectral coefficient generation method and system based on generation countermeasure network

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1606687A (en) * 2002-09-19 2005-04-13 松下电器产业株式会社 Audio decoding apparatus and method
CN1220972C (en) * 2002-02-08 2005-09-28 株式会社Ntt都科摩 Decoding apparatus and coding apparatus, decoding method and coding method
CN101083076A (en) * 2006-06-03 2007-12-05 三星电子株式会社 Method and device for encoding and decoding signals using bandwidth extension techniques

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
RU2008121724A (en) * 2005-11-30 2009-12-10 Панасоник Корпорэйшн (Jp) SUB-BAND CODING DEVICE AND SUB-BAND CODING METHOD

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1220972C (en) * 2002-02-08 2005-09-28 株式会社Ntt都科摩 Decoding apparatus and coding apparatus, decoding method and coding method
CN1606687A (en) * 2002-09-19 2005-04-13 松下电器产业株式会社 Audio decoding apparatus and method
CN101083076A (en) * 2006-06-03 2007-12-05 三星电子株式会社 Method and device for encoding and decoding signals using bandwidth extension techniques

Also Published As

Publication number Publication date
WO2009076871A1 (en) 2009-06-25
CN101458930A (en) 2009-06-17

Similar Documents

Publication Publication Date Title
CN101458930B (en) Excitation signal generation in bandwidth spreading and signal reconstruction method and apparatus
JP5722437B2 (en) Method, apparatus, and computer readable storage medium for wideband speech coding
CN101676993B (en) Method and apparatus for artificially extending the bandwidth of a speech signal
JP4861196B2 (en) Method and device for low frequency enhancement during audio compression based on ACELP / TCX
CN101276587B (en) Audio encoding apparatus and method thereof, audio decoding device and method thereof
US8532998B2 (en) Selective bandwidth extension for encoding/decoding audio/speech signal
CN101662288B (en) Method, device and system for encoding and decoding audios
JP5833675B2 (en) Bandwidth expansion method and apparatus
JP6526704B2 (en) Method, apparatus and computer readable medium for processing an audio signal
JP5809066B2 (en) Speech coding apparatus and speech coding method
US20100063802A1 (en) Adaptive Frequency Prediction
CN105280190B (en) Bandwidth extension encoding and decoding method and device
US8380498B2 (en) Temporal envelope coding of energy attack signal by using attack point location
CN103366749B (en) A kind of sound codec devices and methods therefor
JP2009515212A (en) Audio compression
CN101430880A (en) Encoding/decoding method and apparatus for ambient noise
CN101281748B (en) Method for filling opening son (sub) tape using encoding index as well as method for generating encoding index
US9390722B2 (en) Method and device for quantizing voice signals in a band-selective manner
CN103366751A (en) Sound coding and decoding apparatus and sound coding and decoding method
CN103155035B (en) Audio signal bandwidth extension in CELP-based speech coder
CN105280189A (en) Method and apparatus for high-frequency generation during bandwidth extension coding and decoding
CN1371512A (en) Enhanced waveform interpolative coder
JP3598111B2 (en) Broadband audio restoration device
JP3598112B2 (en) Broadband audio restoration method and wideband audio restoration apparatus
JP2004046238A (en) Broadband audio restoration apparatus and wideband audio restoration method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20110914

Termination date: 20171212