CN1930914B - Method and device for encoding and synthesizing multi-channel audio signals - Google Patents
Method and device for encoding and synthesizing multi-channel audio signals Download PDFInfo
- Publication number
- CN1930914B CN1930914B CN2005800070361A CN200580007036A CN1930914B CN 1930914 B CN1930914 B CN 1930914B CN 2005800070361 A CN2005800070361 A CN 2005800070361A CN 200580007036 A CN200580007036 A CN 200580007036A CN 1930914 B CN1930914 B CN 1930914B
- Authority
- CN
- China
- Prior art keywords
- frequency
- audio
- channel
- frequency range
- parametric
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Lifetime
Links
Images
Landscapes
- Stereophonic System (AREA)
Abstract
Description
技术领域 technical field
本发明涉及音频信号的编码以及随后由编码后的音频数据对听觉场景的合成。 The invention relates to the encoding of audio signals and the subsequent synthesis of auditory scenes from the encoded audio data. the
相关申请的交叉引用 Cross References to Related Applications
本申请要求在04年3月4日以代理人卷号Faller 14-2提交的美国临时申请号60/549972的申请日的优先权。本申请的主题涉及2001年5月4日以代理人卷号Faller 5(“‘877申请”)提交的美国专利申请序号09/848877的主题,2001年11月7日以代理人卷号Baumgarte 1-6-8(“‘458申请”)提交的美国专利申请序号10/045458的主题,2002年5月24日以代理人卷号Baumgarte 2-10(“‘437申请”)提交的美国专利申请序号10/155437的主题,以及2004年4月1日以代理人卷号Baumgarte 7-12(“‘591申请”)提交的美国专利申请序号10/815591的主题,这四个专利申请的所有内容都在此并入作为参考。 This application claims priority on the filing date of U.S. Provisional Application No. 60/549972, filed March 4, 2004, with Attorney Docket No. Faller 14-2. The subject matter of this application is related to that of U.S. Patent Application Serial No. 09/848,877, filed May 4, 2001, in Attorney Docket No. Faller 5 (the "'877 application"), and filed Nov. 7, 2001, in Attorney Docket No. Baumgarte 1 -Subject of U.S. Patent Application Serial No. 10/045,458 filed May 24, 2002 under Attorney Docket Baumgarte 2-10 ("the '437 Application") filed on 6-8 (the "'458 Application") The subject of Serial No. 10/155,437, and the subject of U.S. Patent Application Serial No. 10/815,591, filed April 1, 2004, with Attorney Docket No. Baumgarte 7-12 (the "'591 application"), all of these four patent applications All are incorporated herein by reference. the
背景技术 Background technique
多年来电影院中的多声道环绕音响系统已经得到标准化。随着技术的进步,已能够制造用于家庭使用的多声道环绕系统。现今,这种系统通常作为“家庭影院系统”销售。按照ITU-R建议,大多数这类系统提供了五个常规音频声道和一个低频超低音喇叭声道(表示低频效果或LFE声道)。这种多声道系统被表示为5.1环绕系统。还有其它的环绕系统,如7.1(七个常规声道和一个LFE声道)和10.2(十个常规声道和两个LFE声道)环绕系统。 Multichannel surround sound systems in movie theaters have been standardized for many years. As technology has advanced, it has become possible to manufacture multi-channel surround systems for home use. These days, such systems are often marketed as "home theater systems." Following ITU-R recommendations, most such systems provide five regular audio channels and a low-frequency subwoofer channel (meaning the low-frequency effects or LFE channel). Such a multi-channel system is denoted as a 5.1 surround system. There are other surround systems such as 7.1 (seven regular channels and one LFE channel) and 10.2 (ten regular channels and two LFE channels) surround systems. the
C.Faller和F.Baumgarte的论文“Efficient representation of spatial audio coding using perceptual parametrization”,IEEE workshop on Appl.of Sig.Proc.to Audio and Acoust.,2001年10月,以及C.Faller和F.Baumgart的论文“Binaural Cue Coding Applied to Stereo and Multi-Channel Audio Compression,”Preprint 112th Conv.Aud.Eng.Soc.,2002年5月(总称为“BCC论文”)的内容均在此并入作为参考,来描述参数多声道音频编码技术(称为BCC编码)。 C.Faller and F.Baumgarte's paper "Efficient representation of spatial audio coding using perceptual parametrization", IEEE workshop on Appl. of Sig.Proc.to Audio and Acoust., October 2001, and C.Faller and F.Baumgart The contents of the paper "Binaural Cue Coding Applied to Stereo and Multi-Channel Audio Compression," Preprint 112th Conv.Aud.Eng.Soc., May 2002 (collectively "BCC Papers") are hereby incorporated by reference, to describe a parametric multi-channel audio coding technique (called BCC coding). the
图1示出了根据BCC论文执行双声道提示编码(BCC)的音频处理系统100的方框图。BCC系统100具有一个BCC编码器102,它接收C个音频输入声道108,每个声道例如来自C个不同麦克风106中的每个麦克风。BCC编码器102具有一个下混频器110,它将C个音频输入声道转换成单音频相加信号112。 Figure 1 shows a block diagram of an audio processing system 100 that performs binaural cue coding (BCC) according to the BCC paper. The BCC system 100 has a BCC encoder 102 that receives C audio input channels 108 , each channel eg from each of C different microphones 106 . The BCC encoder 102 has a down-mixer 110 which converts the C audio input channels into a single audio sum signal 112 . the
另外,BCC编码器102具有一个BCC分析器114,它为C个输入声道产生BCC提示码数据流116。该BCC提示码(也被称为听觉场景参数)包括每个输入声道的声道间电平差(ICLD)和声道间时间差(ICTD)数据。BCC分析器114执行基于频带的处理,从而为音频输入声道的一个或多个不同的子频带(例如不同的临界频带)中的每个子频带产生ICLD和ICTD数据。 Additionally, the BCC encoder 102 has a BCC analyzer 114 which generates a BCC hint code data stream 116 for the C input channels. The BCC hint code (also called auditory scene parameter) includes inter-channel level difference (ICLD) and inter-channel time difference (ICTD) data for each input channel. The BCC analyzer 114 performs band-based processing to generate ICLD and ICTD data for each of one or more different sub-bands (eg, different critical bands) of the audio input channel. the
BCC编码器102把相加信号112和BCC提示码数据流116(例如可以是关于该相加信号的带内或带外边信息)发送到BCC系统100的BCC解码器104。BCC解码器104具有一个边信息处理器118,它对数据流116进行处理以恢复BCC提示码120(例如ICLD和ICTD数据)。BCC解码器104也具有一个BCC合成器122,它使用恢复的BCC提示码120由相加信号112合成C个音频输出声道124,从而分别通过C个扬声器126来播放。 BCC encoder 102 sends summed signal 112 and BCC hint code data stream 116 (which may be, for example, in-band or out-of-band side information about the summed signal) to BCC decoder 104 of BCC system 100 . BCC decoder 104 has a side information processor 118 that processes data stream 116 to recover BCC hint codes 120 (eg, ICLD and ICTD data). The BCC decoder 104 also has a BCC synthesizer 122 which uses the recovered BCC hint code 120 to synthesize C audio output channels 124 from the summed signal 112 to be played through C speakers 126 respectively. the
音频处理系统100可以在如5.1环绕音响的多声道音频信号的环境下实现。特别地,BCC编码器102的下混频器110将常规的5.1环绕音响的六个输入声道(即五个常规声道加一个LFE声道)转换成相加信号112。另外,编码器102的BCC分析器114将这六个输入声道变换到频域,以产生相应的BCC提示码116。类似地,BCC解码器 104的边信息处理器118从接收到的边信息流116中恢复BCC提示码120,然后解码器104的BCC合成器122(1)将接收到的相加信号112变换到频域,(2)把恢复的BCC提示码120应用到频域的相加信号以产生六个频域信号,然后(3)把这些频域信号变换到合成的5.1环绕音响的六个时域声道(即五个合成的常规声道加一个合成的LFE声道),以通过扬声器126播放。 The audio processing system 100 may be implemented in the environment of multi-channel audio signals such as 5.1 surround sound. In particular, the down-mixer 110 of the BCC encoder 102 converts the six input channels of conventional 5.1 surround sound (ie, five conventional channels plus one LFE channel) into a summed signal 112 . Additionally, the BCC analyzer 114 of the encoder 102 transforms the six input channels into the frequency domain to generate corresponding BCC hint codes 116 . Similarly, the side information processor 118 of the BCC decoder 104 recovers the BCC hint code 120 from the received side information stream 116, and the BCC synthesizer 122(1) of the decoder 104 then transforms the received summed signal 112 into In the frequency domain, (2) apply the recovered BCC cue code 120 to the summed signal in the frequency domain to produce six frequency domain signals, and then (3) transform these frequency domain signals into the six time domains of the synthesized 5.1 surround sound channels (ie, five synthesized regular channels plus one synthesized LFE channel) for playback through speaker 126. the
发明内容 Contents of the invention
根据本发明,提供了: According to the present invention, there is provided:
对具有多个音频输入声道的多声道音频信号进行编码的方法,所述多声道音频信号具有多个常规声道和至少一个低频效果声道,该方法包括:应用参数音频编码技术为用于第一个频率范围的所有音频输入声道产生参数音频编码,所述第一个频率范围对应于低于指定截止频率的一个或多个子频带;以及应用参数音频编码技术仅仅为用于第二个频率范围的常规声道产生参数音频编码,所述第二个频率范围对应于高于指定截止频率的一个或多个子频带,其中:对于第一个频率范围,参数音频编码技术生成对应于所有音频输入声道的参数音频编码;以及对于第二个频率范围,参数音频编码技术生成仅仅对应于常规声道,而不针对至少一个低频效果声道的参数音频编码。 A method of encoding a multi-channel audio signal having a plurality of audio input channels, the multi-channel audio signal having a plurality of conventional channels and at least one low-frequency effects channel, the method comprising: applying a parametric audio coding technique to producing parametric audio encoding for all audio input channels of a first frequency range corresponding to one or more sub-bands below a specified cutoff frequency; and applying the parametric audio encoding technique only for the first frequency range Parametric audio coding is produced by conventional channels of two frequency ranges, the second frequency range corresponding to one or more sub-bands above a specified cutoff frequency, where: for the first frequency range, parametric audio coding techniques generate parametric audio coding corresponding to Parametric audio coding of all audio input channels; and for the second frequency range, the parametric audio coding technique generates parametric audio coding corresponding only to regular channels, not to at least one low-frequency effects channel. the
对具有多个音频输入声道的多声道音频信号进行编码的装置,所述多声道音频信号具有多个常规声道和至少一个低频效果声道,该装置包括:应用参数音频编码技术为用于第一个频率范围的所有音频输入声道产生参数音频编码的装置,所述第一个频率范围对应于低于指定截止频率的一个或多个子频带;以及应用参数音频编码技术仅仅为用于第二个频率范围的常规声道产生参数音频编码的装置,所述第二个频率范围对应于高于指定截止频率的一个或多个子频带,其中:对于第一个频率范围,参数音频编码技术生成对应于所有音频输入声道的参数音频编码;以及对于第二个频率范围,参数音频编码技术生成仅仅对应于常规声道,而不针对至少一个低频效果声道的参数音频编 码。 Apparatus for encoding a multi-channel audio signal having a plurality of audio input channels, said multi-channel audio signal having a plurality of conventional channels and at least one low-frequency effects channel, comprising: applying a parametric audio coding technique to means for producing parametric audio coding for all audio input channels of a first frequency range corresponding to one or more sub-bands below a specified cut-off frequency; and applying parametric audio coding techniques only for Means for producing parametric audio coding on a regular channel of a second frequency range, said second frequency range corresponding to one or more sub-bands above a specified cut-off frequency, wherein: for a first frequency range, parametric audio coding The technique generates parametric audio coding corresponding to all audio input channels; and for the second frequency range, the parametric audio coding technique generates parametric audio coding corresponding only to conventional channels, not to at least one low-frequency effects channel. the
一种参数音频编码器,包括:适用于由多声道音频信号的多个音频输入声道产生一个或多个组合声道的下混频器,所述多声道音频信号具有多个常规声道和至少一个低频效果声道;以及分析器,用于产生:(1)用于第一个频率范围内的所有音频输入声道的参数音频编码,所述第一个频率范围对应于低于指定截止频率的一个或多个子频带;以及(2)仅仅用于第二个频率范围内的常规声道的参数音频编码,所述第二个频率范围对应于高于指定截止频率的一个或多个子频带,其中:对于第一个频率范围,分析器生成对应于所有音频输入声道的参数音频编码;以及对于第二个频率范围,分析器生成仅仅对应于常规声道,而不针对至少一个低频效果声道的参数音频编码。 A parametric audio encoder comprising: a down-mixer adapted to generate one or more combined channels from a plurality of audio input channels of a multi-channel audio signal having a plurality of conventional channel and at least one low-frequency effects channel; and an analyzer for generating: (1) parametric audio encoding for all audio input channels in a first frequency range corresponding to one or more subbands of a specified cutoff frequency; and (2) parametric audio coding for regular channels only in a second frequency range corresponding to one or more subbands above the specified cutoff frequency subbands, where: for a first frequency range, the analyzer generates parametric audio encodings corresponding to all audio input channels; and for a second frequency range, the analyzer generates parametric audio encodings only for regular channels, not for at least one Parametric audio encoding of the low-frequency effects channel. the
对具有多个音频输出声道的多声道音频信号进行合成的方法,所述多声道音频信号具有多个常规声道和至少一个低频效果声道,该方法包括:应用参数音频解码技术产生用于第一个频率范围的所有音频输出声道,所述第一个频率范围对应于低于指定截止频率的一个或多个子频带;以及应用参数音频解码技术仅仅产生用于第二个频率范围的常规声道,所述第二个频率范围对应于高于指定截止频率的一个或多个子频带;其中:参数音频解码技术使用参数音频编码生成音频输出声道;对于第一个频率范围,参数音频编码对应于所有音频输出声道;以及对于第二个频率范围,参数音频编码仅仅对应于常规声道,而不针对至少一个低频效果声道。 A method of synthesizing a multi-channel audio signal having a plurality of audio output channels, the multi-channel audio signal having a plurality of conventional channels and at least one low-frequency effects channel, the method comprising: applying parametric audio decoding techniques to generate all audio output channels for a first frequency range corresponding to one or more sub-bands below a specified cutoff frequency; and applying parametric audio decoding techniques to generate only for a second frequency range , the second frequency range corresponds to one or more sub-bands above the specified cutoff frequency; where: parametric audio decoding techniques use parametric audio coding to generate audio output channels; for the first frequency range, the parametric The audio encoding corresponds to all audio output channels; and for the second frequency range, the parametric audio encoding corresponds only to the normal channels, not to the at least one low frequency effects channel. the
对具有多个音频输出声道的多声道音频信号进行合成的装置,所述多声道音频信号具有多个常规声道和至少一个低频效果声道,该装置包括:应用参数音频解码技术产生用于第一个频率范围的所有音频输出声道的装置,所述第一个频率范围对应于低于指定截止频率的一个或多个子频带;以及应用参数音频解码技术产生仅仅用于第二个频率范围的常规声道的装置,所述第二个频率范围对应于高于指定截止频率的一个或多个子频带;其中:参数音频解码技术使用参数音频编码生成音频输出声道;对于第一个频率范围,参数音频编码对应于所有音频输出声道;以及对于第二个频率范围,参数音频编码仅仅对应于常规声道,而不针对至少一个低频效果声道。 A device for synthesizing a multi-channel audio signal having a plurality of audio output channels, the multi-channel audio signal having a plurality of conventional channels and at least one low-frequency effect channel, the device comprising: applying parametric audio decoding techniques to generate means for all audio output channels of a first frequency range corresponding to one or more sub-bands below a specified cutoff frequency; and applying parametric audio decoding techniques to generate means of regular channels of frequency ranges, the second frequency range corresponding to one or more sub-bands above a specified cutoff frequency; wherein: parametric audio decoding techniques use parametric audio coding to generate audio output channels; for the first For the frequency range, the parametric audio coding corresponds to all audio output channels; and for the second frequency range, the parametric audio coding corresponds to only the normal channels, not to at least one low-frequency effects channel.
一种参数音频解码器,用于合成具有多个音频输出声道的多声道音频信号,所述多声道音频信号具有多个常规声道和至少一个低频效果声道,所述参数音频解码器包括:边信息处理器,其适用于恢复参数音频编码,以及合成器,其适于:应用参数音频解码技术产生用于第一个频率范围的所有音频输出声道,所述第一个频率范围对应于低于指定截止频率的一个或多个子频带;以及应用参数音频解码技术产生仅仅用于第二个频率范围的常规声道,所述第二个频率范围对应于高于指定截止频率的一个或多个子频带;其中:参数音频解码技术使用参数音频编码生成音频输出声道;对于第一个频率范围,参数音频编码对应于所有音频输出声道;以及对于第二个频率范围,参数音频编码仅仅对应于常规声道,而不针对至少一个低频效果声道。 A parametric audio decoder for synthesizing a multi-channel audio signal with a plurality of audio output channels, the multi-channel audio signal having a plurality of conventional channels and at least one low-frequency effect channel, the parametric audio decoding The processor comprises: a side information processor adapted to recover parametric audio coding, and a synthesizer adapted to: apply parametric audio decoding techniques to generate all audio output channels for a first frequency range, said first frequency range corresponds to one or more sub-bands below the specified cutoff frequency; and applying parametric audio decoding techniques produces regular channels only for the second frequency range corresponding to frequencies above the specified cutoff frequency One or more frequency subbands; where: the parametric audio decoding technique generates audio output channels using parametric audio encoding; for the first frequency range, the parametric audio encoding corresponds to all audio output channels; and for the second frequency range, the parametric audio encoding The encoding corresponds only to the regular channels, not to at least one low-frequency effects channel. the
为了环绕音响的应用,本发明的实施例涉及基于BCC的参数音频编码技术,其中基于频带的BCC编码并不应用于低频超低音喇叭(LFE)声道的高于截止频率的子频带。例如,对5.1环绕音响而言,BCC编码应用于所有的六个声道(即五个常规声道加一个LFE声道)的低于截止频率的子频带,而BCC编码仅仅应用于五个常规声道(即不应用于LFE声道)的高于截止频率的子频带。通过避免BCC编码在LFE声道的“高”频上的应用,本发明的这些实施例与相应的在所有频率上处理所有六个声道的基于BCC的系统相比,具有(1)减少的编码器和解码器处理负担和(2)更小的BCC码比特流。 For surround sound applications, embodiments of the present invention relate to BCC-based parametric audio coding techniques, where band-based BCC coding is not applied to sub-bands above the cutoff frequency of the low-frequency subwoofer (LFE) channel. For example, for 5.1 surround sound, BCC coding is applied to sub-bands below the cutoff frequency of all six channels (that is, five conventional channels plus one LFE channel), while BCC coding is applied to only five conventional channels. The sub-band above the cutoff frequency of the channel (i.e. not applied to the LFE channel). By avoiding the application of BCC coding on the "high" frequencies of the LFE channel, these embodiments of the invention have (1) reduced Encoder and decoder processing burden and (2) smaller BCC coded bitstream. the
更一般而言,本发明涉及如BCC编码等参数音频编码技术的应用,但并不局限于BCC编码,其中在两个或更多个不同的频率范围内处理输入声道的两个或更多个不同的子集。如在该说明书中所使用的,术语“子集”可指包含所有输入声道的集合以及那些包括比所有输入声道要少声道的适当的子集。本发明在5.1和其它环绕音响信号的BCC编码中的应用只是本发明的一种特例。 More generally, the present invention relates to the application of parametric audio coding techniques such as BCC coding, but not limited to BCC coding, in which two or more input channels are processed in two or more different frequency ranges. different subsets. As used in this specification, the term "subset" may refer to a set comprising all input channels as well as those suitable subsets comprising fewer than all input channels. The application of the present invention to BCC coding of 5.1 and other surround sound signals is only a special case of the present invention. the
附图说明Description of drawings
本发明的其它方面、特征和优点将由下面的具体描述、所附的权利要求以及附图而更加明显,其中: Other aspects, features and advantages of the present invention will be more apparent from the following detailed description, appended claims and accompanying drawings, wherein:
图1示出了执行双声道提示码(BCC)的音频处理系统的方框图;以及 Figure 1 shows a block diagram of an audio processing system implementing a binaural cue code (BCC); and
图2示出了根据本发明的一个实施例执行BCC编码的音频处理系统的方框图。 FIG. 2 shows a block diagram of an audio processing system performing BCC encoding according to one embodiment of the present invention. the
具体实施方式 Detailed ways
图2示出了根据本发明的一个实施例执行5.1环绕音响的双声道提示编码(BCC)的音频处理系统200的方框图。BCC系统200具有一个BCC编码器202,它接收六个音频输入声道208(即五个常规声道和一个LFE声道)。BCC编码器202具有一个下混频器210,它把音频输入声道(包括LFE声道)转换(例如取平均)为一个或多个、但少于六个的组合声道212。
FIG. 2 shows a block diagram of an
另外,BCC编码器202具有一个BCC分析器214,它为输入声道产生BCC提示码数据流216。如图2所示,对处于或低于一个指定的截止频率fc的子频带,BCC分析器214在生成BCC提示码数据时使用所有的六个5.1环绕音响输入声道(包括LFE声道)。对所有其它(即高频)的子频带,BCC分析器214仅使用五个常规声道(不使用LFE声道)来产生BCC提示码数据。结果,LFE声道仅对处于或低于截止频率的BCC子频带、而不是对整个BCC频率范围贡献BCC码,从而减小了边信息比特流的整个大小。
Additionally, the
截止频率最好被选择为使得LFE声道的有效音频带宽小于或等于fc(即LFE声道在超过截止频率时实际上没有能量或者没有实际的音频内容)。除非子频带被调整到截止频率,否则截止频率将落入特定的子频带内。这种情况下,部分子频带将超过截止频率。为了便于说明,这样的子频带被称为“处于”截止频率上。在优选的实施例中, LFE声道的整个子频带都被BCC编码,下一个较高频率的子频带为未经过BCC编码的第一高频子频带。 The cutoff frequency is preferably chosen such that the effective audio bandwidth of the LFE channel is less than or equal to fc (ie the LFE channel has practically no energy or actual audio content above the cutoff frequency). Unless the subband is tuned to the cutoff frequency, the cutoff frequency will fall within the particular subband. In this case, some subbands will exceed the cutoff frequency. For ease of illustration, such sub-bands are said to be "at" the cutoff frequency. In a preferred embodiment, the entire subband of the LFE channel is BCC coded, and the next higher frequency subband is the first high frequency subband that is not BCC coded.
在一种可能的实现方式中,BCC提示码包括输入声道的声道间电平差(ICLD)、声道间时间差(ICTD)以及声道间相关(ICC)数据。BCC分析器214最好执行类似于’877和’458申请中所描述的基于频带的处理,从而为音频输入声道的不同子频带产生ICLD和ICTD数据。另外,BCC分析器214最好为不同的子频带产生作为ICC数据的相干性量度。这些相干性量度在’437和’591申请中做了更具体的描述。
In a possible implementation manner, the BCC prompt code includes inter-channel level difference (ICLD), inter-channel time difference (ICTD) and inter-channel correlation (ICC) data of input channels. The
BCC编码器202将一个或多个组合声道212以及BCC提示码数据流216(例如关于该组合声道的带内或带外边信息)发送到BCC系统200的BCC解码器204。BCC解码器204具有一个边信息处理器218,它对数据流216进行处理,以恢复BCC提示码220(例如ICLD、ICTD和ICC数据)。BCC解码器204还具有一个BCC合成器222,它使用恢复的BCC提示码220由一个或多个组合声道212合成六个音频输出声道224,从而分别通过六个环绕音响扬声器226播放。
如图2所示,BCC合成器222对处于或低于截止频率fc的子频带执行六个声道的BCC合成,从而为所有的六个5.1环绕声道(即包括LFE声道)产生频率内容,同时对高于截止频率的子频带执行五声道BCC合成,从而仅仅为5.1环绕音响的五个常规声道产生频率内容。特别地,BCC合成器222把接收到的组合声道212分解为多个子频带(例如临界频带)。在这些子频带中应用不同的处理以获取输出音频声道的相应子频带。结果,对于LFE声道仅获取其频率处于或低于截止频率的子频带。换句话说,LFE声道仅具有处于或低于截止频率的子频带的频率内容。LFE声道的较高的子频带(即那些高于截止频率的子频带)可以用零信号来填充(如果必要的话)。
As shown in FIG. 2 ,
根据特别的实现方式,BCC编码器可以被设计为对所有的频率产生BCC提示码,且简单地不对特别的子频带(例如高于截止频率的子频带和/或实际上具有零能量的子频带)发送这些提示码。类似地, 相应的BCC解码器可以被设计为对所有的频率执行常规的BCC合成,其中BCC解码器对那些不具有明确传送的编码的子频带应用合适的BCC提示码值。 Depending on the particular implementation, the BCC encoder can be designed to generate BCC hint codes for all frequencies, and simply not for specific sub-bands (e.g. sub-bands above the cut-off frequency and/or sub-bands with practically zero energy ) to send these prompt codes. Similarly, a corresponding BCC decoder can be designed to perform conventional BCC synthesis for all frequencies, where the BCC decoder applies appropriate BCC hint code values for those subbands that do not have an explicitly transmitted code. the
虽然本发明已经在应用了’877和’458申请所述的技术来合成听觉场景的BCC解码器的情况下进行了描述,但是本发明也可在应用其它技术来合成听觉场景的BCC解码器的情况下实现,而不必依赖于’877和’458申请所述的技术。例如,本发明的BCC处理的实现可无需ICTD、ICLD和/或ICC数据,可使用或不使用其它合适的例如与头部相关传递函数相关联的提示码。 Although the present invention has been described in the context of a BCC decoder that applies the techniques described in the '877 and '458 applications to synthesize auditory scenes, the invention is also applicable to BCC decoders that employ other techniques to synthesize auditory scenes. case, without having to rely on the techniques described in the '877 and '458 applications. For example, the BCC processing of the present invention may be implemented without ICTD, ICLD and/or ICC data, with or without other suitable hint codes such as those associated with head-related transfer functions. the
在图2的实施例中,通过把六声道BCC分析应用到处于或低于截止频率的子频带以及把五声道BCC分析应用到高于截止频率的子频带来对5.1环绕音响进行编码。在另一个实施例中,本发明可应用于7.1环绕音响,其中八声道BCC分析被应用到处于或低于指定的截止频率的子频带,七声道BCC分析(排除了单个LFE声道)被应用到高于截止频率的子频带。 In the embodiment of Figure 2, 5.1 surround sound is encoded by applying six-channel BCC analysis to sub-bands at or below the cut-off frequency and five-channel BCC analysis to sub-bands above the cut-off frequency. In another embodiment, the invention is applicable to 7.1 surround sound, where eight-channel BCC analysis is applied to sub-bands at or below a specified cutoff frequency, seven-channel BCC analysis (excluding the single LFE channel) is applied to subbands above the cutoff frequency. the
本发明还可应用于具有超过一个LFE声道的环绕音响。例如对于10.2环绕音响,可将十二声道BCC分析应用到处于或低于指定的截止频率的子频带,而将十声道BCC分析(排除了两个LFE声道)应用到高于截止频率的子频带。作为替代,也可以指定两个不同的截止频率:用于10.2环绕音响的第一个LFE声道的第一截止频率以及用于第二个LFE声道的第二截止频率。在这种情况下,假定第一截止频率低于第二截止频率,可将十二声道BCC分析应用到处于或低于第一截止频率的子频带,而将十一声道BCC分析(排除了第一个LFE声道)应用到(1)高于第一截至频率且(2)处于或低于第二截止频率的子频带,并将十声道BCC分析(排除了两个LFE声道)应用到高于第二截止频率的子频带。 The invention is also applicable to surround sound with more than one LFE channel. For 10.2 surround sound for example, twelve-channel BCC analysis can be applied to subbands at or below the specified cutoff frequency, while ten-channel BCC analysis (with the two LFE channels excluded) can be applied above the cutoff frequency sub-band. Alternatively, two different cutoff frequencies can also be specified: a first cutoff frequency for the first LFE channel of 10.2 surround sound and a second cutoff frequency for the second LFE channel. In this case, assuming that the first cutoff frequency is lower than the second cutoff frequency, twelve-channel BCC analysis can be applied to subbands at or below the first cutoff frequency, while eleven-channel BCC analysis (excluding excluding the first LFE channel) to subbands that are (1) above the first cutoff frequency and (2) at or below the second cutoff frequency, and apply the ten-channel BCC analysis (excluding the two LFE channels ) is applied to subbands above the second cutoff frequency. the
类似地,一些消费类多声道设备被特意设计为具有不同频率范围的不同输出声道。例如,一些5.1环绕音响设备具有两个后方声道,这些后方声道被设计为仅再现低于7kHz的频率。本发明通过指定两 个截止频率可应用于这类系统:一个截止频率用于LFE声道,而一个较高的用于后方声道。在这种情况下,六声道BCC分析可被应用到处于或低于LFE截止频率的子频带,五声道BCC分析(排除了LFE声道)可被应用到(1)高于LFE截止频率和(2)处于或低于后方声道截止频率的子频带,三声道BCC分析(排除了LFE声道和两个后方声道)可被应用到高于后方声道截止频率的子频带。 Similarly, some consumer multi-channel devices are purposely designed to have different output channels with different frequency ranges. For example, some 5.1 surround sound equipment has two rear channels that are designed to reproduce only frequencies below 7kHz. The present invention is applicable to such systems by specifying two cutoff frequencies: one for the LFE channel and one higher for the rear channels. In this case, six-channel BCC analysis can be applied to subbands at or below the LFE cutoff frequency, and five-channel BCC analysis (excluding the LFE channel) can be applied to (1) above the LFE cutoff frequency and (2) sub-bands at or below the cut-off frequency of the rear channels, the three-channel BCC analysis (excluding the LFE channel and the two rear channels) can be applied to the sub-bands above the cut-off frequency of the rear channels. the
本发明还可进一步推广用于把参数音频编码应用于两个或更多个不同频率范围内的输入声道的两个或更多个不同的子集,其中参数音频编码可以不同于BCC编码,不同频率范围被选择为使得不同输入声道的频率内容在这些范围内反映出来。根据特定的应用,不同的声道可以以任何适当的组合被排除在不同频率范围之外。例如,低频声道可以被排除在高频区域之外和/或高频声道可以被排除在低频区域之外。甚至可以是这种情况,即任何单个频率范围都不能包含所有的输入声道。 The invention can be further extended to apply parametric audio coding to two or more different subsets of input channels in two or more different frequency ranges, where parametric audio coding can be different from BCC coding, The different frequency ranges are chosen such that the frequency content of the different input channels is reflected within these ranges. Depending on the particular application, different channels may be excluded from different frequency ranges in any suitable combination. For example, low frequency channels may be excluded from the high frequency region and/or high frequency channels may be excluded from the low frequency region. It may even be the case that any single frequency range cannot encompass all input channels. the
如先前所述,尽管输入声道208可以被下混频以形成单个组合(例如单)声道212,在可选的实现方式中,根据特定的音频处理应用,多个输入声道可以被下混频从而形成两个或多个不同的“组合”声道。这种技术的更多信息可见04年1月20日提交的美国专利申请号10/762100,其内容在此并入作为参考。
As previously stated, while
在一些实现方式中,当进行下混频以产生多个组合声道时,组合声道的数据可使用常轨的音频传输技术来传送。例如,在产生两个组合声道时,能够使用常轨的立体声传输技术。在这种情况下,BCC解码器可提取并使用BCC编码从两个组合声道合成多声道信号(例如5.1环绕音响)。此外,这可以提供向下的兼容性,其中两个BCC合成声道使用常轨的(即不基于BCC的)立体声解码器来回放而忽略BCC编码。类似地,向下兼容性的实现可在产生单个BCC组合声道时用于常轨的单解码器。要注意的是,理论上讲,当有多个“组合”声道时,这些组合声道中的一个或多个实际上可基于单独的输入声道。 In some implementations, when downmixing is performed to generate multiple combined channels, data for the combined channels may be transmitted using conventional audio transmission techniques. For example, when generating two combined channels, a conventional stereo transmission technique can be used. In this case, a BCC decoder can extract and use BCC encoding to synthesize a multi-channel signal (eg 5.1 surround sound) from two combined channels. Furthermore, this can provide backward compatibility where the two BCC synthesis channels are played back using a regular (ie non-BCC based) stereo decoder ignoring the BCC encoding. Similarly, a backwards-compatibility implementation can be used with a single decoder for conventional tracks when producing a single BCC composite channel. Note that, theoretically, when there are multiple "combined" channels, one or more of these combined channels could actually be based on separate input channels. the
虽然BCC系统200可具有与音频输出声道相同数目的音频输入 声道,但在作为替代的实施例中,输入声道的数目可根据特定的应用大于或小于输出声道的数目。例如,输入音频可对应于7.1环绕音响,而合成的输出音频可对应于5.1环绕音响,反之亦然。
While the
一般来讲,本发明的BCC编码器的实现可基于这样一种情况,即把M个输入音频声道转换成N个组合声道以及一个或多个相应的BCC编码子集,其中M>N≥1。类似地,本发明的BCC解码器的实现还可基于这样一种情况,即从N个组合音频声道中产生P个输出声道以及相应的BCC编码子集,其中P>N,且P可以和M相同也可以不同。 In general, the implementation of the BCC encoder of the present invention can be based on the case of converting M input audio channels into N combined channels and one or more corresponding BCC coded subsets, where M>N ≥1. Similarly, the realization of the BCC decoder of the present invention can also be based on the fact that P output channels and corresponding BCC coding subsets are generated from N combined audio channels, where P>N, and P can It can be the same as M or different. the
根据特定的实现方式,图2的BCC编码器202和BCC解码器204二者接收到的以及产生的多种不同信号可以是模拟和/数字信号的任何适当的组合,包括所有的模拟信号或所有的数字信号。虽然图2中未作显示,但本领域的技术人员可以理解,一个或多个组合声道212以及BCC提示码数据流116可由BCC编码器202进行进一步的编码,并由BCC解码器204进行相应的解码,例如基于一些适当的压缩机制(例如ADPCM)进一步减小传送的数据的大小。
Depending on the particular implementation, the various signals received by and generated by both
从BCC编码器202到BCC解码器204的数据传输的定义取决于音频处理系统200的特定应用。例如,在一些实施例中,如音乐会的现场广播,传输可涉及数据的实时传输以便在远端位置立即播放。在其它的应用中,“传输”可涉及数据到CD或其它适当的存储介质的存储以用于以后的(即非实时的)播放。当然,其它的应用也是有可能的。
The definition of data transfer from
根据特定的实现方式,传输信道可以是有线的或无线的,可以使用定制的或标准化的协议(例如IP)。如CD、DVD、数字磁带录音机以及固态存储器等介质可用于存储。此外,传输和/或存储可以包括、但并不必须包括信道编码。类似地,虽然本发明已经基于数字音频系统做了描述,但本领域的技术人员可以理解的是,本发明还可基于模拟音频系统的情况来实现,如AM无线电、FM无线电以及模拟电视广播的音频部分,以上均支持附加的带内低比特率传输信道的引入。 Depending on the particular implementation, the transport channel can be wired or wireless, and custom or standardized protocols (eg, IP) can be used. Media such as CDs, DVDs, digital tape recorders, and solid-state memory can be used for storage. Furthermore, transmission and/or storage may, but need not, include channel coding. Similarly, although the present invention has been described in terms of digital audio systems, those skilled in the art will appreciate that the present invention can also be practiced in the context of analog audio systems, such as AM radio, FM radio, and analog television broadcasting. In the audio part, all of the above support the introduction of additional in-band low bit rate transmission channels. the
本发明的实现还可用于许多种不同的应用,如音乐再现、广播和 电话技术。例如,本发明的实现还可用于数字无线电/电视/因特网(例如网站户播)广播,如天狼星卫星广播公司或XM卫星广播公司。其它的应用包括IP语音、PSTN或其它语音网络、模拟无线电广播和因特网广播。 Implementations of the present invention can also be used in many different applications, such as music reproduction, broadcasting and telephony. For example, implementations of the present invention may also be used with digital radio/television/Internet (eg, webcast) broadcasts, such as Sirius Satellite Broadcasting or XM Satellite Broadcasting. Other applications include Voice over IP, PSTN or other voice networks, analog radio broadcasting and Internet broadcasting. the
根据特定的应用,可采用不同的技术把BCC编码的集合嵌入到组合声道中从而获得本发明的BCC信号。任何特定技术的可行性可能至少部分地依赖于用于BCC信号的特定的传输/存储介质。例如,数字无线电广播协议通常支持包括被常规接收机忽略的附加增强比特(例如,在数据包的包头部分)。这些附加的比特可用来表示听觉场景参数的集合以提供BCC信号。一般来说,本发明的实现中可使用任何合适的技术为音频信号做标记,其中对应于听觉场景参数集合的数据被嵌入到音频信号中从而形成BCC信号。例如,这些技术可涉及隐藏在感知遮蔽曲线下的数据或隐藏在伪随机噪声中的数据。伪随机噪声感觉起来像平缓的噪声。数据嵌入的实现还可使用类似于TDM(时分复用)传输中采用的位元抢夺方法,从而用于带内信令。另一种可能的技术是μ律LSB位翻转,其中最低有效位用来传送数据。 Depending on the specific application, different techniques can be used to embed the set of BCC codes into the combined channel to obtain the BCC signal of the present invention. The feasibility of any particular technique may depend, at least in part, on the particular transmission/storage medium used for the BCC signal. For example, digital radio broadcast protocols often support the inclusion of additional enhancement bits (eg, in the header portion of data packets) that are ignored by conventional receivers. These additional bits can be used to represent the set of auditory scene parameters to provide a BCC signal. In general, any suitable technique may be used in the implementation of the present invention to mark the audio signal, wherein data corresponding to a set of auditory scene parameters is embedded into the audio signal to form a BCC signal. For example, these techniques may involve data hidden under a perceptual masking curve or data hidden in pseudorandom noise. Pseudorandom noise feels like smooth noise. Data embedding can also be implemented for in-band signaling using a bit-snatching approach similar to that employed in TDM (Time Division Multiplexing) transmission. Another possible technique is μ-law LSB bit flipping, where the least significant bit is used to transmit data. the
本发明可用基于电路的处理方式来实现,包括在单个集成电路上的可能的实现。对本领域技术人员来说很明显,电路元件的多种功能还可用软件程序中的处理步骤来实现。这样的软件例如可用于数字信号处理器、微控制器或通用计算机中。 The invention may be implemented in circuit-based processing, including possible implementations on a single integrated circuit. It will be apparent to those skilled in the art that various functions of circuit elements may also be implemented by processing steps in a software program. Such software can be used, for example, in digital signal processors, microcontrollers or general purpose computers. the
本发明可以用这些方法以及实现这些方法的装置的形式来体现。本发明还可以用包含在如软盘、CD-ROM、硬盘或其它任何机器可读存储介质等的有形介质中的程序代码的形式来体现,其中当程序代码被加载到如计算机等的机器中并被执行时,该机器就成为实现本发明的装置。本发明还可以用程序代码的形式来体现,这些程序代码例如存储在存储介质中、加载到机器中和/或由机器执行,或者通过某些传输介质或载体来传送,如在电线或电缆上、通过光纤、或经由电磁辐射进行传送,其中当程序代码被加载到如计算机等的机器中并被执行时,该机器就成为实现本发明的装置。当在通用处理器上实现时,程序代码段与该处理器共同提供一种可类似于专用逻辑电路工作的唯一的装置。 The present invention can be embodied in the form of these methods and apparatuses for carrying out these methods. The present invention can also be embodied in the form of program code contained in a tangible medium such as a floppy disk, CD-ROM, hard disk or any other machine-readable storage medium, wherein when the program code is loaded into a machine such as a computer and When executed, the machine becomes a means for implementing the invention. The present invention can also be embodied in the form of program code, for example, stored in a storage medium, loaded into a machine and/or executed by a machine, or transmitted through some transmission medium or carrier, such as on an electric wire or cable. , through optical fiber, or via electromagnetic radiation, wherein when the program code is loaded into a machine such as a computer and executed, the machine becomes a means for implementing the present invention. When implemented on a general-purpose processor, the program code segments cooperate with the processor to provide a unique device that operates like specific logic circuits. the
还应理解的是,为了便于阐述本发明的本质而已经描述和展示的部件的细节、材料以及排列,都可由本领域技术人员在不违背如下面的权利要求所表述的本发明的范围的情况下加以改变。 It should also be understood that details, materials and arrangements of parts which have been described and shown for the purpose of illustrating the nature of the invention can be changed by those skilled in the art without departing from the scope of the invention as expressed in the following claims Change it below. the
Claims (16)
Applications Claiming Priority (5)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US54997204P | 2004-03-04 | 2004-03-04 | |
| US60/549,972 | 2004-03-04 | ||
| US10/827,900 US7805313B2 (en) | 2004-03-04 | 2004-04-20 | Frequency-based coding of channels in parametric multi-channel coding systems |
| US10/827,900 | 2004-04-20 | ||
| PCT/US2005/005605 WO2005094125A1 (en) | 2004-03-04 | 2005-02-23 | Frequency-based coding of audio channels in parametric multi-channel coding systems |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN1930914A CN1930914A (en) | 2007-03-14 |
| CN1930914B true CN1930914B (en) | 2012-06-27 |
Family
ID=37859620
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN2005800070361A Expired - Lifetime CN1930914B (en) | 2004-03-04 | 2005-02-23 | Method and device for encoding and synthesizing multi-channel audio signals |
Country Status (2)
| Country | Link |
|---|---|
| CN (1) | CN1930914B (en) |
| RU (1) | RU2323551C1 (en) |
Families Citing this family (12)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| EP2144229A1 (en) * | 2008-07-11 | 2010-01-13 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Efficient use of phase information in audio encoding and decoding |
| US8311810B2 (en) * | 2008-07-29 | 2012-11-13 | Panasonic Corporation | Reduced delay spatial coding and decoding apparatus and teleconferencing system |
| EP2304975B1 (en) * | 2008-07-31 | 2014-08-27 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Signal generation for binaural signals |
| EP2351024A1 (en) | 2008-10-01 | 2011-08-03 | GVBB Holdings S.A.R.L | Decoding apparatus, decoding method, encoding apparatus, encoding method, and editing apparatus |
| EP2175670A1 (en) * | 2008-10-07 | 2010-04-14 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Binaural rendering of a multi-channel audio signal |
| US8892450B2 (en) | 2008-10-29 | 2014-11-18 | Dolby International Ab | Signal clipping protection using pre-existing audio gain metadata |
| EP2214161A1 (en) * | 2009-01-28 | 2010-08-04 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Apparatus, method and computer program for upmixing a downmix audio signal |
| US12002476B2 (en) | 2010-07-19 | 2024-06-04 | Dolby International Ab | Processing of audio signals during high frequency reconstruction |
| CA3027803C (en) | 2010-07-19 | 2020-04-07 | Dolby International Ab | Processing of audio signals during high frequency reconstruction |
| US8675719B2 (en) * | 2010-09-28 | 2014-03-18 | Tektronix, Inc. | Multi-domain test and measurement instrument |
| BR112015002228B1 (en) * | 2012-08-03 | 2021-12-14 | Fraunhofer -Gesellschaft Zur Ferderung Der Angewandten Forschung E.V. | DECODER AND METHOD FOR A PARAMETRIC CONCEPT OF SPATIAL AUDIO OBJECT ENCODING GENERALIZED FOR MULTI-CHANNEL DOWNMIX/UPMIX BOXES |
| US9607624B2 (en) * | 2013-03-29 | 2017-03-28 | Apple Inc. | Metadata driven dynamic range control |
Family Cites Families (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US4132859A (en) * | 1977-12-02 | 1979-01-02 | Egils Ranga | Sound reproducing apparatus |
| US4382157A (en) * | 1978-07-17 | 1983-05-03 | Kenneth P. Wert, Sr. | Multiple speaker type sound producing system |
| US5265166A (en) * | 1991-10-30 | 1993-11-23 | Panor Corp. | Multi-channel sound simulation system |
| DE4135977C2 (en) * | 1991-10-31 | 1996-07-18 | Fraunhofer Ges Forschung | Method for the simultaneous transmission of signals from N signal sources |
| RU2193827C2 (en) * | 1997-11-14 | 2002-11-27 | В. Вейвс (Сша) Инк. | Post-amplifying stereo-to-ambient sound decoding circuit |
-
2005
- 2005-02-23 CN CN2005800070361A patent/CN1930914B/en not_active Expired - Lifetime
- 2005-02-23 RU RU2006134979/09A patent/RU2323551C1/en active
Also Published As
| Publication number | Publication date |
|---|---|
| RU2323551C1 (en) | 2008-04-27 |
| CN1930914A (en) | 2007-03-14 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| EP1721489B1 (en) | Frequency-based coding of audio channels in parametric multi-channel coding systems | |
| JP4772279B2 (en) | Multi-channel / cue encoding / decoding of audio signals | |
| KR101315077B1 (en) | Scalable multi-channel audio coding | |
| KR101158698B1 (en) | A multi-channel encoder, a method of encoding input signals, storage medium, and a decoder operable to decode encoded output data | |
| CN1647156B (en) | Parameter encoding method, parameter encoder, device for providing an audio signal, decoding method, decoder, device for providing a decoded multi-channel audio signal | |
| JP6088444B2 (en) | 3D audio soundtrack encoding and decoding | |
| JP4322207B2 (en) | Audio encoding method | |
| US7693721B2 (en) | Hybrid multi-channel/cue coding/decoding of audio signals | |
| RU2367033C2 (en) | Multi-channel hierarchical audio coding with compact supplementary information | |
| JP4939933B2 (en) | Audio signal encoding apparatus and audio signal decoding apparatus | |
| TW200818122A (en) | Concept for combining multiple parametrically coded audio sources | |
| CN1930914B (en) | Method and device for encoding and synthesizing multi-channel audio signals | |
| KR20080066537A (en) | Method and apparatus for encoding / decoding audio signal having additional information | |
| HK1101634B (en) | Method and apparatus for coding and synthesizing multi-channel audio signal | |
| WO2006011367A1 (en) | Audio signal encoder and decoder | |
| KR20070017441A (en) | Low Bit Rate Spatial Coding Method and System | |
| Breebaart et al. | 19th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007 | |
| HK1233037B (en) | Residual encoding in an object-based audio system |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| C06 | Publication | ||
| PB01 | Publication | ||
| C10 | Entry into substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| REG | Reference to a national code |
Ref country code: HK Ref legal event code: DE Ref document number: 1101634 Country of ref document: HK |
|
| C14 | Grant of patent or utility model | ||
| GR01 | Patent grant | ||
| REG | Reference to a national code |
Ref country code: HK Ref legal event code: GR Ref document number: 1101634 Country of ref document: HK |
|
| CX01 | Expiry of patent term | ||
| CX01 | Expiry of patent term |
Granted publication date: 20120627 |