KR100932790B1

KR100932790B1 - Multitrack Downmixing Device Using Correlation Between Sound Sources and Its Method

Info

Publication number: KR100932790B1
Application number: KR1020080036085A
Authority: KR
Inventors: 장대영; 백승권; 김민제; 강경옥; 홍진우
Original assignee: 한국전자통신연구원
Priority date: 2007-12-18
Filing date: 2008-04-18
Publication date: 2009-12-21
Anticipated expiration: 2028-04-18
Also published as: KR20090066186A

Abstract

본 발명은 음원 간 상호상관을 이용한 멀티트랙 다운믹싱 장치 및 그 방법에 관한 것으로, 개별 음원 신호를 각 음원 간 상호상관을 이용하여 상호상관 값이 높은 조합을 가지는 멀티트랙 신호로 다운믹싱함으로써, 다운믹싱된 멀티트랙 신호로부터 개별 음원을 더욱 충실하게 복원할 수 있는, 음원 간 상호상관을 이용한 멀티트랙 다운믹싱 장치 및 그 방법을 제공하고자 한다.The present invention relates to a multitrack downmixing apparatus using the cross-correlation between sound sources and a method thereof, and to downmixing individual sound source signals into a multitrack signal having a high cross-correlation value by using cross-correlation between the respective sound sources. An apparatus and method for multitrack downmixing using cross-correlation between sound sources, which can more faithfully recover an individual sound source from a mixed multitrack signal, are provided.

이를 위하여, 본 발명은 멀티트랙 다운믹싱 장치에 있어서, 개별 음원 신호를 주파수 대역으로 변환하기 위한 신호 변환 수단; 상기 변환된 개별 음원 신호로부터 각 음원 간 공간정보를 산출하기 위한 공간정보 산출 수단; 상기 산출된 각 음원 간 공간정보를 이용하여 상호상관 값에 따라 믹싱조합 정보를 결정하기 위한 믹싱조합 결정 수단; 및 상기 결정된 믹싱조합 정보에 따라 상기 변환된 개별 음원 신호를 멀티트랙 신호로 다운믹싱하기 위한 멀티트랙 다운믹싱 수단을 포함한다.To this end, the present invention provides a multitrack downmixing apparatus comprising: signal conversion means for converting an individual sound source signal into a frequency band; Spatial information calculating means for calculating spatial information between respective sound sources from the converted individual sound source signals; Mixing combination determination means for determining mixing combination information according to a cross-correlation value using the calculated spatial information between the respective sound sources; And multitrack downmixing means for downmixing the converted individual sound source signals into multitrack signals according to the determined mixing combination information.

상호상관, 멀티트랙, 개별 음원 신호, 공간정보, 믹싱조합, 폐루프, 믹싱 매트릭스 정보, 멀티트랙 다운믹싱 Cross-correlation, Multitrack, Individual Source Signal, Spatial Information, Mixing Combination, Closed Loop, Mixing Matrix Information, Multitrack Downmixing

Description

Multitrack downmixing apparatus and its method using cross-correlation between sound sources {APPARATUS AND METHOD OF MULTI-TRACK DOWN-MIXING USING CROSS CORRELATION BETWEEN VOICE SOURCE}

본 발명은 음원 간 상호상관을 이용한 멀티트랙 다운믹싱 장치 및 그 방법에 관한 것으로, 더욱 상세하게는 개별 음원 신호를 각 음원 간 상호상관을 이용하여 상호상관 값이 높은 조합을 가지는 멀티트랙 신호로 다운믹싱함으로써, 다운믹싱된 멀티트랙 신호로부터 개별 음원을 더욱 충실하게 복원할 수 있는, 음원 간 상호상관을 이용한 멀티트랙 다운믹싱 장치 및 그 방법에 관한 것이다.The present invention relates to a multitrack downmixing apparatus using the cross-correlation between sound sources and a method thereof, and more particularly, to downlink an individual sound source signal into a multitrack signal having a high cross-correlation value using cross-correlation between each sound source. The present invention relates to a multitrack downmixing apparatus using the cross-correlation between sound sources and a method thereof, which can faithfully recover individual sound sources from downmixed multitrack signals by mixing.

본 발명은 정보통신부 및 정보통신연구진흥원의 IT성장동력기술개발사업의 일환으로 수행한 연구로부터 도출된 것이다[과제관리번호: 2007-S-004-01, 과제명: 무안경 개인형 3D 방송기술개발].The present invention is derived from the research conducted as part of the IT growth engine technology development project of the Ministry of Information and Communication and the Ministry of Information and Telecommunications Research and Development. [Task Management Number: 2007-S-004-01] Development].

현재의 음향 콘텐츠들은 완성된 후에는 볼륨제어, 음색변환 등 극히 제한된 조정 외에는 상호작용이 불가능하다. 이러한 조정은 전체적인 신호의 조정을 말하 며, 음향 콘텐츠에 포함되어 있는 개별 음원에 대한 조정은 현재로서는 불가능하다.Once complete, the current acoustic content cannot be interacted with except for very limited adjustments such as volume control and timbre conversion. This adjustment refers to the adjustment of the overall signal, and adjustments to the individual sound sources contained in the sound content are currently impossible.

최근에, 음악의 개별 악기 및 가수의 목소리를 별도의 개별 음원으로 처리함으로써, 사용자에게 각 개별 음원의 다양한 조정(예를 들어, 선택 및 레벨 조정 등)이 가능하도록 하는 대화형 음악 서비스가 출시되었다. 그러나 대화형 음악 서비스는 개별 음원을 별도로 처리해야 하기 때문에, 대용량의 전송 및 저장 장치가 필요하다는 문제점이 있다.Recently, an interactive music service has been introduced that allows the user to make various adjustments (e.g. selection and level adjustments) of each individual sound source by processing the individual instrument and singer's voice as separate individual sound sources. . However, the interactive music service has a problem in that a large amount of transmission and storage device is required because the separate sound sources must be processed separately.

이러한 문제점을 해결하기 위해, 종래의 "MPEG 서라운드(Surround)"라는 멀티채널 오디오 부호화 기술과, 객체 단위의 오디오 부호화 기술인 SAOC(Spatial Audio Object Coding) 기술은 모노 또는 멀티채널의 다운믹스를 이용하여 압축을 수행한다.In order to solve this problem, the conventional multi-channel audio coding technique called "MPEG Surround" and the spatial audio object coding (SAOC) technique, which is an object-based audio encoding technique, are compressed using a mono or multichannel downmix. Do this.

종래의 오디오 다운믹싱/업믹싱 기술에 대해서 구체적으로 살펴보면, 이러한 종래의 오디오 다운믹싱 기술은 양이 요인(예를 들면, 양이 간 레벨 차이, 양이 간 지연, 양이 간 상호상관 등)을 이용하여 멀티채널 또는 다수의 객체 음원을 분석한다. 그리고 이러한 종래의 다운믹싱 기술은 분석 결과를 토대로 멀티채널 또는 다수의 객체 음원을 하나의 신호로 다운믹싱하여 양이 요인과 함께 전송한다.Looking specifically at conventional audio downmixing / upmixing techniques, such conventional audio downmixing techniques can be used to determine the factors of quantity (e.g., level differences between sheep, delay between sheep, cross correlation between sheep, etc.). Analyze multichannel or multiple object sources. In addition, the conventional downmixing technique downmixes a multichannel or a plurality of object sources into one signal based on the analysis result, and transmits them together with a positive factor.

그리고 종래의 오디오 업믹싱 기술에서 사용자 단말은 상기 전송된 양이 요인을 이용하여 다운믹싱된 신호로부터 원래의 멀티채널 혹은 객체 음원을 복원한다. In a conventional audio upmixing technique, the user terminal restores the original multichannel or object sound source from the downmixed signal using the transmitted amount factor.

요컨대, 종래의 오디오 다운믹싱/업믹싱 기술은 다수의 객체 음원을 하나의 다운믹스 신호에 혼합한 후, 양이 요인을 이용하여 복원한다. 여기서, 양이 요인은 청각 특성을 반영한 비선형 필터뱅크를 통해 나누어진 대역별로 산출된다. 대역 내에서 혼합된 신호로부터 상호상관에 의해 비상관화(De-correlation)함으로써 음상이 가운데로 몰리는 현상을 완화한다.In short, the conventional audio downmixing / upmixing technique mixes a plurality of object sources into one downmix signal, and then reconstructs them using a positive factor. Here, the amount shift factor is calculated for each band divided by the nonlinear filter bank reflecting the auditory characteristics. De-correlation by cross-correlation from mixed signals in bands mitigates the converging of sound images.

하지만, 이러한 종래의 오디오 다운믹싱/업믹싱 기술은 다운믹스 신호로부터 원음을 복원하는 과정을 수행해야 하기 때문에, 복원된 개별 채널이나 음원 신호의 음질이 저하되어 독립적인 서비스가 불가능하다는 문제점이 있다.However, such a conventional audio downmixing / upmixing technique has to perform a process of restoring the original sound from the downmix signal, and thus there is a problem in that independent service is not possible because the sound quality of the restored individual channel or sound source signal is degraded.

따라서 상기와 같은 종래 기술은 다운믹스 신호로부터 원음을 복원하는 과정에서 오류가 발생하기 때문에, 복원된 개별 채널이나 음원 신호의 음질이 저하되어 독립적인 서비스가 불가능하다는 문제점이 있으며, 이러한 문제점을 해결하고자 하는 것이 본 발명의 과제이다.Therefore, the prior art as described above has a problem that the error occurs in the process of restoring the original sound from the downmix signal, the quality of the restored individual channel or sound source signal is degraded, so that independent service is not possible. It is a subject of the present invention.

따라서 본 발명은 개별 음원 신호를 각 음원 간 상호상관을 이용하여 상호상관 값이 높은 조합을 가지는 멀티트랙 신호로 다운믹싱함으로써, 다운믹싱된 멀티트랙 신호로부터 개별 음원을 더욱 충실하게 복원할 수 있는, 음원 간 상호상관을 이용한 멀티트랙 다운믹싱 장치 및 그 방법을 제공하는데 그 목적이 있다.Therefore, the present invention downmixes the individual sound source signals into multitrack signals having a high cross-correlation combination by using the cross-correlation between the respective sound sources, thereby more faithfully restoring the individual sound sources from the downmixed multitrack signals. An object of the present invention is to provide a multitrack downmixing apparatus using the cross-correlation between sound sources and a method thereof.

본 발명의 목적들은 이상에서 언급한 목적으로 제한되지 않으며, 언급되지 않은 본 발명의 다른 목적 및 장점들은 하기의 설명에 의해서 이해될 수 있으며, 본 발명의 실시예에 의해 보다 분명하게 알게 될 것이다. 또한, 본 발명의 목적 및 장점들은 특허 청구 범위에 나타낸 수단 및 그 조합에 의해 실현될 수 있음을 쉽게 알 수 있을 것이다.The objects of the present invention are not limited to the above-mentioned objects, and other objects and advantages of the present invention which are not mentioned above can be understood by the following description, and will be more clearly understood by the embodiments of the present invention. Also, it will be readily appreciated that the objects and advantages of the present invention may be realized by the means and combinations thereof indicated in the claims.

본 발명은 상기 문제점을 해결하기 위하여, 개별 음원 신호를 각 음원 간 상호상관을 이용하여 상호상관 값이 높은 조합을 가지는 멀티트랙 신호로 다운믹싱하는 것을 특징으로 한다.In order to solve the problem, the present invention is characterized by downmixing an individual sound source signal into a multitrack signal having a combination having a high cross-correlation value by using cross-correlation between each sound source.

더욱 구체적으로, 본 발명은, 멀티트랙 다운믹싱 장치에 있어서, 개별 음원 신호를 주파수 대역으로 변환하기 위한 신호 변환 수단; 상기 변환된 개별 음원 신호로부터 각 음원 간 공간정보를 산출하기 위한 공간정보 산출 수단; 상기 산출된 각 음원 간 공간정보를 이용하여 상호상관 값에 따라 믹싱조합 정보를 결정하기 위한 믹싱조합 결정 수단; 및 상기 결정된 믹싱조합 정보에 따라 상기 변환된 개별 음원 신호를 멀티트랙 신호로 다운믹싱하기 위한 멀티트랙 다운믹싱 수단을 포함한다.More specifically, the present invention provides a multitrack downmixing apparatus comprising: signal conversion means for converting an individual sound source signal into a frequency band; Spatial information calculating means for calculating spatial information between respective sound sources from the converted individual sound source signals; Mixing combination determination means for determining mixing combination information according to a cross-correlation value using the calculated spatial information between the respective sound sources; And multitrack downmixing means for downmixing the converted individual sound source signals into multitrack signals according to the determined mixing combination information.

한편, 본 발명은, 멀티트랙 다운믹싱 방법에 있어서, 개별 음원 신호를 주파수 대역으로 변환하는 신호 변환 단계; 상기 변환된 개별 음원 신호로부터 각 음원 간 공간정보를 산출하는 공간정보 산출 단계; 상기 산출된 각 음원 간 공간정보를 이용하여 상호상관 값에 따라 믹싱조합 정보를 결정하는 믹싱조합 결정 단계; 및 상기 결정된 믹싱조합 정보에 따라 상기 변환된 개별 음원 신호를 멀티트랙 신호로 다운믹싱하는 멀티트랙 다운믹싱 단계를 포함한다. Meanwhile, the present invention provides a multitrack downmixing method, comprising: a signal conversion step of converting an individual sound source signal into a frequency band; Calculating spatial information between the respective sound sources from the converted individual sound source signals; A mixing combination determination step of determining mixing combination information according to a cross-correlation value using the calculated spatial information between the respective sound sources; And a multitrack downmixing step of downmixing the converted individual sound source signals into a multitrack signal according to the determined mixing combination information.

상기와 같은 본 발명은, 개별 음원 신호를 각 음원 간 상호상관을 이용하여 상호상관 값이 높은 조합을 가지는 멀티트랙 신호로 다운믹싱함으로써, 다운믹싱된 멀티트랙 신호로부터 개별 음원을 더욱 충실하게 복원할 수 있는 효과가 있다. 즉, 본 발명은, 다수의 음원들을 하나의 신호로 다운믹싱하는 경우보다 여러 개의 트랙에 상호상관 값이 높은 음원들을 조합하여 다운믹싱함으로써, 혼합된 음원들을 더욱 충실하게 분리할 수 있다. The present invention as described above, by downmixing the individual sound source signal to a multitrack signal having a combination of high correlation value using the cross-correlation between each sound source, it is possible to faithfully recover the individual sound source from the downmixed multitrack signal It can be effective. That is, according to the present invention, the mixed sound sources can be more faithfully separated by downmixing a plurality of sound sources having a high cross-correlation value to several tracks than downmixing a plurality of sound sources into one signal.

따라서 본 발명은 사용자 제어에 의해 임의 음원을 제거하거나 단독으로 재생할 때 다른 음원의 혼입을 최대한 줄이고, 개별 음원을 별도로 전송할 때보다 채널용량을 줄일 수 있는 효과가 있다.Therefore, the present invention has the effect of reducing the mixing of other sound sources as much as possible when removing or reproducing any sound source by user control as much as possible, and reducing the channel capacity than when transmitting individual sound sources separately.

상술한 목적, 특징 및 장점은 첨부된 도면을 참조하여 상세하게 후술되어 있는 상세한 설명을 통하여 보다 명확해 질 것이며, 그에 따라 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자가 본 발명의 기술적 사상을 용이하게 실시할 수 있을 것이다. 또한, 본 발명을 설명함에 있어서 본 발명과 관련된 공지 기술에 대한 구체적인 설명이 본 발명의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우에 그 상세한 설명을 생략하기로 한다. 이하, 첨부된 도면을 참조하여 본 발명에 따른 바람직한 실시예를 상세히 설명하기로 한다.The above objects, features, and advantages will become more apparent from the detailed description given hereinafter with reference to the accompanying drawings, and accordingly, those skilled in the art to which the present invention pertains may share the technical idea of the present invention. It will be easy to implement. In addition, in describing the present invention, when it is determined that the detailed description of the known technology related to the present invention may unnecessarily obscure the gist of the present invention, the detailed description thereof will be omitted. Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings.

도 1 은 본 발명에 따른 음원 간 상호상관을 이용한 멀티트랙 다운믹싱 장치의 일실시예 구성도이다.1 is a diagram illustrating an embodiment of a multitrack downmixing apparatus using cross-correlation between sound sources according to the present invention.

도 1에 도시된 바와 같이, 본 발명에 따른 멀티트랙 다운믹싱 장치(100)는 신호 변환부(110), 공간정보 산출부(120), 믹싱조합 결정부(130) 및 멀티트랙 다운믹싱부(140)를 포함한다. 여기서, 신호 변환부(110)는 다수의 필터 뱅크(111)를 포함한다.As shown in FIG. 1, the multitrack downmixing apparatus 100 according to the present invention includes a signal converter 110, a spatial information calculator 120, a mixing combination determiner 130, and a multitrack downmixer ( 140). Here, the signal converter 110 includes a plurality of filter banks 111.

본 발명에 따른 멀티트랙 다운믹싱 장치(100)는 다수 개의 개별 음원 신호를 입력받아 주파수 대역마다 각 음원 신호 간 상호상관을 이용하여 멀티트랙 다운믹싱을 수행한다. 이는 멀티트랙 오디오 압축을 수행하기 위함이다. 여기서, 개별 음원은 음을 발생시키는 기본 단위로 정의되며, 대화형 음악 서비스에서의 객체 음원으로 이용되기도 한다. 개별 음원 신호는 필요에 따라 하나의 마이크로 녹음한 음향 신호가 포함될 수 있다. 또한, 개별 음원 신호는 개별 음원 신호들의 일부가 믹싱되어 생성된 모노 또는 스테레오 채널의 음향 신호가 포함될 수 있다.The multitrack downmixing apparatus 100 according to the present invention receives a plurality of individual sound source signals and performs multitrack downmixing using cross-correlation between each sound source signal for each frequency band. This is to perform multitrack audio compression. Here, the individual sound source is defined as a basic unit for generating sound, and may be used as an object sound source in an interactive music service. The individual sound source signal may include one micro-recorded sound signal as needed. In addition, the individual sound source signal may include a sound signal of a mono or stereo channel generated by mixing some of the individual sound source signals.

구체적으로 살펴보면, 신호 변환부(110)는 다수의 개별 음원 신호(예를 들어, 음원 신호 1, 음원 신호 2, 내지 음원 신호 n)를 각각 입력받는다. 그리고 다수의 개별 음원 신호는 개별 음원 각각에 해당하는 필터 뱅크(111)에서 신호 변환된다. 즉, 다수의 필터 뱅크(111)는 각각 입력받은 개별 음원 신호를 다수의 주파수 대역으로 변환한다. 이때, 개별 음원 신호는 다수의 부대역으로 구분된 전체 주파수 대역폭(예를 들어, 20 kHz) 중에서 해당 대역폭에 대한 주파수 대역으로 변환 된다.In detail, the signal converter 110 receives a plurality of individual sound source signals (for example, a sound source signal 1, a sound source signal 2, and a sound source signal n). A plurality of individual sound source signals are signal converted in the filter bank 111 corresponding to each of the individual sound sources. That is, the plurality of filter banks 111 converts the respective individual sound source signals to the plurality of frequency bands. At this time, the individual sound source signal is converted into a frequency band for the corresponding bandwidth among the entire frequency bandwidth (for example, 20 kHz) divided into a plurality of subbands.

각각의 필터 뱅크(111)는 다수의 개별 음원 신호와 대응되어 개별 음원 신호를 변환하는 기능을 수행한다. 예를 들면, 필터 뱅크(111)는 음원 1을 입력받아 각 주파수 대역폭에 해당하는 다수의 주파수 대역 신호로 변환한다. 즉, 필터 뱅크(111)는 전체 주파수 대역(예를 들면, 20 kHz) 중 음원 1이 해당되는 다수의 부대역 신호로 변환한다.Each filter bank 111 corresponds to a plurality of individual sound source signals and performs a function of converting individual sound source signals. For example, the filter bank 111 receives the sound source 1 and converts it into a plurality of frequency band signals corresponding to each frequency bandwidth. That is, the filter bank 111 converts the plurality of subband signals corresponding to the sound source 1 in the entire frequency band (for example, 20 kHz).

공간정보 산출부(120)는 신호 변환부(110)에서 변환된 다수의 주파수 대역 신호를 분석하여 해당 대역폭에 포함된 각 음원에 대한 공간정보를 산출한다. 공간정보에는 각 음원 간 레벨 차이값 및 상호상관 값이 포함되어 있다. 즉, 공간정보 산출부(120)는 각 음원 단위의 공간정보로서, 음원 간 레벨 차이값 및 상호상관 값을 산출하게 된다. 이때, 공간정보 산출부(120)는 스테레오 채널의 음원인 경우, 두 채널에 대한 상호상관 값을 모두 구하고 이를 평균한 상호상관 값으로 최종 상호상관 값을 결정한다.The spatial information calculator 120 analyzes a plurality of frequency band signals converted by the signal converter 110 and calculates spatial information of each sound source included in the corresponding bandwidth. The spatial information includes level difference values and cross-correlation values between sound sources. That is, the spatial information calculator 120 calculates the level difference value and the cross-correlation value between sound sources as the spatial information of each sound source unit. In this case, the spatial information calculator 120 obtains all of the cross-correlation values for the two channels, and determines the final cross-correlation value as the average of the cross-correlation values when the sound source of the stereo channel.

여기서, 상호상관 값은 "0" ~ "1" 사이의 값을 가진다. 상호상관 값이 "0"에 근접하면 두 음원 간의 상호상관이 작아짐을 알 수 있다. 또한, 상호상관 값이 "1"에 근접하면 두 음원 간의 상호상관이 커짐을 알 수 있다. 한편, 음원 간 레벨 차이값은 해당 주파수 대역에 포함되어 있는 다수의 음원 신호를 비교할 수 있는 각 음원 간의 레벨 차이값을 말한다.Here, the cross-correlation value has a value between "0" and "1". If the cross-correlation value is close to "0", it can be seen that the cross-correlation between the two sound sources becomes small. In addition, when the cross-correlation value is close to "1", it can be seen that the cross-correlation between the two sound sources increases. Meanwhile, the level difference value between sound sources refers to a level difference value between sound sources capable of comparing a plurality of sound source signals included in a corresponding frequency band.

그리고 믹싱조합 결정부(130)는 공간정보 산출부(120)에서 산출된 공간정보 즉, 음원 간 레벨 차이값 및 상호상관 값을 이용하여 상호상관이 높은 음원들을 선 택하여 믹싱조합 정보를 결정한다. 믹싱조합 정보는 다수의 음원 중에서 함께 믹싱하려는 음원들에 관한 것이다. 즉, 믹싱조합 결정부(130)는 상호상관이 높은 음원들을 선택한다. 여기서, 상호상관이 높은 음원들을 믹싱조합으로 선택하는 것은 멀티트랙 다운믹싱된 신호를 업믹싱할 때, 음원 간 오류 없이 업믹싱하기 위함이다. 또한, 이는 음원 신호의 음질 저하를 방지하여 업믹싱하기 위함이다.The mixing combination determination unit 130 determines the mixing combination information by selecting sound sources having high correlation with each other by using the spatial information calculated by the spatial information calculating unit 120, that is, the level difference value and the cross-correlation value between the sound sources. . The mixing combination information relates to sound sources to be mixed together among a plurality of sound sources. That is, the mixing combination determination unit 130 selects sound sources having high correlation. Here, the selection of the highly correlated sound sources as a mixing combination is for upmixing without error between the sound sources when upmixing the multitrack downmixed signal. In addition, this is to prevent the sound quality degradation of the sound source signal to upmix.

이때, 믹싱조합 결정부(130)는 음원 간 상호상관 값의 크기에 따라 다수의 그룹으로 그룹핑한다. 여기서, 다수의 그룹은 멀티트랙을 의미한다. 믹싱조합 결정부(130)는 상호상관이 높은 음원들을 순서대로 그룹핑하고 다운믹싱할 트랙의 음원 구성 즉, 믹싱조합을 결정한다.In this case, the mixing combination determination unit 130 groups the plurality of groups according to the magnitude of the cross-correlation value between the sound sources. Here, a plurality of groups means a multitrack. The mixing combination determination unit 130 determines the sound source configuration of the track to be downmixed, that is, the mixing combination, by grouping the sources having high correlation with each other in order.

한편, 믹싱조합 결정부(130)는 입력되는 음원의 채널에 따라, 모노와 스테레오 음원을 별도로 처리하게 된다. 즉, 스테레오 음원의 조합은 스테레오 채널로 다운믹싱되며, 모노 음원은 믹싱 매트릭스에 의해 모노 혹은 스테레오 채널로 믹싱된다. 모노 음원들이 스테레오로 다운믹싱될 때, 두 개의 채널 사이의 패닝 정보가 믹싱 매트릭스에 포함되어 있다.Meanwhile, the mixing combination determiner 130 separately processes mono and stereo sound sources according to the channel of the input sound source. That is, the combination of the stereo sound sources is downmixed into the stereo channel, and the mono sound source is mixed into the mono or stereo channel by the mixing matrix. When mono sources are downmixed to stereo, panning information between the two channels is included in the mixing matrix.

멀티트랙 다운믹싱부(140)는 신호 변환부(110)에서 변환된 각 음원 신호들을 전달받고, 믹싱조합 결정부(130)에서 결정된 믹싱조합 정보 및 부가적으로 외부로부터의 믹싱 매트릭스 정보에 따라 상기 신호 변환부(110)로부터의 각 음원 신호들을 멀티트랙 신호로 다운믹싱한다. 여기서, 믹싱 매트릭스 정보는 본 발명에서 부가적인 정보로서, 사용자에 의해 입력받거나 다양한 규격에 맞게 믹싱 매트릭스에 대해서 미리 정해져 있다.The multitrack downmixer 140 receives the respective sound source signals converted by the signal converter 110 and according to the mixing combination information determined by the mixing combination determination unit 130 and additionally the mixing matrix information from the outside. Each sound source signal from the signal converter 110 is downmixed into a multitrack signal. Here, the mixing matrix information is additional information in the present invention and is predetermined for the mixing matrix in response to input by a user or in accordance with various standards.

그리고 멀티트랙 다운믹싱부(140)는 다운믹싱된 멀티트랙 신호를 부가정보(예를 들면, 레벨차이, 상호상관 값, 믹싱조합 정보 및 믹싱 매트릭스 정보 등)와 함께 저장하거나 전송한다. 멀티트랙 다운믹싱부(140)는 필요에 따라 종래의 압축 코덱을 이용할 수 있다. The multitrack downmixer 140 stores or transmits the downmixed multitrack signal together with additional information (eg, level difference, cross-correlation value, mixing combination information, and mixing matrix information). The multitrack downmixing unit 140 may use a conventional compression codec as needed.

도 2 는 본 발명에 따른 멀티트랙 업믹싱 장치의 일실시예 구성도이다.2 is a block diagram of an embodiment of a multitrack upmixing apparatus according to the present invention.

도 2에 도시된 바와 같이, 본 발명에 따른 멀티트랙 업믹싱 장치(200)는 멀티트랙 업믹싱부(210) 및 신호 변환부(220)를 포함한다.As shown in FIG. 2, the multitrack upmixing apparatus 200 according to the present invention includes a multitrack upmixing unit 210 and a signal converter 220.

멀티트랙 다운믹싱 장치(100)에서 다운믹싱된 멀티트랙 신호는 매체를 통해 멀티트랙 업믹싱 장치(200)로 전송된다. 여기서, 멀티트랙 업믹싱 장치(200)는 사용자 단말에 포함될 수 있다.The multitrack signal downmixed by the multitrack downmixing apparatus 100 is transmitted to the multitrack upmixing apparatus 200 through a medium. Here, the multitrack upmixing apparatus 200 may be included in the user terminal.

구체적으로 살펴보면, 멀티트랙 업믹싱부(210)는 다운믹싱된 멀티트랙 신호 및 부가정보를 전송받는다. 그리고 멀티트랙 업믹싱부(210)는 다운믹싱된 멀티트랙 신호를 상기 전송받은 부가정보를 이용하여 업믹싱한다. 즉, 멀티트랙 업믹싱부(210)는 부가정보에 포함되어 있는 음원 간 레벨 차이값, 믹싱조합 정보 및 믹싱 매트릭스 정보를 이용하여 멀티트랙 업믹싱 과정을 수행한다.In detail, the multitrack upmix unit 210 receives the downmixed multitrack signal and additional information. The multitrack upmixing unit 210 upmixes the downmixed multitrack signal using the received additional information. That is, the multitrack upmixing unit 210 performs a multitrack upmixing process using the level difference between the sound sources, the mixing combination information, and the mixing matrix information included in the additional information.

그리고 신호 변환부(220)는 멀티트랙 업믹싱부(210)에서 업믹싱된 신호를 변환하여 원래의 개별 음원 신호(예를 들어, 음원 1, 음원 2, 내지 음원 n)로 복원한다. 이때, 각 대역의 음원 신호는 음원 간 레벨차이에 의해 전달된 다운믹싱 멀티트랙 신호로부터 이득을 적용함으로써 복원될 수 있다.The signal converter 220 converts the signal upmixed by the multitrack upmixer 210 and restores the original individual sound source signal (for example, sound source 1, sound source 2, to sound source n). At this time, the sound source signal of each band can be restored by applying a gain from the downmixing multitrack signal transmitted by the level difference between the sound sources.

한편, 멀티트랙 다운믹싱 장치(100)로부터 전송된 하나의 트랙 신호는 다운 믹싱 장치(100)에서 상호상관 값이 큰 음원 신호로 조합되어 다운믹싱된 상태이다. 따라서 멀티트랙 업믹싱 장치(200)는 상호상관 값이 큰 음원 신호 즉, 음원 간의 유사성이 높은 각 음원의 이득만을 적용하여 복원할 수 있게 된다.Meanwhile, one track signal transmitted from the multitrack downmixing apparatus 100 is downmixed by being combined with a sound source signal having a large cross-correlation value in the downmixing apparatus 100. Therefore, the multitrack upmixing apparatus 200 can restore the sound source signal having a large cross-correlation value, that is, only gain of each sound source having high similarity between the sound sources.

또한, 멀티트랙 업믹싱 장치(200)는 각 음원 간 비상관화 과정을 통해 원래의 상호상관 값에 따라 상호상관을 조절함으로써, 복원된 음원의 방향이 변화되는 것을 방지할 수도 있다.In addition, the multitrack upmixing apparatus 200 may prevent the direction of the restored sound source from being changed by adjusting the cross-correlation according to the original cross-correlation value through the non-correlation process between the respective sound sources.

도 3 은 상호상관 값의 크기에 따라 그룹핑하는 방법에 대한 일실시예 설명도이다.3 is a diagram for explaining a method of grouping according to a magnitude of cross-correlation value.

전술한 바와 같이, 공간정보 산출부(120)는 각 음원 간 상호상관 값을 산출하게 된다. 이러한 각 음원 간 상호상관에 대한 일례가 도 3에 나타나 있다. As described above, the spatial information calculator 120 calculates a cross-correlation value between each sound source. An example of the cross-correlation between each of the sound sources is shown in FIG.

도 3에 도시된 바와 같이, 각 음원(음원 1 내지 음원 6)(301 내지 306)은 원주 상에 배치되어 있다. 각 음원(음원 1 내지 6) 간의 상호상관 값의 크기 순서(예를 들어, 1, 2, …, 15)가 표시되어 있다. 여기서, 상호상관 값 크기 순서는 해당 값이 나와 있는 것이 아니라, 상호상관 값의 크기 순서를 표시한 것이다.As shown in Fig. 3, each sound source (sound source 1 to sound source 6) 301 to 306 is disposed on the circumference. The order of magnitude (eg, 1, 2, ..., 15) of the cross-correlation values between each sound source (sound sources 1 to 6) is displayed. Here, the order of the size of the cross-correlation value is not a corresponding value, but represents the order of the size of the cross-correlation value.

상호상관 값에 따라 다운믹싱할 트랙의 음원을 그룹핑하는 방법은, 상호상관이 높은 순서로 연결해 나갈 때 폐루프가 형성되는 음원들을 하나의 트랙에 다운믹싱하게 된다.In the method of grouping the sound sources of the tracks to be downmixed according to the cross-correlation value, when the cross-correlation is connected in high order, the sound sources in which the closed loop is formed are downmixed to one track.

그리고 그룹핑 방법은 상호상관 값이 미리 정한 임계값(예를 들면, 0.5)에 미달하면 그룹핑을 종료하게 된다. 그리고 남은 음원들을 두 개씩 상호상관 값이 큰 쌍으로 조합되어 하나의 트랙으로 다운믹싱된다. 또한, 마지막에 남는 하나의 음원은 다운믹싱하지 않고 하나의 트랙으로 전송된다.The grouping method ends the grouping when the cross-correlation value falls below a predetermined threshold (eg, 0.5). The remaining sound sources are combined in pairs with large cross-correlation values and downmixed into one track. In addition, the last remaining sound source is transmitted to one track without downmixing.

이러한 그룹핑 과정을 통해, 각 음원(음원 1 내지 음원 6)(301 내지 306)은 세 개의 음원으로 구성되는 제1 트랙, 두 개의 음원으로 구성되는 제2 트랙, 하나의 음원으로 구성되는 제3 트랙으로 그룹핑된다. 즉, 세 개의 다운믹싱 트랙이 생성되었음을 알 수 있다.Through this grouping process, each sound source (sources 1 to 6) 301 to 306 is a first track composed of three sound sources, a second track composed of two sound sources, and a third track composed of one sound source. Are grouped together. In other words, it can be seen that three downmixing tracks have been created.

도 4 는 본 발명에 따른 음원 간 상호상관을 이용한 멀티트랙 다운믹싱 방법에 대한 일실시예 흐름도이다.4 is a flowchart illustrating a multitrack downmixing method using cross-correlation between sound sources according to the present invention.

도 4에서 이러한 믹싱조합 결정부(130)의 처리 절차를 상세하게 나타내고 있다.4 illustrates the processing procedure of the mixing combination determination unit 130 in detail.

먼저, 신호 변환부(110)가 다수의 개별 음원 신호(예를 들어, 음원 신호 1, 음원 신호 2, 내지 음원 신호 n)를 입력받아 개별 음원 신호에 해당하는 다수의 주파수 대역으로 변환한다(402). 이때, 개별 음원 신호는 다수의 부대역으로 구분된 전체 주파수 대역폭(예를 들어, 20 kHz) 중에서 해당 대역폭에 대한 주파수 대역으로 변환된다.First, the signal converter 110 receives a plurality of individual sound source signals (for example, sound source signal 1, sound source signal 2, to sound source signal n) and converts them into a plurality of frequency bands corresponding to the individual sound source signals (402). ). At this time, the individual sound source signal is converted into a frequency band for the corresponding bandwidth among the entire frequency bandwidth (for example, 20 kHz) divided into a plurality of subbands.

이후, 공간정보 산출부(120)가 신호 변환부(110)에서 변환된 다수의 주파수 대역 신호를 해당 대역폭에 포함된 각 음원에 대한 공간정보를 산출한다(404). 공간정보에는 각 음원 간 레벨 차이값 및 상호상관 값이 포함되어 있다. 즉, 공간정보 산출부(120)는 각 음원 단위로 공간정보로서, 음원 간 레벨차이 및 상호상관을 산출하게 된다.Thereafter, the spatial information calculator 120 calculates spatial information of each sound source included in the corresponding bandwidth of the plurality of frequency band signals converted by the signal converter 110 (404). The spatial information includes level difference values and cross-correlation values between sound sources. That is, the spatial information calculating unit 120 calculates the level difference and cross-correlation between sound sources as spatial information in units of sound sources.

이후, 믹싱조합 결정부(130)가 공간정보 산출부(120)에서 산출된 공간정보 중 상호상관 값을 각 음원별 대역별로 전달받고, 음원 간 상호상관 중 가장 큰 상호상관 값을 가지는 음원 쌍을 검색한다(406).Then, the mixing combination determination unit 130 receives the cross-correlation value of each sound source band of the spatial information calculated by the spatial information calculation unit 120, and selects a sound source pair having the largest cross-correlation value among the sound sources. Search (406).

그리고 믹싱조합 결정부(130)는 가장 큰 상호상관 값을 가지는 음원 쌍의 상호상관 값과 임계값(예를 들어, 0.5 등)을 비교하여, 가장 큰 상호상관 값이 임계값 미만인지를 확인한다(408).The mixing combination determination unit 130 compares the cross-correlation value of the sound source pair having the largest cross-correlation value with a threshold value (for example, 0.5) to determine whether the largest cross-correlation value is less than the threshold value. (408).

상기 확인 결과(408), 상호상관 값이 임계값 이상이면, 믹싱조합 결정부(130)는 가장 큰 상호상관 값을 가지는 음원 간 연결이 폐루프와 연결되어 있는지를 확인한다(410). 이는 각 음원 간의 상호상관이 높은 순서대로 음원 신호를 연결할 때, 하나의 트랙에 연결된 음원 신호가 폐루프 형태를 이루는지 여부를 확인하기 위함이다.As a result of the check 408, if the cross-correlation value is greater than or equal to the threshold value, the mixing combination determination unit 130 checks whether the connection between the sound sources having the largest cross-correlation value is connected to the closed loop (410). This is to check whether a sound source signal connected to one track has a closed loop shape when the sound source signals are connected in a high order to each other.

상기 확인 결과(410), 가장 큰 상호상관 값을 가지는 음원 간 연결이 이미 등록된 폐루프와 연결되어 있다면, 믹싱조합 결정부(130)는 이미 꽉 찬 트랙이므로 연결을 끊고 "406" 과정으로 천이하게 된다. 이후, 믹싱조합 결정부(130)는 남은 음원 간 상호상관 값 중 가장 큰 상호상관 값을 검색하는 "406" 과정을 수행한다. 반면에, 가장 큰 상호상관 값을 가지는 음원 간 연결이 등록된 폐루프와 연결되어 있지 않다면, 믹싱조합 결정부(130)는 등록된 트랙이 폐루프에 연결되어 있지 않고 이미 등록된 트랙에 연결되어 있는지를 확인한다(412).As a result of the check 410, if the connection between the sound sources having the largest cross-correlation value is already connected to the registered closed loop, the mixing combination determination unit 130 is already a full track and thus disconnects and transitions to the process of "406". Done. Thereafter, the mixing combination determination unit 130 performs a process of searching for the largest cross-correlation value among the remaining sound sources, and performing a "406" process. On the other hand, if the connection between the sound sources having the largest cross-correlation value is not connected to the registered closed loop, the mixing combination determination unit 130 is connected to the already registered track instead of being connected to the closed loop. Check if there is (412).

상기 확인 결과(412), 음원 간 연결이 등록된 트랙에 연결되어 있다면, 믹싱조합 결정부(130)는 그 트랙의 구성에 현재 음원을 추가한다(414). 그리고 믹싱조합 결정부(130)는 추가된 트랙의 구성이 폐루프를 형성하는지 여부를 확인한 다(418). 반면에, 등록된 트랙에 연결되어 있지 않다면, 믹싱조합 결정부(130)는 새로운 트랙을 등록하고(416), "406" 과정부터 다시 수행하게 된다.As a result of the check 412, if the connection between the sound source is connected to the registered track, the mixing combination determination unit 130 adds the current sound source to the configuration of the track (414). The mixing combination determination unit 130 checks whether the configuration of the added track forms a closed loop (418). On the other hand, if it is not connected to the registered track, the mixing combination determination unit 130 registers a new track (416), and performs again from the process "406".

상기 확인 결과(418), 추가된 트랙의 구성이 폐루프를 형성하게 되면, 믹싱조합 결정부(130)는 폐루프를 등록하고(420), "406" 과정부터 다시 수행하게 된다. 반면에, 추가된 트랙의 구성이 폐루프를 형성하지 않으면, 믹싱조합 결정부(130)는 "406" 과정부터 다시 수행하게 된다. 즉, 믹싱조합 결정부(130)는 폐루프에 연결되어 있지 않고 이미 등록된 트랙에도 연결되어 있지 않으면, 새로운 트랙을 등록하고 현재 음원을 트랙의 구성에 추가한 후 "406" 과정부터 다시 수행하게 된다.As a result of the check 418, when the configuration of the added track forms a closed loop, the mixing combination determination unit 130 registers the closed loop (420) and performs the process again from the "406" process. On the other hand, if the configuration of the added track does not form a closed loop, the mixing combination determination unit 130 is to perform again from the "406" process. That is, if the mixing combination determination unit 130 is not connected to the closed loop and is not connected to the already registered track, the mixing combination determination unit 130 registers a new track, adds the current sound source to the track configuration, and then performs the process again from the "406" process. do.

한편, 상기 확인 결과(408), 가장 큰 상호상관 값이 임계값에 미달하면, 믹싱조합 결정부(130)는 믹싱조합을 결정하고, 결정된 믹싱조합에 따라 믹싱조합 정보를 생성한다(422).On the other hand, if the check result 408, the largest cross-correlation value is less than the threshold value, the mixing combination determination unit 130 determines the mixing combination, and generates the mixing combination information according to the determined mixing combination (422).

이후, 멀티트랙 다운믹싱부(140)가 신호 변환부(110)에서 변환된 각 음원 신호들을 전달받고, 믹싱조합 결정부(130)에서 결정된 믹싱조합 정보, 및 부가적으로 믹싱 매트릭스 정보에 따라 상기 변환된 각 음원 신호들을 멀티트랙 신호로 다운믹싱한다(424).Subsequently, the multitrack downmixer 140 receives the respective sound source signals converted by the signal converter 110 and according to the mixing combination information determined by the mixing combination determination unit 130 and additionally the mixing matrix information. Each of the converted sound source signals is downmixed into a multitrack signal (424).

한편, 전술한 바와 같은 본 발명의 방법은 컴퓨터 프로그램으로 작성이 가능하다. 그리고 상기 프로그램을 구성하는 코드 및 코드 세그먼트는 당해 분야의 컴퓨터 프로그래머에 의하여 용이하게 추론될 수 있다.　또한, 상기 작성된 프로그램은 컴퓨터가 읽을 수 있는 기록매체(정보저장매체)에 저장되고, 컴퓨터에 의하여 판독되고 실행됨으로써 본 발명의 방법을 구현한다. 그리고 상기 기록매체는 컴퓨 터가 판독할 수 있는 모든 형태의 기록매체를 포함한다.On the other hand, the method of the present invention as described above can be written in a computer program. And the code and code segments constituting the program can be easily inferred by a computer programmer in the art. In addition, the written program is stored in a computer-readable recording medium (information storage medium), and read and executed by a computer to implement the method of the present invention. The recording medium may include any type of recording medium that can be read by a computer.

이상에서 설명한 본 발명은, 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자에게 있어 본 발명의 기술적 사상을 벗어나지 않는 범위 내에서 여러 가지 치환, 변형 및 변경이 가능하므로 전술한 실시예 및 첨부된 도면에 의해 한정되는 것이 아니다.The present invention described above is capable of various substitutions, modifications, and changes without departing from the technical spirit of the present invention for those skilled in the art to which the present invention pertains. It is not limited by the drawings.

도 1 은 본 발명에 따른 음원 간 상호상관을 이용한 멀티트랙 다운믹싱 장치의 일실시예 구성도,1 is a configuration diagram of an embodiment of a multitrack downmixing apparatus using cross-correlation between sound sources according to the present invention;

도 2 는 본 발명에 따른 멀티트랙 업믹싱 장치의 일실시예 구성도,2 is a block diagram of an embodiment of a multitrack upmixing apparatus according to the present invention;

도 3 은 상호상관 값의 크기에 따라 그룹핑하는 방법에 대한 일실시예 설명도,3 is an explanatory diagram of a method of grouping according to a size of a cross-correlation value;

* 도면의 주요 부분에 대한 부호의 설명* Explanation of symbols for the main parts of the drawings

100: 멀티트랙 다운믹싱 장치 110: 신호 변환부100: multitrack downmixing device 110: signal conversion unit

120: 공간정보 산출부 130: 믹싱조합 결정부120: spatial information calculation unit 130: mixing combination determination unit

140: 멀티트랙 다운믹싱부 200: 멀티트랙 업믹싱 장치140: multitrack downmixing unit 200: multitrack upmixing device

210: 멀티트랙 업믹싱부 220: 신호 변환부210: multitrack upmixer 220: signal converter

Claims

In a multitrack downmixing device,

Signal conversion means for converting an individual sound source signal into a frequency band;

Spatial information calculating means for calculating spatial information between respective sound sources from the converted individual sound source signals;

Mixing combination determination means for determining mixing combination information according to a cross-correlation value using the calculated spatial information between the respective sound sources; And

Multitrack downmixing means for downmixing the converted individual sound source signals into multitrack signals according to the determined mixing combination information

Multitrack downmixing device using the cross-correlation between the sound source.

The method of claim 1,

The mixing combination determination means,

Multitrack downmixing apparatus using the cross-correlation between the sound source, characterized in that for grouping the sound source having a high cross-correlation value by using the calculated spatial information between each sound source.

The method of claim 2,

The mixing combination determination means,

And a plurality of sound sources that form a closed loop as sound sources grouped in one track by connecting the cross-correlation values in the order of high sound sources.

The method of claim 3, wherein

The mixing combination determination means,

If the cross-correlation value is less than a predetermined threshold value, the multi-track downmixing apparatus using the cross-correlation between the sound source, characterized in that for combining the remaining ungrouped sound sources.

The method according to any one of claims 1 to 4,

The multitrack downmixing means,

And downmixing the converted individual sound source signals into multitrack signals according to the determined mixing combination information and external mixing matrix information.

The method according to any one of claims 1 to 4,

The spatial information calculating means,

When the converted individual sound source signal has a stereo channel, the multi-track downmixing apparatus using the sound cross-correlation, characterized in that for calculating the average cross-correlation value for the left / right channel.

In the multitrack downmix method,

A signal conversion step of converting the individual sound source signals into frequency bands;

Calculating spatial information between the respective sound sources from the converted individual sound source signals;

A mixing combination determination step of determining mixing combination information according to a cross-correlation value using the calculated spatial information between the respective sound sources; And

A multitrack downmixing step of downmixing the converted individual sound source signals into a multitrack signal according to the determined mixing combination information

Multitrack downmixing method using cross-correlation between sound sources comprising a.

The method of claim 7, wherein

The mixing combination determination step,

A multitrack downmixing method using cross-correlation between sound sources, wherein sound sources having a high cross-correlation value are grouped using the calculated spatial information between each sound source.

The method of claim 8,

The mixing combination determination step,

A method of multitrack downmixing using cross-correlation between sound sources, wherein the cross-correlation values are connected in order of sound sources to determine closed sound sources as sound sources grouped in one track.

The method of claim 9,

The mixing combination determination step,

If the cross-correlation value is less than a predetermined threshold value, the rest of the ungrouped sound sources are combined.