[go: up one dir, main page]

CN104704855A - System and method for reducing latency in transposer-based virtual bass systems - Google Patents

System and method for reducing latency in transposer-based virtual bass systems Download PDF

Info

Publication number
CN104704855A
CN104704855A CN201380053450.0A CN201380053450A CN104704855A CN 104704855 A CN104704855 A CN 104704855A CN 201380053450 A CN201380053450 A CN 201380053450A CN 104704855 A CN104704855 A CN 104704855A
Authority
CN
China
Prior art keywords
frequency
signal
time
cqmf
audio signal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201380053450.0A
Other languages
Chinese (zh)
Other versions
CN104704855B (en
Inventor
佩尔·埃克斯特兰德
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dolby International AB
Original Assignee
Dolby International AB
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US13/652,023 external-priority patent/US8971551B2/en
Application filed by Dolby International AB filed Critical Dolby International AB
Publication of CN104704855A publication Critical patent/CN104704855A/en
Application granted granted Critical
Publication of CN104704855B publication Critical patent/CN104704855B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R3/00Circuits for transducers, loudspeakers or microphones
    • H04R3/04Circuits for transducers, loudspeakers or microphones for correcting frequency response
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/038Speech enhancement, e.g. noise reduction or echo cancellation using band spreading techniques
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R2430/00Signal processing covered by H04R, not provided for in its groups
    • H04R2430/03Synergistic effects of band splitting and sub-band processing

Landscapes

  • Engineering & Computer Science (AREA)
  • Acoustics & Sound (AREA)
  • Physics & Mathematics (AREA)
  • Signal Processing (AREA)
  • Human Computer Interaction (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Health & Medical Sciences (AREA)
  • Quality & Reliability (AREA)
  • Computational Linguistics (AREA)
  • Multimedia (AREA)
  • Stereophonic System (AREA)
  • Circuit For Audible Band Transducer (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
  • Auxiliary Devices For Music (AREA)

Abstract

A latency reduction system in a virtual bass processing system performs harmonic transposition on low frequency components of an audio signal to generate transposed data indicative of harmonics of the audio signal. The system uses a base transposition factor greater than two, and generates the harmonics in response to frequency-domain values determined by forward and inverse transform stages that use asymmetric analysis and synthesis windows. The system combines a virtual bass signal with the delayed wide band audio signal through analysis filter banks having filter coefficient truncated Nyquist filters. The virtual bass signal may lag the delayed wide band audio signal when combining with the audio signal to further reduce the latency caused by the harmonic transposition. The virtual bass input signal may be directly routed from a CQMF analysis filter bank of a preceding Hybrid filter bank stage, in order to avoid the delay associated with a Nyquist filter bank.

Description

For reducing the system and method based on the delay in the virtual low system for electrical teaching of transposer
The cross reference of related application
This application claims the U.S. Provisional Patent Application No.13/652 submitted on October 15th, 2012, the priority of 023, it is merged into herein with full content by reference.
Technical field
One or more of execution mode relate generally to based on the Audio Signal Processing of conversion, and relates more specifically to reduce based on the delay in the virtual bass synthesis system of transposer.
Background technology
The low-frequency range that bass synthesis refers to signal adds component to strengthen the bass of institute's perception.Among those methods, subwoofer synthetic technology creates the low frequency component lower than the existing partial of signal, to expand and to improve the low-limit frequency scope be present in multi-object audio content.Another kind method uses virtual pitch algorithm, this virtual pitch algorithm from inaudible bass range (such as, the bass of the low pitch play by little loud speaker) generate audible harmonic wave, therefore make harmonic wave and finally also make pitch hear to improve bass response.
Virtual bass synthesis is virtual pitch method, and it improves the perception level when the bass content on the little loud speaker that physically can not reproduce low side bass frequencies during audio plays in audio frequency.The method is observed based on following " fundamental frequency of disappearance " psychologic acoustics: even if when fundamental frequency and first harmonic self disappear, human auditory system still can infer low pitch according to high order harmonic component.Basic functional method analyzes the bass frequencies be present in audio frequency and generates the audible high order harmonic component contributing to the lower frequency that perception disappears.The principal character of virtual bass is: it is by carrying out synthesizing the perceived bass response strengthened on such devices to the high order harmonic component of frequency of low-frequency roll-off (such as, lower than 150Hz) lower than the device with little loud speaker.After energy adjusting, use multiple transposition factor (harmonic wave) by inaudible signal component transposition to higher audible frequency.Virtual bass synthesis can also increase the perception bass of the playback on headphones playback or full frequency band (full-range) loud speaker.Figure 1A shows the inaudible scope 10 with frequency component and composes higher than the Frequency and Amplitude of the audio signal of the audible scope of the frequency component of inaudible scope.The harmonic transposition of the frequency component in inaudible scope 10 can generate the frequency component through transposition in the part 11 of audible scope, and it can strengthen the perception level of the bass content of playback audio signal.Such harmonic transposition can comprise applies multiple transposition factor to generate multiple harmonic waves of this component to each relevant frequency component of input audio signal.
In some audio frequency processing system utilizing the low system for electrical teaching of traditional virtual, to replace time delay that function is associated or postpone some is applied and Yan Huitai is large with frequency.Such as, delay is the low system for electrical teaching of traditional virtual that 1025 digital audio processing systems of sampling can use the time delays adding 3200 other samplings.Suppose sample frequency (f s) be 48kHz, then this can cause total time delay more than 88 milliseconds.This retardation is normally problematic, and for game and telecommunications are applied or even forbid, wherein, the delay of about 100 milliseconds starts to become obvious in audible signal lag.
The traditional transposer system used in the low system for electrical teaching of traditional virtual uses symmetrical time-domain window respectively for time to frequency translation and frequency to the Analysis Block of time change and synthesis stage.Figure 1B shows the time delay be associated with the symmetrical window used in the low system for electrical teaching of traditional virtual as be known in the art.Figure 1B shows the time delay that the transposer that namely generates the second harmonic by second order transposer is forced graphically.As shown in time graph 100, assuming that the time stride S of analysis window a, the center of one of analysis window symmetrical on form is selected as time zero benchmark, and can from the time t Analysis Block 102 0add the sampling 104 of new input.Time graph 110 shows the time-stretching duality of transposer, wherein, and t 0be stretched to the 2t in synthesis stage 112 0.
For the example process shown in Figure 1B, total analysis/synthesis chain time delay D tscan be expressed as follows in equation 1 below, wherein, L is transposer window size, S abe analysis time stride or jump distance:
D ts=L/2+2 (L/2-S a)=3L/2-2S a(equation 1)
In the audio frequency processing system organized based on HQMF (hybrid orthogonal mirror filter), to the input signal of CQMF (complex quad-rature-mirror filter) Analysis Block and usually all there is identical sample frequency f from the output signal of CQMF synthesis stage s, wherein, f susually 44.1kHz or 48kHz is set to.Because system processes usually only from a CQMF signal of 64 channel C QMF groups, so can be f for the input signal sampling rate of virtual bass process s/ 64.It should be noted that the CQMF size that also can use except 64 passages.Due to the combination transposition function of the usage factor 2 fundamental transposition factor, so be 2f from the sample frequency of the output through transposition of traditional virtual bass treatment system s/ 64, produce the factor 2 bandwidth expansion.In the transposer of combination, the fundamental transposition factor is that wherein source conversion frequency section (or frequency band) is mapped to the factor of object transformation frequency band (or frequency band) according to man-to-man relation, that is, in source maps to target frequency section, interpolation or extraction is not related to.The relation of the fundamental transposition factor also between the time stride of control analysis window and the time stride of synthesis window.More particularly, generated time stride equals analysis time stride and is multiplied by the fundamental transposition factor.For L=64 and S athe situation of=4, becomes from the time delay in the output sampling of the system based on 64 channel C QMF:
D ts={ 3L/2-2S aa 64/2=2816 sampling (equation 2)
Except this time delay, also add the time delay exporting Nyquist (Nyquist) the filter bank analysis section process of CQMF subband signal from two virtual basses.This time delay can be about 384 samplings, thus for this example prior art traditional virtual bass treatment system, gives total time delay of 2816+384=3200 sampling.
A solution for the delay of being forced by the low system for electrical teaching of traditional virtual is as substituted harmonic transposition device to change actual treatment circuit system as harmonic oscillator by use alternative parts.But this adds great amount of cost and complexity to system potentially, but also affects audio quality negatively.
The theme discussed in background technology part only just should not be considered to prior art because of referred in the background section.Similarly, problem that is that mention in background technology part or that be associated with the theme of background technology part should not be considered to the problem previously had realized that in prior art.Theme in background technology part only represents diverse ways, and these diverse ways itself also can be inventions.
Summary of the invention
The delay that execution mode comprises in virtual bass treatment system reduces system, and its low frequency component to audio signal performs harmonic transposition to generate the transposition data representing harmonic wave.Harmonic transposition process uses the fundamental transposition factor being greater than 2, and in response to the frequency domain value determined by the transforming section and inverse transformation section that use asymmetric analysis window and synthesis window to generate harmonic wave.Comprised the Nyquist analysis filterbank of blocking prototype filter by use virtual bass signal and the audio signal through time delay are carried out combining the audio signal generating enhancing.When combining with audio signal, the delayed audio signal limiting time section through time delay of virtual bass signal can be allowed to reduce the delay caused by harmonic transposition process further.
Execution mode comprises the method that the transposition data generating expression harmonic wave by performing harmonic transposition to the low frequency component of input audio signal reduce the delay in virtual bass generation system, and wherein, harmonic transposition uses the integer-valued fundamental transposition factor being greater than 2.Described method uses asymmetric analysis window and synthesis window determined frequency domain value to generate harmonic wave to frequency domain transformation section and follow-up reverse frequency to time-domain transforming section by converting to frequency domain transformation and reverse frequency to time-domain the time in response to by the time.Input audio signal is subband CQMF (complex values quadrature mirror filter) signal, and can carry out preliminary treatment to generate the audio frequency representing the threshold sampling of low frequency component to the sampling of input audio signal.
In one embodiment, described method processes input audio signal to provide group analysis subband signal or a frequency band according to low frequency component by analysis filterbank or conversion, use fundamental transposition factor B and transposition factor T to be combined into subband signal or frequency band, and carry out Treatment Analysis subband signal by synthesis filter banks or conversion or frequency band generates high fdrequency component to be combined into subband signal according to described one.This represents the standard mode of carrying out replacing, that is, before the Nonlinear Processing comprising the mapping of conversion frequency section, perform forward FFT and convert, and then performs inverse FFT conversion.Described method can also comprise: in response to the virtual bass signal of transposition data genaration, and by virtual bass signal and input audio signal being combined to virtual one or more analysis filterbank of audio bass output signal application the audio signal generating enhancing, wherein, analysis filterbank comprises blocks prototype filter, and it makes the filter coefficient of restriction quantity be removed.Described method can comprise again predetermined amount of time delayed of virtual bass signal relative to input audio signal, by being combined by the input audio signal of predetermined amount of time shorter compared with the process time delay of being implied with virtual low system for electrical teaching by time delay for virtual bass signal, generate the audio signal of the enhancing of the sub-band sample comprising virtual bass process time lag of combining with the input sub-band sample of time delay.
According to some execution modes, the input audio signal in frequency domain is expanded to the degree that matches with the value of the fundamental transposition factor to produce the audio signal through replacing by the fundamental transposition factor, and this fundamental transposition factor can be the even-integral number value between 4 and 16.In one embodiment, 8 passage nyquist filter groups and 4 passage nyquist filter groups are comprised to the analysis filterbank that transposer CQMF output subband operates, and the prototype filter coefficient limiting the removal of quantity comprises 6 coefficients.In another execution mode, input CQMF signal is directly routed from CQMF analysis filterbank passage 0 output above, therefore by follow-up nyquist filter group section bypass, and therefore avoids relevant time delay.
The execution mode of described method can also comprise to perform the conversion of frequency domain over-sampling to generate low frequency component to input audio signal by generating the sampling of windowing zero padding with the sample frequency limited (using stride analysis time).Because virtual bass signal can be allowed to the obvious deterioration of the audio signal delayed broadband input audio signal not existing enhancing to 20ms, so when being combined by the input audio signal of virtual bass signal and time delay, predetermined amount of time can be selected from the value of 0 sampling to the scope of 1000 samplings.In one embodiment, asymmetric analysis window and synthesis window are configured to make the comparatively long portion of analysis window to divide to be stretched towards past input sample, and the comparatively long portion of synthesis window are divided export sampling towards future to be stretched.
Execution mode also relates to the system or equipment element being configured at least some method realized in said method.
Embodiment
Describe the execution mode for reducing the system and method based on the delay in the virtual low system for electrical teaching of transposer and algorithm time delay.Such system and method utilize more the high-order fundamental transposition factor, the asymmetric mapping window of low delay, block Nyquist prototype filter, relative to the virtual bass signal of the time lag of original audio signal and the Nyquist analysis filterbank be bypassed in previous hybrid filter-bank section.
Run through the present disclosure comprising claims, " to " signal or data executable operations are (such as, filtering is carried out to signal or data, convergent-divergent, conversion or apply gain) expression be used for representing in a broad sense to signal or data or to the processed version of signal or data (such as, having experienced by the version of preliminary filtering or pretreated signal before to signal executable operations) directly executable operations.Express " transposer " and for representing in a broad sense real number value or the pitch shift (pitch-shifting) of complex values input signal or the algorithm unit of time-stretching or device are performed for part or whole available input signals spectrum.Expression " transposer ", " harmonic transposition device ", " phase vocoder ", " Frequency multiplier " or " harmonic generation device " can be used interchangeably.Express " system " for indication device, system or subsystem in a broad sense.Such as, the subsystem realizing decoder can be called as decoder system, and the system comprising such subsystem (such as, the system of X output signal is generated in response to multiple input, wherein subsystem generates M input, and other X-M inputs receive from external source) also can be called as decoder system.Term " processor " may be programmed to or otherwise can be configured to (such as, using software or firmware) to the system of data (such as, audio or video or other view data) executable operations or device for representing in a broad sense.The example of processor comprises field programmable gate array (or other configurable integrated circuit or chipsets), be programmed to and/or the digital signal processor, programmable universal processor or the computer that are otherwise configured to audio frequency or other voice data execution pipeline formula process and programmable microprocessor or chip or chipset.Express " audio process " and " audio treatment unit " to be used interchangeably and in a broad sense for representing the system being configured to process voice data.The example of audio treatment unit includes but not limited to encoder (such as, code converter), decoder, vocoder, codec, pretreatment system, after-treatment system and bit stream treatment system (sometimes referred to as bit stream handling implement).
Execution mode relates to for reducing virtual bass time delay and system and method without the need to carrying out material change to the harmonic transposition device of existing virtual bass processing unit as used in virtual bass treatment system.Virtual bass can be used to postpone to reduce the various aspects of system and method in conjunction with the harmonic generation device (transposer) in audio codec (such as, decoder).Such as virtual bass can also be used to postpone to reduce the various aspects of system and method for the general time-stretching of audio signal or the conventional phase vocoder of pitch shift in conjunction with other transposers or phase vocoder system.
Usually as shown in Figure 1A, the virtual bass generation method of harmonic transposition is used to comprise frequency component from inaudible frequency range to the transposition of audible frequency range to improve bass content limited playback equipment such as by the playback of the little loud speaker of the lower frequency of disappearance physically can not be reproduced.The execution mode that virtual bass postpones to reduce system and method improves virtual bass generation method, this virtual bass generation method performs harmonic transposition to the low frequency component of audio signal and represents the data through transposition of expection at the audible harmonic wave of playback to generate, in response to the virtual bass signal of data genaration through transposition, and by virtual bass signal and (through time delay) input audio signal are combined the audio signal generating enhancing.Usually, in the audio signal strengthened by can not the playback of physically one or more loud speaker of reproducing low frequencies component, the audio signal of enhancing provides the perception level that improve of bass content.
The harmonic transposition performed by virtual bass generation method utilizes the second order transposer of each low frequency component of use, and more high-order transposer is (usual with at least one, three rank transposers and quadravalence transposer, and alternatively, at least one other more high-order transposer) combination transposition generate harmonic wave, make in response to by the common time to frequency domain transformation section (such as, other operations by the coefficient of frequency excute phase multiplication obtained from single time to frequency domain transformation or phase place) frequency domain value determined generates all harmonic waves, after the above-mentioned common time to frequency domain transformation section, carry out common frequency convert (in fact to time-domain, above-mentioned common frequency is divided into two less conversion to adapt to the bandwidth sum sample frequency of the subband of CQMF framework to time-domain conversion).
Fig. 2 processes according to some delay reductions that realize of execution mode or postpones to reduce the block diagrams of the virtual bass treatment system that process uses in conjunction with some.In one embodiment, virtual bass treatment system 200 obtains multiple complex values sub-band sample (HQMF sampling) as input 201 (input A) from so-called hybrid filter-bank.In one embodiment, original time domain audio input signal has been separated into so multiple hybrid subband 201 (it is discussed in further detail below) by the hybrid filter-bank before virtual bass process, and they can by input buffer 206 buffer memory.Then, the input of institute's buffer memory is processed by Nyquist synthesis filter banks 208, Nyquist synthesis filter banks 208 performs complex functionality represents low-frequency audio content (such as, between 0Hz and 375Hz) single complex values QMF (CQMF) territory signal 202 (signal C) with reconstruct.In another embodiment, virtual low system for electrical teaching postpones by the nyquist filter group section bypass in hybrid filter-bank above being comprised the mechanism of saving.This makes this system can save by CQMF passage 0 signal is directly fed to virtual bass module as input 203 (input B) time delay be associated with Nyquist analysis bank (such as, sampling for 384).As shown in Figure 2, two inputs one of 202 or 203 are selected by the switch of such as selector 204, and selected signal comprises virtual bass input signal 205 (signal D), and it is processed by transposer 209 further.
The normally following combination of transposer (or phase vocoder): be non-linear section (excute phase multiplication or phase-shifts) after the time to frequency translation or bank of filters, be that frequency is to time change or bank of filters after non-linear section.Thus as shown in Figure 2, transposer 209 comprises the time to frequency transform part 210, non-linear section 212 and frequency to time change 214.Non-linear section 212 in transposer 209 is amendment phase place and the subband of signal or transform component is applied to the processing block of some gains (amplitude) control signal.Then, signal through replacing is by output buffer 216 buffer memory, and processed by Nyquist analysis filterbank 218 subsequently, Nyquist analysis filterbank 218 performs and virtual bass is exported CQMT signal decomposition and to become and the hybrid subband of input signal 201 is sampled the analytic function of subband corresponding to (HQMF).The undressed version 2 20 through time delay of input a-signal is mixed with the output of nyquist filter group 218 comprises to produce the audio output signal 222 that virtual bass outputs signal the enhancing of the input signal added through time delay.
Although execution mode can relating to nyquist filter group for some function as synthesized 208 and analyze the uses of 218 sections of process, it should be noted that the bank of filters or frequency division that can also use other types or dividing circuit and technology.In other embodiments, bank of filters above-mentioned or frequency division or divide circuit and technology can not exist.
Fig. 3 A to Fig. 3 C is the more detailed figure of the virtual bass treatment system shown in Fig. 2.Fig. 3 A shows preliminary treatment hybrid filter-bank section 300, that is, be not usually a part for virtual low system for electrical teaching but the section alternatively before virtual low system for electrical teaching.Hybrid filter-bank can be the combination of CQMF group, and wherein, the minimum CQMF band of some processes by the nyquist filter group of pre-sizing the frequency resolution improving low-frequency range.The combination of sampling from Nyquist Analysis Block and the low frequency sub-band of remaining CQMF passage is referred to as hybrid subband sampling or HQMF (mixing QMF) signal.As shown in fig. 3, time domain input signal 302 is input to 64 channel C QMF analysis filterbank 304.In one embodiment, an output of this bank of filters, CQMF passage 0 (being expressed as signal B) 306 are directly fed to the virtual bass module 330 (this signal corresponds to the input B 203 of Fig. 2) of Fig. 3 C.It should be noted that signal B 306 is by Nyquist analysis filterbank 307 bypass, therefore avoids the time delay be associated.CQMF passage 0,1 and 2 is also input to multiple Nyquist analysis filterbank 307 to 309.Output from Nyquist analysis filterbank and remaining CQMF subband (3 to 63) produces hybrid subband sampling 0 to 76 (being represented as signal A) 310.
As shown in the system 320 of Fig. 3 B, multiple complex values hybrid subband sampling (signal A) 322 is input to Nyquist synthesis filter banks section 324.Assuming that the virtual bass module 330 of Fig. 3 C is one of other modules in system, it operates hybrid subband sampling (HQMF sampling).Therefore, the signal A 310 of Fig. 3 A can experience the process of other modules before becoming the input A 322 of Fig. 3 B after pre-processing filter group section 300.In a kind of example embodiment, namely the one 8 hybrid subband be processed from the subband of low frequency 8 passage (8-ch) nyquist filter group 307 (it to produce the signal bandwidth of roughly 344Hz to 375Hz according to sampling rate).Be not downsampled because nyquist filter group is contrary with CQMF group, so nyquist filter is combined into step especially simply, because it is only the summation of the sub-band sample of each CQMF (or HQMF) time slot.After suing for peace to 8 minimum hybrid subband samplings in section 324, System reorganization CQMF passage 0 signal C 326, it becomes the input 332 of the virtual bass module 330 of Fig. 3 C.
Fig. 3 C shows and reduces process according to some delays that realize of execution mode or process by the virtual low system for electrical teaching used in conjunction with some delay reductions.The virtual bass module 330 of Fig. 3 C using signal D 332 as input.In the execution mode that Nyquist analysis filterbank 307 is above bypassed, signal D 332 can be routed from the signal B 306 of Fig. 3 A.In another embodiment, signal D 332 can be fed from the signal C 326 of the Nyquist Analysis Block 320 of Fig. 3 B.In these two kinds of execution modes, the signal D 332 i.e. input signal of virtual bass module is single complex values CQMF signal (that is, from the first passage (passage 0) of one group of CQMF subband signal).
In virtual bass application, optional dynamic process function can be performed to change the dynamic of virtual bass input signal by dynamic processor 336.Processor 336 may be used for reducing the level of weak bass and keeps or strengthen strong bass, that is, be used as expander.The program meets the shape waiting sound profile (ELC) in bass range, and wherein, loudness contour is more smooth in the frequency of louder signal, and more precipitous for the signal of more weak loudness.Therefore, when generating harmonic wave to keep the relative loudness between fundamental component and the harmonic wave generated, more weak bass can be attenuated more by stronger bass.The gain of dynamic processor 336 can by slip (running) averaged energy signal, that is, the moving average energy of (monophony) version through lower mixing of a CQMF band signal 332 controls.
For the execution mode of system 330, before inputing to Nonlinear Processing block 344, (possible dynamic process) CQMF signal is performed and uses first of window size L (to the zero padding of length N on comprising), forward FFT 340 and modulation function 342 to add window function.In embodiments of the present invention, window shape is asymmetrical.In another embodiment, transposer (comprising parts 338 to 356) represents the phase vocoder improved, and it uses the fft analysis/synthesis chain identical with fundamental transposition device to use the interpolation technique that is called as " combine and replace " to generate second order, three rank, quadravalence and may the harmonic wave (the transposition factor) of more high-order.Usually, although damage the quality of other harmonic waves except base order harmonics to a certain extent, computation complexity is saved in such combination transposition.When not using combination transposition, at least positive-going transition or inverse transformation need different for the different transposition factors.Nonlinear Processing block 344 uses integer to replace the factor, and it carries out certain phase estimation of redundancy, phase unwrapping or phase locking techniques, and these technology are usually when time in the phase vocoder being used to a lot of standard is unstable and inaccuracy.In one embodiment, phase multiplier 344 use higher than 2 fundamental transposition factor B as 8 or any other suitable value.
Transposer 338 to 356 uses over-sampling (that is, the zero padding analysis window in block 338 and 356 and synthesis window) to improve pulse (knocking) sound in a frequency domain, and it is main when using in bass frequency range.When not carrying out such over-sampling, knocking tum sound and probably generating the pseudo-sound of at least some Pre echoes and rear echo, make bass fuzzy and unintelligible.In one embodiment, oversample factor F is selected as at least factor F=(B+1)/2, and wherein, B is the fundamental transposition factor (such as, B=8).This contributes to guaranteeing that the transient sound for isolation suppresses pre-echo and rear echo.
As shown in FIG. 3 C, transposer comprises gain and the slope-compensation of the every FFT frequency band applied by the amplifier 346 after phase multiplier circuit (Nonlinear Processing block 344).This makes the overall gain of the different transposition factor to be set up independently.Such as, gain can be configured to be similar to some and wait sound profile (ELC).As approximation, can be come by the straight line on the logarithmic scale of the frequency lower than 400Hz suitably to carry out modeling to ELC.In this case, although odd-order harmonics (such as, three rank, five rank etc.) very important for the virtual bass effect of result, but sometimes can be perceived as compared with even-order harmonics more ear-piercing, so odd-order harmonics can be attenuated largely due to odd-order harmonics.Each signal through transposition additionally can have slope gain, that is, with the roll-off attenuation factor that such as every octave dB measures.Also apply this decay at transform domain by the every frequency band of amplifier 346.
In the system based on non-mixed bank of filters, such as, in time domain system, adopt the signal 302 of Fig. 3 A as input, transposer 338 to 356 will directly to full sampling rate (such as, 44.1kHz or 48kHz) time-domain signal operate, then utilize the FFT size of roughly 4096 lines, to provide the suitable resolution of low frequency (bass) scope.But, in one embodiment, all process are performed to CQMF passage 0 sub-band sample (the signal D 332 of system 330).This by processing only interested signal in transposer, that is, by processing some advantage of providing relative to normal process practice to (or maximum extraction) low-pass signal of threshold sampling as saved computation complexity.Such as, by using quadravalence fundamental transposition device, virtual low system for electrical teaching expands the bandwidth of input signal by the factor 4.Usually, do not require that virtual low system for electrical teaching output bandwidth is higher than the signal of roughly 500Hz.This represents, bandwidth is 375Hz (or f s=48kHz) a CQMF passage (passage 0) more suitable for the input of virtual bass, and two CQMF passages (passage 0 and 1) have enough bandwidth (at f for virtual bass output above s750Hz under=48kHz).Using CQMF passage 0 as input, this system can use size 64 (4096/64) to replace the FFT of 4096 conversion to process complex values sampling, wherein, be reduced to the down-sampling factor that 1/64 is derived from CQMF group, compared with time domain input signal, this also equals the bandwidth reduced of a CQMF subband signal.Due to intrinsic bandwidth expansion, so need to be transformed into CQMF band 0 and 1 from the output of transposer.This can also carry out approx by 64 line FFT being divided into 4 16 line FFT and utilizing the CQMF prototype filter in transform domain to respond compensation subsequently before the inverse FFT of two 16 line FFT calculating composition CQMF band 0 and 1.Note, in the above example, do not consider frequency domain over-sampling, because it will increase positive-going transition size and inverse transformation size by previously mentioned oversample factor.In one application, FFT spectrum can be split in the module 348 of virtual bass module 330, and the compensation of CQMF filter response can be carried out by multiplier 350.At other execution modes, before FFT splits module 348, the compensation of CQMF filter response can be carried out to complete (64 lines such as, in example) FFT spectrum.
As illustrated further in Fig. 3 C, the transform size and windowing subsequently that use N/B to put and overlapping/be added step 356, use length of window L/B, the output from CQMF filter response compensation block 350 is inputed to the modulation step 352 before inverse fft circuit 354.In embodiments of the present invention, window shape is asymmetrical.Can also before FFT block 348 and CQMF filter response compensation block 350 application of modulation step 352.Output signal from windowing and overlapping/adder circuit 356 is two CQMF signals, comprises the virtual bass signal that will mix with the HQMF signal A 364 through time delay.But these two signals first need respectively by 8 passages and 4 passage Nyquist analysis filterbank 360 are filtered meets hybrid domain.In embodiments of the present invention, Nyquist analysis filterbank 360 uses and blocks prototype filter.HQMF from bank of filters 360 exports can by bandpass filtering, and the audio frequency mixing to produce enhancing in module 362 with the input component A 364 through time delay exports HQMF signal 366.In one embodiment, time delay to the input A 364 of mixed zone mixed block 362 is less than virtual bass system delay (if use signal B 306 as input, then deduct Nyquist and analyze time delay) to comprise the virtual bass signal of time lag.
When as above summarize, when performing FFT segmentation, the phase relation between can not keeping from the subband signal of CQMF analysis bank.In order to alleviate this situation in embodiments, system 330 before Nyquist analysis block 360 to CQMF passage 1 use to be doubled by exp (-j pi/2) 358 phase compensation.The specific independent variable of phase compensation function 358 depends on the modulation scheme of CQMF group 304 use before by Fig. 3 A, and can be different because of execution mode.In addition, compensating factor 358 can be moved and absorb in other processing blocks.
virtual bass postpones to reduce
As described in the background section, virtual bass treatment system introduces some time delays when processing input signal.With reference to Figure 1B, the time delay of traditional transposer (about measured by transposer output sampling frequency rate) can be expressed as D=3L/2-2S a, wherein, L is transposer window size, S aanalyze stride or jump distance.As previously described, at L=64 and S ain the system of=4, total time delay of transposer and nyquist filter group analysis section can be about 3200 samplings.
In one embodiment, virtual bass treatment system comprises the parts performing the delay that some steps are associated with virtual bass contents processing with reduction.Fig. 4 is the block diagram being postponed the main function components reducing process and system utilization by virtual bass according to execution mode.As shown in Figure 40 0 of Fig. 4, postpone to reduce process and comprise the virtual bass signal 408 using the more high-order fundamental transposition factor 402, the asymmetric mapping window of low delay 404, block Nyquist prototype filter 406 and time lag.Each functional part of Figure 40 0 can be used alone or use in conjunction with one or more in miscellaneous part the delay helping to reduce virtual bass contents processing.Such as when each in parts 402 to 408 is embodied as hardware component as circuit, processor etc., Figure 40 0 can represent system.Such as when each in parts 402 to 408 is implemented as the action performed by functional part as computer implemented process by one or more processing execution, this figure can also represent process.Alternately, Figure 40 0 can represent that some of them parts can realize with ware circuit and miscellaneous part may be implemented as hybrid system and the method for performed method step.Parts 402 to 408 may be implemented as different individual components, or they can be combined in one or more delay reduction function merged.The composition of each parts of system 400 and being described in detail as follows of operation.
the more high-order fundamental transposition factor
For the more high-order fundamental transposition factor 402 of Fig. 4, can by traditional transposer time delay equation D ts={ 3L/2-2S a64/2 (equation 2) be derived as shown in Equation 3:
D ts={ (B+1) L/2-BS a64/B (equation 3)
In equation 3, the fundamental transposition factor 2 of legacy system is replaced by arbitrary integer fundamental transposition factor B.Note, equation 3 refers to the time delay of the output sampling of the framework based on CQMF with 64 passages.Can verify, for constant L and S a, time delay reduces along with the increase of B.Postpone reduction system for the virtual bass according to execution mode, Fig. 5 A shows the time delay of to jump with first apart from being associated, and Fig. 5 B shows the time delay of to jump distance with second and being associated.The form 1 of Fig. 5 A shows jumps apart from S for various window size (L=16 to 128) and the fundamental transposition factor (B=2 to 16) athe delay of=4.By contrast, the form 2 of Fig. 5 B shows and jumps apart from S for identical various window size (L=16 to 128) and the fundamental transposition factor (B=2 to 16) athe delay of=2.As seen in Fig. 5 A and Fig. 5 B, such as, by the fundamental transposition factor is increased to 8 from 2, can realize postponing significantly to reduce (such as, for L=64 and S athe nominal case of=4, is reduced to 2048 samplings from 2816 samplings).
With reference to Fig. 3 C, in the transposer 338 to 356 of combination, when generating more high-order transposition factor T-wherein T being greater than B (T > B), transposer source range is less than the transposer target zone in analytic transformation spectrum.Target frequency section is produced by the interpolation of source frequency section.When using more high-order fundamental transposition device to generate low order transposition because of the period of the day from 11 p.m. to 1 a.m, that is, when T is less than B (T < B), source range will be greater than target zone, and target frequency section is produced by the extraction of source frequency section.But, same for situation T < B, when T is odd number, the source frequency segment index being derived as k=nB/T can not be integer usually---wherein n is target frequency segment index, and therefore the interpolation from two continuous print source frequency sections is derived target frequency section.
The exponent number increased of the fundamental transposition factor has certain association to virtual bass process.First, need to set up and control to force transposer source range to remain within the scope of analytic transformation the scope of 0 to N-1 (that is, in).Secondly, compared with using the system of the fundamental transposition factor 2, present two synthesis conversion 354 sizes can be N/B but not N/2, and wherein, N is analytic transformation size.This represent, synthesis window by by factor B but not 2 be extracted and compose segmentation 348 together with filter response compensate 350 gain vector also can be correspondingly reduced.This is the result of the bandwidth expansion added of high value for B; Transposer exports the frequency range (assuming that input of a CQMF band) covering B CQMF band inherently, and wherein, in fact only two CQMF bands are synthesized above, thus save complexity.For fundamental transposition factor B=8 and frequency domain oversample factor F=4, two synthesis transform size are N s=FL/B=464/8=32, and synthesize mapping window 356 there is only L/B=64/8=8 tap.
The quality of signal through transposition is controlled by the fundamental transposition factor, and decreases for more high-order transposition exponent number, but can improve apart from (over-sampling increased in time domain) by using the analysis that reduce to jump.In addition, in order to keep the quality of knocking sound (transition), for the higher fundamental transposition factor, the exponent number increasing frequency domain over-sampling is needed.But the over-sampling added in time and frequency two may increase the computation complexity of transposer.In one embodiment, compared with legacy system, analyze and jump apart from being reduced 1/2nd.The fundamental transposition device of factor B=8 will require the frequency domain oversample factor of at least F=(B+1)/2=4.5.In one embodiment, the over-sampling (F=4) of this system usage factor 4, and the value 0.5 disappeared when end comes to a point when mapping window is usually in fact not remarkable.Therefore, in this embodiment, owing to increasing over-sampling in time, cause computation complexity to amount to and increased by the factor 2.It should be noted that the time over-sampling that increases with the time delay slightly increased for cost, for L=64, B=8 and S a=2, terminate, as shown in the form 2 of Fig. 5 B with the total delay of 2176 samplings.
asymmetrical mapping window
Give form 1 and the content shown in form 2 of Fig. 5 A and Fig. 5 B, the obvious mode can supposing to reduce transposer time delay uses shorter mapping window and therefore less analytic transformation size and synthesizes transform size.But this is usually to reduce the quality of intensive tone signal for cost, because being produced the frequency resolution reduced by shorter mapping window.Have been found that and can realize the reduction of the more robust of the algorithm time delay of transposer by using in positive-going transition section and inverse transformation section asymmetric analysis window and synthesis window.Thus, in one embodiment, for the asymmetric conversion 404 of low delay of Fig. 4, postpone reduction system in positive-going transition section and inverse transformation section, use asymmetric analysis window and the synthesis window windowed segments 338 and 356 of Fig. 3 C (such as, be respectively).This does not cause conversion time delay to improve the frequency response of time-limited asymmetric window in essence by " tail " towards history samples extended window.In even more generally execution mode, the length of analysis window and the size of positive-going transition can with the varying in size of the length of synthesis window and inverse transformation.
Fig. 5 C is the example plot of the time response of asymmetric window compared with the Chinese of conventional symmetrical rather (Hanning) window.Fig. 5 C shows: as shown in curve chart 514 for length be conduct sampling (x-axis) and the signal amplitude of the Hanning window mouth of 64 (such as, in units of volt) function time response, as shown in curve chart 516 for length to be time response of the function as sampling (x-axis) and signal amplitude (such as, in units of volt) of the Hanning window mouth of 41 and length be 64 and time delay be the curve chart 512 time response of the asymmetric window of 40 (time delay equals the Hanning window mouth that length is 41).Fig. 5 D is the exemplary graph of the frequency response of asymmetric window compared with the Hanning window mouth of conventional symmetrical.Fig. 5 D shows: as shown in curve chart 524 for length be the Hanning window mouth of 64 as the signal amplitude on normalized frequency (x-axis) and logarithmic scale (such as, the frequency response of function dB), as shown in curve 526 for length to be the frequency response of the function as the signal amplitude (such as, dB) on normalized frequency (x-axis) and logarithmic scale of the Hanning window mouth of 41 and length be 64 and time delay be the frequency response curve 522 of the asymmetric window of 40 (equaling the Hanning window mouth of length 41).As seen in figure 5d, the main lobe (lobe) of asymmetric window has frequency resolution between expression two Hanning window mouths or optionally width between the Hanning window mouth of these symmetries.
In order to adapt to asymmetric window conversion process, compared with realizing with tradition, needing partly to change transposer algorithm, considering the conversion time delay D reduced analyzing/synthesize chain.Replace passing through e after the positive-going transition of legacy system He before inverse transformation -j π kcarry out frequency modulation(FM), asymmetric system requires to carry out frequency modulation(FM) 342 after following analytic transformation:
M a(k)=e -i (2 π/N) (D/2-L+1) k, 0≤k < N (equation 4)
This system also requires to modulate before the segmentation of following synthesis FFT spectrum:
M s(n)=e -i (π/NDn), 0≤n < N (equation 5)
In superincumbent equation 4 and equation 5, k and n is conversion frequency coefficient index respectively, and N is analytic transformation size, that is, N=FL, and wherein, F is frequency domain oversample factor, and L is analysis window size and D is conversion time delay.As pointed in Fig. 3 C, the modulation of equation 5 can also be applied to FFT and split module 348 and the modem section 352 responded after compensation process 350.
Fig. 6 shows according to the use of the asymmetric window of execution mode and the time delay be associated of being forced by B rank fundamental transposition device on form.In the low system for electrical teaching of traditional virtual, B is configured to 2 usually, but if combine more high-order fundamental transposition factor treatment 402 to use asymmetric window treatments 404, then B can be greater than 2 integer value (such as, B=4,8 or 16).Time graph 600 shows the time zero benchmark of the group delay (approximate D/2) as analysis window.At Analysis Block 602 from time t 0rise and add new sampling 604.The time-stretching duality that time graph 610 shows transposer in the synthesis stage 612 of the sampling 614 of new time-stretching by t 0move to time Bt 0.When the window using asymmetric window as shown in Fig. 5 (512) or Fig. 6, total analysis/synthesis chain amount of delay is similar to: D/2+B (D/2-S a).
Can be shifted for the symmetrical window situation that realizes by the circulation timei by N/2 sampling for wherein frequency domain modulation, equation 4 above and the calculating of equation 5 can similarly realize respectively by displacement circulation timei of N-(D/2-(L-1)) (mod N) the individual sampling before analytic transformation and being shifted N-D/2 circulation timei of sampling after (single) synthesis conversion.But when asymmetric window and the more high-order fundamental transposition factor such as B=8 and FFT splits section 348 and combining, the time shift after synthesis converts can be (N-D/2)/B sampling, and it can be integer value.In this case, the value rounded can be used as approximation.In addition, in order to save complexity, can modulate analyzing modulation with the synthesis that synthesis furnishing combines as the merging provided by equation 6:
M aSC(k)=e -i (2 π/N) (D/2 (B+1)-L+1) B) k, 0≤k < N (equation 6)
Only when the factor T that replaces equals B, the hybrid modulation of equation 6 just can be accurately.For other transposition factors, equation 6 also can be approximation.
Alternately, the modulation of equation 6 may be implemented as shown in equation 7 synthesis conversion after displacement combination circulation timei:
f x ( m ) = g x ( S + m ) , 0 &le; m < N / B - S f x ( N / B - S + m ) = g x ( m ) , 0 &le; m < S (equation 7)
In superincumbent equation 7, g xm () is the time-domain output from one of synthesis inverse transformation, f xm () is the time series of displacement, and S equals:
In addition, when ceil function when the independent variable of (rounding into immediate integer) is not accurate integer, equation 7 provides the warbled only approximation realized by equation 6 (itself can be approximation).Shall also be noted that equation 5 above and equation 6 are preferably only applied to the finite part of coefficient, these coefficients are included in two inverse Fourier transforms.
With reference to Fig. 6, the accurate expression of the total system time delay of asymmetric window transposer framework becomes as shown in equation 8:
D ta={ (B+1) D/2-B (S a-1) } 64/B (equation 8)
In addition, equation 8 refers to the time delay used based in the output sampling of the framework of 64 channel C QMF.
Postpone reduction system for the virtual bass according to the asymmetric mapping window of the use of execution mode, Fig. 7 A shows the form of jumping the total retardation value apart from size about first, and Fig. 7 B shows the form of the total retardation value of jumping distance about second.The form 3 of Fig. 7 A shows jumps apart from S for various conversion delay value (D=15 to 127) and the fundamental transposition factor (B=2 to 16) athe delay of=4.By contrast, the form 4 of Fig. 7 B shows and jumps apart from S for identical various change delay value (D=15 to 127) and the fundamental transposition factor (B=2 to 16) athe delay of=2.As seen in form 4, reduce to be that 828 samplings are (for S from symmetrical 64 tap windows (D=63) to the delay of asymmetric window a=2 and the nominal case of B=8,2204-1376=828).
By equation 3 compared with equation 8, can verify, D is set ts=D taprovide:
D=L-(2B/ (B+1)) (equation 9)
The conversion that equation 9 above expresses the expection of the symmetrical window as B=1 postpones D=L-1.
The asymmetrical amount of transposition window can change according to the restriction of system and requirement.In a kind of execution mode and specific implementation, the group delay of asymmetric window is selected close to converting 1/2nd of time delay, to keep suitable transposition quality.Thus, in this case, G d≈ D/2=20.This can have been come by the constraint of the group delay during optimizing phase of comprising the design of asymmetric filters.
block Nyquist prototype filter
With reference to Fig. 4, the 3rd delay reduction element comprises use and blocks Nyquist prototype filter 406.As shown in FIG. 3 C, in order to mix virtual bass signal in hybrid domain, 8 passages and 4 passage Nyquist analysis filterbank 360 are applied to virtual bass and export CQMF passage (these bank of filters correspond to the nyquist filter group 307 and 308 of Fig. 3 A).In one embodiment, Nyquist analysis filterbank 360 uses 13 symmetrical tap prototype filters, and it can produce the time delay (such as, in this case, 664=384 exports sampling) of 6 CQMF samplings.By removing 6 coefficients acted in following sampling of prototype filter, this whole time delay (such as, 384 samplings) can be eliminated.Usually, Nyquist analysis/synthesis chain still provides perfect reconstruct.But, use the frequency response blocking the nyquist filter group of filter to change.The optimization of remaining filter coefficient can improve the potential poor frequency response using and block the nyquist filter group of filter.
the virtual bass signal of time lag
With reference to Fig. 4, the 4th delay reduction element comprises makes the delayed primary signal of virtual bass signal, 408.In this case, during shorter compared with the time period that broadband signal (that is, the mixed signal A 364 of Fig. 3 C) is in fact implied by time delay and the virtual bass system delay time period, the delay of whole system can be reduced.Unofficially listen to test to show: do not hinder virtual bass effect lower than the delayed of 20ms.This delayed for 48kHz audio signal corresponding to 960 samplings.
In the specific implementation of execution mode, virtual bass signal is allowed to make delayed 352 samplings (being 7.33ms under 48kHz) altogether of broadband signal.Because 1376 can not be divided exactly by CQMF bank of filters size 64, so in these 352 samplings, 32 samplings are from the use to asymmetric mapping window.Therefore, the wideband delay that can be divided into 1344 from the time delay of asymmetric window transform adds that 32 basses of sampling are delayed.Thus additionally delayed except 32 samplings is 320 samplings (5 CQMF sampling, corresponding to 6.67ms under 48kHz sample frequencys).
The different delay of Fig. 4 reduces element 402 to 408 and can use with the combination of any practical quantity the reduction realizing virtual bass system delay.In addition, often kind can be postponed the suitable change of minishing method be modified to and increase the relevant delay that to decline with any perception of virtual bass signal quality.In one embodiment, use following value to achieve four to postpone to reduce element: fundamental transposition factor B=8, jumping are apart from S a=2, conversion postpone D=40, block nyquist filter group and 320 sampling additional virtual basses delayed.In the illustrated case, result output sampling in virtual bass system delay as follows:
D VB={(B+1)·D/2-B·(S A-1)}·64/B-32+0-320=1376-352=1024
Evade Nyquist analysis filter in pretreatment section (such as by using the signal B 306 of input B 203 in Fig. 2, Fig. 3 A as the input D 332 in the virtual bass module 330 of Fig. 3 C) as mentioned above, the time delay of 384 other samplings can be saved, produce virtual bass system delay 1024-384=640 sampling (corresponding to 13ms under 48kHz sample frequency).
The time delay of 640 samplings in this sample situation is significantly less than the nominal delay of 3200 samplings in the low system for electrical teaching of previously described traditional virtual.Even can be delayed by increasing larger virtual bass, by jumping apart from S abe increased to 4 replacements 2, or reduce this time delay further by designing the asymmetric mapping window with the analysis/synthesis time delay of the result being shorter than 40.But although can reduce delay further, the change of any such value can produce virtual bass quality poor a little.
The execution mode that virtual bass described herein postpones reduction system can use in conjunction with any system of virtual bass generation system as shown in Fig. 2 and Fig. 3 suitably.Fig. 8 shows the block diagram comprising the audio frequency processing system of virtual bass generation system and delay reduction system according to execution mode.As shown in Figure 8, system 800 comprises virtual bass generation system 330 as shown in Figure 3 C.Virtual low system for electrical teaching 330 receives input audio signal 801, and performs some frequency transposition functions and carry out playback to the audio content producing enhancing with the loud speaker 806 by having limited frequency response ability.Some delays can be associated with the transposition function performed by virtual low system for electrical teaching 330.In one embodiment, virtual bass delay reduction system 400 (as shown in Figure 4) is provided as the reprocessing of virtual low system for electrical teaching 300 to reduce the delay be associated with virtual bass process.Then, the audio signal reducing delay carrying out self-virtualizing bass system 300 and 400 is sent to rendering subsystem 802, rendering subsystem 802 is configured to generate speaker feeds, and speaker feeds can be fed to left and right (or multichannel) loud speaker 806 by amplifier 804.
Although virtual bass delay reduction system 400 is illustrated as the independent after-treatment component in system 800, but it should be noted that, such delay reduces the part (as noted) that system may be implemented as virtual low system for electrical teaching 330, or the part of any suitable element being implemented as system 800 is as the functional part in rendering subsystem 802.Similarly, virtual low system for electrical teaching 330 can be the traditional virtual bass generation system summarized in background technology, or it can be use harmonic transposition to be generated and treatment system by any other virtual bass of the perception level of the bass content of loud speaker 806 playback to strengthen input audio signal 801 to increase.
The execution mode using virtual bass to postpone reduction system can presented in any audio frequency processing system with playback digital audio by various different playback reproducer and audio tweeter (transducer).These loud speakers can be presented as any one in the project of various different listening device or playback apparatus, as computer, TV, stereophonic sound system (family or movie theatre), cell phone, panel computer and other portable playback device.Loud speaker can have any suitable size and rated power, and can the form of driver, loudspeaker enclosure, ambiophonic system, bar shaped audio amplifier, earphone, earplug etc. in a free-standing be provided.Loud speaker can be configured with any suitable array, and can comprise monophony driver, ears loud speaker, surround sound loudspeaker array or any other suitable audio driver array.
One or more computer of executive software instruction or audio signal being processed to realize the various aspects of one or more of execution mode described herein in by the audio system of Internet Transmission of processing unit can comprised.Any execution mode in described execution mode can be used separately or be combined with each other with any combination to be used.Although various execution mode is promoted by the various defects of one or more the local prior art discussed or mention at specification, execution mode not necessarily solves any defect in these defects.In other words, different execution modes can solve the different defect that may discuss in the description.Some execution modes only partly can solve some defects that will discuss in the description or an only defect, and some execution modes can not solve any defect in these defects.
Can for the treatment of numeral or digitized audio document suitable computer based acoustic processing network environment in realize system described herein in.The various piece of adaptive audio system can comprise one or more network following: described network comprises the independent machine of any desired amt, comprises one or more router (not shown) of the data for transmitting between buffer memory and route computer.Such network can build based on various different procotol, and can be internet, wide area network (WAN), local area network (LAN) (LAN) or its combination in any.
The computer program that can be controlled by the execution of the calculation element based on processor to system realize in parts, block, processor or other functional part one or more.Should also be noted that, according to its behavior, register transfer, logical block and/or further feature, hardware, firmware and/or data can be used and/or any amount of combination of instruction that realizes in various machine readable media or computer-readable medium to describe various function disclosed herein.The computer-readable medium that can embody such formatted data and/or instruction includes but not limited to various forms of physics (non-transient state), non-volatile media, as light, magnetic or semiconductor storage medium.
Unless the context clearly requires otherwise, otherwise throughout specification and claims, word " comprises (comprise) ", " comprising (comprising) " etc. will to explain with the adversative meaning comprised of exclusive meaning or limit; That is, explain according to the meaning of " including but not limited to ".In addition, the word of odd number or plural number is used also to comprise plural number or odd number respectively.In addition, the word of word " in this article ", " hereinafter ", " above ", " below " and the similar meaning refers to any specific part of the whole of the application instead of the application.When the list with reference to two or more items uses word "or", all following explanation of this word contained in this word: the combination in any of project in all items and list in any one project, list in list.
Although by example and according to specific execution mode describe one or more realize, should be appreciated that one or more realization is not limited to disclosed execution mode.On the contrary, as obvious to those skilled in the art, it is intended to cover various amendment and similar layout.Therefore, the scope of claims should meet to be explained the most widely, to comprise all such amendments and similar layout.
Accompanying drawing explanation
In accompanying drawing below, identical Reference numeral is used in reference to identical key element.Although figure below depicts various example, one or more realizes the example being not limited to describe in figure.
Figure 1A shows frequency component in known virtual bass treatment system from inaudible frequency range to the transposition of audible frequency range.
Figure 1B shows the time delay that the symmetrical window with using in the low system for electrical teaching of traditional virtual well known in the prior art is associated.
Fig. 2 is the generalized block diagram postponing the virtual bass treatment system reducing process according to the realization of execution mode.
Fig. 3 A show according to execution mode based on the preliminary treatment hybrid filter-bank section in the system of HQMF.
Fig. 3 B shows the previous Nyquist synthesis filter banks section of the virtual bass treatment system according to execution mode.
Fig. 3 C is the more detailed figure according to the virtual bass treatment system shown in Fig. 2 of execution mode.
Fig. 4 is the block diagram being postponed the main function components reducing process and system utilization by virtual bass according to execution mode.
Fig. 5 A shows and postpones with the virtual bass of the not same order using the fundamental transposition factor form that first of reduction system jumps the time delay that distance is associated according to execution mode.
Fig. 5 B shows and postpones with the virtual bass of the not same order using the fundamental transposition factor form that second of reduction system jumps the time delay that distance is associated according to execution mode.
Fig. 5 C is the exemplary plot of the time response of asymmetric window compared with some conventional symmetrical window, and Fig. 5 D is the exemplary plot of the frequency response of asymmetric window compared with some conventional symmetrical window.
Fig. 6 shows according to the use of the asymmetric window of execution mode and the time delay be associated of being forced by B rank fundamental transposition device.
Fig. 7 A shows and postpones according to the virtual bass of the not same order of the asymmetric mapping window of the use of execution mode and the fundamental transposition factor form that first of reduction system jumps the total retardation value of distance.
Fig. 7 B shows and postpones according to the virtual bass of the not same order of the asymmetric mapping window of the use of execution mode and the fundamental transposition factor form that second of reduction system jumps the total retardation value of distance.
Fig. 8 shows the block diagram comprising the audio frequency processing system of virtual bass generation system and delay reduction system according to execution mode.

Claims (22)

1., for generating a method for the virtual bass of low delay, comprising:
Receive input audio signal;
Harmonic transposition is performed to generate the transposition data representing the harmonic wave of described input audio signal to the low frequency component of described input audio signal;
Generating virtual bass signal is carried out in response to described transposition data; And
By the time delay version of described virtual bass signal and described input audio signal is carried out combining the audio signal generating enhancing, wherein, described harmonic transposition utilizes and uses the combination of the fundamental transposition rank B higher than 2 to replace, described harmonic wave is made to comprise the second harmonic of each described low frequency component and at least one more high-order harmonic wave, and make in response to by the frequency domain value using common time to the frequency domain transformation section of asymmetric analysis window to determine and by the follow-up inverse transformation using common frequency to the time-domain transforming section of asymmetric synthesis window to determine to generate all described harmonic waves.
2. method according to claim 1, wherein, described fundamental transposition factor B be selected from 4,8,16 or 32 integer value.
3. method according to claim 1, wherein, described input audio signal be represent from one group of complex values quadrature mirror filter (CQMF) subband signal threshold sampling or the subband CQMF signal of low frequency audio frequency close to threshold sampling.
4. method according to claim 3, wherein, described threshold sampling or low frequency input audio frequency close to threshold sampling be CQMF passage 0 signal of the lowest band represented from one group of CQMF subband signal.
5. method according to claim 4, also comprises:
Transposition data are generated by following: by generating the sampling of asymmetric windowing zero padding and coming to perform the conversion of frequency domain over-sampling to described input audio signal to described asymmetric windowing zero padding sampling time of implementation to frequency domain transformation according to low frequency component; And follow-up to from the described time to frequency domain transformation output perform nonlinear operation to generate described transposition data according to described low frequency component;
Two class frequency components are generated by being divided into the first class frequency component in the first frequency band and the second class frequency component in the second frequency band according to the frequency component by described nonlinear operation process; And
Perform first frequency to time-domain to described first class frequency component further convert and convert to time-domain described second class frequency component execution second frequency, wherein, described first frequency to time-domain conversion and described second frequency to time-domain convert in the transform size of each be the 1/B of described time to the transform size of frequency domain transformation; And
Asymmetric zero padding window is applied further to from described frequency to the sampling that time-domain converts, wherein, described asymmetric zero padding window is that the 1/B that the described asymmetric windowing zero padding generated according to described input audio signal is sampled is long, thus forms two groups of transposition data.
6. method according to claim 5, wherein, described first frequency band is the frequency band of the CQMF passage 0 from one group of CQMF subband signal, and described second frequency band is the frequency band of the CQMF passage 1 from described one group of CQMF subband signal.
7. method according to claim 6, wherein, carry out generating virtual bass signal in response to described transposition data to comprise and one of to be applied in described two groups of transposition data or both analysis filterbank, wherein, described analysis filterbank comprises the truncated version of balanced-filter.
8. method according to claim 7, wherein, described analysis filterbank is nyquist filter group, and the truncated version of described balanced-filter is the removed filter of one of half portion of the symmetry of described filter.
9. method according to claim 8, wherein, described analysis filterbank comprises one of 8 passage nyquist filter groups or 4 passage nyquist filter groups, and wherein, one of half portion of the removed symmetry of described filter comprises 6 coefficients.
10. method according to claim 1, wherein, the time delay version of described input audio signal by time delay with the Late phase of described virtual bass signal than shorter predetermined amount of time, and the audio signal of described enhancing represents the virtual bass signal of time lag.
11. methods according to claim 10, wherein, described predetermined amount of time is selected from the value of 0 sampling to the scope of 1000 samplings.
12. methods according to claim 4, wherein, directly receive input audio frequency CQMF passage 0 from the analysis CQMF group output of preliminary treatment hybrid filter-bank section, thus by the Nyquist analysis filterbank bypass of described preliminary treatment hybrid filter-bank section.
13. 1 kinds, for generating the equipment of the virtual bass of low delay, comprising:
First component, described first component receives input audio signal and performs harmonic transposition to generate the transposition data representing the harmonic wave of described input audio signal to the low frequency component of described input audio signal; And
Second component, described second component carrys out generating virtual bass signal in response to described transposition data and carries out the time delay version of described virtual bass signal and described input audio signal to combine to generate the audio signal of enhancing, wherein, described harmonic transposition utilizes and uses the combination of the fundamental transposition rank B higher than 2 to replace, described harmonic wave is made to comprise the second harmonic of each described low frequency component and at least one more high-order harmonic wave, and make in response to by the frequency domain value using common time to the frequency domain transformation section of asymmetric analysis window to determine and by the follow-up inverse transformation using common frequency to the time-domain transforming section of asymmetric synthesis window to determine to generate all described harmonic waves.
14. equipment according to claim 13, wherein, described fundamental transposition factor B be selected from 4,8,16 or 32 integer value.
15. equipment according to claim 13, wherein, described input audio signal be represent from one group of complex values quadrature mirror filter (CQMF) subband signal threshold sampling or the subband CQMF signal of low frequency audio frequency close to threshold sampling.
16. equipment according to claim 15, wherein, described threshold sampling or CQMF passage 0 signal that close to the low frequency audio frequency of threshold sampling is the lowest band represented from one group of CQMF subband signal.
17. equipment according to claim 16, also comprise:
3rd parts, described 3rd parts generate transposition data according to low frequency component by following: by generating the sampling of asymmetric windowing zero padding and coming to perform the conversion of frequency domain over-sampling to described input audio signal to described asymmetric windowing zero padding sampling time of implementation to frequency domain transformation; And follow-up to from the described time to frequency domain transformation output perform nonlinear operation to generate described transposition data according to described low frequency component;
4th parts, described 4th parts generate two class frequency components according to the frequency component by described nonlinear operation process by being divided into the first class frequency component in the first frequency band and the second class frequency component in the second frequency band;
5th parts, described 5th parts perform first frequency to time-domain to described first class frequency component further and convert and convert to time-domain described second class frequency component execution second frequency, wherein, described first frequency to time-domain conversion and described second frequency to time-domain convert in the transform size of each be the 1/B of described time to the transform size of frequency domain transformation; And
6th parts, described 6th parts apply asymmetric zero padding window to from described frequency to the sampling that time-domain converts, wherein, described asymmetric zero padding window is that the 1/B that the described asymmetric windowing zero padding generated according to described input audio signal is sampled is long, thus forms two groups of transposition data.
18. equipment according to claim 17, wherein, described first frequency band is the frequency band of the CQMF passage 0 from one group of CQMF subband signal, described second frequency band is the frequency band of the CQMF passage 1 from described one group of CQMF subband signal, and wherein, carry out generating virtual bass signal in response to described transposition data to comprise and one of to be applied in described two groups of transposition data or both analysis filterbank, wherein, described analysis filterbank comprises the truncated version of balanced-filter.
19. equipment according to claim 18, wherein, described analysis filterbank is nyquist filter group, and the truncated version of described balanced-filter is the removed filter of one of half portion of the symmetry of described filter.
20. equipment according to claim 19, wherein, described analysis filterbank comprises one of 8 passage nyquist filter groups or 4 passage nyquist filter groups, and wherein, one of half portion of the removed symmetry of described filter comprises 6 coefficients.
21. equipment according to claim 13, also comprise:
Timing part, described timing part generate described audio signal by time delay with the version of the Late phase of described virtual bass signal than shorter predetermined amount of time; And
Hydrid component, described virtual bass signal and the described input audio signal through time delay are combined the audio signal of the enhancing to generate the virtual bass signal representing time lag by described hydrid component.
22. equipment according to claim 16, also comprise interface unit, described interface unit directly receives described CQMF passage 0 from the analysis CQMF group output of preliminary treatment hybrid filter-bank section, thus by the Nyquist analysis filterbank bypass of described preliminary treatment hybrid filter-bank section.
CN201380053450.0A 2012-10-15 2013-09-27 Systems and methods for reducing delay in transposer-based virtual bass systems Active CN104704855B (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US13/652,023 US8971551B2 (en) 2009-09-18 2012-10-15 Virtual bass synthesis using harmonic transposition
US13/652,023 2012-10-15
PCT/EP2013/070262 WO2014060204A1 (en) 2012-10-15 2013-09-27 System and method for reducing latency in transposer-based virtual bass systems

Publications (2)

Publication Number Publication Date
CN104704855A true CN104704855A (en) 2015-06-10
CN104704855B CN104704855B (en) 2016-08-24

Family

ID=49293633

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201380053450.0A Active CN104704855B (en) 2012-10-15 2013-09-27 Systems and methods for reducing delay in transposer-based virtual bass systems

Country Status (4)

Country Link
EP (2) EP2907324B1 (en)
JP (1) JP5894347B2 (en)
CN (1) CN104704855B (en)
WO (1) WO2014060204A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105280189A (en) * 2015-09-16 2016-01-27 深圳广晟信源技术有限公司 Method and apparatus for high-frequency generation during bandwidth extension coding and decoding
CN114467313A (en) * 2019-08-08 2022-05-10 博姆云360公司 Non-linear adaptive filter bank for psycho-acoustic frequency range extension
CN115299075A (en) * 2020-03-20 2022-11-04 杜比国际公司 Bass boost for speakers
CN115802244A (en) * 2022-12-19 2023-03-14 上海艾为电子技术股份有限公司 Virtual bass generation method, medium, and electronic device

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP4367906A1 (en) * 2021-07-09 2024-05-15 Soundfocus Aps Method and loudspeaker system for processing an input audio signal
WO2023280356A1 (en) * 2021-07-09 2023-01-12 Soundfocus Aps Method and transducer array system for directionally reproducing an input audio signal
JP2023130644A (en) * 2022-03-08 2023-09-21 アルプスアルパイン株式会社 Acoustic signal processing device, acoustic system, and method for enhancing low-pitched sound feeling

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070253576A1 (en) * 2006-04-27 2007-11-01 National Chiao Tung University Method for virtual bass synthesis
CN101505443A (en) * 2009-03-13 2009-08-12 北京中星微电子有限公司 Virtual supper bass enhancing method and system
CN102354500A (en) * 2011-08-03 2012-02-15 华南理工大学 Virtual bass boosting method based on harmonic control
TW201215172A (en) * 2010-07-09 2012-04-01 Conexant Systems Inc Systems and methods for generating phantom bass

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
SE0101175D0 (en) 2001-04-02 2001-04-02 Coding Technologies Sweden Ab Aliasing reduction using complex-exponential-modulated filter banks
US8036903B2 (en) * 2006-10-18 2011-10-11 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Analysis filterbank, synthesis filterbank, encoder, de-coder, mixer and conferencing system
JP4983694B2 (en) * 2008-03-31 2012-07-25 株式会社Jvcケンウッド Audio playback device
US8818541B2 (en) * 2009-01-16 2014-08-26 Dolby International Ab Cross product enhanced harmonic transposition
GB0906594D0 (en) * 2009-04-17 2009-05-27 Sontia Logic Ltd Processing an audio singnal
KR101613684B1 (en) * 2009-12-09 2016-04-19 삼성전자주식회사 Apparatus for enhancing bass band signal and method thereof
CA3027803C (en) * 2010-07-19 2020-04-07 Dolby International Ab Processing of audio signals during high frequency reconstruction
JP5375861B2 (en) * 2011-03-18 2013-12-25 ヤマハ株式会社 Audio reproduction effect adding method and apparatus
TWI575962B (en) * 2012-02-24 2017-03-21 杜比國際公司 Low delay real-to-complex conversion in overlapping filter banks for partially complex processing

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070253576A1 (en) * 2006-04-27 2007-11-01 National Chiao Tung University Method for virtual bass synthesis
CN101505443A (en) * 2009-03-13 2009-08-12 北京中星微电子有限公司 Virtual supper bass enhancing method and system
TW201215172A (en) * 2010-07-09 2012-04-01 Conexant Systems Inc Systems and methods for generating phantom bass
CN102354500A (en) * 2011-08-03 2012-02-15 华南理工大学 Virtual bass boosting method based on harmonic control

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105280189A (en) * 2015-09-16 2016-01-27 深圳广晟信源技术有限公司 Method and apparatus for high-frequency generation during bandwidth extension coding and decoding
CN105280189B (en) * 2015-09-16 2019-01-08 深圳广晟信源技术有限公司 The method and apparatus that bandwidth extension encoding and decoding medium-high frequency generate
CN114467313A (en) * 2019-08-08 2022-05-10 博姆云360公司 Non-linear adaptive filter bank for psycho-acoustic frequency range extension
CN115299075A (en) * 2020-03-20 2022-11-04 杜比国际公司 Bass boost for speakers
CN115299075B (en) * 2020-03-20 2023-08-18 杜比国际公司 Bass Boost for Speakers
US12101613B2 (en) 2020-03-20 2024-09-24 Dolby International Ab Bass enhancement for loudspeakers
CN115802244A (en) * 2022-12-19 2023-03-14 上海艾为电子技术股份有限公司 Virtual bass generation method, medium, and electronic device

Also Published As

Publication number Publication date
JP5894347B2 (en) 2016-03-30
JP2015531575A (en) 2015-11-02
EP2720477A1 (en) 2014-04-16
EP2907324A1 (en) 2015-08-19
WO2014060204A1 (en) 2014-04-24
CN104704855B (en) 2016-08-24
EP2720477B1 (en) 2016-03-02
EP2907324B1 (en) 2016-11-09

Similar Documents

Publication Publication Date Title
US9407993B2 (en) Latency reduction in transposer-based virtual bass systems
US8175280B2 (en) Generation of spatial downmixes from parametric representations of multi channel signals
TWI484478B (en) Method for decoding M encoded audio channels representing N audio channels, means for decoding, and computer program
CN104704855A (en) System and method for reducing latency in transposer-based virtual bass systems
JP4664431B2 (en) Apparatus and method for generating an ambience signal
RU2666316C2 (en) Device and method of improving audio, system of sound improvement
EP3048815A1 (en) Method and apparatus for processing audio signals
EP1635611B1 (en) Audio signal processing apparatus and method
EP2939443B1 (en) System and method for variable decorrelation of audio signals
US9913036B2 (en) Apparatus and method and computer program for generating a stereo output signal for providing additional output channels
US20230085013A1 (en) Multi-channel decomposition and harmonic synthesis
CN111988726A (en) Method and system for synthesizing single sound channel by stereo
JP7533440B2 (en) Signal processing device, method, and program
JP5483813B2 (en) Multi-channel speech / acoustic signal encoding apparatus and method, and multi-channel speech / acoustic signal decoding apparatus and method
CN108182947B (en) Sound channel mixing processing method and device
EP2149876A1 (en) Reverberation applying device and corresponding program
KR102329707B1 (en) Apparatus and method for processing multi-channel audio signals
Pulikottil Virtual bass system by exploiting the rhythmic contents in music
JPH04104200A (en) Device and method for voice speed conversion
KR20090085887A (en) Audio signal decoding method
JP2000152398A (en) Audio signal processor and method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant