WO2018193161A1 - Extension spatiale dans le domaine d'élévation par extension spectrale - Google Patents
Extension spatiale dans le domaine d'élévation par extension spectrale Download PDFInfo
- Publication number
- WO2018193161A1 WO2018193161A1 PCT/FI2018/050274 FI2018050274W WO2018193161A1 WO 2018193161 A1 WO2018193161 A1 WO 2018193161A1 FI 2018050274 W FI2018050274 W FI 2018050274W WO 2018193161 A1 WO2018193161 A1 WO 2018193161A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- audio signal
- spectral
- spectral content
- content
- mixing
- Prior art date
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S7/00—Indicating arrangements; Control arrangements, e.g. balance control
- H04S7/30—Control circuits for electronic adaptation of the sound field
- H04S7/307—Frequency adjustment, e.g. tone control
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S2400/00—Details of stereophonic systems covered by H04S but not provided for in its groups
- H04S2400/11—Positioning of individual sound objects, e.g. moving airplane, within a sound field
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S2420/00—Techniques used stereophonic systems covered by H04S but not provided for in its groups
- H04S2420/01—Enhancing the perception of the sound image or of the spatial distribution using head related transfer functions [HRTF's] or equivalents thereof, e.g. interaural time difference [ITD] or interaural level difference [ILD]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S2420/00—Techniques used stereophonic systems covered by H04S but not provided for in its groups
- H04S2420/07—Synergistic effects of band splitting and sub-band processing
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S2420/00—Techniques used stereophonic systems covered by H04S but not provided for in its groups
- H04S2420/11—Application of ambisonics in stereophonic audio systems
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S2420/00—Techniques used stereophonic systems covered by H04S but not provided for in its groups
- H04S2420/13—Application of wave-field synthesis in stereophonic audio systems
Definitions
- the SAM system enables the creation of immersive sound scenes comprising "background spatial audio" or ambience and sound objects for Virtual Reality (VR) applications.
- the scene can be designed such that the overall spatial audio of the scene, such as a concert venue, is captured with a microphone array (such as one contained in the OZO virtual camera) and the most important sources captured using the 'external' microphones.
- volumetric virtual sound sources can be simply implemented by the creation of sounds with a perceived spatial extent. This is because the ability of listeners to perceive sounds at different distances is not good. A sound with a perceived spatial extent may surround the listener or it may have a specific width.
- the listener Because of a hearing effect called summing localization, the listener perceives simultaneously presented coherent audio signals as a virtual sound source between the original sources. If the coherence is lower, the signals may be perceived as separate audio objects or as a spatially extended auditory effect. Coherence can be measured with the interaural cross-correlation value between signals (IACC). When played identical signals from both headphones, the listener will perceive an auditory event in the center of the head. With identical signals the IACC value equals one. With an IACC value of zero, one auditory event will be perceived near each ear. When the IACC value is between one and zero, the listener may perceive a spatially extended or spread auditory event inside the head, with the extent varying according to the IACC value.
- IACC interaural cross-correlation value between signals
- one approach is to divide to signal in non-overlapping frequency bands, and then present the frequency bands at distinct spatial positions around the listener.
- the area from which the frequency bands are presented may be used to control the perceived spatial extent. Special care needs to be taken on how to distribute the frequency bands, such that no degradation in the timbre of the sound occurs, and that the sound is perceived as a single spatially extended source rather than several sound objects.
- the audio is split into frequency bands (512, for example). These are then rendered from a number of (9, for example) different directions defined by the desired spatial extent.
- the frequency bands are divided into the different directions using what is called a low-discrepancy sequence, e.g., a Halton sequence. This provides random looking uniformly distributed frequency component sets for the different directions.
- a filter which selects frequency components of the original signal based on the Halton sequence.
- signals for the different directions that, ideally, have similar frequency content (shape) as the original signal, but do not contain common frequency components with each other. This results in the sound being heard as having spatial extent.
- an apparatus for generating a spatially extended audio signal configured to: analyse at least one audio signal to determine spectral content of the at least one audio signal; determine whether to spectrally extend the at least one audio signal based on the spectral content of the at least one audio signal, such that the at least one audio signal is to include a determined portion of frequencies above a defined frequency; and vertically spatially extend at least part of the at least one audio signal when the determined spectral content of the at least one audio signal is to be processed.
- the apparatus may be further configured to spectrally extend the at least one audio signal based on the spectral content of the at least one audio signal, such that the at least one audio signal includes a determined portion of frequencies above a defined frequency defined as a determined portion of energy of the audio signal above a defined frequency, wherein the at least part of the at least one audio signal is at least part of the spectrally extended at least one audio signal.
- the apparatus may be further configured to divide the at least one audio signal into a first part and a second part based on the spectral content of the at least one audio signal, wherein the at least part of the at least one audio signal is the first part of the at least one audio signal.
- the first frequency value may be 3 kHz.
- the apparatus configured to spectrally extend the at least one audio signal based on the spectral content of the at least one audio signal may be configured to apply at least one of: add content with a frequency above the first frequency value to the at least one audio signal; apply spectral band replication to the at least one audio signal; select content from lower frequencies of the at least one audio signal, transpose the content to higher frequencies, and match a harmonic structure of the signal; add noise with a frequency above the first frequency value to the at least one audio signal; and apply a spectral tilt which amplifies higher frequencies above the first frequency value to the at least one audio signal.
- the apparatus configured to divide the at least one audio signal into a first part and a second part based on the spectral content of the at least one audio signal may be configured to divide the at least one audio signal into the second part comprising the at least one audio signal without spectrally extensions and the first part comprising the at least one audio signal spectral extensions.
- the apparatus configured to divide the at least one audio signal into a first part and a second part based on the spectral content of the at least one audio signal may be configured to divide the at least one audio signal using a 3 dB per octave mixing filter with a 50/50 centre-point at a determined frequency value, wherein the first part comprising a high pass filter version of the mixing filter and the second part comprising a low pass filter version of the mixing filter.
- the method may further comprise: horizontally spatially extending at least part of the at least one audio signal; and combining the horizontally spatially extended at least part of the at least one audio signal and the vertically spatially extended at least part of the at least one audio signal to generate at least one spatially extended audio signal comprising horizontal and vertical spatial extent.
- the method may further comprise at least one of: receiving the at least one audio signal from a microphone; and generating the at least one audio signal in a synthetic sound generator.
- Analysing at least one audio signal to determine spectral content of the at least one audio signal may further comprise: determining a first energy content of the at least one audio signal below a first frequency value; and determining a second energy content of the at least one audio signal above the first frequency value.
- the first frequency value may be 3 kHz.
- Spectrally extending the at least one audio signal based on the spectral content of the at least one audio signal may comprise at least one of: adding content with a frequency above the first frequency value to the at least one audio signal; applying spectral band replication to the at least one audio signal; selecting content from lower frequencies of the at least one audio signal, transposing the content to higher frequencies, and matching a harmonic structure of the signal; adding noise with a frequency above the first frequency value to the at least one audio signal; and applying a spectral tilt which amplifies higher frequencies above the first frequency value to the at least one audio signal.
- Dividing the at least one audio signal into a first part and a second part based on the spectral content of the at least one audio signal may comprise dividing the at least one audio signal using a 3 dB per octave mixing filter with a 50/50 centre-point at a determined frequency value, wherein the first part comprises a high pass filter version of the mixing filter and the second part comprising a low pass filter version of the mixing filter.
- Combining the horizontally spatially extended at least part of the at least one audio signal and the vertically spatially extended at least part of the at least one audio signal to generate at least one spatially extended audio signal comprising vertically spatial extent may comprise mixing the horizontally spatially extended at least part of the at least one audio signal and the vertically spatially extended at least part of the at least one audio signal based on a user input defining a head-pose parameter.
- the user input may define a head-pose parameter set of yaw, pitch and roll wherein mixing the horizontally spatially extended at least part of the at least one audio signal and the vertically spatially extended at least part of the at least one audio signal based on a user input defining a head-pose parameter may comprise: analysing the at least one audio signal to determine whether there is elevation extent with a spectral band extension; updating a source position for mixing using all of the head-pose parameters yaw, pitch and roll parameters based on determining there is elevation extent and controlling the mixing based on the updated source position; updating a source position for mixing using only head-pose yaw parameter based on determining there is no elevation extent; determining whether the source position needs to be perceivable; controlling the mixing based on the updated positions based on the source actual position being determined as not being needed to be perceivable; and controlling the mixing based on a source position before updating based on the source actual position being determined as being needed to be perceivable.
- the method may further comprise receiving a user input for controlling the vertically spatially extending of the second part of the at least one audio signal.
- Analysing at least one audio signal to deternnine spectral content of the at least one audio signal may comprise: analysing at least one audio signal to determine spectral content of the at least one audio signal; and storing a result of the analysis as metadata associated with the at least one audio signal prior to the spectrally extending of the at least one audio signal based on the metadata spectral content of the at least one audio signal.
- an apparatus for generating a spatially extended audio signal comprising: means for analysing at least one audio signal to determine spectral content of the at least one audio signal; means for determining whether to spectrally extend the at least one audio signal based on the spectral content of the at least one audio signal, such that the at least one audio signal is to include a determined portion of frequencies above a defined frequency; and means for vertically spatially extending at least part of the at least one audio signal when the determined spectral content of the at least one audio signal is to be processed.
- the apparatus may further comprise means for spectrally extending the at least one audio signal based on the spectral content of the at least one audio signal, such that the at least one audio signal includes a determined portion of frequencies above a defined frequency defined as a determined portion of energy of the audio signal above a defined frequency, wherein the at least part of the at least one audio signal is at least part of the spectrally extended at least one audio signal.
- the apparatus may further comprise means for dividing the at least one audio signal into a first part and a second part based on the spectral content of the at least one audio signal, wherein the at least part of the at least one audio signal is the first part of the at least one audio signal.
- the apparatus may further comprise: means for horizontally spatially extending at least part of the at least one audio signal; and means for combining the horizontally spatially extended at least part of the at least one audio signal and the vertically spatially extended at least part of the at least one audio signal to generate at least one spatially extended audio signal comprising horizontal and vertical spatial extent.
- the apparatus may further comprise at least one of: means for receiving the at least one audio signal from a microphone; and means for generating the at least one audio signal in a synthetic sound generator.
- Means for analysing at least one audio signal to determine spectral content of the at least one audio signal may further comprise: means for determining a first energy content of the at least one audio signal below a first frequency value; and means for determining a second energy content of the at least one audio signal above the first frequency value.
- the first frequency value may be 3 kHz.
- the means for spectrally extending the at least one audio signal based on the spectral content of the at least one audio signal may comprise at least one of: means for adding content with a frequency above the first frequency value to the at least one audio signal; means for applying spectral band replication to the at least one audio signal; means for selecting content from lower frequencies of the at least one audio signal, means for transposing the content to higher frequencies, and means for matching a harmonic structure of the signal; means for adding noise with a frequency above the first frequency value to the at least one audio signal; and means for applying a spectral tilt which amplifies higher frequencies above the first frequency value to the at least one audio signal.
- the means for dividing the at least one audio signal into a first part and a second part based on the spectral content of the at least one audio signal may comprise means for dividing the at least one audio signal into the second part below a determined frequency value and the first part above the determined frequency value.
- the means for dividing the at least one audio signal into a first part and a second part based on the spectral content of the at least one audio signal may comprise means for dividing the at least one audio signal into the second part comprising the at least one audio signal without spectrally extensions and the first part comprising the at least one audio signal spectral extensions.
- the means for dividing the at least one audio signal into a first part and a second part based on the spectral content of the at least one audio signal may comprise means for dividing the at least one audio signal using a 3 dB per octave mixing filter with a 50/50 centre-point at a determined frequency value, wherein the first part comprising a high pass filter version of the mixing filter and the second part comprising a low pass filter version of the mixing filter.
- the means for combining the horizontally spatially extended at least part of the at least one audio signal and the vertically spatially extended at least part of the at least one audio signal to generate at least one spatially extended audio signal comprising vertically spatial extent may comprise means for mixing the horizontally spatially extended at least part of the at least one audio signal and the vertically spatially extended at least part of the at least one audio signal based on a user input defining a head-pose parameter.
- the user input may define a head-pose parameter set of yaw, pitch and roll wherein the means for mixing the horizontally spatially extended at least part of the at least one audio signal and the vertically spatially extended at least part of the at least one audio signal based on a user input defining a head-pose parameter may comprise: means for analysing the at least one audio signal to determine whether there is elevation extent with a spectral band extension; means for updating a source position for mixing using all of the head-pose parameters yaw, pitch and roll parameters based on determining there is elevation extent and controlling the mixing based on the updated source position; means for updating a source position for mixing using only head-pose yaw parameter based on determining there is no elevation extent; means for determining whether the source position needs to be perceivable; means for controlling the mixing based on the updated positions based on the source actual position being determined as not being needed to be perceivable; and means for controlling the mixing based on a source position before updating based on the source
- the apparatus may further comprise means for receiving a user input for controlling the vertically spatially extending of the second part of the at least one audio signal.
- the means for analysing at least one audio signal to determine spectral content of the at least one audio signal may comprise: means for analysing at least one audio signal to determine spectral content of the at least one audio signal; and means for storing a result of the analysis as metadata associated with the at least one audio signal prior to the spectrally extending of the at least one audio signal based on the metadata spectral content of the at least one audio signal.
- a computer program product stored on a medium may cause an apparatus to perform the method as described herein.
- An electronic device may comprise apparatus as described herein.
- a chipset may comprise apparatus as described herein.
- Figure 1 shows schematically an example system for spatial audio mixing featuring original and spatially extended audio signals in the horizontal and vertical domain according to some embodiments
- Figure 3 shows schematically an example spatially extending synthesizer implementation shown in figure 1 in further detail according to some embodiments
- Figure 4 shows schematically an example mixer control system for the mixer shown in figure 1 according to some embodiments
- Figure 5 shows the operation of the example mixer shown in figure 4.
- Figure 6 shows schematically an example device suitable for implementing the apparatus shown in Figures 1 , 3 and 5.
- a conventional approach to the capturing and mixing of sound sources with respect to an audio background or environment audio field signal would be for a professional producer to utilize an external microphone (a close or Lavalier microphone worn by the user, or a microphone attached to an instrument or some other microphone) to capture audio signals close to the sound source, and further utilize a 'background' microphone or microphone array to capture a environmental audio signal. These signals or audio tracks may then be manually mixed to produce an output audio signal such that the produced sound features the sound source coming from an intended (though not necessarily the original) direction.
- an external microphone a close or Lavalier microphone worn by the user, or a microphone attached to an instrument or some other microphone
- the concepts as discussed in detail hereafter is a system that implements spatially extending synthesis, which ensures that perceivable spatial extent in the elevation domain is also created. Synthesis of elevation extent increases the immersive effect of sound compared to only spatially extending in the azimuth. However, elevation based spatial extending requires a different approach than azimuth based spatial extending as the listener perceives elevation strongly only with high frequency content. Thus, the embodiments as discussed herein show a system in two parts: firstly, the system is configured to analyse the input signal to determine whether there is sufficient high frequency content present and generate high frequency content when necessary. Secondly the system is configured to perform the spatially extending synthesis in the elevation domain as well as the azimuth domain.
- FIG 1 With respect to figure 1 is shown an example system for generating a vertical and horizontally spatially extended audio signal according to some embodiments.
- the system in some embodiments may comprise an audio signal input 101 .
- the audio signal input 101 is configured to receive or generate a mono audio signal.
- the mono audio signal may be one from a microphone such as an external microphone.
- the external microphone may be any microphone external or separate to a microphone array (for example a Lavalier microphone) which may capture a spatial audio signal.
- a Lavalier microphone any microphone external or separate to a microphone array (for example a Lavalier microphone) which may capture a spatial audio signal.
- the external microphones can be worn/carried by persons or mounted as close-up microphones for instruments or a microphone in some relevant location which the designer wishes to capture accurately.
- a Lavalier microphone typically comprises a small microphone worn around the ear or otherwise close to the mouth.
- the audio signal may be provided either by a Lavalier microphone or by an internal microphone system of the instrument (e.g., pick-up microphones in the case of an electric guitar) or an internal audio output (e.g., a electric keyboard output).
- the close microphone may be configured to output the captured audio signals to a mixer.
- the external microphone may be connected to a transmitter unit (not shown), which wirelessly transmits the audio signal to a receiver unit (not shown).
- the external microphone, mic sources and thus the performers and/or the instruments that are being played positions may be tracked by using position tags located on or associated with the microphone source.
- the external microphone comprises or is associated with a microphone position tag.
- the microphone position tag may be configured to transmit a radio signal such that an associated receiver may determine information identifying the position or location of the close microphone. It is important to note that microphones worn by people can be freely moved in the acoustic space and the system supporting location sensing of wearable microphone has to support continuous sensing of user or microphone location.
- the close microphone position tag may be configured to output this signal to a position tracker.
- HAIP high accuracy indoor positioning
- the system comprises a spectral content analyser 103.
- the spectral content analyser 103 may be configured to receive the audio signal (for example from the microphone).
- the spectral content analyser 103 may be configured to analyse the audio signal for its spectral content distribution.
- the spectral content analyser 103 may be configured to check how much of the signal energy is located above a 3 kHz frequency boundary compared to the signal energy below the boundary.
- the frequency boundary may be any suitable frequency.
- the spectral content analyser 103 is configured to determine whether the energy content of the audio signal about the boundary is greater than a determined threshold value and control a spectral band extender 105, a vertical signal selector 107 and the horizontal and vertical spatially extending synthesizers 109, 1 1 1 .
- the determined threshold value may be, for example, that at least 10% of the signal energy is located above 3kHz. In some embodiments any other suitable threshold may be used, and the threshold may be adjustable or adjusted depending on a user or sound engineer's preferences.
- the audio signal can be used as is without any spectral extending.
- the spectral content analyser is configured to control the spectral band extender 105.
- the system comprises a spectral band extender 105.
- the spectral band extender 105 may be configured to receive the audio signal and furthermore receive a control input from the spectral content analyser 103.
- the spectral band extender 105 in some embodiments may be configured to add high (or higher) frequency spectral content to the audio signal in order to create energy for the vertical spatially extending synthesis.
- the spectral band extender 105 is configured to add harmonic distortion to the signal with a specific distortion effect. However this in some listener's ears can be perceived as annoying.
- the spectral band extender 105 is configured to apply specific spectral bandwidth extension methods such as spectral band replication (SBR).
- SBR spectral band replication
- any suitable spectral bandwidth extension methods may be implemented, such as for example those shown in Larsen, E; Aarts, R; Audio Bandwidth Extension, 2004 and Larsen, E; Aarts, R; Danessis M; Efficient high-frequency bandwidth extension of music and speech. These methods are normally used, for example, in audio coding to reduce the used data rate for high frequencies.
- the implementations create energy in the higher frequencies regions by picking or selecting content from lower frequencies, transposing the content to higher frequencies (i.e., pitch shifts or moves in spectral domain), and matching the assumed harmonic structure of the signal.
- the signal is more noise-like, noise is added instead of or in addition to the harmonic signal.
- These spectral bandwidth extension processed may be performed until the specified threshold for energy is met. In some circumstances this spectral bandwidth extension process can generate artefacts in the output audio signal.
- the vertical signal selector 107 is configured to select or filter the audio signals and pass the selected audio signals to the horizontal spatially extending synthesizer (H-Spatially Extending synthesizer) 109 and to the vertical spatially extending synthesizer (V-Spatially Extending synthesizer) 1 1 1 .
- the vertical signal selector 107 is configured such that, as the lower frequencies are generally important for azimuth perception, more of the lower-frequency energy is selected for the horizontal extension.
- the vertical signal selector 107 is configured such that, as higher frequencies are important for elevation perception, the higher frequency energy is selected for the vertical extension.
- the vertical signal selector 107 is configured to divide the audio signal using the 3 kHz border with a large crossing bandwidth.
- the system may further comprise a spatial mixer 1 13.
- the spatial mixer 1 13 is configured to receive the horizontally (or azimuth) spatially extended signal 1 10 and vertically (or elevation) spatially extended signal 1 12 and generate a horizontally and vertically (azimuth and elevation) spatially extended audio signal. The operation of the spatial mixer 1 13 is shown in further detail later.
- step 203 the spectral content of the audio signal is analysed as shown in figure 2 by step 203.
- the vertical audio signal output may be spatially extended by the application of the vertical spatially extending synthesizer as shown in figure 2 by step 209.
- the horizontal spatially extended audio signal and the vertical spatially extended audio signal may then be combined using separate horizontal and vertical source as shown in figure 2 by step 21 1 .
- the spatially extending synthesizer may further comprise an object position input/determiner 402.
- the object position input/determiner 402 may be configured to determine the spatial position of sound sources. This information may be determined in some embodiments by a sound object processor.
- the spatially extending synthesizer may further comprise a series of multipliers.
- Figure 3 is shown one multiplier for each frequency band.
- the series of multipliers comprise multipliers 407i to 407g, however any suitable number of multipliers may be used.
- Each frequency domain band signal may be multiplied in the multiplier 407 with the determined VBAP gains.
- a spatial mixer may be implemented in some embodiments by combining the audio signals using a simple mixer however in some situations for example those implementing binaural representation, where a normal head-tracking operation is used to change the directions of the signals, the vertical spatially extended signal may become the horizontally extended signal and vice versa. This happens when the elevation or tilt (also called pitch and roll) of the head are non- zero and especially when either is ⁇ 90 degrees. In these situations the output audio signals may be incorrect.
- the system may comprise a head- locking mixer where the direction compensation driven by head-tracking does not affect the extended signal where the orientation changes.
- Figure 4 shows the example spectral mixer comprising a spectral band extension determiner 501 .
- the source position is updated using the yaw, pitch and roll parameters from the headpose and the mixing is controlled based on the updated position as shown in figure 5 by step 609 and the output as shown in figure 5 by step 610.
- the source position is updated using yaw parameters from the headpose as shown in figure 5 by step 61 1 .
- the source is then analysed to determine whether the source actual position needs to be perceivable as shown in figure 5 by step 613.
- the mixing is controlled based on the updated positions and output as shown in figure 5 by step 614.
- the original signal position is updated using the yaw, pitch and roll parameters from the headpose and the mixing is controlled based on the updated positions as shown in figure 5 by step 615 and the output as shown in figure 5 by step 616.
- the system could receive a user input to control the spatially extending synthesis. For example where the user can decide whether to implement vertical spatially extending an audio signal and the extent of the vertical spatial extent. Furthermore in some embodiments the user can also monitor the output signal, for example, a binaurally rendered version and determine how aggressive the vertical extent and/or high frequency content creation algorithms are. For example, the user can determine that less vertical extent (than the automatic algorithm produces) is enough, then control the system manually, for example, by forcing the system to use less aggressive high frequency content creation scheme and/or changing the extent narrower.
- the parameters for controlling extent creation are only dependent on the input signal, it is possible to precompute the analysis beforehand and store it as separate metadata.
- this analysis may be integrated within an audio file input to the system and can be very advantageous as the user can also tune the parameters just right beforehand.
- the parameters are fetched from the metadata and used to control the extent. Additionally, these parameters could be time dependent and thus change through time if the user so desires.
- Another example may be where the user wants to spatialize an electric guitar signal in a music mix for VR content. This is desired to be done both in the horizontal and vertical planes. As most of the spectral content for the guitar is below the 3kHz mark, the system uses bandwidth extension to extend the frequencies for the vertical extension.
- the device may be any suitable electronics device or apparatus.
- the device 1200 is a mobile device, user equipment, tablet computer, computer, audio playback apparatus, etc.
- the device 1200 may comprise a microphone 1201 .
- the microphone 1201 may comprise a plurality (for example a number N) of microphones. However it is understood that there may be any suitable configuration of microphones and any suitable number of microphones.
- the microphone 1201 is separate from the apparatus and the audio signal transmitted to the apparatus by a wired or wireless coupling.
- the microphone 1201 may in some embodiments be the microphone array as shown in the previous figures.
- the microphone may be a transducer configured to convert acoustic waves into suitable electrical audio signals.
- the microphone can be solid state microphones. In other words the microphone may be capable of capturing audio signals and outputting a suitable digital format signal.
- the microphone 1201 can comprise any suitable microphone or audio capture means, for example a condenser microphone, capacitor microphone, electrostatic microphone, Electret condenser microphone, dynamic microphone, ribbon microphone, carbon microphone, piezoelectric microphone, or microelectrical-mechanical system (MEMS) microphone.
- the microphone can in some embodiments output the audio captured signal to an analogue-to-digital converter (ADC) 1203.
- ADC analogue-to-digital converter
- the device 1200 may further comprise an analogue-to-digital converter 1203.
- the analogue-to-digital converter 1203 may be configured to receive the audio signals from each of the microphone 1201 and convert them into a format suitable for processing. In some embodiments where the microphone is an integrated microphone the analogue-to-digital converter is not required.
- the analogue-to-digital converter 1203 can be any suitable analogue-to-digital conversion or processing means.
- the analogue-to-digital converter 1203 may be configured to output the digital representations of the audio signal to a processor 1207 or to a memory 121 1 .
- the device 1200 comprises at least one processor or central processing unit 1207.
- the processor 1207 can be configured to execute various program codes such as the methods such as described herein.
- the device 1200 comprises a user interface 1205.
- the user interface 1205 can be coupled in some embodiments to the processor 1207.
- the processor 1207 can control the operation of the user interface 1205 and receive inputs from the user interface 1205.
- the user interface 1205 can enable a user to input commands to the device 1200, for example via a keypad.
- the user interface 205 can enable the user to obtain information from the device 1200.
- the user interface 1205 may comprise a display configured to display information from the device 1200 to the user.
- the user interface 1205 can in some embodiments comprise a touch screen or touch interface capable of both enabling information to be entered to the device 1200 and further displaying information to the user of the device 1200.
- the user interface 1205 may be the user interface for communicating with the position determiner as described herein.
- the transceiver 1209 can communicate with further apparatus by any suitable known communications protocol.
- the transceiver 1209 or transceiver means can use a suitable universal mobile telecommunications system (UMTS) protocol, a wireless local area network (WLAN) protocol such as for example IEEE 802.X, a suitable short-range radio frequency communication protocol such as Bluetooth, or infrared data communication pathway (IRDA).
- UMTS universal mobile telecommunications system
- WLAN wireless local area network
- IRDA infrared data communication pathway
- the device 1200 can comprise in some embodiments an audio subsystem output 1215.
- An example as shown in Figure 6 shows the audio subsystem output 1215 as an output socket configured to enabling a coupling with headphones 121 .
- the audio subsystem output 1215 may be any suitable audio output or a connection to an audio output.
- the audio subsystem output 1215 may be a connection to a multichannel speaker system.
- the digital to analogue converter 1213 and audio subsystem 1215 may be implemented within a physically separate output device.
- the DAC 1213 and audio subsystem 1215 may be implemented as cordless earphones communicating with the device 1200 via the transceiver 1209.
- the device 1200 is shown having both audio capture, audio processing and audio rendering components, it would be understood that in some embodiments the device 1200 can comprise just some of the elements.
- the various embodiments of the invention may be implemented in hardware or special purpose circuits, software, logic or any combination thereof.
- some aspects may be implemented in hardware, while other aspects may be implemented in firmware or software which may be executed by a controller, microprocessor or other computing device, although the invention is not limited thereto.
- firmware or software which may be executed by a controller, microprocessor or other computing device, although the invention is not limited thereto.
- While various aspects of the invention may be illustrated and described as block diagrams, flow charts, or using some other pictorial representation, it is well understood that these blocks, apparatus, systems, techniques or methods described herein may be implemented in, as non-limiting examples, hardware, software, firmware, special purpose circuits or logic, general purpose hardware or controller or other computing devices, or some combination thereof.
- Embodiments of the inventions may be practiced in various components such as integrated circuit modules.
- the design of integrated circuits is by and large a highly automated process.
- Complex and powerful software tools are available for converting a logic level design into a semiconductor circuit design ready to be etched and formed on a semiconductor substrate.
Landscapes
- Physics & Mathematics (AREA)
- Engineering & Computer Science (AREA)
- Acoustics & Sound (AREA)
- Signal Processing (AREA)
- Stereophonic System (AREA)
Abstract
La présente invention concerne un appareil permettant de générer un signal audio étendu spatialement, l'appareil étant configuré pour : analyser au moins un signal audio pour déterminer une répartition spectrale du ou des signaux audio ; déterminer s'il faut étendre spectralement le ou les signaux audio sur la base de la répartition spectrale du ou des signaux audio, de sorte que le ou les signaux audio comprennent une partie déterminée de fréquences supérieures à une fréquence définie ; et étendre spatialement à la verticale au moins une partie du ou des signaux audio lorsque la répartition spectrale déterminée du ou des signaux audio doit être traitée.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
GB1706287.8 | 2017-04-20 | ||
GB1706287.8A GB2561594A (en) | 2017-04-20 | 2017-04-20 | Spatially extending in the elevation domain by spectral extension |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2018193161A1 true WO2018193161A1 (fr) | 2018-10-25 |
Family
ID=58795702
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/FI2018/050274 WO2018193161A1 (fr) | 2017-04-20 | 2018-04-19 | Extension spatiale dans le domaine d'élévation par extension spectrale |
Country Status (2)
Country | Link |
---|---|
GB (1) | GB2561594A (fr) |
WO (1) | WO2018193161A1 (fr) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070127748A1 (en) * | 2003-08-11 | 2007-06-07 | Simon Carlile | Sound enhancement for hearing-impaired listeners |
WO2010086461A1 (fr) * | 2009-01-28 | 2010-08-05 | Dolby International Ab | Transposition améliorée d'harmonique |
US20100262427A1 (en) * | 2009-04-14 | 2010-10-14 | Qualcomm Incorporated | Low complexity spectral band replication (sbr) filterbanks |
US8804971B1 (en) * | 2013-04-30 | 2014-08-12 | Dolby International Ab | Hybrid encoding of higher frequency and downmixed low frequency content of multichannel audio |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8428269B1 (en) * | 2009-05-20 | 2013-04-23 | The United States Of America As Represented By The Secretary Of The Air Force | Head related transfer function (HRTF) enhancement for improved vertical-polar localization in spatial audio systems |
JP2015163909A (ja) * | 2014-02-28 | 2015-09-10 | 富士通株式会社 | 音響再生装置、音響再生方法及び音響再生プログラム |
-
2017
- 2017-04-20 GB GB1706287.8A patent/GB2561594A/en not_active Withdrawn
-
2018
- 2018-04-19 WO PCT/FI2018/050274 patent/WO2018193161A1/fr active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070127748A1 (en) * | 2003-08-11 | 2007-06-07 | Simon Carlile | Sound enhancement for hearing-impaired listeners |
WO2010086461A1 (fr) * | 2009-01-28 | 2010-08-05 | Dolby International Ab | Transposition améliorée d'harmonique |
US20100262427A1 (en) * | 2009-04-14 | 2010-10-14 | Qualcomm Incorporated | Low complexity spectral band replication (sbr) filterbanks |
US8804971B1 (en) * | 2013-04-30 | 2014-08-12 | Dolby International Ab | Hybrid encoding of higher frequency and downmixed low frequency content of multichannel audio |
Non-Patent Citations (1)
Title |
---|
HABIGT, T. ET AL.: "Enhancing 3D Audio Using Blind Bandwidth Extension", AES , 129TH CONVENTION, 4 November 2010 (2010-11-04) - 7 November 2010 (2010-11-07), San Francisco, CA , USA, pages 1 - 5, XP055548817, Retrieved from the Internet <URL:http:// mediatum.ub.tum.de/doc/1070615/597003.pdf> [retrieved on 20180830] * |
Also Published As
Publication number | Publication date |
---|---|
GB2561594A (en) | 2018-10-24 |
GB201706287D0 (en) | 2017-06-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP7683101B2 (ja) | 少なくとも一つのフィードバック遅延ネットワークを使ったマルチチャネル・オーディオに応答したバイノーラル・オーディオの生成 | |
JP6818841B2 (ja) | 少なくとも一つのフィードバック遅延ネットワークを使ったマルチチャネル・オーディオに応答したバイノーラル・オーディオの生成 | |
JP4921470B2 (ja) | 頭部伝達関数を表すパラメータを生成及び処理する方法及び装置 | |
US9131305B2 (en) | Configurable three-dimensional sound system | |
US8081762B2 (en) | Controlling the decoding of binaural audio signals | |
CN112806030B (zh) | 用于处理空间音频信号的方法和装置 | |
WO2017182714A1 (fr) | Fusion de signaux audio avec des métadonnées spatiales | |
EP3320692A1 (fr) | Appareil de traitement spatial de signaux audio | |
GB2543275A (en) | Distributed audio capture and mixing | |
EP3613221A1 (fr) | Amélioration de lecture de haut-parleur à l'aide d'un signal audio traité en étendue spatiale | |
CN114270878B (zh) | 一种声场相关渲染的方法和装置 | |
JP2024028526A (ja) | 音場関連レンダリング | |
WO2018193162A2 (fr) | Génération de signal audio pour mixage audio spatial | |
JP2022502872A (ja) | 低音マネジメントのための方法及び装置 | |
US10708679B2 (en) | Distributed audio capture and mixing | |
EP3613043B1 (fr) | Génération d'ambiance pour mélange audio spatial comprenant l'utilisation de signal original et étendu | |
US11388540B2 (en) | Method for acoustically rendering the size of a sound source | |
WO2018193161A1 (fr) | Extension spatiale dans le domaine d'élévation par extension spectrale | |
KR101993585B1 (ko) | 실시간 음원 분리 장치 및 음향기기 | |
Tom | Automatic mixing systems for multitrack spatialization based on unmasking properties and directivity patterns |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 18787261 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 18787261 Country of ref document: EP Kind code of ref document: A1 |