[go: up one dir, main page]

HK1188343A - Dynamic compensation of audio signals for improved perceived spectral imbalances - Google Patents

Dynamic compensation of audio signals for improved perceived spectral imbalances Download PDF

Info

Publication number
HK1188343A
HK1188343A HK14101405.1A HK14101405A HK1188343A HK 1188343 A HK1188343 A HK 1188343A HK 14101405 A HK14101405 A HK 14101405A HK 1188343 A HK1188343 A HK 1188343A
Authority
HK
Hong Kong
Prior art keywords
audio signal
computer code
frequency
playback
frequency coefficients
Prior art date
Application number
HK14101405.1A
Other languages
Chinese (zh)
Other versions
HK1188343B (en
Inventor
M.沃尔什
E.斯特因
J-M.卓特
Original Assignee
Dts(英属维尔京群岛)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dts(英属维尔京群岛)有限公司 filed Critical Dts(英属维尔京群岛)有限公司
Publication of HK1188343A publication Critical patent/HK1188343A/en
Publication of HK1188343B publication Critical patent/HK1188343B/en

Links

Description

Dynamic compensation of audio signals for improved perceived spectral imbalance
(Cross-reference to related applications)
This application claims the benefit of U.S. provisional application No.61/381831, filed on 9/10/2010, the entire contents of which are incorporated herein by reference.
Technical Field
The invention relates to a method for equalizing an audio signal for playback by using adaptive filtering.
Background
An audio signal may be described by its spectral balance or frequency response. When played in a playback device, the audio signal has an associated sound pressure level or "SPL". These two properties of the audio signal are logically independent: given a linear time-invariant reproduction system, changing the sound pressure level of an audio signal should not affect any objective measure of the spectral balance of the signal.
However, from a subjective psychoacoustic point of view, we observe that changes in sound pressure level produce significant changes in the perceived spectral balance of the signal. This is because the sensitivity of the human ear to differences in sound pressure level varies with frequency. For example, when we reduce the sound pressure level of an audio signal, the perceived loudness of low frequencies decreases at a rate much greater than mid-range frequencies.
This phenomenon can be described by an equal loudness curve. Fig. 1 shows an equal loudness curve defined by ISO standard 226 (2003). The unit of measure of loudness is square, where 1 square is defined as the Sound Pressure Level (SPL) of 1 decibel (dB) at a frequency of 1000Hz (1 kHz). The curves in fig. 1 represent the SPL required to provide a consistent level of response across frequency, which can be perceived by an "average" individual. Fig. 1 shows 6 such curves modeling the loudness level from human hearing threshold up to 100-party perception in 20-party increments. Note that, according to the definition of square, a 20-square loudness requires a 20dB SPL at 1kHz, a 40-square loudness requires a 40dB SPL at 1kHz, and so on.
Loudness perception may also vary from person to person due to the environment and physical attributes such as age-related hearing loss, also known as presbycusis. In fig. 2 adapted from data included in ISO standard 7029 (2000), the attenuation of "average" persons increasing with age is shown. The baseline attenuation is the hearing of a 20 year old average individual, represented by a line attenuated at 0 dB. As can be seen from fig. 2, on average, people 30 years old have only slightly worse hearing above about 1800Hz than people 20 years old. In contrast, on average, the hearing of a 60 year old person has a significant decrease (hearing loss of more than 20 dB) for frequencies above 1000 Hz. Presbycusis is therefore particularly problematic in higher audible frequencies and is highly age-dependent.
Listeners often attempt to counteract the perceived loss in balance in high and low frequencies by applying an equalization function ("EQ") to their audio output. In the past, this EQ function was often applied by using a graphic equalizer that boosted low and high frequencies, creating a smiling shape on an octave-separated slider. Although the "smiling face" EQ complements the perceived spectrum well at low listening levels, it is generally applied regardless of sound pressure level. Thus, at higher sound pressure levels, the resulting equalized acoustic track may be perceived as too low heavy at low frequencies and too harsh at high frequencies.
Finally, audio that has been actively compressed using perceptual coding techniques (e.g., mp 3) for low bit rates may be perceived as either under-clean or ambiguous as a result of the encoding process. This is often because higher frequencies have been filtered out to save bandwidth. Applying a high frequency EQ will not contribute to this situation, since audio is only absent in the higher frequency band.
The above-mentioned problems related to the spectral balance of the spectral perception of an audio signal played at a lower level can be summarized as follows:
the sensitivity of the human ear to differences in sound pressure levels varies with the frequency of spectral imbalances that produce the sensation at lower listening levels.
Age related hearing loss produces a sense of quieter high frequency content.
While applying a "smiling face" EQ curve can help correct the spectral balance of the perception at lower listening levels, it can also overcompensate at higher listening levels (when less compensation is needed).
Lower bitrate perceptual audio coding can produce the perception of ambitious audio.
Applying any type of high frequency EQ may not highlight low bit rate encoded material.
Disclosure of Invention
Various embodiments of the present invention address the above deficiencies of the prior art by dynamically compensating the played audio content for perceived spectral imbalance using a combination of SPL-dependent adaptive EQ, optional spectral bandwidth extension, and SPL-independent (but listener-dependent) EQ. As a result of the continuous playback level and signal bandwidth analysis, only the played audio is advantageously processed when needed.
As mentioned above, the sensitivity of humans to low frequencies (< 1000 kHz) is different from high frequencies, so that a reduction in output gain will produce a much lower level of bass perception, often to the extent that bass frequencies are not audible at all when played at very low levels. SPL equalization works by continuously adapting the spectrum of the input audio signal to be output as a playback signal such that the reproduced perceived spectral balance is maintained relative to the perceived spectral balance at some desired level of monitoring. This is done by calculating the relative difference of the equal loudness curves generated for the desired listening level and the actual listening level. The greater the difference between the desired and actual playback levels, the lower the perceived bass level and the greater the low frequency EQ required for the perceived balance of bass loss. The basis for SPL equalization is known in the art, for example, as described in Holman et al, "Loudness Compensation: Use and Abuse", J.Audio Eng.Soc., vol.26, pp.526-536 (July-Aug.1978). Various embodiments of the present invention modify this basic technique as explained in more detail below.
As shown in fig. 2, high frequency hearing loss produces a reduction in high frequency hearing acuity as the frequency increases. To compensate for varying degrees of hearing impairment, we implement listener dependent EQ based on the inverse of the trend described in fig. 2 but not directly on sampling of the audio signal. Thus, as the amount of compensation desired increases, we raise the frequency by a higher amount and starting at a lower cutoff frequency. The overall gain of the applied high frequency EQ also depends on the assumed actual playback level to avoid applying too much high frequency boost at higher sound pressure levels, which would otherwise be perceived as disturbing or irritating.
Bandwidth extension techniques may be used in cases where the listener relies on equalization to be applied but has less audible effect due to limited high frequency content. Broadly speaking, typical Audio Bandwidth Extension algorithms derive additional higher Frequency Audio content from existing lower Frequency content by using techniques such as the nonlinear distortion described in Larsen et al, "efficiency high-Frequency Bandwidth Extension of Music and Speech," AES112th Convention (May2002) and Band Replication described in Dietz et al, "Spectral Band Replication, a Novel Approach in Audio Coding," AES112th Convention (May 2002). To derive full benefit from the combination of bandwidth expansion and response equalization, in some embodiments of the present invention, bandwidth expansion is applied prior to high frequency loudness equalization. An optional bandwidth detection algorithm may be used to detect the amount of high frequency content present in the input signal so that bandwidth extension is only applied when needed.
Accordingly, in a first embodiment of the present invention, a method of equalizing an audio signal within a processing apparatus is provided. The method comprises the following steps: in a first process, frequency coefficients of a portion of an audio signal are divided into a plurality of subbands, where each subband includes one or more frequency coefficients. The method comprises the following steps: for one or more of the plurality of sub-bands, a processing device is used to perform a series of processes. First, the processing means determines at least one control signal magnitude based in part on (i) a predetermined control sound pressure level and (ii) frequency coefficients of one or more sub-bands. The processing device then determines at least one playback signal magnitude based in part on a control volume level of the playback device. The processing means then generates first equal loudness curve data based on the control signal magnitude. Then, the processing means generates second equal loudness curve data based on the playback signal magnitude. Once the curve is generated for a particular portion of the audio signal, the method continues by developing compensation data based on the first and second equal loudness curve data within one or more sub-bands and compensating for frequency coefficients of the portion of the audio signal by using the compensation data.
The associated method also includes transforming the compensated frequency coefficients within the sub-bands to produce an equalized audio signal that may be output to a playback device. The audio signal may comprise a plurality of sections and the steps of determining at least one control signal magnitude, determining at least one playback signal magnitude, generating first equal loudness curve data, generating second equal loudness curve data, developing compensation data and compensating for frequency coefficients of the sections may be repeated for each section. Generating the first equal loudness curve data (for an idealized listening setting) may include generating equal loudness curve data according to ISO226 for the control signal magnitude and normalizing the generated equal loudness curve data to have a gain of 0dB at 1 kHz. Similarly, generating second equal loudness curve data (for an idealized listening setting) may include generating equal loudness curve data according to ISO226 for playback signal size and normalizing the generated equal loudness curve data to have a gain of 0dB at 1 kHz.
With respect to these methods, the control level may be a peak level of a prescribed frequency occurring in the recording of the audio signal. Also, one or more sub-bands may be limited to frequencies below, for example, 1 kHz. Determining the compensation data may include deriving additional high frequency audio content from the low frequency audio content of the portion.
The method may further comprise: determining second compensation data based on the received data pertaining to the hearing characteristics of the listener; and increasing at least one of the frequency coefficients based on the second compensation data. In the extension method, increasing at least one of the frequency coefficients may be based in part on an assumed playback level. Also, determining the second compensation data may include calculating a boost level according to a function, and the data may have a predetermined maximum boost level.
There is also provided in a second embodiment a method for equalizing an audio signal for playback on a playback device. As before, the method includes dividing the audio signal into a plurality of subbands that include one or more frequency coefficients. The second method also requires dynamically adapting the frequency coefficients of one or more sub-bands based on the playback level of the playback device and the control sound pressure level. The method then calls for adapting the frequency coefficients of one or more of the plurality of sub-bands based on the hearing loss data of the listener. Finally, the method requires transforming the adapted frequency coefficients into an equalized audio signal for playback on the playback device. According to the method, the dynamic adaptation and the adaptation for hearing loss results in an individualized and dynamically equalized audio signal close to the spectral balance of the audio signal at the time of control. It is speculated that the sound engineer controlling the audio signal has excellent listening acuity and that the method provides a substantially equivalent listening experience for another individual.
In a related embodiment, dynamically adapting the audio magnitude of one or more sub-bands is limited to frequencies below 1 kHz. The dynamic adaptation may comprise four sub-processes for each sampling period of the audio signal. The first sub-process determines a desired signal magnitude at a predetermined frequency based in part on controlling the sound pressure level. The second sub-process is to determine at least one actual playback size based in part on a control volume adjustment of the playback device and a maximum sound pressure level of the playback device. The third sub-process is to generate equal loudness curve data based on the desired signal size and the actual playback size. The fourth sub-process is to apply equal loudness curve data to adapt one or more of the frequency coefficients.
In another related embodiment, the method further comprises adjusting the frequency coefficient based on the age of the user. Accordingly, the extended method includes receiving a user input identifying an age of the user. Thereby, adapting the one or more sub-bands based on the hearing loss data comprises determining a function between the first and second frequencies such that at least the first frequency and the function boost frequency coefficients in one or more of the plurality of sub-bands based on the received age of the user and based on the determined function. Adapting the subbands may also include receiving user input indicative of a variable of the function, such that the user modifies the function and causes an increase or decrease in the boost of one or more of the frequency coefficients.
In another related embodiment, the method includes performing a hearing test by generating a series of frequency-based sounds for response by a user; adapting one or more of the plurality of sub-bands comprises determining a boost level for one or more of the frequency coefficients based on the user's response to the hearing test.
A third method for equalization of an audio signal is also provided. The method comprises the following steps: converting the audio signal to a digital representation; filtering the digital representation to dynamically adjust the audio signal based on data controlling sound pressure level and hearing characteristics pertaining to a given listener; and converting the filtered digital representation into a filtered audio signal for playback on a playback device.
There is also provided a computer program product comprising a non-transitory computer readable medium having computer code thereon for performing any or all of the above methods.
A system for equalization of an audio signal is also provided, wherein the audio signal is represented by frequency coefficients sampled at a plurality of sample times. The system comprises: a sound pressure level equalizer for (i) receiving an audio signal and (ii) dynamically adapting frequency coefficients for a sampling time based on a desired sound pressure level and an actual playback sound pressure level of the audio signal. The sound pressure level equalizer determines a frequency coefficient adjustment for adapting the frequency coefficient by using equal loudness curve data determined based on the actual playback sound pressure level and the desired sound pressure level. The system further comprises a listener dependent equalizer for adjusting the frequency content for the sampling time based on user input determining the hearing loss compensation data.
In a related embodiment, the system further includes a bandwidth detector for (i) detecting a bandwidth of the audio signal at each sample time based on the frequency coefficient for the sample time and (ii) outputting a bandwidth signal representative of the bandwidth. The related system also includes a logic switch for receiving the bandwidth signal and either (i) providing the audio signal to the bandwidth extension module if the bandwidth is determined to be below the predetermined frequency or (ii) bypassing the bandwidth extension module if the bandwidth is determined to be above the predetermined frequency for the sampling time. The bandwidth extension module adds additional frequency coefficients to the audio signal at frequencies above the predetermined bandwidth for a given sampling time based on information included in the audio signal.
The system may include a memory in communication with the listener dependent equalizer that includes a plurality of sets of listener dependent curve data and that provides particular listener dependent curve data to the listener dependent equalizer based on the user input. Similarly, the system further includes a memory in communication with the sound pressure level equalizer that includes a plurality of sets of equal loudness curve data and provides specific equal loudness curve data based on actual playback sound pressure levels or desired sound pressure levels. Finally, the system may include a hearing test module for producing a series of audible tones at different frequencies, receiving user input in response to the audible tones, and determining user-specific hearing data.
Drawings
The above features of the embodiments will be more readily understood by reference to the following detailed description, taken in conjunction with the accompanying drawings, in which,
fig. 1 shows an equal loudness curve defined by ISO standard 226 (2003);
fig. 2 shows a typical statistical distribution of age-dependent hearing thresholds adapted from data included in ISO standard 7029 (2000);
FIG. 3 shows the results of an equalization process performed by an embodiment of the present invention for filtering an input audio signal to produce an output audio signal;
fig. 4 is a block diagram representing an arrangement of functional blocks that may be used to dynamically equalize the loudness of an input audio signal in accordance with an embodiment of the invention;
fig. 5 is a flow chart representing an algorithm for equalizing loudness in both audible low and high frequencies in accordance with the embodiment of fig. 4;
FIG. 6 illustrates concepts related to calculating a dB offset between a peak signal level and an input audio signal level to dynamically equalize audible low frequencies, according to embodiments of the invention;
FIG. 7A shows how the concept of FIG. 6 can be applied to approximate control sounds;
FIG. 7B illustrates how the concepts of FIG. 6 can be applied to approximate the sound pressure level that a listener of an input audio signal wishes to hear;
fig. 8 shows fig. 2 modified to equalize the audio signal to compensate for the characteristics of the hearing loss of an individual listener according to an embodiment of the present invention.
Detailed Description
And (4) defining. As used in this specification and the appended claims, the following terms shall have the indicated meanings, unless the context requires otherwise:
a continuous (analog) audio signal may be digitally sampled at a "sampling frequency" to form a stream of digital data. Common sampling frequencies include: 44.1kHz for MPEG-1 audio including MP 3; 48kHz used by various professional digital video standards such as SDI; and 96kHz for DVD audio, blu-ray audio, and HD-DVD audio. The digital data represents a "sampling period" defined as the time between samples of the audio signal.
The digital data of the sampling period may be transformed from a time-based representation ("time domain") to a frequency-based representation ("frequency domain") by using well-known transforms such as the Discrete Cosine Transform (DCT). While the data values in the time domain may represent a range of voltage magnitudes (e.g., a voltage magnitude), the data values in the frequency domain may represent the magnitudes of the frequencies present in the audio signal during a sampling period. Such data values in the frequency domain are referred to herein as "frequency coefficients".
Various embodiments of the present invention dynamically compensate audio content for perceived spectral imbalance by using a combination of a first process that is audio content dependent and a second process that is audio content independent. In the first process, it is preferable that in the control, the SPL-dependent EQ is adaptively applied to the audio signal to correct a difference between the output SPL of the audio playback apparatus and the SPL of the audio signal for the audio signal at an earlier time. In the second process, fixed equalization is applied to compensate for the hearing characteristics of the listener such as presbycusis, regardless of the specific SPL of the audio signal. Optionally, in a third process, the spectral bandwidth of the audio signal is expanded to improve sound quality at higher frequencies before applying the listener dependent EQ.
Fig. 3 shows the result of the equalization process performed by an embodiment of the present invention. The solid curve 301 represents a portion of the input audio signal in the frequency domain. The dashed curve 302 represents an optional bandwidth extension to the input audio signal 301. The dashed curve 303 represents the output audio signal produced by the embodiment. Note that due to the bandwidth extension 302, the output curve 303 extends to a higher frequency than the (non-extended) input signal 301.
The gap 304 on the left of the diagram represents the effect of SPL dependent filtering, which will be described more fully below in conjunction with FIGS. 4-7. In fig. 3, the gap represents a modest increase in SPL in some of the low frequencies used to dynamically compensate for the difference between the "studio" volume and the playback volume. The gap 305 on the right of the figure represents the effect of listener dependent (SPL independent) filtering, which will be described more fully below in connection with fig. 8. Listener dependent filtering is used to compensate mainly for hearing loss and other listener hearing characteristics independent of the input audio signal.
Although the curves 301, 303 shown in fig. 3 overlap significantly over the middle frequency range and differ over the low and high frequency ranges, this diagram is only used to illustrate the difference between SPL-dependent filtering and listener-dependent filtering. In particular, SPL-dependent filtering generally affects lower frequencies more than higher frequencies, and SPL-independent filtering generally affects higher frequencies more than lower frequencies. However, as described later, the two filtering effects may overlap over some or all of the audible spectrum, and fig. 3 should not be viewed as limiting the invention to non-overlapping filters.
A generalized diagram of the complete scheme outlined above is shown in fig. 4. To evaluate whether bandwidth extension is guaranteed, a bandwidth detection algorithm 402 is applied to the input audio signal 401. If warranted, an optional bandwidth extension 403 is applied, as shown by the dashed line. The bandwidth extension derives additional high frequency audio content from the low frequency audio content. The bandwidth extension algorithm may be one of many algorithms known in the art. For an excellent overview of the disclosed algorithm, reference is made to Larsen et al, Audio Bandwidth Extension, Application of psychoacoustics, Signal Processing and Loudmaker Design (Wiley, 2004). In other embodiments of the present invention, bandwidth expansion is always performed, and in still other embodiments is never performed.
Regardless of whether bandwidth extension 403 is applied, the signal is further processed through an SPL-dependent loudness equalization stage 404 and a listener-dependent loudness equalization stage 405. These stages apply separate equalization functions that are themselves functions of a predetermined difference between the assumed desired listening level and the actual listening level (assumed lower) in the SPL. The EQ curve may also be modified to be more or less active in the high and low frequency bands according to user preferences. The result of applying these equalization functions is an output audio signal 406 that can be supplied to a playback device for output. A control playback volume 410 from the playback device is used as an input to one or both of the equalization processes 404, 405.
Generally, the process of FIG. 4 may be implemented in a processing device or system comprising specialized hardware, computer hardware, software in the form of computer program code, or a combination thereof. As described above with respect to process 402, such a processing device may include a bandwidth detector for detecting a bandwidth of the input audio signal at each sample time based on the frequency coefficient for the sample time and outputting a bandwidth signal representative of the bandwidth. The processing device may also include a logic switch for receiving the bandwidth signal. The switch causes the audio signal to be provided to the bandwidth extension module if the bandwidth is determined to be below the predetermined frequency. The bandwidth extension module described above with respect to process 403 may add additional frequency coefficients to the audio signal at frequencies above the determined bandwidth based on information contained in the audio signal at the given sampling time. However, if the bandwidth is determined to be higher than the predetermined frequency for the sampling time, the switch causes the audio signal to bypass the bandwidth extension module.
Systems utilizing the present invention may also include an SPL-dependent equalizer for receiving an audio signal and dynamically adapting the frequency coefficients of the sampling time based on a desired sound pressure level and an actual playback sound pressure level for the audio signal. The sound pressure equalizer determines a frequency coefficient adjustment for adapting the frequency coefficient by using equal loudness curve data determined based on the actual playback sound pressure level and the desired sound pressure level. The system may further comprise a listener dependent equalizer for adjusting the frequency content for the sampling time based on user input determining the hearing loss compensation data.
Such a system may be implemented by a memory in communication with a listener dependent equalizer that contains multiple sets of listener dependent curve data and provides the listener dependent equalizer with particular listener dependent curve data based on user input. Similarly, such a system may be implemented by a memory in communication with a sound pressure level equalizer containing multiple sets of equal loudness curve data and providing specific equal loudness curve data based on actual playback sound pressure levels or desired sound pressure levels. According to some alternative embodiments of the invention described below, the system may include a hearing test module for producing a series of audible tones at different frequencies, receiving user input in response to the audible tones, and determining user-specific hearing data. These data may include data pertaining to the equal loudness contours heard by the user or hearing loss data of the user, or both.
A flow chart for implementing loudness equalization in an embodiment of the present invention is shown in fig. 5. In short, the present embodiment works by determining the difference between the SPL of a target audio sample in the original environment, such as a control studio, and the maximum SPL of the environment. The difference is then replicated by generating an output signal for playback in the playback environment taking into account the playback device's own maximum SPL and any gain resulting from the master playback volume level.
We begin with knowledge of the assumptions of the peak desired peak sound pressure level (e.g., peak level of pink or brown noise playing at a control level), the actual peak sound pressure level capability of the consumer's playback device, and the master volume level. This information can be obtained using any device at hand. For example, the peak control SPL may be encoded in the input audio data stream, or it may be manually entered into the playback device. As a non-limiting example, the peak SPL on the control may be determined to be about 85dB SPL during recording of the audio signal by the recording engineer. On the other hand, the peak SPL of the listener's playback device depends only on the function of the device and is thus independent of any particular input audio signal. In one embodiment, the method of fig. 5 is performed within an amplifier or other device connected to an external speaker, and the peak SPL may be determined based on the hardware characteristics of the amplifier (including its power itself), for example in a laptop with integrated speakers, and thus the peak SPL of the playback device may be determined directly by querying a manufacturer preset value or querying a database linking computer models with their speaker characteristics.
The method of fig. 5 begins at process 501, where a portion of an input audio signal is converted to a complex frequency domain representation using a 64-band oversampled polyphase analysis filterbank. Other types of filter banks may be used. A different number of filter banks may also be used. In the implementation described herein, the analysis filter bank extracts a block of 64 frequency-domain samples for each block of 64 time-domain input samples, thereby dividing the frequency coefficients to form a plurality of subbands.
In process 502, any known master volume gain applied to the input data is "cancelled". By doing so, we can better estimate the desired content-dependent control level. In process 503, the low frequency (< 1 kHz) spectrum is smoothed by averaging over time with, for example, a leaky integrator as known in the art.
In process 504, the desired content dependency level is estimated by deriving an average low frequency magnitude of the current frame of data and calculating its offset from the assumed peak or "full-size" magnitude. The effect of this process 504 is shown visually in fig. 6. The spectrum of a particular portion of the input audio signal is shown as curve 601. The low frequency spectrum of the portion is defined by the frequency up to the cut-off frequency, which in this case is 1 kHz. The average magnitude 602 of these frequencies is the output of process 503. Fig. 6 also shows a hypothetical peak control SPL 603. The purpose of the process 504 is to determine the size of the gap 604 between the low frequency average 602 and the peak control SPL603 for a given portion of the audio signal.
Fig. 7A provides more detail of the implementation of this process. Fig. 7A shows a portion of the frequency spectrum of an input audio signal 601, a low frequency average magnitude 602, and a hypothetical peak control SPL 603. Process 504 assigns a value of "M" dB SPL to peak control SPL603 and assigns a value of "X" dB SPL to difference 604. Thus, a "desired" control level occurs over (M-X) dB SPL. The value of X may be determined by subtracting the low frequency average 602 from the assumed peak size 603.
The desired playback SPL701 is determined based on the value of X shown in fig. 7B and now described. First, the peak playback device SPL702 is assigned a value of "P" dB SPL, and any master volume gain 703 applied at the time of playback is assigned a value of "V" dB SPL. Note that the playback apparatus peak SPL (P dB) is generally higher than the control peak SPL (M dB). The desired effective sound pressure level 701 of the output signal is calculated as (P-X-V) dB SPL. Thus, the desired output audio signal level 701 is selected such that, when it is boosted by the master volume gain 703, it sounds X dB below the maximum output level 702 of the playback device. The effect of these calculations is that the audio signal sounds "quieter" than the associated peak SPL, both in an idealized recording studio and on the listener's playback device, by the same amount (X dB, element 604 in fig. 7A and 7B).
However, as described above, the sensitivity of the human ear to differences in sound pressure levels varies with frequency, creating a perceived spectral imbalance at lower listening levels. Simply reducing the sound pressure level equally across all frequencies according to these equations (e.g., by reducing the frequency coefficients equally in each of the various frequency bins) therefore produces a false perceived spectral balance. This is advantageously avoided in the illustrated embodiment by applying processes 505-507.
Thus, returning to FIG. 5, in process 505, equal loudness curve data is generated for the desired SPL and the playback SPL represented by (M-X) dB SPL and (P-X-V) dB SPL in FIGS. 7A and 7B. The generation of equal loudness curve data is generally done by referring to ISO226 cited above. The data of the sound pressure level between the standard levels can be calculated, for example, by interpolation. However, in some embodiments, the processing device may be equipped with an equal loudness testing module that directly tests each listener's hearing. This alternative embodiment is able to produce an equal loudness curve that perfectly matches how a given listener perceives equal loudness, avoiding the use of coarser standardized ISO data in the comparison. Such an embodiment may be provided with different listener profiles, where each profile contains data relating to hearing characteristics of different listeners.
In process 506, the values of the equal loudness contours are normalized to have a 0dB gain at 1 kHz. This process may be performed by scaling calculations known in the art. Also, in process 506, audio signal compensation data (e.g., frequency coefficients for each frequency bin) in the form of EQ values is developed based on the two equal loudness curves. In one embodiment, this is done by computing the difference (in dB) of the normalized equal loudness curves across each frequency bin. The EQ values resulting from process 506 are then transformed from a logarithmic decibel scale to a linear scale for direct application to the audio signal in process 507. These values now represent the desired linear EQ so that the audio played on the consumer's device has the same perceived low frequency balance heard at the control level.
The above adjustments are made dynamically in response to the input audio signal for the purpose of producing an output audio signal that is perceived by a listener with perfect hearing as being suitably loud. However, not all listeners have perfect hearing. Therefore, we now turn to the listener dependent EQ determined in process 508.
Referring to fig. 8, listener dependent EQ is based on a straight line view that can be adjusted by the listener. The nature of this line mimics the curvilinear behavior required to compensate for the listener's hearing impairment and generally operates to boost sound levels at higher frequencies. Thus, for a 20 year old person with perfect hearing, no compensation is needed or applied. For a 30 year old person, a straight line curve 801 may be applied.
The EQ curve may be limited such that it has a maximum boost level 802 (e.g., 12 dB) and a minimum gain of 0 dB. For a 40 year old person, the EQ curve 803 may be applied to frequency until it intersects the maximum gain line 802, and then applies a flat 12dB gain along the curve 802 for higher frequencies. For a 50 year old person, curve 804 may be applied in this manner along with a portion of curve 802. Also, for a 60 year old person, curve 805 and curve 802 may be applied together.
While the curves 801, 803-805 in fig. 8 are based on the ISO standard, the EQ curve characteristics can also be modified to be more or less active by using user parameters that modify the frequency intercept and slope of the EQ. Thus, the straight line curve may be adjusted for a given hearing loss characteristic of the listener. Alternatively, the processing device may receive a user input identifying the age of the listener and calculate an appropriate curve based on the received age.
To achieve greater accuracy, the processing device may be equipped with a hearing loss test module to determine the exact hearing loss characteristics of the listener in a manner similar to determining the equal loudness hearing characteristics of the listener. The module performs a hearing test by producing a series of sounds at a given frequency to which the user responds when the sounds become audible. The EQ curve is then based on the user's response to the hearing test. Similarly, the processing means may comprise a series of listener profiles each containing hearing loss data relating to a particular listener.
Referring back to FIG. 5, in process 509, the SPL-dependent and listener-dependent compensation curves are combined to form combined compensation data. To avoid applying too high a gain at higher listening levels, the EQ curve is also affected by frequency independent gain as a function of the assumed listening level. In process 510, the frequency coefficients of the input samples are compensated for by using the combined compensation data. Thus, according to methods well known in the art, EQ (in the frequency domain) is applied to an input audio signal to produce an output audio signal. Generally, application of EQ involves increasing at least one of the frequency coefficients based on listener dependent compensation data. Finally, in process 511, the resulting complex band coefficients are recombined and transformed into a time-domain equalized block of output samples using a 64-band combining block or equivalent frequency-time domain filter. The processes of fig. 5 may be repeated for each block of input samples. The equalized audio signal may then be output to a playback device for playback.
The embodiments of the invention described above are intended to be exemplary only; numerous variations and modifications will become apparent to those skilled in the art. All such variations and modifications are intended to be within the scope of the present invention as defined by any appended claims.
It should be noted that logic flow diagrams are used herein to describe aspects of the invention and should not be construed to limit the invention to any particular logic flow or logic implementation. The described logic may be partitioned into different logic blocks (e.g., programs, modules, functions, or subroutines) without changing the overall results or otherwise departing from the true scope of the invention. Often times, logic elements may be added, modified, omitted, and executed in a different order, or implemented using different logic structures (e.g., logic gates, cyclic primitives, conditional logic, and other logic structures) without changing the overall results or otherwise departing from the true scope of the invention.
The invention can be embodied in many different forms including, but not limited to, computer program logic for use with a processor (e.g., a microprocessor, microcontroller, digital signal processor, or general purpose computer), programmable logic for use with a programmable logic device (e.g., a Field Programmable Gate Array (FPGA) or other PLD), discrete components, integrated circuitry (e.g., an application specific integrated circuit, ASIC), or any other means including any combination thereof.
Computer program logic implementing all or a portion of the functionality described herein may be embodied in various forms including, but in no way limited to, source code forms, computer executable forms, and various intermediate forms (e.g., forms produced by an assembler, compiler, linker, or locator). The source code may include a series of computer program instructions implemented in any of a variety of programming languages (e.g., object code, assembly language, or a high-level language such as Fortran, C + +, JAVA, or HTML) for use with various operating systems or operating environments. The source code may define and use various data structures and communication messages. The source code may be in computer-executable form (e.g., via an interpreter), or the source code may be converted (e.g., via a translator, assembler, or compiler) into computer-executable form.
The computer program and any programmable logic may be fixed in any form (e.g., source code form, computer executable form, or intermediate form) in a non-transitory storage medium such as a semiconductor memory device (e.g., RAM, ROM, PROM, EEPROM, or flash programmable RAM), a magnetic memory device (e.g., disk or fixed disk), an optical memory device (e.g., CD-ROM), a PC card (e.g., PCMCIA card), or other memory device. The computer program may be distributed in any form, as a removable storage medium with accompanying printed or electronic documentation (e.g., shrink wrapped software) preloaded with a computer system (e.g., on system ROM or fixed disk), or distributed from a server or electronic bulletin board over the communication system (e.g., the internet or world wide web).
Hardware logic (including programmable logic for use with programmable logic devices) that implements all or a portion of the functions described herein may be designed, captured, simulated, or recorded electronically, using conventional manual methods, or using various tools such as computer-aided design (CAD), a hardware description language (e.g., VHDL or AHDL), or a PLD programming language (e.g., PALASM, ABEL, or CUPL).

Claims (45)

1. A method of equalizing an audio signal within a processing device, the method comprising:
dividing frequency coefficients of a portion of an audio signal into a plurality of sub-bands, wherein each sub-band comprises one or more frequency coefficients;
for one or more of the plurality of subbands, using a processing device to:
a) determining at least one control signal magnitude based in part on (i) a predetermined control sound pressure level and (ii) frequency coefficients of one or more sub-bands;
b) determining at least one playback signal magnitude based in part on a master volume level of a playback device;
c) generating first equal loudness curve data based on the control signal magnitude; and
d) generating second equal loudness curve data based on the playback signal magnitude;
developing compensation data based on the first and second equal loudness curve data within the one or more sub-bands; and
frequency coefficients of a portion of the audio signal are compensated using compensation data.
2. The method of claim 1, further comprising:
the compensated frequency coefficients are transformed within the sub-bands to produce an equalized audio signal.
3. The method of claim 2, further comprising:
the equalized audio signal is output to a playback apparatus.
4. The method of claim 1, wherein the audio signal comprises a plurality of portions, the method further comprising:
the steps of determining at least one control signal magnitude, determining at least one playback signal magnitude, generating first equal loudness curve data, generating second equal loudness curve data, developing compensation data, and compensating for frequency coefficients of the portion are repeated for each of the plurality of portions.
5. The method of claim 1, wherein generating first equal loudness curve data comprises:
for control signal size, equal loudness curve data is generated according to ISO 226; and
the resulting equal loudness curve data was normalized to have a gain of 0dB at 1 kHz.
6. The method of claim 1, wherein generating second equal loudness curve data comprises:
for playback signal size, equal loudness curve data is generated according to ISO 226; and
the resulting equal loudness curve data was normalized to have a gain of 0dB at 1 kHz.
7. The method of claim 1, wherein the control level is a peak level of a prescribed frequency occurring during recording of the audio signal.
8. The method of claim 1, wherein one or more sub-bands are limited to frequencies below 1 kHz.
9. The method of claim 1, wherein determining the compensation data comprises deriving additional high frequency audio content from the low frequency audio content of the portion.
10. The method of claim 1, further comprising:
determining second compensation data based on the received data pertaining to the hearing characteristics of the listener; and
at least one of the frequency coefficients is increased based on the second compensation data.
11. The method of claim 10, wherein increasing at least one of the frequency coefficients is based in part on an assumed playback level.
12. The method of claim 10, wherein determining second compensation data comprises calculating a boost level according to a function.
13. The method of claim 12, wherein the second compensation data has a predetermined maximum boost level.
14. A method for equalizing an audio signal for playback on a playback device, the method comprising:
dividing the audio signal into a plurality of sub-bands containing one or more frequency coefficients;
dynamically adapting frequency coefficients of one or more sub-bands based on a playback level of the playback device and the control sound pressure level;
adapting frequency coefficients of one or more of the plurality of sub-bands based on hearing loss data of a listener;
transforming the adapted frequency coefficients into an equalized audio signal for playback on a playback device,
wherein the dynamic adaptation and the adaptation for hearing loss results in an individualized and dynamically equalized audio signal approaching the spectral balance of the audio signal when controlled.
15. The method of claim 14, wherein dynamically adapting the audio magnitude of one or more subbands is limited to frequencies below 1 kHz.
16. The method of claim 14, wherein dynamically adapting the audio magnitude comprises:
for each sampling period of the audio signal:
determining a desired signal magnitude at a predetermined frequency based in part on the control sound pressure level;
determining at least one actual playback size based in part on a master volume adjustment of the playback device and a maximum sound pressure level of the playback device;
generating equal loudness curve data based on the expected signal size and the actual playback size; and
equal loudness curve data is applied to adapt one or more of the frequency coefficients.
17. The method of claim 14, further comprising:
receiving a user input identifying an age of a user;
wherein adapting one or more of the plurality of sub-bands based on the hearing loss data comprises:
determining a function between the first and second frequencies, wherein at least the first frequency and the function are based on the received age of the user; and
boosting frequency coefficients in one or more of the plurality of sub-bands based on the determined function.
18. The method of claim 17, wherein adapting one or more of the plurality of subbands comprises:
user input is received representing a variable of the function, and,
wherein the user input modifies the function and causes an increase or decrease in the boost of at least one of the frequency coefficients.
19. The method of claim 14, further comprising:
a hearing test is performed by generating a series of frequency-based sounds for response by a user, and,
wherein adapting one or more of the plurality of sub-bands comprises determining a boost level for one or more of the frequency coefficients based on the user's response to the hearing test.
20. A method for equalizing an audio signal, the method comprising:
converting the audio signal to a digital representation;
filtering the digital representation to dynamically adjust the audio signal based on both data controlling the sound pressure level and hearing characteristics pertaining to a given listener; and
the filtered digital representation is converted into a filtered audio signal for playback on a playback device.
21. A computer program product comprising a non-transitory computer-readable medium having thereon computer code for equalizing an audio signal, the computer code comprising:
computer code for dividing frequency coefficients of a portion of an audio signal into a plurality of sub-bands, wherein each sub-band comprises one or more frequency coefficients;
for one or more of the plurality of subbands, computer code for:
a) determining at least one control signal magnitude based in part on (i) a predetermined control sound pressure level and (ii) frequency coefficients of one or more sub-bands;
b) determining at least one playback signal magnitude based in part on a master volume level of a playback device;
c) generating first equal loudness curve data based on the control signal magnitude; and
d) generating second equal loudness curve data based on the playback signal magnitude;
computer code for developing compensation data based on the first and second equal loudness curve data within the one or more sub-bands; and
computer code for compensating frequency coefficients of a portion of the audio signal using compensation data.
22. The computer program product of claim 21, further comprising:
computer code for transforming the compensated frequency coefficients within the sub-bands to produce an equalized audio signal.
23. The computer program product of claim 22, further comprising:
computer code for outputting the equalized audio signal to a playback device.
24. The computer program product of claim 21, wherein the audio signal comprises a plurality of portions, the computer program product further comprising:
computer code for repeating the determining at least one control signal magnitude, the determining at least one playback signal magnitude, the generating first equal loudness curve data, the generating second equal loudness curve data, the developing compensation data, and the compensating for frequency coefficients of the portion for each of the plurality of portions.
25. A computer program product according to claim 21 wherein the computer code for generating first equal loudness curve data comprises:
computer code for generating equal loudness curve data according to ISO226 for a control signal magnitude; and
computer code for normalizing the generated equal loudness curve data to have a gain of 0dB at 1 kHz.
26. A computer program product according to claim 21 wherein the computer code for generating second equal loudness curve data comprises:
computer code for obtaining equal loudness curve data according to ISO226 for playback signal size; and
computer code for normalizing the obtained equal loudness curve data to have a gain of 0dB at 1 kHz.
27. The computer program product of claim 21, wherein the control level is a peak level of a specified frequency occurring during recording of the audio signal.
28. The computer program product of claim 21, wherein one or more sub-bands are limited to frequencies below 1 kHz.
29. A computer program product according to claim 21 wherein the computer code for determining compensation data includes computer code for deriving additional high frequency audio content from the low frequency audio content of the portion.
30. The computer program product of claim 21, further comprising:
computer code for determining second compensation data based on the received data pertaining to the hearing characteristics of the listener; and
computer code for increasing at least one of the frequency coefficients based on the second compensation data.
31. The computer program product of claim 30, wherein the computer code for increasing at least one of the frequency coefficients is based in part on an assumed playback level.
32. A computer program product according to claim 30, wherein the computer code for determining second compensation data comprises computer code for calculating a level of boost between the first frequency and the second frequency according to a function.
33. The computer program product of claim 32, wherein the second compensation data has a predetermined maximum boost level.
34. A computer program product comprising a non-transitory computer-readable medium having thereon computer code for equalizing an audio signal for playback on a playback device, the computer code comprising:
computer code for dividing the audio signal into a plurality of sub-bands containing one or more frequency coefficients;
computer code for dynamically adapting frequency coefficients of one or more sub-bands based on a playback level of the playback device and a control sound pressure level;
computer code for adapting frequency coefficients of one or more of the plurality of sub-bands based on hearing loss data of a listener;
computer code for transforming the adapted frequency coefficients into an equalized audio signal for playback on a playback device;
wherein the computer code for dynamic adaptation and the computer code for hearing loss adaptation result in an individualized and dynamically equalized audio signal that approaches the spectral balance of the audio signal when controlled.
35. The computer program product of claim 34, wherein the computer code for dynamically adapting the audio magnitude of the one or more sub-bands is limited to frequencies below 1 kHz.
36. The computer program product of claim 34, wherein the computer code for dynamically adapting the audio magnitude comprises:
computer code for performing the following for each sample period of the audio signal:
determining a desired signal magnitude at a predetermined frequency based in part on the control sound pressure level;
determining at least one actual playback size based in part on any master volume adjustment of the playback device and a maximum sound pressure level of the playback device;
generating equal loudness curve data based on the expected signal size and the actual playback size; and
the equal loudness curve data is applied to adapt one or more of the frequency coefficients.
37. The computer program product of claim 34, further comprising:
computer code for receiving a user input identifying an age of a user, wherein adapting one or more of a plurality of sub-bands based on age-related hearing loss data comprises:
computer code for determining a function between the first and second frequencies, wherein at least the first frequency and the function are based on the age of the received user; and
computer code for boosting frequency coefficients in one or more of the plurality of subbands based on the determined function.
38. The computer program product of claim 37, wherein the computer code for adapting one or more of the plurality of subbands comprises:
computer code for receiving user input representing variables of a function, and,
wherein the user input modifies the function and causes an increase or decrease in the boost of at least one of the frequency coefficients.
39. The computer program product of claim 34, further comprising:
computer code for performing a hearing test by producing a series of frequency-based sounds for response by a user, and,
wherein the computer code for adapting one or more of the plurality of sub-bands includes computer code for determining a boost level for one or more of the frequency coefficients based on the user's response to the hearing test.
40. A computer program product comprising a non-transitory computer-readable medium having thereon computer code for equalizing an audio signal, the computer code comprising:
computer code for converting the audio signal into a digital representation;
computer code for filtering the digital representation to dynamically adjust the audio signal based on data controlling both the sound pressure level and hearing characteristics pertaining to a given listener; and
computer code for converting the filtered digital representation into a filtered audio signal for playback on a playback device.
41. A system for equalizing an audio signal, wherein the audio signal is represented by frequency coefficients sampled at a plurality of sample times, the system comprising:
a sound pressure level equalizer for (i) receiving the audio signal and (ii) dynamically adapting the frequency coefficients for the sampling time based on an actual playback sound pressure level and a desired sound pressure level of the audio signal, wherein the sound pressure level equalizer determines frequency coefficient adjustments for adapting the frequency coefficients using equal loudness curve data determined based on the actual playback sound pressure level and the desired sound pressure level; and
a listener dependent equalizer for adjusting the frequency content for the sampling time based on user input determining hearing loss compensation data.
42. The system of claim 41, further comprising:
a bandwidth detector for (i) detecting a bandwidth of the audio signal at each sampling time based on the frequency coefficient for the sampling time and (ii) outputting a bandwidth signal representing the bandwidth;
a logic switch to receive the bandwidth signal and to provide, for a sample time, (i) an audio signal to the bandwidth extension module if the bandwidth is determined to be below a predetermined frequency or (ii) to bypass the bandwidth extension module if the bandwidth is determined to be above the predetermined frequency;
the bandwidth extension module adds additional frequency coefficients to the audio signal at frequencies above the determined bandwidth based on information contained within the audio signal for a given sampling time.
43. The system of claim 41, further comprising:
a memory in communication with the listener dependent equalizer, the memory containing a plurality of sets of listener dependent curve data and providing the listener dependent equalizer with particular listener dependent curve data based on the user input.
44. The system of claim 41, further comprising:
a memory in communication with the sound pressure level equalizer, the memory including a plurality of sets of equal loudness curve data and providing specific equal loudness curve data based on an actual playback sound pressure level or a desired sound pressure level.
45. The system of claim 41, further comprising:
a hearing test module for producing a series of audible tones at different frequencies, receiving user input in response to the audible tones, and determining hearing data specific to the user.
HK14101405.1A 2010-09-10 2011-09-08 Dynamic compensation of audio signals for improved perceived spectral imbalances HK1188343B (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US61/381,831 2010-09-10

Publications (2)

Publication Number Publication Date
HK1188343A true HK1188343A (en) 2014-04-25
HK1188343B HK1188343B (en) 2017-10-06

Family

ID=

Similar Documents

Publication Publication Date Title
EP2614586B1 (en) Dynamic compensation of audio signals for improved perceived spectral imbalances
JP7662227B2 (en) Loudness adjustment for downmixed audio content
JP5632532B2 (en) Device and method for correcting input audio signal
RU2426180C2 (en) Calculation and adjustment of audio signal audible volume and/or spectral balance
EP1619793B1 (en) Audio enhancement system and method
JP4938862B2 (en) Hybrid digital / analog loudness compensation volume control
KR101261212B1 (en) Calculating and adjusting the perceived loudness and/or the perceived spectral balance of an audio signal
AU2011244268A1 (en) Apparatus and method for modifying an input audio signal
JP2013521539A (en) System for synthesizing loudness measurements in single playback mode
CN108768330B (en) Automatic loudness control
Czyzewski et al. Adaptive personal tuning of sound in mobile computers
HK1188343A (en) Dynamic compensation of audio signals for improved perceived spectral imbalances
HK1188343B (en) Dynamic compensation of audio signals for improved perceived spectral imbalances
HK1187741B (en) Dynamic compensation of audio signals for improved perceived spectral imbalances
Lopatka et al. Personal adaptive tuning of mobile computer audio
HK1161443B (en) Apparatus and method for modifying an input audio signal