Disclosure of Invention
The invention provides an audio data stream compression method based on OFDM modulation, which solves the problem that more inter-channel crosstalk is easy to cause when multi-channel audio coding is used for reducing dynamic range compression distortion in a high dynamic range in the prior art.
The invention is realized by the following technical scheme:
An audio data stream compression method based on OFDM modulation, the method comprising:
Step S1, sampling a high dynamic range part of an original audio signal, obtaining a sampling signal, preprocessing the sampling signal for noise removal and quantization conversion, primarily reducing the dynamic range compression distortion of the sampling signal, and dividing the sampling signal into a plurality of channels by using a multi-channel coding mode for further distortion reduction processing;
s2, dividing the sampling signal of each channel into a plurality of frames, processing the sampling signal frame by using discrete cosine transform, outputting DCT coefficients according to different channels, dividing an original audio signal into a plurality of subcarrier signals by using a frequency division multiplexing modulation method, and placing the subcarrier signals in a frequency domain, and setting a quantization allocation strategy for randomly allocating all DCT coefficients to all subcarrier signals;
S3, performing inverse discrete Fourier transform on DCT coefficients on subcarrier signals, so as to convert frequency domain data of original audio signals represented by the DCT coefficients into time domain OFDM symbols, adding cyclic prefix to each OFDM symbol, and performing second multi-channel coding on each OFDM symbol in a mode of independently coding each symbol;
S4, presetting a reference audio signal representing a specification, setting a matched reference crosstalk attenuation ratio, converting an OFDM symbol of a time domain into a DCT coefficient representing frequency domain data by using a frequency division multiplexing demodulation method, converting the DCT coefficient by using an inverse discrete cosine transform, and marking a conversion result as a corrected audio signal;
And S5, calculating a corrected crosstalk attenuation ratio of the corrected audio signal and comparing the corrected crosstalk attenuation ratio with a reference crosstalk attenuation ratio, judging that the signal is corrected when the corrected crosstalk attenuation ratio is smaller than the reference crosstalk attenuation ratio, and re-executing the step S2 when the corrected crosstalk attenuation ratio is larger than the reference crosstalk attenuation ratio.
Due to limitations of coding techniques, the dynamic range of the signal is compressed, resulting in loss of detail and layering of the audio. Currently, multichannel audio coding is mainly used in the prior art to reduce dynamic range compression distortion. At present, when the existing multichannel audio coding method is used, inter-channel crosstalk is easy to occur in high dynamic range audio, signal aliasing and interference among channels can obviously influence the spatial positioning and the overall tone quality of the audio, and the specific phenomenon is that the audio contents of all channels are mutually influenced in the coding or decoding process, so that the original independent channel information is mixed together, and the spatial sense and the definition of the audio are damaged. At present, a method for reducing dynamic range compression distortion capable of obviously reducing inter-channel crosstalk during high dynamic range processing is lacking in the existing multi-channel audio coding technology. Based on the above, the invention provides an audio data stream compression method based on OFDM modulation, which solves the problem that more inter-channel crosstalk is easy to cause when multi-channel audio coding is used for reducing dynamic range compression distortion in a high dynamic range in the prior art.
Further, as a possible implementation mode, the frequency division multiplexing modulation method includes the steps of using a frequency domain window function for subcarrier signals to smooth subcarrier spectrums and dynamically adjust subcarrier intervals, obtaining channel frequency response of each subcarrier signal based on frequency domain analysis, identifying spectrum spreading caused by multipath effects based on the channel frequency response, then converting all subcarrier signals to the frequency domain by using DFT conversion, carrying out frequency domain filtering on each subcarrier signal, and compensating spectrum data of the subcarrier signals after filtering to spectrum spreading caused by the multipath effects.
Further, as a possible implementation manner, the process of using the frequency domain window function includes applying a Blackman window function to the frequency domain data of each subcarrier for windowing, adjusting the intervals of the subcarriers based on the characteristics of the Blackman window function, setting a width critical value for the main lobe width of the Blackman window, and when the window function width of each subcarrier is lower than the width critical value, doubling the subcarrier interval width of the subcarrier signal.
Further, as a possible embodiment, let the sample index amount of the window function of the subcarrier be denoted as N, let the width of the window function be denoted as N, let the blackman window function value be denoted as ω, the window function value at the nth sample point be denoted as ω (N), the constant term be denoted as a, the first coefficient be denoted as B, the second coefficient be denoted as C,
The computational formula of the blackman window function is set as:,
wherein the constant term a is used to ensure that the value of the window function is not equal to zero at the end points of the beginning and end of the window,
The first coefficient B and the second coefficient C are used for adjusting the main lobe and side lobe characteristics of the window function.
Further, the compensating process of spectrum spreading further comprises adjusting the spectrum based on the channel frequency response by setting a compensation filter for adjusting the spectrum to reduce distortion.
Further, the setting method of the width critical value is to construct a linear relation between the spectrum efficiency and the width critical value, and dynamically adjust the width critical value according to the spectrum efficiency.
Further, the corrected crosstalk attenuation ratio of the corrected audio signal is set to be a ratio between the crosstalk amplitude of the corrected audio signal and the main lobe amplitude of the corrected audio signal.
Further, the reference crosstalk attenuation ratio is calculated based on a spectral leakage value representing a component size of a desired spectral component of the original audio signal that is spread to a non-target frequency and a crosstalk component value representing an energy amount caused by occurrence of overlapping interference of the original audio signal.
Further, the spectral leakage value is set as a ratio of side lobe energy of the original audio signal to main lobe energy of the reference audio signal, and the crosstalk component value is set as a ratio of total crosstalk component energy of the original audio signal to total signal energy of the reference audio signal.
Further, the cyclic prefix adjusts for length based on multipath effects and delay spread dynamics of the channel.
Compared with the prior art, the method adopts frequency division multiplexing modulation in multi-channel coding, divides the signal into a plurality of sub-carriers, uses DCT coefficient of frequency domain to further process, can independently compress and optimize dynamic range on each channel, carries out secondary multi-channel independent coding on each OFDM symbol, can reduce crosstalk caused by inconsistent coding, and has the advantage and beneficial effect of effectively controlling inter-channel crosstalk when multi-channel audio coding is used.
Detailed Description
For the purpose of making apparent the objects, technical solutions and advantages of the present invention, the present invention will be further described in detail with reference to the following examples and the accompanying drawings, wherein the exemplary embodiments of the present invention and the descriptions thereof are for illustrating the present invention only and are not to be construed as limiting the present invention.
Examples
As shown in fig. 1, the present embodiment is an audio data stream compression method based on OFDM modulation, which solves the problem in the prior art that more inter-channel crosstalk is easily caused when multi-channel audio coding is used in a high dynamic range to reduce dynamic range compression distortion.
The invention is realized by the following technical scheme:
An audio data stream compression method based on OFDM modulation, the method comprising:
Step S1, sampling a high dynamic range part of an original audio signal, obtaining a sampling signal, preprocessing the sampling signal for noise removal and quantization conversion, primarily reducing the dynamic range compression distortion of the sampling signal, and dividing the sampling signal into a plurality of channels by using a multi-channel coding mode for further distortion reduction processing;
s2, dividing the sampling signal of each channel into a plurality of frames, processing the sampling signal frame by using discrete cosine transform, outputting DCT coefficients according to different channels, dividing an original audio signal into a plurality of subcarrier signals by using a frequency division multiplexing modulation method, and placing the subcarrier signals in a frequency domain, and setting a quantization allocation strategy for randomly allocating all DCT coefficients to all subcarrier signals;
S3, performing inverse discrete Fourier transform on DCT coefficients on subcarrier signals, so as to convert frequency domain data of original audio signals represented by the DCT coefficients into time domain OFDM symbols, adding cyclic prefix to each OFDM symbol, and performing second multi-channel coding on each OFDM symbol in a mode of independently coding each symbol;
S4, presetting a reference audio signal representing a specification, setting a matched reference crosstalk attenuation ratio, converting an OFDM symbol of a time domain into a DCT coefficient representing frequency domain data by using a frequency division multiplexing demodulation method, converting the DCT coefficient by using an inverse discrete cosine transform, and marking a conversion result as a corrected audio signal;
And S5, calculating a corrected crosstalk attenuation ratio of the corrected audio signal and comparing the corrected crosstalk attenuation ratio with a reference crosstalk attenuation ratio, judging that the signal is corrected when the corrected crosstalk attenuation ratio is smaller than the reference crosstalk attenuation ratio, and re-executing the step S2 when the corrected crosstalk attenuation ratio is larger than the reference crosstalk attenuation ratio.
The original audio signal is the target processing audio signal, and the multi-channel coding process is the main mode for reducing distortion in the prior art. The use of Discrete Cosine Transform (DCT) to process the sampled signal frame by frame aims at the different frequency components in the audio signal being processed independently, which transform processes the spectral characteristics of the signal separately, helping to reduce cross-talk between the different frequency components during encoding and thus reduce inter-channel interference, while the frame by frame DCT processing of the sampled signal allows independent optimisation of the data for each frame. This framing process allows the crosstalk problem for each frame to be handled separately, helping to optimize the separation of signals between channels at the frame level, thereby reducing crosstalk. The original audio signal is divided into a plurality of subcarrier signals by using a frequency division multiplexing modulation method and is arranged in a frequency domain, so that a plurality of signals are allowed to be transmitted simultaneously in the same frequency spectrum resource, and each signal occupies a different frequency band. This approach increases the efficiency of spectrum utilization, allowing the system to transmit more information within a limited bandwidth while maintaining the independence of each channel signal. Converting frequency domain data of an original audio signal represented by DCT coefficients into time domain OFDM symbols for reducing frequency domain interference, signal interference and crosstalk in the frequency domain may be mitigated after conversion into the time domain because the processing and transmission characteristics of the time domain signal are different from the frequency domain. The inverse OFDM and inverse DCT operations are performed to convert the time domain signal back to the frequency domain and recover the original audio signal, which can effectively correct the interference caused by the crosstalk in the transmission process and further reduce the crosstalk between the multiple channels through proper demodulation and decoding steps. The reference audio signal refers to a standard audio signal for comparing, calibrating or evaluating the quality of other audio signals, i.e. an audio signal representing an industry specification.
The high dynamic range portion of the original audio signal is sampled, the sampled signal is pre-processed for noise removal and quantization conversion to reduce dynamic range compression distortion, and multi-channel coding is used to divide the sampled signal into multiple channels to further reduce distortion. The audio signal is cleaned and optimized for more efficient subsequent processing. Sampling of the high dynamic range portion ensures that the details of the audio signal are preserved. Noise removal and quantization conversion help to improve signal quality, and multi-channel coding provides more flexibility for subsequent processing. The sampled signal of each channel is divided into frames, the DCT is applied frame by frame and DCT coefficients are output for converting the audio signal from the time domain to the frequency domain, typically for data compression and removing redundant information. The division of the frames helps to gradually process the audio data, and improves the processing accuracy and efficiency.
More, the original audio signal is divided into several sub-carrier signals, placed in the frequency domain, and the quantization allocation strategy randomly allocates the DCT coefficients to the sub-carrier signals. The frequency division multiplexing technology allows signals to be transmitted on different subcarriers simultaneously, thereby improving the bandwidth utilization. The quantization allocation strategy is used to allocate quantization results of signals to different subcarriers or frequency ranges, and common quantization allocation strategies may include uniform quantization allocation, linear quantization allocation, and gradient quantization allocation. As a specific application, in a specific implementation, it is preferable that a quantization allocation policy is set based on gradient quantization allocation, the frequencies of subcarrier signals and DCT coefficients are arranged from high to low, all frequency ranges are equally divided into a plurality of frequency gradient range sections, subcarrier signals and DCT coefficients in the same gradient range section are allocated correspondingly and randomly, and at least one DCT coefficient is allocated on each subcarrier signal. The width of the frequency gradient range interval is comprehensively considered based on the sparsity of DCT coefficients and the subcarrier width. The quantization allocation strategy enables each subcarrier to bear different DCT coefficients, and frequency domain utilization is optimized. The DCT coefficients are converted into time domain OFDM symbols, and a cyclic prefix is added to each OFDM symbol for converting the data of the frequency domain back to the time domain, so as to prepare for OFDM modulation, and the cyclic prefix is helpful for reducing the interference among channel symbols and improving the robustness of the system. And then performing second multi-channel coding on each OFDM symbol for further crosstalk optimization. DCT coefficients of the time domain OFDM symbols converted back to the frequency domain are used for converting the time domain signals back to the frequency domain, the inverse DCT operation is prepared, the inverse discrete cosine transform restores the frequency domain data to the time domain, and the corrected audio signals are obtained. And (2) calculating a corrected crosstalk attenuation ratio of the corrected audio signal, comparing the corrected crosstalk attenuation ratio with a reference crosstalk attenuation ratio, judging whether the signal is corrected according to the ratio, and returning to the step (S2) if the signal is not qualified, so as to evaluate the corrected audio quality. If the correction is not standard, the process returns and reprocesses to ensure that the final audio signal reaches the desired quality.
Further, as a possible implementation mode, the frequency division multiplexing modulation method includes the steps of using a frequency domain window function for subcarrier signals to smooth subcarrier spectrums and dynamically adjust subcarrier intervals, obtaining channel frequency response of each subcarrier signal based on frequency domain analysis, identifying spectrum spreading caused by multipath effects based on the channel frequency response, then converting all subcarrier signals to the frequency domain by using DFT conversion, carrying out frequency domain filtering on each subcarrier signal, and compensating spectrum data of the subcarrier signals after filtering to spectrum spreading caused by the multipath effects.
The frequency domain window function helps to smooth the frequency spectrum of the subcarriers, reducing spectral leakage effects. The spectrum leakage refers to leakage of frequency domain signal energy from one frequency component to the other, which can lead to signal interference and crosstalk. The use of the simultaneous window function may adjust the subcarrier spacing to accommodate different spectral environments. Such dynamic adjustment helps to optimize spectrum utilization, reduce interference between adjacent subcarriers, and improve overall modulation efficiency. By analyzing the channel frequency response, the spectrum change condition of each subcarrier in the actual transmission process can be known. The channel frequency response reveals the attenuation and frequency selective distortion of the signal during transmission. Multipath effects typically cause spectral spreading, i.e., signal overlap and interference due to multiple path transmissions. This effect will widen the spectrum and increase interference between signals. Identifying these effects is a precondition for processing and compensating for spectral spreading. The frequency domain filtering of each subcarrier signal can effectively reduce or compensate for the spread of spectrum due to multipath effects. In particular implementations, the filter design may optimize the signal based on the characteristics of multipath effects, thereby reducing the impact of spectral spreading on signal quality.
Further, as a possible implementation manner, the process of using the frequency domain window function includes applying a Blackman window function to the frequency domain data of each subcarrier for windowing, adjusting the intervals of the subcarriers based on the characteristics of the Blackman window function, setting a width critical value for the main lobe width of the Blackman window, and when the window function width of each subcarrier is lower than the width critical value, doubling the subcarrier interval width of the subcarrier signal.
The blackman window function is a windowing function that reduces spectral leakage and improves the accuracy of frequency domain analysis. The shape of which typically has a large main lobe and a plurality of side lobes. The spectrum leakage can be effectively reduced, and the spectrum data can be smoothed. Compared to other window functions, such as hanning and hamming windows, the sidelobe attenuation of the blackman window is more pronounced, helping to reduce the sidelobe interference of the spectrum. By applying the blackman window function to the frequency domain data of the sub-carriers, the frequency domain leakage effect can be reduced, and the resolution of the frequency spectrum and the definition of the signal can be improved. The main lobe width of the blackman window directly affects the subcarrier spacing in the frequency domain, and in order to avoid spectral overlap between subcarriers, the subcarrier spacing needs to be adjusted according to the characteristics of the window function so that it is large enough to maintain effective isolation in the frequency domain. The main lobe width of the blackman window determines the resolution of the subcarriers in the frequency domain. Setting the width threshold helps to adjust the subcarrier spacing according to the actual window function width. When the main lobe width of the blackman window is below the set width threshold, it is indicated that the spectrum interference between the subcarriers is small. In order to further optimize the system performance, the subcarrier spacing needs to be increased to prevent interference caused by spectrum leakage or multipath effect, and the dynamic adjustment method can adapt to actual requirements in different frequency domain environments, so that a lower interference level and higher signal quality are ensured in the frequency domain processing process of subcarrier signals.
Further, the sample index amount of the window function of the subcarrier is represented as N, the width of the window function is represented as N, the value of the blackman window function is represented as ω, the value of the window function at the nth sample point is represented as ω (N), the constant term is represented as a, the first coefficient is represented as B, the second coefficient is represented as C,
The computational formula of the blackman window function is set as:,
wherein the constant term a is used to ensure that the value of the window function is not equal to zero at the end points of the beginning and end of the window,
The first coefficient B and the second coefficient C are used for adjusting the main lobe and side lobe characteristics of the window function.
The constant term a is used to ensure that the value of the window function at the beginning and ending endpoints of the window is not zero. The method is beneficial to reducing the influence of the window function on the signal edge, avoiding the sudden interruption of the signal at the boundary of the window, reducing the frequency spectrum leakage, reducing the frequency spectrum artifact generated by the window function at the signal edge by enabling the boundary value of the window function to be close to zero, and improving the accuracy of frequency spectrum analysis. The first coefficient B is used for adjusting the width and the shape of the main lobe of the window function. The width of the main lobe influences the width of the main peak of the window function in the frequency domain, thereby influencing the resolution of the frequency spectrum, increasing the B value can lead the main lobe to be wider and improve the resolution of the frequency spectrum, but the amplitude of the side lobe can be increased. Optimizing the value of B can find a balance between main lobe width and side lobe attenuation. The second coefficient C is used to adjust the sidelobe characteristics of the window function. The C value is added to help reduce the amplitude of the side lobe, make the side lobe decay faster, improve the frequency spectrum dynamic range of the window function, and help reduce noise and interference in the frequency spectrum. In operation, the window function value ω (n) at each sample point is calculated according to the formula. When the Blackman window function is applied in the frequency domain, the endpoint value of the window function is ensured to be non-zero, the frequency spectrum artifact is reduced, and the coefficients B and C are adjusted to optimize the main lobe and side lobe characteristics of the window function. This affects spectral resolution and sidelobe interference and the application of the Blackman window function to the frequency domain data. By selecting proper A, B, C values, the smoothing of the spectrum data can be optimized, the spectrum leakage and interference can be reduced, and the system performance can be improved.
Further, as a possible implementation manner, the compensating process of spectrum expansion further comprises setting a compensating filter based on the channel frequency response to adjust the spectrum, wherein the compensating filter is used for adjusting the spectrum to reduce distortion, the setting method of the width critical value is that a linear relation between the spectrum efficiency and the width critical value is constructed, the width critical value is dynamically adjusted according to the frequency spectrum efficiency, and the corrected crosstalk attenuation ratio of the corrected audio signal is set to be the ratio between the crosstalk amplitude of the corrected audio signal and the main lobe amplitude of the corrected audio signal.
The compensation filter is designed to adjust the frequency spectrum according to the channel frequency response. The function of such a filter is to correct for spectral spread due to multipath effects or other channel characteristics, ensuring that the spectrum of the signal is restored to the desired shape. The main purpose of the compensation filter is to reduce the distortion caused by the spread spectrum. By appropriate filtering, the signal spectrum can be tuned to a form that is more nearly ideal, thereby reducing interference and distortion due to spectral spreading. In practice, the filter is preferably designed to take into account the specific frequency response of the channel in order to accurately compensate for spectral shifts and spreads of the signal. The effective compensation filter can improve the quality of the signal and the overall performance of the system. The width critical value is used for determining whether the main lobe width of the Blackman window function is enough, when the spectrum efficiency is high, the spectrum overlap of the subcarriers is smaller, a smaller width critical value is needed, and when the spectrum efficiency is low, the spectrum overlap of the subcarriers is larger, a larger width critical value is needed. The modified crosstalk attenuation ratio is used to evaluate the crosstalk effect of the modified audio signal. The degree of crosstalk in the corrected signal can be quantified by calculating the ratio of the crosstalk amplitude to the main lobe amplitude, and the smaller the corrected crosstalk attenuation ratio is, the better the correction effect is, and the crosstalk influence is smaller. The ratio is used to determine the effectiveness of the correction process and to determine whether further adjustments are needed. By setting the ratio and comparing it with a reference value, the success of the correction process can be evaluated. If the ratio is not as expected, the description modification process needs to be optimized or re-executed. The embodiment can effectively reduce the distortion caused by spectrum expansion and improve the quality of the audio signal. The compensation filter corrects the spectral distortion, dynamically adjusts the width threshold to optimize spectral efficiency, and corrects the crosstalk attenuation ratio for evaluating the correction effect.
Further, as a possible implementation manner, the reference crosstalk attenuation ratio is calculated based on a spectral leakage value and a crosstalk component value, wherein the spectral leakage value represents a component size of an expected spectral component of an original audio signal extending to a non-target frequency, the crosstalk component value represents an energy size caused by overlapping interference of the original audio signal, the spectral leakage value is set to be a ratio of side lobe energy of the original audio signal to main lobe energy of the reference audio signal, the crosstalk component value is set to be a ratio of total energy of crosstalk components of the original audio signal to total energy of signals of the reference audio signal, and the cyclic prefix adjusts the length according to multipath effects and delay extension dynamics of a channel.
The spectral leakage value represents a component size of a desired spectral component of the original audio signal spread to non-target frequencies. The leakage degree of the signal in the frequency domain is estimated by calculating the ratio of the side lobe energy of the original signal to the main lobe energy of the reference audio signal, wherein the side lobe energy refers to the energy of the signal in the frequency range outside the main lobe, and the lower the leakage value is, the less the frequency spectrum leakage is. The crosstalk component value is used for quantifying the energy caused by overlapping interference by calculating the ratio of the total energy of the crosstalk component to the total energy of the signal, and the smaller the crosstalk component value is, the less the interference is. The length of the cyclic prefix should be dynamically adjusted according to multipath effects and delay spread of the channel. Multipath refers to the arrival of a signal at a receiver through multiple paths during propagation, resulting in a time domain spread of the signal. Delay spread refers to the time delay that a signal appears at the receiving end. In a specific implementation, the reference crosstalk attenuation ratio may be set as a weighted average of the spectrum leakage value and the crosstalk component value or other comprehensive indicators, and is used to evaluate the overall crosstalk processing effect.
The foregoing description of the embodiments has been provided for the purpose of illustrating the general principles of the invention, and is not meant to limit the scope of the invention, but to limit the invention to the particular embodiments, and any modifications, equivalents, improvements, etc. that fall within the spirit and principles of the invention are intended to be included within the scope of the invention.