[go: up one dir, main page]

CN114866948A - Audio processing method and device, electronic equipment and readable storage medium - Google Patents

Audio processing method and device, electronic equipment and readable storage medium Download PDF

Info

Publication number
CN114866948A
CN114866948A CN202210451743.5A CN202210451743A CN114866948A CN 114866948 A CN114866948 A CN 114866948A CN 202210451743 A CN202210451743 A CN 202210451743A CN 114866948 A CN114866948 A CN 114866948A
Authority
CN
China
Prior art keywords
audio signal
mapping relation
determining
monaural
current
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210451743.5A
Other languages
Chinese (zh)
Other versions
CN114866948B (en
Inventor
黄为庆
邱音良
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing QIYI Century Science and Technology Co Ltd
Original Assignee
Beijing QIYI Century Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing QIYI Century Science and Technology Co Ltd filed Critical Beijing QIYI Century Science and Technology Co Ltd
Priority to CN202210451743.5A priority Critical patent/CN114866948B/en
Publication of CN114866948A publication Critical patent/CN114866948A/en
Application granted granted Critical
Publication of CN114866948B publication Critical patent/CN114866948B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S7/00Indicating arrangements; Control arrangements, e.g. balance control
    • H04S7/30Control circuits for electronic adaptation of the sound field
    • H04S7/302Electronic adaptation of stereophonic sound system to listener position or orientation
    • H04S7/303Tracking of listener position or orientation
    • H04S7/304For headphones
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R5/00Stereophonic arrangements
    • H04R5/033Headphones for stereophonic communication
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S2400/00Details of stereophonic systems covered by H04S but not provided for in its groups
    • H04S2400/01Multi-channel, i.e. more than two input channels, sound reproduction with two speakers wherein the multi-channel information is substantially preserved
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S2400/00Details of stereophonic systems covered by H04S but not provided for in its groups
    • H04S2400/15Aspects of sound capture and related signal processing for recording or reproduction

Landscapes

  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Acoustics & Sound (AREA)
  • Signal Processing (AREA)
  • Stereophonic System (AREA)

Abstract

The application provides an audio processing method, an audio processing device, an electronic device and a readable storage medium, wherein the method comprises the following steps: determining a current mapping relation group corresponding to a current position, wherein each preset position corresponds to one mapping relation group, each mapping relation group comprises 2n mapping relation pairs, each mapping relation pair indicates a corresponding relation between a preset position in a monaural environment and an audio signal of one sound channel received by the monaural environment at the preset position, n is the number of the sound channels, and n is the number of the sound channels 1; determining, for the current set of mappings, a first audio signal for each channel received by the monaural; obtaining a second audio signal of each sound channel received by the monaural according to the first audio signal and the received original audio signal; and performing weighted accumulation processing on the n second audio signals to obtain target audio signals of the n sound channels received by the monaural.This application has improved tone quality effect.

Description

Audio processing method and device, electronic equipment and readable storage medium
Technical Field
The present application relates to the field of audio technologies, and in particular, to an audio processing method and apparatus, an electronic device, and a readable storage medium.
Background
The earphone is a product commonly used in life of people, and a user can watch movies, TV plays, listen to songs and the like through the earphone. The earphone may be pressed or tucked against the user's ear when in use, which may give the user a feeling of sound trapped in the ear, resulting in poor sound effect of the earphone.
With the popularization of 5G network transmission and HDR rendering technologies, how to improve the audio experience of wearing earphones, remove the trouble that sound is trapped in ears when a user wears earphones, simulate the sound box effect in an open environment, and enable the user to be as if the user is in a real environment, which is a problem to be solved urgently.
Aiming at the problem of poor sound effect of the existing earphone, no good solution is provided at present.
Disclosure of Invention
In order to solve the technical problem or at least partially solve the technical problem, the present application provides an audio processing method, an apparatus, an electronic device and a readable storage medium.
In a first aspect, the present application provides an audio processing method, including:
determining a current mapping relation group corresponding to a current position, wherein each preset position corresponds to one mapping relation group, each mapping relation group comprises 2n mapping relation pairs, each mapping relation pair indicates a corresponding relation between a preset position in a monaural environment and an audio signal of one sound channel received by the monaural environment at the preset position, n is the number of the sound channels, and n is the number of the sound channels 1;
Determining, for the current set of mappings, a first audio signal for each channel received by the monaural;
obtaining a second audio signal of each sound channel received by the monaural according to the first audio signal and the received original audio signal;
and performing weighted accumulation processing on the n second audio signals to obtain target audio signals of the n sound channels received by the monaural.
Optionally, before determining the current mapping relationship group corresponding to the current location, the method further includes:
in a preset three-dimensional range of head rotation, taking a horizontal area in the horizontal direction corresponding to each height area as a preset position, wherein the preset three-dimensional range is divided into at least one height area, and each height area is divided into at least one horizontal area in the horizontal direction;
determining audio data of n channels received by a monaural at the preset position;
taking the audio data of each sound channel at the preset position as a mapping relation pair;
and constructing a mapping relation group according to the n mapping relation pairs at the preset position.
Optionally, before taking a horizontal area in the horizontal direction corresponding to each height area as a preset position, the method further includes:
determining a maximum vertical range of rotation of the head in a vertical direction and a maximum horizontal range of rotation in a horizontal direction;
determining the preset three-dimensional range according to the maximum vertical range and the maximum horizontal range;
determining a vertical interval angle in a vertical direction and a horizontal interval angle in a horizontal direction;
and in the preset stereo range, determining at least one height area according to the maximum vertical range and the vertical interval angle, and determining at least one horizontal area according to the horizontal range corresponding to the height area and the horizontal interval angle.
Optionally, the formula for calculating the number of mapping relationship pairs is:
and S is 180/f 360/m 2n, wherein S is the number of the mapping relation pairs, 180 is the maximum vertical range, f is the vertical interval angle, 360 is the maximum horizontal range, m is the horizontal interval angle, and 2n is the number of the mapping relation pairs corresponding to the preset position.
Optionally, the determining a current mapping relationship group corresponding to the current location includes:
detecting a channel type, wherein the channel type indicates the number of channels, and for different channel types, audio signals of each channel received by a single ear at the same preset position are not identical;
detecting whether a rotation sensor exists in the earphone or not, wherein the rotation sensor is used for detecting the rotation angle of the head;
selecting a fixed position from the preset positions according to the type of the sound channel under the condition that a rotation sensor does not exist in the earphone, and taking a mapping relation group of the fixed position as a current mapping relation group corresponding to the current position;
under the condition that a rotation sensor exists in the earphone, determining the current position of the head according to the detected rotation angle; and determining a current mapping relation group corresponding to the current position according to the current position and the sound channel type.
Optionally, before determining the current mapping relationship group corresponding to the current location, the method further includes:
determining a recording parameter corresponding to a current recording scene;
and determining the audio signal of one sound channel received by the monaural at the preset position according to the sound of the sounder and the recording parameters.
Optionally, after obtaining the target audio signals of n channels received by the monaural, the method further includes:
and under the condition that the first audio signal is lower than a first audio threshold value, performing gain and amplitude control on the target audio signal so that the target audio signal is larger than the first audio threshold value and smaller than a second audio threshold value.
In a second aspect, the present application provides an audio processing apparatus, the apparatus comprising:
a first determining module, configured to determine a current mapping relationship group corresponding to a current location, where each preset location corresponds to one mapping relationship group, each mapping relationship group includes 2n mapping relationship pairs, and the mapping relationship pairs indicate a correspondence between preset locations in a monaural environment and an audio signal of a channel received by the monaural environment at the preset locations, where the number of channels is n, and n is the number of channels 1;
A second determining module, configured to determine, for the current mapping relationship group, a first audio signal of each channel received by the monaural;
the obtaining module is used for obtaining a second audio signal of each sound channel received by the monaural according to the first audio signal and the received original audio signal;
and the processing module is used for performing weighted accumulation processing on the n second audio signals to obtain target audio signals of the n sound channels received by the monaural.
In a third aspect, an electronic device is provided, which includes a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory complete communication with each other through the communication bus;
a memory for storing a computer program;
and the processor is used for realizing any audio processing method when executing the program stored in the memory.
In a fourth aspect, a computer-readable storage medium is provided, having stored therein a computer program which, when executed by a processor, performs any of the audio processing method steps. Compared with the prior art, the technical scheme provided by the embodiment of the application has the following advantages:
according to the method provided by the embodiment of the application, the earphone acquires the first audio signal capable of simulating the open environment in the current position, and then the second audio signal is obtained according to the original audio signal and the first audio signal, so that each second audio signal can achieve the effect of simulating the sound effect in the open environment, the earphone carries out weighted accumulation processing on the n second audio signals, and the target audio signals of the n sound channels heard by a single ear can be obtained. This application can simulate the first audio signal in the wide environment through merging into at original audio signal for ultimate target audio signal also can simulate the wide environment, has improved the tone quality effect.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention.
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without inventive exercise.
Fig. 1 is a schematic hardware environment diagram of an audio processing method according to an embodiment of the present disclosure;
fig. 2 is a flowchart of an audio processing method according to an embodiment of the present application;
FIG. 3 is a schematic diagram of a preset position;
FIG. 4 is a schematic illustration of 5.1 channels;
fig. 5 is a system block diagram of an audio processing method according to an embodiment of the present application;
fig. 6 is a schematic structural diagram of an audio processing apparatus according to an embodiment of the present application;
fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
To solve the problems mentioned in the background, according to an aspect of embodiments of the present application, an embodiment of an audio processing method is provided.
Alternatively, in the embodiment of the present application, the audio processing method described above may be applied to a hardware environment formed by the earphone 101 and the server 103 as shown in fig. 1. As shown in fig. 1, the server 103 is connected to the headset 101 through a network, which may be used to provide services for the headset, and a database 105 may be provided on or separate from the server for providing data storage services for the server 103, the network including but not limited to: wide area, metropolitan area, or local area networks, and headset classes including, but not limited to, head-worn, ear-plugged, in-ear, behind-the-ear, and the like.
The embodiment of the application provides an audio processing method, which can be applied to an earphone and is used for processing audio so that the sound of the earphone simulates the sound effect in an open environment. The method and the device can be applied to watching of various scenes such as movie and television play, listening to music, recording of audio and the like through the earphone.
An audio processing method provided in the embodiments of the present application will be described in detail below with reference to specific embodiments, as shown in fig. 2, the specific steps are as follows:
step 201: and determining a current mapping relation group corresponding to the current position.
Wherein each preset position corresponds to a mapping relation group, each mapping relation group comprises 2n mapping relation pairs, the mapping relation pairs indicate the corresponding relation between the preset positions in the monaural and open environment and the audio signals of one sound channel received by the monaural at the preset positions, n is the number of the sound channels, and n is the number of the sound channels 1。
In the embodiment of the application, when the audio recording is performed in a place with a wide environment, a preset stereo range of the head rotation can be determined, then the preset stereo range is divided into at least one height area, the horizontal direction corresponding to each height area is divided into at least one horizontal area, and thus, one horizontal area corresponding to each height area is a preset position. N generators are arranged in the preset stereo range, each generator corresponds to one sound channel, and when the head of a user is located at any one preset position, the single ear of the user can hear the audio signals of the n sound channels, namely the double ears of the user can hear 2n audio signals.
The earphone is used for constructing a mapping relation pair aiming at a preset position in a single ear or an open environment and an audio signal of a sound channel received by the single ear at the preset position, and since the sound channel type is n, a common user is two ears, then 2n mapping relation pairs at the preset position can construct a mapping relation group, namely, each preset position corresponds to one mapping relation group.
In this embodiment of the present application, the headset may detect a current position of the head of the user, where the detection method is as follows: when a rotation sensor does not exist in the earphone, the earphone determines a fixed position according to the type of the sound channel, the fixed position is used as the current position, and then the current mapping relation group corresponding to the current position is selected from the mapping relation group corresponding to the preset position; when a rotation sensor exists in the earphone, the earphone determines the current position of the head according to the up-and-down rotation angle and the left-and-right rotation angle of the head of a user, and then selects a current mapping relation group corresponding to the current position from a mapping relation group corresponding to a preset position according to the current position and the sound channel type.
As an optional implementation manner, before determining the current mapping relationship group corresponding to the current location, the method further includes: determining a recording parameter corresponding to a current recording scene; and determining the audio signal of one sound channel received by the monaural at the preset position according to the sound of the sounder and the recording parameters.
The method comprises the steps that each recording scene is stored in a database in advance and has corresponding recording parameters, and the earphone acquires the recording parameters corresponding to the current recording scene from the database according to the current recording scene, wherein the recording parameters comprise but are not limited to reverberation parameters, sound attenuation parameters, recording room material parameters and recording room size parameters, and the recording parameters can influence sound received by the earphone. The earphone analyzes the sound and the recording parameters of the generator to obtain an audio signal. Therefore, for different recording scenes, the received audio signals are different, and the mapping relation pairs and the mapping relation groups are also different. For example, for two different scenes, namely movie and music, if the movie and the music are recorded in different rooms respectively, the corresponding recorded room material parameter, recorded room size parameter and sound attenuation parameter are also different, and in addition, the reverberation parameters are also different because the echo sizes required for watching the movie and the music are different, so that the obtained audio signals of the movie and the music are different, and the mapping relationship pair is also different.
Step 202: for the current set of mappings, a first audio signal for each channel received monaural is determined.
In the embodiment of the present application, the headphone determines the first audio signal of each channel received by one ear for the current mapping relationship group, so that the headphone can acquire the first audio signal of each channel received by two ears.
Step 203: and obtaining a second audio signal of each sound channel received by the monaural according to the first audio signal and the received original audio signal.
In the embodiment of the application, the audio signal received by the earphone is an original audio signal, and the earphone processes the first audio signal and the original audio signal to obtain a second audio signal, wherein the first audio signal is used for enabling the second audio signal to simulate the sound effect in the open environment.
The calculation formula of the second audio signal is:
s1 ═ f (x) × g (x), where S1 is the second audio signal, x is the current position, f (x) is the decoded sequence of the original audio signal, and g (x) is the first audio signal.
Optionally, the headphone may further perform convolution transformation on the first audio signal and the received original audio signal to obtain a second audio signal of each channel received by a single ear. The convolution can adopt a naive algorithm and also can adopt Fourier transform, the convolution in the application adopts the Fourier transform, the Fourier transform can improve the processing speed, and meanwhile, the operation complexity is reduced.
The formula for the convolution variation is:
Figure BDA0003617386270000071
step 204: and performing weighted accumulation processing on the n second audio signals to obtain target audio signals of n sound channels received by a single ear.
In the embodiment of the application, the second audio signal is each audio signal received by a single ear, and the single ear can receive n audio signals, so that the earphone processes the n second audio signals to obtain a target audio signal, and the target audio signal is fused with the n second audio signals, so that the single ear can hear better audio effect.
In the application, the earphone acquires a first audio signal capable of simulating an open environment in the current position, and then obtains a second audio signal aiming at the original audio signal and the first audio signal, so that each second audio signal can achieve the effect of simulating the sound effect in the open environment, the earphone carries out weighting accumulation processing on n second audio signals, and the target audio signals of n sound channels heard by a single ear can be obtained. This application can simulate the first audio signal in the wide environment through merging into at original audio signal for ultimate target audio signal also can simulate the wide environment, has improved the tone quality effect.
As an optional implementation manner, before determining the current mapping relationship group corresponding to the current location, the method further includes: in a preset three-dimensional range of head rotation, taking a horizontal area in the horizontal direction corresponding to each height area as a preset position, wherein the preset three-dimensional range is divided into at least one height area, and each height area is divided into at least one horizontal area in the horizontal direction; determining audio data of n channels received by a monaural at a preset position, wherein each channel corresponds to a generator; taking the audio data of each sound channel at a preset position as a mapping relation pair; and constructing a mapping relation group according to the n mapping relation pairs at the preset positions.
In the embodiment of the application, the recording of the audio in the simulated open environment can be performed in advance. The earphone determines the maximum vertical range of the head rotating in the vertical direction and the maximum horizontal range of the head rotating in the horizontal direction, then determines the preset three-dimensional range of the head rotating according to the maximum numerical angle and the maximum horizontal angle by taking the center of the maximum vertical range and the center of the maximum horizontal range as original points, then determines the vertical interval angle in the vertical direction and the horizontal interval angle in the horizontal direction, finally determines at least one height area according to the maximum vertical range and the vertical interval angle in the preset three-dimensional range, and determines at least one horizontal area according to the horizontal range and the horizontal interval angle corresponding to the height area.
Fig. 3 is a schematic diagram of a preset position. It can be seen that the maximum vertical range of the rotation of the head in the vertical direction is-90 to 90 degrees, the maximum horizontal range of the rotation of the head in the horizontal direction is-180 to 180 degrees, the preset three-dimensional range is a sphere, one height area is arranged at intervals of f in the height direction, one horizontal area is arranged at intervals of m in the horizontal direction, and one horizontal area corresponding to each height area is a preset position.
Illustratively, the maximum vertical range of the rotation of the head in the vertical direction is-90 ° to 90 °, the maximum horizontal range of the rotation of the head in the horizontal direction is-180 ° to 180 °, the vertical interval angle f is divided by taking the center in the vertical direction as an origin, and the horizontal interval angle m is divided by taking the center in the horizontal direction as an origin, that is, 180/f height regions can be obtained within the height range of 180 °, 360/m horizontal regions can be obtained within the horizontal range of 360 °, one horizontal region corresponding to each height region is a preset position, and then each height region corresponds to one 360/m preset positions, and 180/f × 360/m preset positions can be obtained in total.
Fig. 4 is a schematic diagram of 5.1 channels. As shown in fig. 3, taking 5.1 channels as an example, a generator is respectively placed at a left channel L, a right channel R, a middle channel C, a left surround sound LS, a right surround sound RS, and a subwoofer FE, so that a head of a user can receive audio data of 6 channels through a single ear at any preset position, and can receive audio data of 12 channels through two ears, if the user rotates the head (the head is located at other preset positions), the audio data of 12 channels can be received through two ears, but the tone quality of the audio data is different from the tone quality of the audio data received before the head rotates.
Each sound channel corresponds to one generator, and the earphone can determine that the single ear of the user receives the audio data of the n sound channels at each preset position, wherein the head of the user is located at different positions, and the sound quality of the received audio data of the n sound channels is different. The earphone takes each audio data as a mapping relation pair, and then a mapping relation group is constructed according to n mapping relation pairs at a preset position, namely, a mapping relation group is corresponding to a height area of the same layer.
The formula for determining the number of the mapping relation pairs is as follows:
and S is 180/f 360/m 2n, wherein S is the number of the mapping relation pairs, f is a vertical interval angle, m is a horizontal interval angle, and n is a sound channel. 180/f is the number of height regions, 360/m is the number of horizontal regions, 180/f 360/m is the number of preset positions, and 2n is the number of audio signals received by both ears at a preset position (the number of pairs of mapping relationships in a preset position).
As an optional implementation manner, determining the current mapping relationship group corresponding to the current location includes: detecting a channel type, wherein the channel type indicates the number of channels, and for different channel types, audio signals of each channel received by a single ear at the same preset position are not identical; detecting whether a rotation sensor exists in the earphone or not, wherein the rotation sensor is used for detecting the rotation angle of the head; under the condition that a rotation sensor does not exist in the earphone, selecting a fixed position from preset positions according to the type of a sound channel, and taking a mapping relation group of the fixed position as a current mapping relation group corresponding to the current position; under the condition that a rotation sensor exists in the earphone, determining the current position of the head according to the detected rotation angle; and determining a current mapping relation group corresponding to the current position according to the current position and the sound channel type.
The headphone may detect the decoded channel type, wherein the channel type may be mono, 5.1 channel, 7.1 channel, or the like. Since the channel types are different, the laying order and the laying angle of the channels are not completely the same, and the sound quality of the audio signal of each channel received by one ear is not completely the same for different channel types even at the same preset position. That is, for different channel types, the mapping relationship pairs corresponding to the channels at the same preset position are not completely the same.
The headset is classified into two types, one is a headset having a rotation sensor, and the other is a headset having no rotation sensor, which can detect the position of the head when the head of the user rotates. If the rotation sensor does not exist in the earphone, the earphone can determine the arrangement sequence and the placing angle of the sound channels according to the type of the sound channels, then the fixed position is selected from the preset positions according to the arrangement sequence and the placing angle of the sound channels, and the mapping relation group of the fixed position is used as the current mapping relation group corresponding to the current position.
Illustratively, the arrangement position and the laying angle of the channels in the 5.1 channels are:
c is the direction of 0 degree right ahead,
r is deflected to the right by 30 degrees,
l is deflected to the left by 30 degrees,
the RS is deflected 115 degrees to the right,
LS is deflected 115 degrees to the left,
the FE is deflected 15 degrees to the left,
the headset can determine a fixed position, which is one of the preset positions, according to the arrangement position and the placing angle.
If a rotation sensor is present in the earphone, the earphone can select the current position of the head from the preset positions according to the detected rotation angle, and each sound channel at each preset position corresponds to two mapping relationship pairs (a left ear and a right ear correspond to one mapping relationship pair respectively), so that the earphone can determine the mapping relationship pair corresponding to each sound channel at the current position, and thus obtain the mapping relationship group at the current position.
For example, the initial position of the head is set to (0, 0), if the head rotates 3 degrees counterclockwise horizontally and the head rotates 20 degrees upward, the current position is (-3, 20), the current position corresponds to the preset region at the horizontal region of (0, -5) and the height region of (15, 20), and then the mapping relationship group of the preset region is set as the mapping relationship group of the current position.
As an optional implementation manner, after obtaining target audio signals of n channels received by a monaural, the method further includes: under the condition that a rotation sensor exists in the earphone, denoising is carried out on the target audio signal through a preset denoising scheme.
In the embodiment of the application, if a rotation sensor is present in the earphone, since the current position of the head changes in the rotation process, a convolution kernel (mapping relation group) needs to be switched when the head changes the position once, and when the convolution kernel is switched, the sound amplitude may be discontinuous, that is, noise is generated, and then a preset denoising scheme needs to be used for denoising a target audio signal so as to eliminate sudden sound change. The preset denoising scheme may be LSM (Least square measure) or filtering, and is not specifically limited in the present application.
As an optional implementation, after obtaining target audio signals of n channels received by a monaural, the method further includes: and under the condition that the target audio signal is lower than the first audio threshold, performing gain and amplitude control on the target audio signal so that the target audio signal is larger than the first audio threshold and smaller than the second audio threshold.
In the embodiment of the application, when sound is recorded, the target audio signal obtained due to small recorded sound is also small, and therefore, if the target audio signal is detected by the earphone to be lower than the first audio threshold, the target audio signal is subjected to gain, and in order to avoid explosive sound generated due to too large sound, amplitude control needs to be performed while the gain is performed, so that the target audio signal is smaller than the second audio threshold, and thus the finally obtained target audio signal is not only larger than the first audio threshold but also smaller than the second audio threshold, and it is ensured that the sound is large and the explosive sound is not generated. The sound Gain may be implemented by AGC (Automatic Gain Control) or gaussian filtering, and the Gain mode is not specifically limited in the present application.
After the target audio signal is denoised, gain and amplitude controlled by the earphone, the target audio signal is input into the renderer to be converted from a digital signal into an electric signal, a rendering process is completed, and output sound is obtained.
Based on the same technical concept, the embodiment of the present application further provides a system block diagram of an audio processing method, as shown in fig. 5, the system block diagram includes:
1. the laboratory recording mapping relation pair adopts a Motion sensor.
2. Convolution kernel selection module (Get kernel).
3. A convolution module (convolute) for calculating the second audio signal, using FFT (Fast Fourier transform).
4. And a Noise reduction module (Noise suppression) adopts an LSM (least squares) method.
5. The Audio gain control module (Audio gain) uses the AGC method.
Based on the same technical concept, an embodiment of the present application further provides an audio processing apparatus, as shown in fig. 6, the apparatus includes:
a first determining module 601, configured to determine a current mapping relationship group corresponding to a current location, where each preset location corresponds to one mapping relationship group, each mapping relationship group includes 2n mapping relationship pairs, and a mapping relationship pair indicates a correspondence between a preset location in an monaural environment and an audio signal of a channel received by the monaural environment at the preset location, where the number of channels is n, and n is the number of channels 1;
A second determining module 602, configured to determine, for the current mapping relationship group, a first audio signal of each channel received by a monaural;
an obtaining module 603, configured to obtain, according to the first audio signal and the received original audio signal, a second audio signal of each channel received by a monaural;
the processing module 604 is configured to perform weighted accumulation processing on the n second audio signals to obtain target audio signals of n channels received by a monaural.
Optionally, the apparatus is further configured to:
in a preset three-dimensional range of head rotation, taking a horizontal area in the horizontal direction corresponding to each height area as a preset position, wherein the preset three-dimensional range is divided into at least one height area, and each height area is divided into at least one horizontal area in the horizontal direction;
determining audio data of n channels received by a monaural at a preset position;
taking the audio data of each sound channel at a preset position as a mapping relation pair;
and constructing a mapping relation group according to the n mapping relation pairs at the preset positions.
Optionally, the apparatus is further configured to:
determining a maximum vertical range of rotation of the head in a vertical direction and a maximum horizontal range of rotation in a horizontal direction;
determining a preset three-dimensional range according to the maximum vertical range and the maximum horizontal range;
determining a vertical interval angle in a vertical direction and a horizontal interval angle in a horizontal direction;
in a preset stereo range, at least one height area is determined according to the maximum vertical range and the vertical interval angle, and at least one horizontal area is determined according to the horizontal range and the horizontal interval angle corresponding to the height area.
Optionally, the formula for calculating the number of mapping relationship pairs is:
and S is 180/f 360/m 2n, wherein S is the number of the mapping relation pairs, 180 is the maximum vertical range, f is the vertical interval angle, 360 is the maximum horizontal range, m is the horizontal interval angle, and 2n is the number of the mapping relation pairs corresponding to the preset position.
Optionally, the first determining module 601 includes:
detecting a channel type, wherein the channel type indicates the number of channels, and for different channel types, audio signals of each channel received by a single ear at the same preset position are not identical;
detecting whether a rotation sensor exists in the earphone or not, wherein the rotation sensor is used for detecting the rotation angle of the head;
under the condition that a rotation sensor does not exist in the earphone, selecting a fixed position from preset positions according to the type of a sound channel, and taking a mapping relation group of the fixed position as a current mapping relation group corresponding to the current position;
under the condition that a rotation sensor exists in the earphone, determining the current position of the head according to the detected rotation angle; and determining a current mapping relation group corresponding to the current position according to the current position and the sound channel type.
Optionally, the apparatus is further configured to:
determining a recording parameter corresponding to a current recording scene;
and determining the audio signal of one sound channel received by the monaural at the preset position according to the sound of the sounder and the recording parameters.
Optionally, the apparatus is further configured to:
and under the condition that the first audio signal is lower than the first audio threshold, performing gain and amplitude control on the target audio signal so that the target audio signal is larger than the first audio threshold and smaller than the second audio threshold.
Based on the same technical concept, an embodiment of the present invention further provides an electronic device, as shown in fig. 7, including a processor 701, a communication interface 702, a memory 703 and a communication bus 704, where the processor 701, the communication interface 702, and the memory 703 complete mutual communication through the communication bus 704,
a memory 703 for storing a computer program;
the processor 701 is configured to implement the above steps when executing the program stored in the memory 703.
The communication bus mentioned in the electronic device may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The communication bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown, but this does not mean that there is only one bus or one type of bus.
The communication interface is used for communication between the electronic equipment and other equipment.
The Memory may include a Random Access Memory (RAM) or a Non-Volatile Memory (NVM), such as at least one disk Memory. Optionally, the memory may also be at least one memory device located remotely from the processor.
The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but also Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components.
In a further embodiment provided by the present invention, there is also provided a computer readable storage medium having stored therein a computer program which, when executed by a processor, implements the steps of any of the methods described above.
In a further embodiment provided by the present invention, there is also provided a computer program product containing instructions which, when run on a computer, cause the computer to perform any of the methods of the above embodiments.
In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the invention to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website site, computer, server, or data center to another website site, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that includes one or more available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.
It is noted that, in this document, relational terms such as "first" and "second," and the like, may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in the process, method, article, or apparatus that comprises the element.
The foregoing are merely exemplary embodiments of the present invention, which enable those skilled in the art to understand or practice the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (10)

1. A method of audio processing, the method comprising:
determining a current mapping relation group corresponding to the current position, wherein each preset position corresponds to one mapping relation group, and each mapping relation group is mappedThe relationship group comprises 2n pairs of mappings indicating a correspondence between a monaural, a preset position in an open environment and an audio signal of a channel received by the monaural at the preset position, n being a number of channels 1;
Determining, for the current set of mappings, a first audio signal for each channel received by the monaural;
obtaining a second audio signal of each sound channel received by the monaural according to the first audio signal and the received original audio signal;
and performing weighted accumulation processing on the n second audio signals to obtain target audio signals of the n sound channels received by the monaural.
2. The method of claim 1, wherein before determining the current mapping relationship set corresponding to the current location, the method further comprises:
in a preset three-dimensional range of head rotation, taking a horizontal area in the horizontal direction corresponding to each height area as a preset position, wherein the preset three-dimensional range is divided into at least one height area, and each height area is divided into at least one horizontal area in the horizontal direction;
determining audio data of n channels received by a monaural at the preset position;
taking the audio data of each sound channel at the preset position as a mapping relation pair;
and constructing a mapping relation group according to the n mapping relation pairs at the preset position.
3. The method of claim 2, wherein before taking a horizontal area in the horizontal direction corresponding to each height area as a preset position, the method further comprises:
determining a maximum vertical range of rotation of the head in a vertical direction and a maximum horizontal range of rotation in a horizontal direction;
determining the preset three-dimensional range according to the maximum vertical range and the maximum horizontal range;
determining a vertical interval angle in a vertical direction and a horizontal interval angle in a horizontal direction;
and in the preset stereo range, determining at least one height area according to the maximum vertical range and the vertical interval angle, and determining at least one horizontal area according to the horizontal range corresponding to the height area and the horizontal interval angle.
4. The method of claim 3, wherein the formula for calculating the number of mapping relationship pairs is:
and S is 180/f 360/m 2n, wherein S is the number of the mapping relation pairs, 180 is the maximum vertical range, f is the vertical interval angle, 360 is the maximum horizontal range, m is the horizontal interval angle, and 2n is the number of the mapping relation pairs corresponding to the preset position.
5. The method of claim 1, wherein the determining the current mapping relationship group corresponding to the current location comprises:
detecting a channel type, wherein the channel type indicates the number of channels, and for different channel types, audio signals of each channel received by a single ear at the same preset position are not identical;
detecting whether a rotation sensor exists in the earphone or not, wherein the rotation sensor is used for detecting the rotation angle of the head;
selecting a fixed position from the preset positions according to the type of the sound channel under the condition that a rotation sensor does not exist in the earphone, and taking a mapping relation group of the fixed position as a current mapping relation group corresponding to the current position;
under the condition that a rotation sensor exists in the earphone, determining the current position of the head according to the detected rotation angle; and determining a current mapping relation group corresponding to the current position according to the current position and the sound channel type.
6. The method of claim 1, wherein before determining the current mapping relationship set corresponding to the current location, the method further comprises:
determining a recording parameter corresponding to a current recording scene;
and determining the audio signal of one sound channel received by the monaural at the preset position according to the sound of the sounder and the recording parameters.
7. The method of claim 1, wherein after obtaining the n-channel target audio signals received by the monaural input, the method further comprises:
and under the condition that the first audio signal is lower than a first audio threshold value, performing gain and amplitude control on the target audio signal so that the target audio signal is larger than the first audio threshold value and smaller than a second audio threshold value.
8. An audio processing apparatus, characterized in that the apparatus comprises:
the first determining module is used for determining a current mapping relation group corresponding to a current position, wherein each preset position corresponds to one mapping relation group, each mapping relation group comprises 2n mapping relation pairs, the mapping relation pairs indicate the corresponding relation between a preset position in a monaural environment and an audio signal of a sound channel received by the monaural environment at the preset position, the number of the sound channels is n, and n is larger than or equal to 1;
a second determining module, configured to determine, for the current mapping relationship group, a first audio signal of each channel received by the monaural;
the obtaining module is used for obtaining a second audio signal of each sound channel received by the monaural according to the first audio signal and the received original audio signal;
and the processing module is used for performing weighted accumulation processing on the n second audio signals to obtain target audio signals of the n sound channels received by the monaural.
9. An electronic device is characterized by comprising a processor, a communication interface, a memory and a communication bus, wherein the processor and the communication interface are used for realizing mutual communication by the memory through the communication bus;
a memory for storing a computer program;
a processor for implementing the method steps of any one of claims 1 to 7 when executing a program stored in a memory.
10. A computer-readable storage medium, characterized in that a computer program is stored in the computer-readable storage medium, which computer program, when being executed by a processor, carries out the method steps of any one of claims 1 to 7.
CN202210451743.5A 2022-04-26 2022-04-26 Audio processing method, device, electronic equipment and readable storage medium Active CN114866948B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210451743.5A CN114866948B (en) 2022-04-26 2022-04-26 Audio processing method, device, electronic equipment and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210451743.5A CN114866948B (en) 2022-04-26 2022-04-26 Audio processing method, device, electronic equipment and readable storage medium

Publications (2)

Publication Number Publication Date
CN114866948A true CN114866948A (en) 2022-08-05
CN114866948B CN114866948B (en) 2024-07-05

Family

ID=82632463

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210451743.5A Active CN114866948B (en) 2022-04-26 2022-04-26 Audio processing method, device, electronic equipment and readable storage medium

Country Status (1)

Country Link
CN (1) CN114866948B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117193702A (en) * 2023-08-10 2023-12-08 深圳市昂晖电子科技有限公司 Playing control method based on embedded player and related device

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1466401A (en) * 2002-07-02 2004-01-07 矽统科技股份有限公司 Method for generating stereo sound effect
US7167567B1 (en) * 1997-12-13 2007-01-23 Creative Technology Ltd Method of processing an audio signal
CN104010264A (en) * 2013-02-21 2014-08-27 中兴通讯股份有限公司 Method and apparatus for processing double-track audio signals
CN104464742A (en) * 2014-12-31 2015-03-25 武汉大学 System and method for carrying out comprehensive non-uniform quantitative coding on 3D audio space parameters
CN106804023A (en) * 2013-07-22 2017-06-06 弗朗霍夫应用科学研究促进协会 Input sound channel is to the mapping method of output channels, signal processing unit and audio decoder
WO2018193163A1 (en) * 2017-04-20 2018-10-25 Nokia Technologies Oy Enhancing loudspeaker playback using a spatial extent processed audio signal
CN111615044A (en) * 2019-02-25 2020-09-01 宏碁股份有限公司 Method and system for correcting energy distribution of sound signal
CN113689890A (en) * 2021-08-09 2021-11-23 北京小米移动软件有限公司 Method and device for converting multi-channel signal and storage medium
CN113889125A (en) * 2021-12-02 2022-01-04 腾讯科技(深圳)有限公司 Audio generation method and device, computer equipment and storage medium

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7167567B1 (en) * 1997-12-13 2007-01-23 Creative Technology Ltd Method of processing an audio signal
CN1466401A (en) * 2002-07-02 2004-01-07 矽统科技股份有限公司 Method for generating stereo sound effect
CN104010264A (en) * 2013-02-21 2014-08-27 中兴通讯股份有限公司 Method and apparatus for processing double-track audio signals
CN106804023A (en) * 2013-07-22 2017-06-06 弗朗霍夫应用科学研究促进协会 Input sound channel is to the mapping method of output channels, signal processing unit and audio decoder
CN104464742A (en) * 2014-12-31 2015-03-25 武汉大学 System and method for carrying out comprehensive non-uniform quantitative coding on 3D audio space parameters
WO2018193163A1 (en) * 2017-04-20 2018-10-25 Nokia Technologies Oy Enhancing loudspeaker playback using a spatial extent processed audio signal
CN111615044A (en) * 2019-02-25 2020-09-01 宏碁股份有限公司 Method and system for correcting energy distribution of sound signal
CN113689890A (en) * 2021-08-09 2021-11-23 北京小米移动软件有限公司 Method and device for converting multi-channel signal and storage medium
CN113889125A (en) * 2021-12-02 2022-01-04 腾讯科技(深圳)有限公司 Audio generation method and device, computer equipment and storage medium

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
SANJEEV MEHROTRA ET AL.: ""Interpolation of combined head and room impulse response for audio spatialization"", 《2011 IEEE 13TH INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING》, 19 October 2011 (2011-10-19), pages 1 - 6, XP032027534, DOI: 10.1109/MMSP.2011.6093794 *
廖传奇等: ""基于空间位置信息的三维音频编码技术研究"", 《计算机工程》, vol. 43, no. 1, 20 April 2016 (2016-04-20), pages 303 - 308 *
王超等: ""基于HRTF的虚拟三维空间环绕声耳机重放"", 《信息技术》, vol. 33, no. 1, 31 January 2009 (2009-01-31), pages 39 - 41 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117193702A (en) * 2023-08-10 2023-12-08 深圳市昂晖电子科技有限公司 Playing control method based on embedded player and related device
CN117193702B (en) * 2023-08-10 2024-06-11 深圳市昂晖电子科技有限公司 Playing control method based on embedded player and related device

Also Published As

Publication number Publication date
CN114866948B (en) 2024-07-05

Similar Documents

Publication Publication Date Title
JP6446068B2 (en) Determine and use room-optimized transfer functions
JP7705647B2 (en) Spatial relocation of multiple acoustic streams
EP3893523B1 (en) Audio signal processing method and apparatus
JP6939786B2 (en) Sound field forming device and method, and program
WO2018149275A1 (en) Method and apparatus for adjusting audio output by speaker
US11418903B2 (en) Spatial repositioning of multiple audio streams
JP2016502345A (en) Cooperative sound system
CN110809214B (en) Audio playing method, audio playing device and terminal equipment
CN109165005B (en) Sound effect enhancement method and device, electronic equipment and storage medium
US11863952B2 (en) Sound capture for mobile devices
CN116600242B (en) Audio sound and image optimization methods, devices, electronic equipment and storage media
US12192738B2 (en) Electronic apparatus for audio signal processing and operating method thereof
CN114866948B (en) Audio processing method, device, electronic equipment and readable storage medium
CN113553022A (en) Equipment adjusting method and device, mobile terminal and storage medium
US10440495B2 (en) Virtual localization of sound
CN119233188A (en) Spatial audio system, audio processor and virtual surround sound conversion method for stereo speaker playing device
CN117178568A (en) Error correction for head related filters
Cecchi et al. An efficient implementation of acoustic crosstalk cancellation for 3D audio rendering
US20170365272A1 (en) Device for detecting, monitoring, and cancelling ghost echoes in an audio signal
CN111107481A (en) Audio rendering method and device
CN115706895A (en) Immersive sound reproduction using multiple transducers
WO2022185725A1 (en) Information processing device, information processing method, and program
WO2022047606A1 (en) Method and system for authentication and compensation
US11722821B2 (en) Sound capture for mobile devices
CN119277259A (en) A method for calibrating audio and video, terminal equipment, storage medium and program product

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant