[go: up one dir, main page]

TW202226230A - Method to mute and unmute a microphone signal - Google Patents

Method to mute and unmute a microphone signal Download PDF

Info

Publication number
TW202226230A
TW202226230A TW110142936A TW110142936A TW202226230A TW 202226230 A TW202226230 A TW 202226230A TW 110142936 A TW110142936 A TW 110142936A TW 110142936 A TW110142936 A TW 110142936A TW 202226230 A TW202226230 A TW 202226230A
Authority
TW
Taiwan
Prior art keywords
microphone signal
mute
input microphone
processor
level
Prior art date
Application number
TW110142936A
Other languages
Chinese (zh)
Inventor
啟昇 陳
倫階 曾
卡斯特羅 艾莉爾 阿雷拉諾 德
Original Assignee
新加坡商創新科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 新加坡商創新科技有限公司 filed Critical 新加坡商創新科技有限公司
Publication of TW202226230A publication Critical patent/TW202226230A/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • G06F3/165Management of the audio stream, e.g. setting of volume, audio stream path
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R3/00Circuits for transducers, loudspeakers or microphones
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0316Speech enhancement, e.g. noise reduction or echo cancellation by changing the amplitude
    • G10L21/0324Details of processing therefor
    • G10L21/034Automatic adjustment
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/78Detection of presence or absence of voice signals
    • G10L25/87Detection of discrete points within a voice signal
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R1/00Details of transducers, loudspeakers or microphones
    • H04R1/08Mouthpieces; Microphones; Attachments therefor
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R2430/00Signal processing covered by H04R, not provided for in its groups
    • H04R2430/01Aspects of volume control, not necessarily automatic, in sound systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Signal Processing (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • Circuit For Audible Band Transducer (AREA)

Abstract

A method for muting and unmuting a microphone is provided. The method includes providing a processor, receiving an input microphone signal, measuring the input microphone signal for a loudness level at a sampling rate, calculating a mute threshold level, checking if the loudness level is higher than or equal to the mute threshold level, and resetting a mute delay timer upon determining that the loudness level is higher than or equal to the mute threshold level and obtaining the input microphone signal, or checking if the mute delay timer is running upon determining that the loudness level is not higher than or equal to the mute threshold level and attenuating the input microphone signal if the mute delay timer is not running or obtaining the input microphone signal if the mute delay timer is still running, and writing the input microphone signal or attenuated input microphone signal to an output buffer.

Description

將麥克風信號靜音和取消靜音之方法How to mute and unmute the microphone signal

本發明大致是有關於麥克風信號的靜音及取消靜音,並且更特別是有關於利用語音活動偵測器的麥克風信號的靜音及取消靜音。The present invention generally relates to muting and unmuting of microphone signals, and more particularly to muting and unmuting of microphone signals using voice activity detectors.

麥克風在例如電話通話或網際網路通話(具有或不具有視訊,其利用例如是Zoom、Skype及Microsoft Teams的通訊軟體)的語音或視訊通話期間被使用。通常,麥克風在所述通話期間一直被致能的。然而,總是致能的麥克風會拾訊非所要背景雜訊或是來自環境的無意音訊,此導致對於所述通話中的遠端方的干擾及影響。為了避免此問題,通常在所述通話期間,使用者將會在其不說話時靜音其麥克風,因而背景聲音/雜訊將不會被所述通話中的其他遠端方聽見。所述麥克風亦可以是預設靜音的,以便於最小化對於其他使用者的擾亂。但使用者在其開始說話時可能常常忘記取消靜音所述麥克風。Microphones are used during voice or video calls such as telephone calls or Internet calls (with or without video, which utilize communication software such as Zoom, Skype and Microsoft Teams). Typically, the microphone is always enabled during the call. However, always-enabled microphones can pick up unwanted background noise or unintentional audio from the environment, which results in interference and impact on the far-end party in the call. To avoid this problem, typically during the call, the user will mute his microphone when he is not speaking, so the background sound/noise will not be heard by the other far-end parties in the call. The microphone can also be preset to mute in order to minimize disruption to other users. But users may often forget to unmute the microphone when they start speaking.

因此,可看出所需的是一種分別在使用者並未說話以及在使用者正在說話時將麥克風信號靜音及取消靜音之方法。再者,其它所期望的特點及特徵從結合所附的圖式及本揭露內容的此背景所做的後續的詳細說明及所附的請求項來看將會變成明顯的。Thus, it can be seen that what is needed is a method to mute and unmute the microphone signal when the user is not speaking and when the user is speaking, respectively. Furthermore, other desirable features and characteristics will become apparent from the subsequent detailed description and the appended claims, taken in conjunction with the accompanying drawings and this background of the disclosure.

在本發明的一特點中,提出一種用於將麥克風靜音及取消靜音之方法。所述方法包含設置一處理器、接收一輸入麥克風信號、以一取樣率來量測所述輸入麥克風信號的一音量位準、計算一靜音臨界位準、檢查所述音量位準是否高於或等於所述靜音臨界位準、以及在判斷所述音量位準高於或等於所述靜音臨界位準之後重置一靜音延遲計時器並且獲得所述輸入麥克風信號、或是在判斷所述音量位準並未高於或等於所述靜音臨界位準之後檢查所述靜音延遲計時器是否正在運行,並且若所述靜音延遲計時器並未運行,則衰減所述輸入麥克風信號、或是若所述靜音延遲計時器仍在運行,則獲得所述輸入麥克風信號、以及將所述輸入麥克風信號或是被衰減的輸入麥克風信號寫入一輸出緩衝器。In one feature of the present invention, a method for muting and unmuting a microphone is provided. The method includes setting a processor, receiving an input microphone signal, measuring a volume level of the input microphone signal at a sampling rate, calculating a mute threshold, and checking whether the volume level is above or equal to the mute threshold level, and after judging that the volume level is higher than or equal to the mute threshold level, reset a mute delay timer and obtain the input microphone signal, or determine the volume level Check if the mute delay timer is running after the mute threshold level is not above or equal to the mute threshold level, and if the mute delay timer is not running, attenuate the input microphone signal, or if the mute delay timer is not running With the mute delay timer still running, the input microphone signal is obtained and either the input microphone signal or the attenuated input microphone signal is written to an output buffer.

在本發明的另一特點中,提出一種包含可藉由一處理器讀取的一非暫態儲存媒體的軟體產品,所述非暫態儲存媒體具有儲存於其上的一組指令以用於將一輸入麥克風信號靜音及取消靜音。所述軟體產品包含一第一序列的指令,當藉由所述處理器執行時,其使得所述處理器接收一輸入麥克風信號、一第二序列的指令,當藉由所述處理器執行時,其使得所述處理器以一取樣率來量測所述輸入麥克風信號的一音量位準、一第三序列的指令,當藉由所述處理器執行時,其使得所述處理器計算一靜音臨界位準、一第四序列的指令,當藉由所述處理器執行時,其使得所述處理器檢查所述音量位準是否高於或等於所述靜音臨界位準,並且在判斷所述音量位準高於或等於所述靜音臨界位準之後重置一靜音延遲計時器並且獲得所述輸入麥克風信號、或是在判斷所述音量位準並未高於或等於所述靜音臨界位準之後檢查所述靜音延遲計時器是否正在運行,並且若所述靜音延遲計時器並未運行,則衰減所述輸入麥克風信號、或是若所述靜音延遲計時器仍在運行,則獲得所述輸入麥克風信號、以及一第五序列的指令,當藉由所述處理器執行時,其使得所述處理器將所述輸入麥克風信號或是被衰減的輸入麥克風信號寫入一輸出緩衝器。In another feature of the present invention, a software product is provided that includes a non-transitory storage medium readable by a processor, the non-transitory storage medium having stored thereon a set of instructions for Mute and unmute an input microphone signal. The software product includes a first sequence of instructions that, when executed by the processor, cause the processor to receive an input microphone signal, a second sequence of instructions, when executed by the processor , which causes the processor to measure a volume level of the input microphone signal at a sampling rate, a third sequence of instructions that, when executed by the processor, cause the processor to calculate a Mute critical level, a fourth sequence of instructions that, when executed by the processor, cause the processor to check whether the volume level is higher than or equal to the mute critical level, and when judging the After the volume level is higher than or equal to the mute threshold, reset a mute delay timer and obtain the input microphone signal, or determine that the volume level is not higher than or equal to the mute threshold Check if the mute delay timer is running after calibration and attenuate the input microphone signal if the mute delay timer is not running, or get the mute delay timer if the mute delay timer is still running The input microphone signal, and a fifth sequence of instructions, when executed by the processor, cause the processor to write the input microphone signal or the attenuated input microphone signal to an output buffer.

以下詳細說明在本質上僅僅是範例而已,因而並不欲限制本發明或本申請案以及本發明的用途。再者,並沒有意圖來藉由在本發明的先前背景或是以下的詳細說明中所呈現的任何理論來界定。所述各種實施例之一意圖是呈現一種將麥克風信號靜音及取消靜音之方法。The following detailed description is merely exemplary in nature and is not intended to limit the invention or the application and uses of the invention. Furthermore, there is no intention to be bound by any theory presented in the preceding background of the invention or the following detailed description. One of the intent of the various embodiments is to present a method of muting and unmuting a microphone signal.

參照圖1,描繪根據各種實施例的一種用於將麥克風信號靜音及取消靜音之方法的流程圖100被展示。一裝置被設置有一處理器。所述處理器在步驟110中接收一輸入麥克風信號,並且所述麥克風信號的音量位準在步驟120中被量測。在一實施例中,所述輸入麥克風信號可以是在頻域中。來自一頻帶的麥克風信號的頻帶幅度可藉由取複數的輸入麥克風信號的均方根(RMS)乘上一幅度縮放因數來加以量測。所述頻帶可以是從一下限頻率(例如250Hz)到一上限頻率(例如輸入麥克風信號的奈奎斯特頻率或8000Hz,以較低者為準)。所述幅度縮放因數考慮到所述音訊的取樣率。在一實施例中,所述幅度縮放因數可以是所述取樣率及音框尺寸的函數的平方根的倒數。藉由乘上所述幅度縮放因數,所述頻帶幅度變成對於不同的取樣率及音框尺寸而言是不變的。Referring to FIG. 1, a flowchart 100 depicting a method for muting and unmuting a microphone signal in accordance with various embodiments is shown. A device is provided with a processor. The processor receives an input microphone signal in step 110 , and the volume level of the microphone signal is measured in step 120 . In an embodiment, the input microphone signal may be in the frequency domain. The frequency band amplitude of the microphone signal from a frequency band can be measured by taking the root mean square (RMS) of the complex input microphone signal and multiplying it by an amplitude scaling factor. The frequency band may be from a lower frequency (eg, 250 Hz) to an upper frequency (eg, the Nyquist frequency of the input microphone signal or 8000 Hz, whichever is lower). The amplitude scaling factor takes into account the sampling rate of the audio. In one embodiment, the amplitude scaling factor may be the inverse of the square root of a function of the sampling rate and frame size. By multiplying by the amplitude scaling factor, the frequency band amplitude becomes constant for different sampling rates and frame sizes.

目前音量位準藉由利用具有一上升響應時間以及一釋放響應時間的一平滑化函數以平滑化所述頻帶幅度來獲得的。上升響應是平滑化值相較於其先前值將會增加有多快速/緩慢的響應,並且釋放響應是平滑化值相較於其先前值將會減少有多快速/緩慢的響應。在一實施例中,所述上升響應時間以及釋放響應時間是16msec。所述麥克風信號的音量位準即時地被量測,其具有16kHz的取樣率、512的音框尺寸、16msec的上升響應時間以及16msec的釋放響應時間,使得所述麥克風信號的音量位準可以每32msec而被判斷出。如同將在以下進一步詳細描述的,有利的是此將會容許所述麥克風信號能夠在沒有語音損失下幾乎立刻被取消靜音。根據系統資源及限制,其它適當的取樣率及音框尺寸亦可被利用。例如,48kHz的取樣率及512的音框尺寸容許所述麥克風信號的音量位準能夠每10.67msec而被判斷出。The current volume level is obtained by smoothing the frequency band amplitude using a smoothing function with a rise response time and a release response time. The rise response is how quickly/slowly the smoothed value will increase compared to its previous value, and the release response is how quickly/slowly the smoothed value will decrease compared to its previous value. In one embodiment, the rise response time and release response time are 16 msec. The volume level of the microphone signal is measured in real time, with a sampling rate of 16kHz, a sound frame size of 512, a rise response time of 16msec, and a release response time of 16msec, so that the volume level of the microphone signal can be adjusted every time. 32msec was judged. As will be described in further detail below, this advantageously will allow the microphone signal to be unmuted almost immediately without loss of speech. Depending on system resources and constraints, other suitable sample rates and sound frame sizes may also be utilized. For example, a sampling rate of 48 kHz and a sound frame size of 512 allow the volume level of the microphone signal to be determined every 10.67 msec.

在步驟130中,所述處理器根據語音活動偵測(VAD)設定來計算一靜音臨界位準。在一實施例中,所述靜音臨界位準根據五個不同的VAD模式中之一者來設定。所述模式是自動校準、透過預設的手動校準、透過預設位準的手動校準、透過客製值的手動校準、以及即時自動調整。在一實施例中,預設較佳模式是所述即時自動調整模式。In step 130, the processor calculates a mute threshold according to the voice activity detection (VAD) setting. In one embodiment, the mute threshold level is set according to one of five different VAD modes. The modes are automatic calibration, manual calibration by default, manual calibration by default levels, manual calibration by custom values, and real-time auto-adjustment. In one embodiment, the preset preferred mode is the instant auto-adjustment mode.

當所述模式被設定為自動校準時,使用者被要求在音訊校準進行中保持無聲一定的持續期間(例如至少3秒)。在自動校準期間,所述處理器量測波峰環境雜訊位準,並且根據所量測到的位準來調整所述靜音臨界位準。波峰雜訊可對於每200ms的麥克風信號來加以量測,並且被儲存在具有尺寸為8的循環緩衝器中。此與對於獲取最後1.6sec(8×200ms)之波峰雜訊相同,其每200ms更新一次。一按鈕可被做成可供使用者利用來開始所述校準。儘管自動校準可給予環境聲音/雜訊位準的更正確量測,但其需要使用者在每次通話開始時都要進行此校準。When the mode is set to automatic calibration, the user is required to remain silent for a certain duration (eg, at least 3 seconds) while audio calibration is in progress. During auto-calibration, the processor measures the peak ambient noise level and adjusts the mute threshold level according to the measured level. The peak noise can be measured for every 200ms of the microphone signal and stored in a circular buffer of size 8. This is the same as for the peak noise for the last 1.6sec (8x200ms) acquisition, which is updated every 200ms. A button can be made available to the user to initiate the calibration. Although automatic calibration can give more accurate measurements of ambient sound/noise levels, it requires the user to perform this calibration at the beginning of each call.

當所述模式被設定為透過預設的手動校準時,使用者被要求考量其語音位準(例如大聲、適中、輕聲)、環境雜訊位準(例如高、適中或低背景雜訊)、以及使用中的麥克風(例如頭戴式耳機麥克風、耳塞式耳機麥克風、前額麥克風、遠場麥克風)來選擇不同的預設。所述靜音臨界位準根據預先定義的預設值來設定,所述預設值本機地預先調諧以對應所選預設。When the mode is set to pass a preset manual calibration, the user is asked to take into account their speech level (eg loud, moderate, soft), ambient noise level (eg high, moderate or low background noise) , and the microphone in use (eg headset, earbud, forehead, far-field) to select different presets. The mute threshold level is set according to a pre-defined preset value that is locally pre-tuned to correspond to the selected preset.

當所述模式被設定為透過預設位準的手動校準時,使用者是被要求選擇客製的預設,例如但不限於特定的麥克風類型及模型以及環境類型。例如,使用者可被呈現麥克風選項,例如是“Creative Labs Live! Cam Sync HD 1080p Webcam麥克風”、“Lewitt LCT 640 TS麥克風”、“Audio Technica AE2300麥克風”、“Panasonic Dynamic麥克風WM-530”、等等。使用者亦可被呈現環境選項,例如“市場”、“購物中心”、“辦公室”、等等。所述靜音臨界位準是根據對應所選的客製的預設的預設位準而被設定的。When the mode is set to manual calibration via preset levels, the user is asked to select a custom preset, such as but not limited to a specific microphone type and model and environment type. For example, the user may be presented with microphone options such as "Creative Labs Live! Cam Sync HD 1080p Webcam Microphone", "Lewitt LCT 640 TS Microphone", "Audio Technica AE2300 Microphone", "Panasonic Dynamic Microphone WM-530", etc. Wait. The user may also be presented with environment options, such as "market", "shopping mall", "office", and the like. The mute threshold level is set according to the preset level corresponding to the selected customized preset.

當所述模式被設定為透過客製值的手動校準時,所述靜音臨界位準根據雜訊底線以及使用者定義的固定偏移而被設定的。一滑動件可被呈現給使用者以容許使用者能夠調整用於所述VAD的偏移值。所述雜訊底線可藉由追蹤頻帶幅度的最小位準,利用具有緩慢的上升響應及快速的釋放響應的一平滑化函數來加以量測。上升響應是平滑化值相較於其先前值將會有多快速/緩慢的增加的響應,並且釋放響應是平滑化值相較於其先前值將會有多快速/緩慢的減少的響應。在一較佳實施例中,所述上升響應時間是10sec,並且所述釋放響應時間是50msec。When the mode is set to manual calibration with custom values, the mute threshold is set based on a noise floor and a user-defined fixed offset. A slider may be presented to the user to allow the user to adjust the offset value for the VAD. The noise floor can be measured by tracking the minimum level of the frequency band amplitude using a smoothing function with a slow rise response and a fast release response. The rise response is the response of how fast/slowly the smoothed value will increase compared to its previous value, and the release response is the response of how fast/slowly the smoothed value will decrease compared to its previous value. In a preferred embodiment, the rise response time is 10sec and the release response time is 50msec.

當所述模式被設定為即時自動調整時,所述靜音臨界位準持續即時地被更新。在靜音狀況期間,所述靜音臨界位準根據瞬間的波峰雜訊而定。在一較佳實施例中,平滑化利用一1msec的上升響應時間以及一2000msec的釋放響應時間而施加。波峰雜訊可針對於每一個200ms的麥克風信號來加以量測,並且被儲存在一具有尺寸為8的循環緩衝器中。此與對於獲取最後1.6sec(8×200ms)之波峰雜訊相同,其每200ms更新一次。在靜音至取消靜音的轉變期間,先前的靜音臨界位準值可被儲存為一靜音臨界(最小)值。在取消靜音至靜音的轉變期間,所述循環緩衝器被清除。在取消靜音狀況期間,所述靜音臨界位準是根據平均音量位準而定。在一較佳實施例中,平均音量位準是藉由利用一200msec的上升響應時間以及一200msec的釋放響應時間來平滑化所量測到的頻帶幅度而被計算出。若所計算出的平均音量位準加上預先定義的固定偏移低於所儲存的靜音臨界(最小)值,則所述靜音臨界(最小)值將會被使用。在一較佳實施例中,平滑化利用一2000msec的上升響應時間以及一2000msec的釋放響應時間而被施加的。有利的是,在自動調整模式中沒有供使用者設定或選擇之配置設定。When the mode is set to auto-adjust on-the-fly, the mute threshold level continues to be updated on-the-fly. During mute conditions, the mute threshold level is based on transient peak noise. In a preferred embodiment, smoothing is applied with a rise response time of 1 msec and a release response time of 2000 msec. The peak noise is measured for each 200ms microphone signal and stored in a circular buffer of size 8. This is the same as for the peak noise for the last 1.6sec (8x200ms) acquisition, which is updated every 200ms. During the mute to unmute transition, the previous mute threshold level value may be stored as a mute threshold (minimum) value. During an unmute-to-mute transition, the circular buffer is cleared. During unmuted conditions, the mute threshold level is based on the average volume level. In a preferred embodiment, the average volume level is calculated by smoothing the measured band amplitudes with a rise response time of 200msec and a release response time of 200msec. If the calculated average volume level plus a predefined fixed offset is lower than the stored mute threshold (min) value, the mute threshold (min) value will be used. In a preferred embodiment, smoothing is applied with a rise response time of 2000 msec and a release response time of 2000 msec. Advantageously, there are no configuration settings for the user to set or select in the auto-tuning mode.

在步驟140中,所述處理器檢查所述音量位準是否大於或等於所述靜音臨界位準。當所述檢查指出所述音量位準大於或等於所述靜音臨界位準時,所述靜音延遲計時器在步驟150中重置。當所量測到的音量位準低於所述靜音臨界位準時,所述靜音延遲計時器控制何時所述麥克風將會自動地靜音(藉由衰減所述麥克風信號)。每當所量測到的音量位準高於或等於所述靜音臨界位準時,所述靜音延遲計時器在步驟150中重置。當所量測到的音量位準低於所述靜音臨界位準時,所述靜音延遲計時器將會繼續在步驟180中運行,直到所述計時器逾時為止。當所述計時器逾時,所述輸入麥克風信號將會在步驟190中被充分衰減以達成將所述麥克風靜音的效果。所述靜音延遲計時器設定可應用於所有五個模式,並且使用者將會能夠設定一較佳值。在一實施例中,所述預設值是1秒。1秒的靜音延遲表示若1秒內沒有偵測到語音活動,則所述麥克風將會被靜音。所述靜音延遲計時器有利地容許減少由於使用者在通話期間短暫地暫停其語音而造成所述麥克風不必要地短暫靜音,此大為強化所述通話的整體感受。在步驟160中,原始的麥克風信號被獲得,並且在回到步驟110之前,在步驟170中被寫入一輸出緩衝器。當所述原始的麥克風信號被寫入所述輸出緩衝器時,將所述麥克風信號取消靜音的效果被達成。當所述麥克風信號在步驟190中被衰減,並且在步驟170中被寫入輸出緩衝器時,將所述麥克風信號靜音的效果被達成。In step 140, the processor checks whether the volume level is greater than or equal to the mute threshold level. When the check indicates that the volume level is greater than or equal to the mute threshold level, the mute delay timer is reset in step 150 . The mute delay timer controls when the microphone will be automatically muted (by attenuating the microphone signal) when the measured volume level is below the mute threshold. The mute delay timer is reset in step 150 whenever the measured volume level is higher than or equal to the mute threshold level. When the measured volume level is lower than the mute threshold level, the mute delay timer will continue to run in step 180 until the timer expires. When the timer expires, the input microphone signal will be sufficiently attenuated in step 190 to achieve the effect of muting the microphone. The mute delay timer setting is applicable to all five modes, and the user will be able to set a preferred value. In one embodiment, the preset value is 1 second. A mute delay of 1 second means that if no voice activity is detected for 1 second, the microphone will be muted. The mute delay timer advantageously allows reducing unnecessary brief mute of the microphone due to the user briefly pausing his speech during a call, which greatly enhances the overall experience of the call. The raw microphone signal is obtained in step 160 and written to an output buffer in step 170 before returning to step 110 . The effect of unmuting the microphone signal is achieved when the original microphone signal is written to the output buffer. The effect of muting the microphone signal is achieved when the microphone signal is attenuated in step 190 and written to the output buffer in step 170 .

在另一方面,當所述音量位準在步驟140中並未大於或等於所述靜音臨界位準時,所述處理器將會在步驟180中檢查所述靜音延遲計時器是否正在運行。當所述檢查指出所述靜音延遲計時器正在運行時,所述程序前進到步驟160,接著步驟170。在另一方面,當所述靜音延遲計時器並未運行時(逾時),所述麥克風是在步驟190中藉由衰減所述麥克風信號而被靜音,並且所述程序在回到步驟110之前先行前進到步驟170。在流程圖100中的方法持續地量測所述輸入麥克風信號。On the other hand, when the volume level is not greater than or equal to the mute threshold in step 140, the processor will check in step 180 whether the mute delay timer is running. When the check indicates that the mute delay timer is running, the routine proceeds to step 160 followed by step 170 . On the other hand, when the mute delay timer is not running (timed out), the microphone is muted in step 190 by attenuating the microphone signal, and the process returns to step 110 before returning to Advance to step 170 first. The method in flowchart 100 continuously measures the input microphone signal.

在一實施例中,一視覺指示器被顯示給使用者以了解目前靜音狀態。當使用者選擇及調整其VAD設定時,所述麥克風信號將會同時被分析並且靜音狀態被顯示,使得使用者可依此對於所述VAD設定做出改變。由於當使用者先前選擇及調整其VAD設定時的環境狀況可能不同於實際通話期間的實際環境狀況,因此在通話的整個持續期間都顯示目前靜音狀態容許使用者能夠知道即時靜音狀態,並且若必要的話,使用者可依此對於所述VAD設定做出改變。In one embodiment, a visual indicator is displayed to the user of the current mute status. As the user selects and adjusts their VAD settings, the microphone signal is simultaneously analyzed and the mute status is displayed so that the user can make changes to the VAD settings accordingly. Since the environmental conditions when the user previously selected and adjusted their VAD settings may be different from the actual environmental conditions during the actual call, displaying the current mute status for the entire duration of the call allows the user to know the instant mute status and, if necessary, If so, the user can make changes to the VAD settings accordingly.

在靜音狀況期間,所述麥克風信號被衰減以具有類似麥克風的靜音效果。在一較佳實施例中,動態衰減技術被實施,其考慮即時麥克風信號的強度,而不是施加一固定衰減來將所述麥克風信號靜音。當使用者並未在說話並且所述麥克風信號位準是低的(例如在背景雜訊因此是低的情節中),較小衰減將會被施加來達成所述麥克風信號的目標靜音音訊位準。當使用者並未在說話並且麥克風信號位準是高的(例如在背景雜訊位準因此是高的情節中),更多衰減將會被施加來達成所述麥克風信號的目標靜音音訊位準。所述麥克風信號的目標靜音音訊位準被決定成使得所述位準足夠低到讓人無法聽到音訊,但又足夠高到通訊應用程式仍然可偵測到被衰減的麥克風信號的存在。所述衰減位準根據所述輸入麥克風信號在靜音狀況期間的目標音訊位準(例如約-66dB)以及目前量測到的音量位準而被計算出,其中平滑化利用一2000msec的上升響應時間以及一200msec的釋放響應時間而被施加的。有利的是,此避免某些通訊應用程式由於不能夠在通訊期間偵測到麥克風信號的存在,而認為使用者的麥克風或音訊系統設置可能有問題。During silent conditions, the microphone signal is attenuated to have a microphone-like muting effect. In a preferred embodiment, dynamic attenuation techniques are implemented that take into account the strength of the instantaneous microphone signal, rather than applying a fixed attenuation to mute the microphone signal. When the user is not speaking and the microphone signal level is low (eg in scenarios where background noise is therefore low), less attenuation will be applied to achieve the target mute audio level of the microphone signal . When the user is not speaking and the microphone signal level is high (eg in scenarios where the background noise level is therefore high), more attenuation will be applied to achieve the target mute audio level of the microphone signal . The target mute audio level for the microphone signal is determined so that the level is low enough that no one can hear the audio, but high enough that the communication application can still detect the presence of the attenuated microphone signal. The attenuation level is calculated according to the target audio level (eg, about -66dB) of the input microphone signal during a silent condition and the currently measured volume level, wherein the smoothing utilizes a rise response time of 2000msec and a 200msec release response time is applied. Advantageously, this prevents some communication applications from thinking that there may be a problem with the user's microphone or audio system settings due to their inability to detect the presence of a microphone signal during communication.

儘管在所述流程圖中的步驟是依序被給出,但應該體認到的是某些步驟可同時、或是用不同序列來執行。所述步驟可用硬體、軟體、韌體、或是其之任意組合來實施。Although the steps in the flowcharts are presented sequentially, it should be recognized that certain steps may be performed concurrently, or in different sequences. The steps can be implemented in hardware, software, firmware, or any combination thereof.

因此,可看出的是已經提出一種分別在使用者並未說話以及在使用者正在說話時將麥克風信號靜音及取消靜音之方法。本發明之一優點是其提供一種方式讓裝置自動地分別在使用者並未說話以及在使用者正在說話時將麥克風靜音及取消靜音。有利的是,所述麥克風幾乎立刻被取消靜音,而無語音損失。Thus, it can be seen that a method has been proposed to mute and unmute the microphone signal when the user is not speaking and when the user is speaking, respectively. One advantage of the present invention is that it provides a way for the device to automatically mute and unmute the microphone when the user is not speaking and when the user is speaking, respectively. Advantageously, the microphone is unmuted almost immediately without loss of speech.

儘管範例實施例已經在本實施例的先前詳細說明中被呈現,但應該體認到的是存在大量變化。應該進一步體認到的是,所述範例實施例只是例子而已,因而並不欲以任何方式來限制本發明的範疇、可利用性、操作、或是配置。而是,先前詳細說明將會提供熟習此項技術者便利的規劃以用於實施本發明範例的實施例,所理解的是各種改變可在範例實施例中所述的步驟及操作方法的功能及配置上達成,而不脫離如同在所附請求項中闡述的本發明的範疇。While example embodiments have been presented in the foregoing detailed description of this embodiment, it should be appreciated that a vast number of variations exist. It should be further appreciated that the exemplary embodiments described are examples only, and are therefore not intended to limit the scope, applicability, operation, or configuration of the invention in any way. Rather, the foregoing detailed description will provide those skilled in the art with a convenient plan for implementing exemplary embodiments of the present invention, with the understanding that various modifications may be made to the steps and methods of operation described in the exemplary embodiments in function and Arrangements are achieved without departing from the scope of the present invention as set forth in the appended claims.

100:流程圖 110:步驟 120:步驟 130:步驟 140:步驟 150:步驟 160:步驟 170:步驟 180:步驟 190:步驟 100: Flowchart 110: Steps 120: Steps 130: Steps 140: Steps 150: Steps 160: Steps 170: Steps 180: Steps 190: Steps

[圖1]是描繪根據各種實施例的一種用於將麥克風信號靜音及取消靜音之方法的流程圖。[FIG. 1] is a flowchart depicting a method for muting and unmuting a microphone signal according to various embodiments.

100:流程圖 100: Flowchart

110:步驟 110: Steps

120:步驟 120: Steps

130:步驟 130: Steps

140:步驟 140: Steps

150:步驟 150: Steps

160:步驟 160: Steps

170:步驟 170: Steps

180:步驟 180: Steps

190:步驟 190: Steps

Claims (20)

一種用於將麥克風信號靜音及取消靜音之方法,其包括: 設置處理器; 接收輸入麥克風信號; 以取樣率來量測所述輸入麥克風信號的音量位準; 計算靜音臨界位準; 檢查所述音量位準是否高於或等於所述靜音臨界位準,並且在判斷所述音量位準高於或等於所述靜音臨界位準之後重置靜音延遲計時器並且獲得所述輸入麥克風信號、或是在判斷所述音量位準並未高於或等於所述靜音臨界位準之後檢查所述靜音延遲計時器是否正在運行,並且若所述靜音延遲計時器並未運行,則衰減所述輸入麥克風信號、或是若所述靜音延遲計時器仍在運行,則獲得所述輸入麥克風信號;以及 將所述輸入麥克風信號或是被衰減的所述輸入麥克風信號寫入輸出緩衝器。 A method for muting and unmuting a microphone signal, comprising: set the processor; Receive input microphone signal; measuring the volume level of the input microphone signal with a sampling rate; Calculate the mute critical level; Check whether the volume level is higher than or equal to the mute threshold level, and reset the mute delay timer and obtain the input microphone signal after judging that the volume level is higher than or equal to the mute threshold level , or check whether the mute delay timer is running after judging that the volume level is not higher than or equal to the mute threshold level, and if the mute delay timer is not running, attenuate the mute delay timer an input microphone signal, or if the mute delay timer is still running, obtaining the input microphone signal; and The input microphone signal or the attenuated input microphone signal is written to an output buffer. 如請求項1之方法,其中所述輸入麥克風信號在頻域中,並且量測所述輸入麥克風信號的音量位準的步驟藉由取所述輸入麥克風信號的均方根乘上幅度縮放因數。The method of claim 1, wherein the input microphone signal is in the frequency domain, and the step of measuring the volume level of the input microphone signal is by taking the root mean square of the input microphone signal and multiplying it by an amplitude scaling factor. 如請求項2之方法,其中所述幅度縮放因數是所述取樣率及音框尺寸的函數的平方根的倒數。The method of claim 2, wherein the amplitude scaling factor is the inverse of the square root of a function of the sample rate and frame size. 如請求項1之方法,其中所述輸入麥克風信號的所述音量位準每32毫秒被判斷出。The method of claim 1, wherein the volume level of the input microphone signal is determined every 32 milliseconds. 如請求項1之方法,其中計算所述靜音臨界位準的步驟包括檢查語音活動偵測模式,並且獲得用於計算所述靜音臨界位準的一組參數。The method of claim 1, wherein the step of calculating the silence threshold includes examining a voice activity detection mode, and obtaining a set of parameters for calculating the silence threshold. 如請求項5之方法,其中所述語音活動偵測模式從由自動校準、透過預設的手動校準、透過預設位準的手動校準、透過客製值的手動校準、以及即時自動調整所構成的群組中所選出,並且其中所述即時自動調整模式由預設所選出。The method of claim 5, wherein the voice activity detection mode consists of automatic calibration, manual calibration through presets, manual calibration through preset levels, manual calibration through custom values, and real-time automatic adjustment is selected from the group of , and wherein the instant auto-adjustment mode is selected by default. 如請求項1之方法,其中所述靜音延遲計時器被配置為1秒。The method of claim 1, wherein the silence delay timer is configured to be 1 second. 如請求項1之方法,其進一步包括獲得用於使用者的所述靜音延遲計時器的較佳值、以及配置所述靜音延遲計時器至所述較佳值的步驟。The method of claim 1, further comprising the steps of obtaining a preferred value for the silence delay timer for a user, and configuring the silence delay timer to the preferred value. 如請求項1之方法,其進一步包括顯示視覺指示器以顯示目前靜音狀態的步驟。The method of claim 1, further comprising the step of displaying a visual indicator to show the current mute state. 如請求項1之方法,其中衰減所述輸入麥克風信號的步驟包括根據所述輸入麥克風信號的目標音訊位準以及所述音量位準來決定衰減值,並且其中所述輸入麥克風信號的所述目標音訊位準足夠低到人無法聽到述音訊,但是足夠高到通訊應用程式仍然偵測到被衰減的所述輸入麥克風信號的存在。The method of claim 1, wherein the step of attenuating the input microphone signal comprises determining an attenuation value based on a target audio level of the input microphone signal and the volume level, and wherein the target of the input microphone signal The audio level is low enough that a human cannot hear the audio, but high enough that the communication application still detects the presence of the attenuated input microphone signal. 一種包含藉由處理器讀取的非暫態儲存媒體的軟體產品,所述非暫態儲存媒體具有儲存於其上的一組指令以用於將輸入麥克風信號靜音及取消靜音,其包括: 第一序列的指令,當藉由所述處理器執行時,其使得所述處理器接收輸入麥克風信號; 第二序列的指令,當藉由所述處理器執行時,其使得所述處理器以取樣率來量測所述輸入麥克風信號的音量位準; 第三序列的指令,當藉由所述處理器執行時,其使得所述處理器計算靜音臨界位準; 第四序列的指令,當藉由所述處理器執行時,其使得所述處理器檢查所述音量位準是否高於或等於所述靜音臨界位準,並且在判斷所述音量位準高於或等於所述靜音臨界位準之後重置靜音延遲計時器並且獲得所述輸入麥克風信號、或是在判斷所述音量位準並未高於或等於所述靜音臨界位準之後檢查所述靜音延遲計時器是否正在運行,並且若所述靜音延遲計時器並未運行,則衰減所述輸入麥克風信號、或是若所述靜音延遲計時器仍在運行,則獲得所述輸入麥克風信號;以及 第五序列的指令,當藉由所述處理器執行時,其使得所述處理器將所述輸入麥克風信號或是被衰減的所述輸入麥克風信號寫入輸出緩衝器。 A software product comprising a non-transitory storage medium read by a processor, the non-transitory storage medium having stored thereon a set of instructions for muting and unmuting an input microphone signal, comprising: a first sequence of instructions that, when executed by the processor, cause the processor to receive an input microphone signal; a second sequence of instructions that, when executed by the processor, cause the processor to measure the volume level of the input microphone signal at a sampling rate; a third sequence of instructions that, when executed by the processor, cause the processor to calculate a mute threshold level; A fourth sequence of instructions, when executed by the processor, causes the processor to check whether the volume level is higher than or equal to the mute threshold, and to determine that the volume level is higher than Either reset the mute delay timer and obtain the input microphone signal after equal to the mute threshold level, or check the mute delay after judging that the volume level is not higher than or equal to the mute threshold level whether a timer is running and attenuates the incoming microphone signal if the mute delay timer is not running, or obtains the incoming microphone signal if the mute delay timer is still running; and A fifth sequence of instructions, when executed by the processor, causes the processor to write the input microphone signal or the attenuated input microphone signal to an output buffer. 如請求項11之軟體產品,其中所述輸入麥克風信號在頻域中,並且所述輸入麥克風信號的所述音量位準藉由取所述輸入麥克風信號的均方根乘上幅度縮放因數來量測。The software product of claim 11, wherein the input microphone signal is in the frequency domain, and the volume level of the input microphone signal is quantified by taking the root mean square of the input microphone signal and multiplying it by an amplitude scaling factor Measurement. 如請求項12之軟體產品,其中所述幅度縮放因數是所述取樣率及音框尺寸的函數的平方根的倒數。The software product of claim 12, wherein the amplitude scaling factor is the inverse of the square root of a function of the sampling rate and sound frame size. 如請求項11之軟體產品,其中所述輸入麥克風信號的所述音量位準每32毫秒被判斷出。The software product of claim 11, wherein the volume level of the input microphone signal is determined every 32 milliseconds. 如請求項11之軟體產品,其中所述靜音臨界位準根據語音活動偵測模式以及一組參數而被計算出。The software product of claim 11, wherein the mute threshold level is calculated based on a voice activity detection mode and a set of parameters. 如請求項15之軟體產品,其中所述語音活動偵測模式從由自動校準、透過預設的手動校準、透過預設位準的手動校準、透過客製值的手動校準、以及即時自動調整所構成的群組中所選出,並且其中所述即時自動調整模式由預設所選出。The software product of claim 15, wherein the voice activity detection mode is selected from the group consisting of automatic calibration, manual calibration through presets, manual calibration through preset levels, manual calibration through custom values, and real-time automatic adjustment. is selected from the group consisting of, and wherein the instant auto-adjustment mode is selected by preset. 如請求項11之軟體產品,其中所述靜音延遲計時器被配置為1秒。The software product of claim 11, wherein the silence delay timer is configured to be 1 second. 如請求項11之軟體產品,其進一步包括一序列的指令,當藉由所述處理器執行時,其使得所述處理器獲得用於使用者的所述靜音延遲計時器的較佳值,並且配置所述靜音延遲計時器至所述較佳值。The software product of claim 11, further comprising a sequence of instructions that, when executed by the processor, cause the processor to obtain a preferred value for the silence delay timer for a user, and Configure the mute delay timer to the preferred value. 如請求項11之軟體產品,其進一步包括一序列的指令,當藉由所述處理器執行時,其使得所述處理器顯示視覺指示器以顯示目前靜音狀態。The software product of claim 11, further comprising a sequence of instructions that, when executed by the processor, cause the processor to display a visual indicator to display a current mute state. 如請求項11之軟體產品,其中所述輸入麥克風信號被衰減根據所述輸入麥克風信號的目標音訊位準以及所述音量位準而定的衰減值,並且其中所述輸入麥克風信號的所述目標音訊位準足夠低到人無法聽到音訊,但是足夠高到通訊應用程式仍然偵測到被衰減的所述輸入麥克風信號的存在。The software product of claim 11, wherein the input microphone signal is attenuated by an attenuation value that is dependent on a target audio level of the input microphone signal and the volume level, and wherein the target of the input microphone signal The audio level is low enough that a human cannot hear the audio, but high enough that the communication application still detects the presence of the attenuated input microphone signal.
TW110142936A 2020-12-29 2021-11-18 Method to mute and unmute a microphone signal TW202226230A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US202063131424P 2020-12-29 2020-12-29
US63/131,424 2020-12-29

Publications (1)

Publication Number Publication Date
TW202226230A true TW202226230A (en) 2022-07-01

Family

ID=78825128

Family Applications (1)

Application Number Title Priority Date Filing Date
TW110142936A TW202226230A (en) 2020-12-29 2021-11-18 Method to mute and unmute a microphone signal

Country Status (4)

Country Link
US (1) US11947868B2 (en)
EP (1) EP4024893A1 (en)
CN (1) CN114697810A (en)
TW (1) TW202226230A (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11589154B1 (en) * 2021-08-25 2023-02-21 Bose Corporation Wearable audio device zero-crossing based parasitic oscillation detection
CN117835108B (en) * 2024-03-05 2024-05-28 厦门乐人电子有限公司 Wireless microphone mute prediction method, system, terminal and storage medium

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2546001B2 (en) * 1989-12-15 1996-10-23 三菱電機株式会社 Automatic gain control device
US5991718A (en) * 1998-02-27 1999-11-23 At&T Corp. System and method for noise threshold adaptation for voice activity detection in nonstationary noise environments
US20050014535A1 (en) * 2003-07-18 2005-01-20 Pratik Desai System and method for speaker-phone operation in a communications device
US8620653B2 (en) * 2009-06-18 2013-12-31 Microsoft Corporation Mute control in audio endpoints
US9210503B2 (en) * 2009-12-02 2015-12-08 Audience, Inc. Audio zoom
JP5575977B2 (en) * 2010-04-22 2014-08-20 クゥアルコム・インコーポレイテッド Voice activity detection
US8798283B2 (en) * 2012-11-02 2014-08-05 Bose Corporation Providing ambient naturalness in ANR headphones
EP3188495B1 (en) * 2015-12-30 2020-11-18 GN Audio A/S A headset with hear-through mode

Also Published As

Publication number Publication date
EP4024893A1 (en) 2022-07-06
CN114697810A (en) 2022-07-01
US11947868B2 (en) 2024-04-02
US20220206739A1 (en) 2022-06-30

Similar Documents

Publication Publication Date Title
CN112954115B (en) Volume adjusting method and device, electronic equipment and storage medium
US11164592B1 (en) Responsive automatic gain control
CN100397781C (en) sound enhancement system
US9171552B1 (en) Multiple range dynamic level control
CN110349595B (en) Audio signal automatic gain control method, control equipment and storage medium
US10461712B1 (en) Automatic volume leveling
US9008319B2 (en) Sound pressure level limiter with anti-startle feature
KR20240007168A (en) Optimizing speech in noisy environments
JP2008543194A (en) Audio signal gain control apparatus and method
EP3777114B1 (en) Dynamically adjustable sidetone generation
TW202226230A (en) Method to mute and unmute a microphone signal
WO2014169757A1 (en) Method and terminal for adaptively adjusting frequency response
US9225937B2 (en) Ultrasound pairing signal control in a teleconferencing system
US20140236590A1 (en) Communication apparatus and voice processing method therefor
CN104221284A (en) System and method for leveling loudness variations of an audio signal
CN108882115B (en) Loudness adjustment method and device and terminal
CN114902560A (en) Apparatus and method for automatic volume control with ambient noise compensation
TWI578755B (en) System and method for adjusting volume of multiuser conference
EP3863308B1 (en) Volume adjustment device and volume adjustment method
US20130295868A1 (en) System and Method for Performing Automatic Gain Control in Mobile Phone Environments
TW201506913A (en) Microphone system and sound processing method thereof
HK40068075A (en) Method to mute and unmute a microphone signal
CN116233715A (en) Sound amplification quality detection and indication method and sound amplification system
US20250287164A1 (en) Determination of needed room acoustic calibration of an audio system
EP4303874A1 (en) Providing a measure of intelligibility of an audio signal