Nesta et al., 2011 - Google Patents
Robust Automatic Speech Recognition through On-line Semi Blind Signal ExtractionNesta et al., 2011
View PDF- Document ID
- 6039266553429425140
- Author
- Nesta F
- Matassoni M
- Publication year
- Publication venue
- Proc. 1st Int. Workshop on Machine Listening in Multisource Environments (CHiME)
External Links
Snippet
This paper describes the system used to process the data of the CHiME Pascal 2011 competition, whose goal is to separate the desired speech and recognize the commands being spoken. The binaural recorded mixtures are processed by an on-line Semi-Blind …
- 238000000605 extraction 0 title abstract description 8
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
- G10L2021/02161—Number of inputs available containing the signal or the noise to be suppressed
- G10L2021/02166—Microphone arrays; Beamforming
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/14—Speech classification or search using statistical models, e.g. hidden Markov models [HMMs]
- G10L15/142—Hidden Markov Models [HMMs]
- G10L15/144—Training of HMMs
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0272—Voice signal separating
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/063—Training
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/04—Training, enrolment or model building
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/06—Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
- G10L19/008—Multichannel audio signal coding or decoding, i.e. using interchannel correlation to reduce redundancies, e.g. joint-stereo, intensity-coding, matrixing
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 characterised by the type of extracted parameters
- G10L25/18—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R3/00—Circuits for transducers, loudspeakers or microphones
- H04R3/005—Circuits for transducers, loudspeakers or microphones for combining the signals of two or more microphones
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| JP6400218B2 (en) | Audio source isolation | |
| HK1244104A1 (en) | Audio source separation | |
| Mohammadiha et al. | Speech dereverberation using non-negative convolutive transfer function and spectro-temporal modeling | |
| Boeddeker et al. | Convolutive transfer function invariant SDR training criteria for multi-channel reverberant speech separation | |
| CN110998723B (en) | Signal processing device using neural network, signal processing method, and recording medium | |
| US10904688B2 (en) | Source separation for reverberant environment | |
| Nesta et al. | A flexible spatial blind source extraction framework for robust speech recognition in noisy environments | |
| US20230306980A1 (en) | Method and System for Audio Signal Enhancement with Reduced Latency | |
| Kubo et al. | Efficient full-rank spatial covariance estimation using independent low-rank matrix analysis for blind source separation | |
| Habets et al. | Dereverberation | |
| Baby et al. | Speech dereverberation using variational autoencoders | |
| Li et al. | An EM algorithm for audio source separation based on the convolutive transfer function | |
| Astudillo et al. | Integration of beamforming and uncertainty-of-observation techniques for robust ASR in multi-source environments | |
| Cornell et al. | Learning filterbanks for end-to-end acoustic beamforming | |
| Nesta et al. | Robust Automatic Speech Recognition through On-line Semi Blind Signal Extraction | |
| Leglaive et al. | Student's t Source and Mixing Models for Multichannel Audio Source Separation | |
| JP4946330B2 (en) | Signal separation apparatus and method | |
| Yoshioka et al. | Dereverberation by using time-variant nature of speech production system | |
| Nesta et al. | Unsupervised spatial dictionary learning for sparse underdetermined multichannel source separation | |
| Takatani et al. | High-fidelity blind separation of acoustic signals using SIMO-model-based independent component analysis | |
| Li et al. | Low complex accurate multi-source RTF estimation | |
| Nakatani et al. | Simultaneous denoising, dereverberation, and source separation using a unified convolutional beamformer | |
| Wang et al. | Real-Time Independent Vector Analysis Using Semi-Supervised Nonnegative Matrix Factorization as a Source Model. | |
| Gao et al. | A physical model-based self-supervised learning method for signal enhancement under reverberant environment | |
| Nishikawa et al. | Stable learning algorithm for blind separation of temporally correlated acoustic signals combining multistage ICA and linear prediction |