Soni et al., 2019 - Google Patents
Automatic audio event recognition schemes for context-aware audio computing devicesSoni et al., 2019
- Document ID
- 5127701885840509689
- Author
- Soni S
- Dey S
- Manikandan M
- Publication year
- Publication venue
- 2019 Seventh International Conference on Digital Information Processing and Communications (ICDIPC)
External Links
Snippet
Automatic audio event recognition (AER) plays a major role in designing and building intelligent location and context-aware applications including audio surveillance, audio indexing and content retrieval, highlight extraction, drone and robotic navigation, machine …
- 230000001537 neural 0 abstract description 20
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/26—Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L2015/088—Word spotting
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination
- G10L25/66—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination for extracting parameters related to health condition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3074—Audio data retrieval
- G06F17/30743—Audio data retrieval using features automatically derived from the audio content, e.g. descriptors, fingerprints, signatures, MEP-cepstral coefficients, musical score, tempo
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3074—Audio data retrieval
- G06F17/30755—Query formulation specially adapted for audio data retrieval
- G06F17/30758—Query by example, e.g. query by humming
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3074—Audio data retrieval
- G06F17/30749—Audio data retrieval using information manually generated or using information not derived from the audio data, e.g. title and artist information, time and location information, usage information, user ratings
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 characterised by the type of extracted parameters
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Yang et al. | Combining temporal features by local binary pattern for acoustic scene classification | |
| Heittola et al. | Context-dependent sound event detection | |
| Bountourakis et al. | Machine learning algorithms for environmental sound recognition: Towards soundscape semantics | |
| Soni et al. | Automatic audio event recognition schemes for context-aware audio computing devices | |
| Fan et al. | Deep neural network based environment sound classification and its implementation on hearing aid app | |
| Ntalampiras | Universal background modeling for acoustic surveillance of urban traffic | |
| Waldekar et al. | Analysis and classification of acoustic scenes with wavelet transform-based mel-scaled features | |
| Shabbir et al. | Smart city traffic management: Acoustic-based vehicle detection using stacking-based ensemble deep learning approach | |
| Usaid et al. | Ambulance siren detection using artificial intelligence in urban scenarios | |
| Dosbayev et al. | Audio surveillance: detection of audio-based emergency situations | |
| Seker et al. | CNNsound: Convolutional neural networks for the classification of environmental sounds | |
| Parineh et al. | Acoustic Sensors and Audio Signal Processing in Intelligent Transportation Systems: A Survey | |
| Hajihashemi et al. | Novel time-frequency based scheme for detecting sound events from sound background in audio segments | |
| Mielke et al. | Smartphone application for automatic classification of environmental sound | |
| Segarceanu et al. | Forest monitoring using forest sound identification | |
| Luitel et al. | Sound event detection in urban soundscape using two-level classification | |
| Chinvar et al. | Ambulance siren detection using an MFCC based support vector machine | |
| Ferroudj | Detection of rain in acoustic recordings of the environment using machine learning techniques | |
| Chen et al. | An intelligent nocturnal animal vocalization recognition system | |
| Yang et al. | Sound event detection in real-life audio using joint spectral and temporal features | |
| El-metwally et al. | Optimized deep neural networks audio tagging framework for virtual business assistant | |
| Kakade et al. | Fast classification for identification of vehicles on the road from audio data of pedestrian’s mobile phone | |
| Ren et al. | Learning target template for acoustic event detection from low-SNR training data | |
| Olteanu et al. | Fusion of speech techniques for automatic environmental sound recognition | |
| Catanghal Jr | Sound detection for study room monitoring and evaluation |