Garnier-Rizet et al., 2008 - Google Patents
CallSurf: Automatic Transcription, Indexing and Structuration of Call Center Conversational Speech for Knowledge Extraction and Query by Content.Garnier-Rizet et al., 2008
View PDF- Document ID
- 5236679963594848599
- Author
- Garnier-Rizet M
- Adda G
- Cailliau F
- Gauvain J
- Guillemin-Lanne S
- Lamel L
- Vanni S
- Waast-Richard C
- et al.
- Publication year
- Publication venue
- LREC
External Links
Snippet
Being the client's first interface, call centres worldwide contain a huge amount of information of all kind under the form of conversational speech. If accessible, this information can be used to detect eg. major events and organizational flaws, improve customer relations and …
- 230000035897 transcription 0 title abstract description 34
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/1822—Parsing for meaning understanding
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/183—Speech classification or search using natural language modelling using context dependencies, e.g. language models
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L2015/088—Word spotting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2765—Recognition
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination
- G10L25/66—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination for extracting parameters related to health condition
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Garnier-Rizet et al. | CallSurf: Automatic Transcription, Indexing and Structuration of Call Center Conversational Speech for Knowledge Extraction and Query by Content. | |
| Fendji et al. | Automatic speech recognition using limited vocabulary: A survey | |
| Hansen et al. | Speechfind: Advances in spoken document retrieval for a national gallery of the spoken word | |
| Waibel et al. | Meeting browser: Tracking and summarizing meetings | |
| US8996371B2 (en) | Method and system for automatic domain adaptation in speech recognition applications | |
| US8583432B1 (en) | Dialect-specific acoustic language modeling and speech recognition | |
| Zissman et al. | Automatic dialect identification of extemporaneous conversational, Latin American Spanish speech | |
| Hain et al. | The 2005 AMI system for the transcription of speech in meetings | |
| Vaudable et al. | Negative emotions detection as an indicator of dialogs quality in call centers | |
| Lamel et al. | Speech processing for audio indexing | |
| Kopparapu | Non-linguistic analysis of call center conversations | |
| Furui | Recent progress in corpus-based spontaneous speech recognition | |
| Chuangsuwanich | Multilingual techniques for low resource automatic speech recognition | |
| Żelasko et al. | AGH corpus of Polish speech | |
| Moyal et al. | Phonetic search methods for large speech databases | |
| Ferraro et al. | Benchmarking open source and paid services for speech to text: an analysis of quality and input variety | |
| Walker et al. | Semi-supervised model training for unbounded conversational speech recognition | |
| Zhou et al. | Discriminative training of the hidden vector state model for semantic parsing | |
| Rao et al. | Language identification using excitation source features | |
| Lin et al. | Phoneme-less hierarchical accent classification | |
| Tarján et al. | Improved recognition of Hungarian call center conversations | |
| Håkansson et al. | Transfer learning for domain specific automatic speech recognition in Swedish: An end-to-end approach using Mozilla’s DeepSpeech | |
| Kepuska et al. | Speech corpus generation from DVDs of movies and tv series | |
| Kolar et al. | Speaker adaptation of language models for automatic dialog act segmentation of meetings | |
| Nguyen et al. | Improving Speech Recognition with Jargon Injection |