[go: up one dir, main page]

Horvitz et al., 2000 - Google Patents

Deeplistener: harnessing expected utility to guide clarification dialog in spoken language systems.

Horvitz et al., 2000

View PDF
Document ID
4377204221556194569
Author
Horvitz E
Paek T
Publication year
Publication venue
INTERSPEECH

External Links

Snippet

We describe research on endowing spoken language systems with the ability to consider the cost of misrecognition, and using that knowledge to guide clarification dialog about a user's intentions. Our approach relies on coupling utility-directed policies for dialog with the …
Continue reading at www.cs.cmu.edu (PDF) (other versions)

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • G10L15/1822Parsing for meaning understanding
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • G10L15/183Speech classification or search using natural language modelling using context dependencies, e.g. language models
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L2015/088Word spotting
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • G10L15/30Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/065Adaptation
    • G10L15/07Adaptation to the speaker
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • G10L2015/0635Training updating or merging of old and new templates; Mean values; Weighting
    • G10L2015/0636Threshold criteria for the updating
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N99/00Subject matter not provided for in other groups of this subclass
    • G06N99/005Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination
    • G10L25/66Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination for extracting parameters related to health condition
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems

Similar Documents

Publication Publication Date Title
US20220122607A1 (en) Controlling an engagement state of an agent during a human-machine dialog
CN114127710B (en) Ambiguous solutions using conversational search history
US7580908B1 (en) System and method providing utility-based decision making about clarification dialog given communicative uncertainty
US10319381B2 (en) Iteratively updating parameters for dialog states
Smith et al. Interaction strategies for an affective conversational agent
US20220115001A1 (en) Method, System and Apparatus for Understanding and Generating Human Conversational Cues
Horvitz et al. Models of attention in computing and communication: from principles to applications
KR101622111B1 (en) Dialog system and conversational method thereof
US20030061029A1 (en) Device for conducting expectation based mixed initiative natural language dialogs
Trung Multimodal dialogue management-state of the art
EP4091161B1 (en) Synthesized speech audio data generated on behalf of human participant in conversation
US11437039B2 (en) Intelligent software agent
Horvitz et al. Harnessing models of users’ goals to mediate clarification dialog in spoken language systems
Horvitz et al. Deeplistener: harnessing expected utility to guide clarification dialog in spoken language systems.
CN116417003A (en) Voice interaction system, method, electronic device and storage medium
US11669697B2 (en) Hybrid policy dialogue manager for intelligent personal assistants
US12148417B1 (en) Label confidence scoring
CN114127694A (en) Error recovery for the session system
Feng et al. ASR-GLUE: A new multi-task benchmark for asr-robust natural language understanding
Paul et al. Intent based multimodal speech and gesture fusion for human-robot communication in assembly situation
US11430446B1 (en) Dialogue system and a dialogue method
CN111292749B (en) Session control method and device of intelligent voice platform
EP4089569A1 (en) A dialogue system and a dialogue method
EP4158621B1 (en) Enabling natural conversations with soft endpointing for an automated assistant
Hofmann Intuitive speech interface technology for information exchange tasks