Chaudhary et al., 2017 - Google Patents

Keyword based indexing of a multimedia file

Chaudhary et al., 2017

Document ID: 2487530723575259577
Author: Chaudhary A; Akshatha K; Kodlekere K; Prasad S
Publication year: 2017
Publication venue: 2017 IEEE International Symposium on Multimedia (ISM)

External Links

Cited by

Snippet

As of February 2014 there were around 10 million registered user for massive open online courses (MOOCS) in about 1500 courses. One of the many reasons for such a huge user base is the advancements in the media streaming and storage technologies. Multimedia has …

Continue reading at ieeexplore.ieee.org (other versions)

238000000034 method 0 abstract description 9

Classifications

- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B27/00—Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
- G11B27/10—Indexing; Addressing; Timing or synchronising; Measuring tape travel
- G11B27/102—Programmed access in sequence to addressed parts of tracks of operating record carriers
- G11B27/105—Programmed access in sequence to addressed parts of tracks of operating record carriers of operating discs
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L2015/088—Word spotting
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/183—Speech classification or search using natural language modelling using context dependencies, e.g. language models
- G10L15/187—Phonemic context, e.g. pronunciation rules, phonotactical constraints or phoneme n-grams
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B27/00—Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
- G11B27/10—Indexing; Addressing; Timing or synchronising; Measuring tape travel
- G11B27/19—Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier
- G11B27/28—Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier by using information signals recorded by the same method as the main recording
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30781—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F17/30784—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre
- G06F17/30796—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre using original textual content or text extracted from visual content or transcript of audio data
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30017—Multimedia data retrieval; Retrieval of more than one type of audiovisual media
- G06F17/30023—Querying
- G06F17/30038—Querying based on information manually generated or based on information not derived from the media content, e.g. tags, keywords, comments, usage information, user ratings
- G06F17/30041—Querying based on information manually generated or based on information not derived from the media content, e.g. tags, keywords, comments, usage information, user ratings using location information
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3074—Audio data retrieval
- G06F17/30743—Audio data retrieval using features automatically derived from the audio content, e.g. descriptors, fingerprints, signatures, MEP-cepstral coefficients, musical score, tempo
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30781—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F17/30817—Information retrieval; Database structures therefor; File system structures therefor of video data using information manually generated or using information not derived from the video content, e.g. time and location information, usage information, user ratings
- G06F17/3082—Information retrieval; Database structures therefor; File system structures therefor of video data using information manually generated or using information not derived from the video content, e.g. time and location information, usage information, user ratings using information manually generated, e.g. tags, keywords, comments, title and artist information, manually generated time, location and usage information, user ratings
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/26—Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices

Similar Documents

Publication	Publication Date	Title
US8200490B2 (en)	2012-06-12	Method and apparatus for searching multimedia data using speech recognition in mobile device
US8209171B2 (en)	2012-06-26	Methods and apparatus relating to searching of spoken audio data
Hauptmann et al.	1997	Informedia: News-on-demand multimedia information acquisition and retrieval
US10133538B2 (en)	2018-11-20	Semi-supervised speaker diarization
Glass et al.	2007	Recent progress in the MIT spoken lecture processing project.
Foote	1999	An overview of audio information retrieval
US20100299131A1 (en)	2010-11-25	Transcript alignment
WO2023168373A1 (en)	2023-09-07	Structured video documents
Alberti et al.	2009	An audio indexing system for election video material
CN113326387A (en)	2021-08-31	Intelligent conference information retrieval method
KR20210133667A (en)	2021-11-08	Server for providing corpus building service and method therefore
Chen et al.	2000	Retrieval of broadcast news speech in Mandarin Chinese collected in Taiwan using syllable-level statistical characteristics
Chand et al.	2021	A framework for lecture video segmentation from extracted speech content
Nouza et al.	2012	Making czech historical radio archive accessible and searchable for wide public
GB2451938A (en)	2009-02-18	Methods and apparatus for searching of spoken audio data
Chaudhary et al.	2017	Keyword based indexing of a multimedia file
Nouza et al.	2011	Voice technology to enable sophisticated access to historical audio archive of the Czech radio
Pala et al.	2019	Real-time transcription, keyword spotting, archival and retrieval for telugu TV news using ASR
Bourlard et al.	2013	Processing and linking audio events in large multimedia archives: The eu inevent project
Nouza et al.	2012	Large-scale processing, indexing and search system for Czech audio-visual cultural heritage archives
Nouza et al.	2006	A system for information retrieval from large records of Czech spoken data
Soares et al.	2018	A framework for automatic topic segmentation in video lectures
Gareshma et al.	2022	Interactive Audio Indexing and Speech Recognition based Navigation Assist Tool for Tutoring Videos
Bertillo et al.	2023	Enhancing Accessibility of Parliamentary Video Streams: AI-Based Automatic Indexing Using Verbatim Reports.
Mishra et al.	2023	Indexing and segmentation of video contents: A review