Kacprzak et al., 2017 - Google Patents

Speech/music discrimination for analysis of radio stations

Kacprzak et al., 2017

Document ID: 4109703264566954798
Author: Kacprzak S; Chwiećko B; Ziółko B
Publication year: 2017
Publication venue: 2017 International conference on systems, signals and image processing (IWSSIP)

External Links

Cited by

Snippet

A computationally efficient feature, called Minimum Energy Density (MED) was applied to discriminate audio signals between speech and music in the radio stations programs. The presented binary classifier is based on testing two features: energy distribution and …

Continue reading at ieeexplore.ieee.org (other versions)

238000004458 analytical method 0 title abstract description 12

Classifications

- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/78—Detection of presence or absence of voice signals
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination
- G10L25/66—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination for extracting parameters related to health condition
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
- G10L19/018—Audio watermarking, i.e. embedding inaudible data in the audio signal
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/90—Pitch determination of speech signals
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/93—Discriminating between voiced and unvoiced parts of speech signals
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS
- G10H2210/00—Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
- G10H2210/031—Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis using predictive techniques
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/26—Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition

Similar Documents

Publication	Publication Date	Title
US7346516B2 (en)	2008-03-18	Method of segmenting an audio stream
Lu et al.	2001	A robust audio classification and segmentation method
Lu et al.	2002	Content analysis for audio classification and segmentation
Chou et al.	2001	Robust singing detection in speech/music discriminator design
Sukittanon et al.	2004	Modulation-scale analysis for content identification
Lavner et al.	2009	A decision-tree-based algorithm for speech/music classification and segmentation
US20070083365A1 (en)	2007-04-12	Neural network classifier for separating audio sources from a monophonic audio signal
Butko et al.	2011	Audio segmentation of broadcast news in the Albayzin-2010 evaluation: overview, results, and discussion
Pinquier et al.	2002	Robust speech/music classification in audio documents
Zhou et al.	2008	Music onset detection based on resonator time frequency image
Thambi et al.	2014	Random forest algorithm for improving the performance of speech/non-speech detection
Kwon et al.	2002	Speaker change detection using a new weighted distance measure.
Kumar et al.	2018	Music Source Activity Detection and Separation Using Deep Attractor Network.
Kacprzak et al.	2017	Speech/music discrimination for analysis of radio stations
Li et al.	2006	Singing Voice Separation from Monaural Recordings.
Guaus et al.	2004	A non-linear rhythm-based style classification for broadcast speech-music discrimination
Andersson	2004	Audio classification and content description
Zhu et al.	2007	SVM-based audio classification for content-based multimedia retrieval
Kacprzak et al.	2013	Speech/music discrimination via energy density analysis
Zhu et al.	2007	Automatic audio genre classification based on support vector machine
Li	2013	Nonexclusive audio segmentation and indexing as a pre-processor for audio information mining
Keum et al.	2005	Speech/music discrimination using spectral peak feature for speaker indexing
Karunarathna et al.	2019	Classification of voice content in the context of public radio broadcasting
US20250069592A1 (en)	2025-02-27	Method and System for Low-Complexity Real-Time Multiclass Hierarchical Audio Classification
Tao et al.	2008	A fuzzy logic based speech extraction approach for e-learning content production