Zhu et al., 2013 - Google Patents

Multi-stage non-negative matrix factorization for monaural singing voice separation

Zhu et al., 2013

Document ID: 11763288721423942278
Author: Zhu B; Li W; Li R; Xue X
Publication year: 2013
Publication venue: IEEE Transactions on audio, speech, and language processing

External Links

Cited by

Snippet

Separating singing voice from music accompaniment can be of interest for many applications such as melody extraction, singer identification, lyrics alignment and recognition, and content-based music retrieval. In this paper, a novel algorithm for singing …

Continue reading at ieeexplore.ieee.org (other versions)

238000000926 separation method 0 title abstract description 84

Classifications

- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS
- G10H2210/00—Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
- G10H2210/031—Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS
- G10H2250/00—Aspects of algorithms or signal processing methods without intrinsic musical character, yet specifically adapted for or used in electrophonic musical processing
- G10H2250/131—Mathematical functions for musical analysis, processing, synthesis or composition
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS
- G10H1/00—Details of electrophonic musical instruments
- G10H1/36—Accompaniment arrangements
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS
- G10H1/00—Details of electrophonic musical instruments
- G10H1/0008—Associated control or indicating means
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 characterised by the type of extracted parameters
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/90—Pitch determination of speech signals
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/003—Changing voice quality, e.g. pitch or formants
- G10L21/007—Changing voice quality, e.g. pitch or formants characterised by the process used
- G10L21/013—Adapting to target pitch
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
- G10L19/018—Audio watermarking, i.e. embedding inaudible data in the audio signal
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS
- G10H2240/00—Data organisation or data communication aspects, specifically adapted for electrophonic musical tools or instruments
- G10H2240/121—Musical libraries, i.e. musical databases indexed by musical parameters, wavetables, indexing schemes using musical parameters, musical rule bases or knowledge bases, e.g. for automatic composing methods
- G10H2240/131—Library retrieval, i.e. searching a database or selecting a specific musical piece, segment, pattern, rule or parameter set
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS
- G10H3/00—Instruments in which the tones are generated by electromechanical means
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3074—Audio data retrieval
- G06F17/30743—Audio data retrieval using features automatically derived from the audio content, e.g. descriptors, fingerprints, signatures, MEP-cepstral coefficients, musical score, tempo

Similar Documents

Publication	Publication Date	Title
Zhu et al.	2013	Multi-stage non-negative matrix factorization for monaural singing voice separation
Bello et al.	2005	A tutorial on onset detection in music signals
Rafii et al.	2012	Repeating pattern extraction technique (REPET): A simple method for music/voice separation
Durrieu et al.	2011	A musically motivated mid-level representation for pitch estimation and musical audio source separation
Cho et al.	2013	On the relative importance of individual components of chord recognition systems
Sukittanon et al.	2004	Modulation-scale analysis for content identification
Ikemiya et al.	2016	Singing voice separation and vocal F0 estimation based on mutual combination of robust principal component analysis and subharmonic summation
Giannoulis et al.	2013	Musical instrument recognition in polyphonic audio using missing feature approach
Arora et al.	2015	Multiple F0 estimation and source clustering of polyphonic music audio using PLCA and HMRFs
JP5127982B2 (en)	2013-01-23	Music search device
Rosner et al.	2014	Classification of music genres based on music separation into harmonic and drum components
Cogliati et al.	2015	Piano music transcription with fast convolutional sparse coding
FitzGerald et al.	2010	Single channel vocal separation using median filtering and factorisation techniques
Genussov et al.	2013	Multiple fundamental frequency estimation based on sparse representations in a structured dictionary
Cañadas-Quesada et al.	2017	Harmonic-percussive sound separation using rhythmic information from non-negative matrix factorization in single-channel music recordings
Wan et al.	2015	Automatic piano music transcription using audio‐visual features
Dziubinski et al.	2005	Estimation of musical sound separation algorithm effectiveness employing neural networks
Yela et al.	2017	Interference reduction in music recordings combining kernel additive modelling and non-negative matrix factorization
Yoshii et al.	2005	Adamast: A drum sound recognizer based on adaptation and matching of spectrogram templates
Shiu et al.	2005	Musical structure analysis using similarity matrix and dynamic programming
Klapuri	2010	Pattern induction and matching in music signals
Han et al.	2007	Desoloing Monaural Audio Using Mixture Models.
Bellur et al.	2013	A cepstrum based approach for identifying tonic pitch in Indian classical music
Glazyrin et al.	2012	Chord recognition using Prewitt filter and self-similarity
Sajid et al.	2020	An Effective Framework for Speech and Music Segregation.