Zhu et al., 2013 - Google Patents
Multi-stage non-negative matrix factorization for monaural singing voice separationZhu et al., 2013
- Document ID
- 11763288721423942278
- Author
- Zhu B
- Li W
- Li R
- Xue X
- Publication year
- Publication venue
- IEEE Transactions on audio, speech, and language processing
External Links
Snippet
Separating singing voice from music accompaniment can be of interest for many applications such as melody extraction, singer identification, lyrics alignment and recognition, and content-based music retrieval. In this paper, a novel algorithm for singing …
- 238000000926 separation method 0 title abstract description 84
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS
- G10H2210/00—Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
- G10H2210/031—Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS
- G10H2250/00—Aspects of algorithms or signal processing methods without intrinsic musical character, yet specifically adapted for or used in electrophonic musical processing
- G10H2250/131—Mathematical functions for musical analysis, processing, synthesis or composition
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS
- G10H1/00—Details of electrophonic musical instruments
- G10H1/36—Accompaniment arrangements
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS
- G10H1/00—Details of electrophonic musical instruments
- G10H1/0008—Associated control or indicating means
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 characterised by the type of extracted parameters
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/90—Pitch determination of speech signals
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/003—Changing voice quality, e.g. pitch or formants
- G10L21/007—Changing voice quality, e.g. pitch or formants characterised by the process used
- G10L21/013—Adapting to target pitch
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
- G10L19/018—Audio watermarking, i.e. embedding inaudible data in the audio signal
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS
- G10H2240/00—Data organisation or data communication aspects, specifically adapted for electrophonic musical tools or instruments
- G10H2240/121—Musical libraries, i.e. musical databases indexed by musical parameters, wavetables, indexing schemes using musical parameters, musical rule bases or knowledge bases, e.g. for automatic composing methods
- G10H2240/131—Library retrieval, i.e. searching a database or selecting a specific musical piece, segment, pattern, rule or parameter set
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS
- G10H3/00—Instruments in which the tones are generated by electromechanical means
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3074—Audio data retrieval
- G06F17/30743—Audio data retrieval using features automatically derived from the audio content, e.g. descriptors, fingerprints, signatures, MEP-cepstral coefficients, musical score, tempo
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Zhu et al. | Multi-stage non-negative matrix factorization for monaural singing voice separation | |
| Bello et al. | A tutorial on onset detection in music signals | |
| Rafii et al. | Repeating pattern extraction technique (REPET): A simple method for music/voice separation | |
| Durrieu et al. | A musically motivated mid-level representation for pitch estimation and musical audio source separation | |
| Cho et al. | On the relative importance of individual components of chord recognition systems | |
| Sukittanon et al. | Modulation-scale analysis for content identification | |
| Ikemiya et al. | Singing voice separation and vocal F0 estimation based on mutual combination of robust principal component analysis and subharmonic summation | |
| Giannoulis et al. | Musical instrument recognition in polyphonic audio using missing feature approach | |
| Arora et al. | Multiple F0 estimation and source clustering of polyphonic music audio using PLCA and HMRFs | |
| JP5127982B2 (en) | Music search device | |
| Rosner et al. | Classification of music genres based on music separation into harmonic and drum components | |
| Cogliati et al. | Piano music transcription with fast convolutional sparse coding | |
| FitzGerald et al. | Single channel vocal separation using median filtering and factorisation techniques | |
| Genussov et al. | Multiple fundamental frequency estimation based on sparse representations in a structured dictionary | |
| Cañadas-Quesada et al. | Harmonic-percussive sound separation using rhythmic information from non-negative matrix factorization in single-channel music recordings | |
| Wan et al. | Automatic piano music transcription using audio‐visual features | |
| Dziubinski et al. | Estimation of musical sound separation algorithm effectiveness employing neural networks | |
| Yela et al. | Interference reduction in music recordings combining kernel additive modelling and non-negative matrix factorization | |
| Yoshii et al. | Adamast: A drum sound recognizer based on adaptation and matching of spectrogram templates | |
| Shiu et al. | Musical structure analysis using similarity matrix and dynamic programming | |
| Klapuri | Pattern induction and matching in music signals | |
| Han et al. | Desoloing Monaural Audio Using Mixture Models. | |
| Bellur et al. | A cepstrum based approach for identifying tonic pitch in Indian classical music | |
| Glazyrin et al. | Chord recognition using Prewitt filter and self-similarity | |
| Sajid et al. | An Effective Framework for Speech and Music Segregation. |