[go: up one dir, main page]

CN104077382A - Method for improving GDM (Global Data Manager) feature selection of audio classifier - Google Patents

Method for improving GDM (Global Data Manager) feature selection of audio classifier Download PDF

Info

Publication number
CN104077382A
CN104077382A CN201410298526.2A CN201410298526A CN104077382A CN 104077382 A CN104077382 A CN 104077382A CN 201410298526 A CN201410298526 A CN 201410298526A CN 104077382 A CN104077382 A CN 104077382A
Authority
CN
China
Prior art keywords
feature
sigma
categories
separation
degree
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201410298526.2A
Other languages
Chinese (zh)
Inventor
王荣燕
戎丽霞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dezhou University
Original Assignee
Dezhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dezhou University filed Critical Dezhou University
Priority to CN201410298526.2A priority Critical patent/CN104077382A/en
Publication of CN104077382A publication Critical patent/CN104077382A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • G06F16/68Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/683Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content

Landscapes

  • Engineering & Computer Science (AREA)
  • Library & Information Science (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本发明公开了一种用于提高音频分类器性能的GDM特征选择方法,对每一种音频类型c,c∈[1,C],训练每种特征f的高斯混合模型Gfc;挑选第一个特征f1,对每一个特征,计算每两类别之间的分离度,选择第一个特征满足使所有类别之间的平均分离度最大;在选出第一个特征f1后,从待选择的特征集合中去掉特征f1,并找出f1对应的各个类别中分离度最小的两个类c1和c2;选择第二个特征f2,选择使得步骤三中找出的两个类c1和c2分离度最大的特征f2;在选出第二个特征f2后,从待选则特征集合中去掉特征f2,并用选出的特征f1和f2组成一个特征矢量,最终得到选出的特征为。本发明选出的特征子集能够使得最容易混淆的类别区分性最好,能够提高分类器整体的分类精度。

The invention discloses a GDM feature selection method for improving the performance of an audio classifier. For each audio type c, c∈[1, C], train a Gaussian mixture model G fc of each feature f; select the first feature f 1 , for each feature, calculate the degree of separation between every two categories, select the first feature to satisfy the maximum average separation degree between all categories; after selecting the first feature f 1 , from the Remove the feature f 1 from the selected feature set, and find out the two classes c 1 and c 2 with the smallest degree of separation among the categories corresponding to f 1 ; select the second feature f 2 , and select the two classes found in step 3 The feature f 2 with the largest degree of separation between classes c 1 and c 2 ; after selecting the second feature f 2 , remove the feature f 2 from the feature set to be selected, and use the selected features f 1 and f 2 to form a The feature vector, and finally the selected feature is . The feature subset selected by the present invention can make the categories that are most likely to be confused the best, and can improve the overall classification accuracy of the classifier.

Description

一种用于提高音频分类器的GDM特征选择方法A GDM Feature Selection Method for Improving Audio Classifiers

技术领域technical field

本发明属于音频特征提取领域,尤其涉及一种用于提高音频分类器的GDM特征选择方法。The invention belongs to the field of audio feature extraction, in particular to a GDM feature selection method for improving an audio classifier.

背景技术Background technique

音频特征,是影响音频分类器性能的另一关键因素。一段原始音频流本身仅仅是一种非语义符号表示和非结构化的二进制流,除了含有采样频率、量化精度以及编码方法等有限的信息外,本身并不包含明确的结构信息和语义信息。Audio features are another key factor affecting the performance of audio classifiers. A piece of original audio stream itself is just a non-semantic symbol representation and unstructured binary stream. It does not contain clear structural information and semantic information except for limited information such as sampling frequency, quantization accuracy, and encoding method.

人耳具有极强的分辨能力,给定一段音频流,不仅可以立即分辨出音频的类型,还能分辨出音频中说话人的情绪以及音乐的基调(兴奋或压抑等)等很难描述的声音特征。要使计算机能够具备像人耳一样对音频进行分类和识别的功能,首先需要将音频流从一系列的二值符号转变为能够反映不同音频类型之间差异的特征参数,即特征提取。特征提取是各种分类问题的基础。The human ear has a very strong ability to distinguish. Given an audio stream, it can not only immediately distinguish the type of audio, but also distinguish the emotion of the speaker in the audio and the tone of the music (excitement or depression, etc.), which are difficult to describe. feature. In order for the computer to be able to classify and recognize audio like the human ear, it is first necessary to convert the audio stream from a series of binary symbols into feature parameters that can reflect the differences between different audio types, that is, feature extraction. Feature extraction is the basis of various classification problems.

根据特定问题和特定领域的性质,选择有明显区分能力的特征,是设计分类过程中非常关键的一部分。在有限训练样本的情况下,我们希望用尽可能少的特征设计具有良好通用性的分类器。Depending on the nature of the particular problem and domain, the selection of features with distinct discriminative power is a critical part of the design classification process. In the case of limited training samples, we hope to design a classifier with good generality with as few features as possible.

传统特征选择的算法基于使所有类别的平均离散程度最大(GMM basedMean Separability Maximization,简称GMSM)准则,这种特征选择算法的性能很容易受易分类别的影响。而实际上,多类音频分类器的性能除了受易分类别的影响外,更多的是受易混淆类别的影响,要提高分类器的性能,提高易混淆类别间的分类精度是问题的关键所在,因此,特征选择时,应该选取那些使得易混淆类别更容易区分的特征。The traditional feature selection algorithm is based on the criterion of maximizing the average dispersion of all categories (GMM based Mean Separability Maximization, referred to as GMSM). The performance of this feature selection algorithm is easily affected by the easy-to-classify category. In fact, the performance of multi-class audio classifiers is not only affected by easily classified categories, but also by easily confused categories. To improve the performance of classifiers, improving the classification accuracy between easily confused categories is the key to the problem. Therefore, when selecting features, one should choose those features that make the confusing categories easier to distinguish.

发明内容Contents of the invention

本发明的目的在于提供一种用于提高音频分类器的GDM特征选择方法,旨在提高易混淆类别间的分类精度。The purpose of the present invention is to provide a GDM feature selection method for improving audio classifiers, aiming at improving the classification accuracy between easily confused categories.

本发明是这样实现的,一种用于提高音频分类器的GDM特征选择方法的具体步骤如下:The present invention is achieved in that a kind of concrete steps for improving the GDM feature selection method of audio classifier are as follows:

步骤一、训练模型,对每一种音频类型c,c∈[1,C],训练每种特征f的高斯混合模型GfcStep 1, training model, for each audio type c, c ∈ [1, C], train the Gaussian mixture model G fc of each feature f;

步骤二、挑选第一个特征f1,对每一个特征,计算每两类别之间的分离度,选择第一个特征满足使所有类别之间的平均分离度最大,即:Step 2. Select the first feature f 1 , and for each feature, calculate the degree of separation between every two categories, and select the first feature to maximize the average degree of separation between all categories, namely:

ff 11 == argarg maxmax ff ∈∈ [[ 11 ,, Ff ]] ΣΣ ii == 11 CC ΣΣ jj == 11 CC SS ff (( GG fithe fi ,, GG fjfj )) ;;

步骤三、在选出第一个特征f1后,从待选择的特征集合中去掉特征f1,并找出f1对应的各个类别中分离度最小的两个类c1和c2,即:Step 3. After selecting the first feature f 1 , remove the feature f 1 from the feature set to be selected, and find out the two classes c 1 and c 2 with the smallest degree of separation among the categories corresponding to f 1 , namely :

(( cc 11 ,, cc 22 )) == argarg minmin ii ,, jj SS ff 11 (( GG ff 11 ii ,, GG ff 11 jj )) ;;

步骤四、选择第二个特征f2,选择使得步骤三中找出的两个类c1和c2分离度最大的特征f2,即:Step 4. Select the second feature f 2 , and select the feature f 2 that maximizes the degree of separation between the two classes c 1 and c 2 found in step 3, namely:

ff 22 == argarg maxmax ff SS ff 11 (( GG fcfc 11 ,, GG fcfc 22 )) ;;

步骤五、在选出第二个特征f2后,从待选则特征集合中去掉特征f2,并用选出的特征f1和f2组成一个特征矢量,迭代步骤三-步骤四,在迭代时,用下面的公式分别代替步骤三和步骤四中的两个公式,即:Step 5. After selecting the second feature f 2 , remove the feature f 2 from the feature set to be selected, and use the selected features f 1 and f 2 to form a feature vector, iterative step 3-step 4, in the iterative , replace the two formulas in Step 3 and Step 4 with the following formulas respectively, namely:

{{ cc ll ,, cc ll ++ 11 }} == argarg minmin ii ,, jj ΣΣ mm == 11 ll -- 11 SS ff mm (( ii ,, jj ))

ff ll == argarg maxmax ff SS ff ll -- 11 (( GG ff ll -- 11 cc ll ,, GG ff ll -- 11 cc ll ++ 11 ))

其中,l表示迭代的次数;Among them, l represents the number of iterations;

判断是否满足迭代截止条件,若l<L,则返回步骤三,否则,停止迭代,得到选出的特征为:f1,f2,…,fLJudging whether the iteration cut-off condition is satisfied, if l<L, return to step 3, otherwise, stop the iteration, and the selected features are: f 1 , f 2 ,..., f L .

进一步,用于提高音频分类器的GDM特征选择方法的问题描述如下:Further, the problem description of the GDM feature selection method for improving audio classifiers is as follows:

假设有C个音频类型,F种特征,要从中选出L个子特征,首先,对于每一种特征f,训练每一个类别的高斯混合模型GMMfc,c∈[1,C],第c类高斯混合分布的概率密度函数为:Suppose there are C audio types and F features, and L sub-features should be selected from them. First, for each feature f, train a Gaussian mixture model GMM fc for each category, c∈[1,C], class c The probability density function of the Gaussian mixture distribution is:

pp (( Xx || &Theta;&Theta; cc )) == &Sigma;&Sigma; ii == 11 KK &pi;&pi; ii pp (( Xx || &theta;&theta; ii ))

其中,K表示混合分量的个数,Θc=(π1,...,πK1,...,θK),表示模型的参数,πi,表示第i个混合分量的权重,满足约束条件:θi={μii},表示第i个混合分量的参数;Among them, K represents the number of mixture components, Θ c =(π 1 ,...,π K1 ,...,θ K ), represents the parameters of the model, π i represents the i-th mixture component Weights, satisfying the constraints: θ i ={μ ii }, represents the parameter of the ith mixed component;

p(X|θi)为每一个高斯分量,其表达形式如下:p(X|θ i ) is each Gaussian component, and its expression is as follows:

pp (( Xx || &theta;&theta; ii )) == 11 (( 22 &pi;&pi; )) DD. // 22 || &Sigma;&Sigma; ii || 11 // 22 expexp (( -- 11 22 (( Xx -- &mu;&mu; ii )) TT &Sigma;&Sigma; ii -- 11 (( Xx -- &mu;&mu; ii )) ))

其中,μi为D维的均值矢量,表示高斯分量的均值;Among them, μ i is the mean value vector of D dimension, represents the mean value of Gaussian component;

Σi为D×D的协方差矩阵;Σ i is the covariance matrix of D×D;

定义两个类别之间的分离度(Separability)为:Define the degree of separation between two categories (Separability) as:

Sf(GMMfk,GMMfl)=dis(GMMfk,GMMfl)S f (GMM fk ,GMM fl )=dis(GMM fk ,GMM fl )

dis(·,·)表示两个高斯混合模型之间的距离,采用改进的对称距离度量K-L2距离,计算公式如下:dis(·,·) represents the distance between two Gaussian mixture models, using the improved symmetric distance measure K-L2 distance, the calculation formula is as follows:

dd (( ii ,, jj )) == 11 22 (( &mu;&mu; ii -- &mu;&mu; jj )) TT (( &Sigma;&Sigma; ii -- 11 ++ &Sigma;&Sigma; jj -- 11 )) (( &mu;&mu; ii -- &mu;&mu; jj )) ++ || &Sigma;&Sigma; jj || || &Sigma;&Sigma; ii || ++ || &Sigma;&Sigma; ii || || &Sigma;&Sigma; jj ||

效果汇总Effect summary

本发明的用于提高音频分类器的GDM特征选择方法选出的特征子集能够使得最容易混淆的类别区分性最好,能够提高分类器整体的分类精度。The feature subset selected by the GDM feature selection method for improving the audio classifier of the present invention can make the category that is most likely to be confused the best, and can improve the overall classification accuracy of the classifier.

附图说明Description of drawings

图1是本发明实施例提供的用于提高音频分类器的GDM特征选择方法流程图。Fig. 1 is a flowchart of a GDM feature selection method for improving an audio classifier provided by an embodiment of the present invention.

具体实施方式Detailed ways

为了使本发明的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本发明进行进一步详细说明。应当理解,此处所描述的具体实施例仅仅用以解释本发明,并不用于限定本发明。In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.

本发明的用于提高音频分类器的GDM特征选择方法的问题描述如下:The problem description of the GDM feature selection method for improving the audio classifier of the present invention is as follows:

假设有C个音频类型,F种特征,要从中选出L个子特征,首先,对于每一种特征f,训练每一个类别的高斯混合模型GMMfc,c∈[1,C],第c类高斯混合分布的概率密度函数为:Suppose there are C audio types and F features, and L sub-features should be selected from them. First, for each feature f, train a Gaussian mixture model GMM fc for each category, c∈[1,C], class c The probability density function of the Gaussian mixture distribution is:

pp (( Xx || &Theta;&Theta; cc )) == &Sigma;&Sigma; ii == 11 KK &pi;&pi; ii pp (( Xx || &theta;&theta; ii ))

其中,K表示混合分量的个数,Θc=(π1,...,πK1,...,θK),表示模型的参数,πi,表示第i个混合分量的权重,满足约束条件:θi={μii},表示第i个混合分量的参数;Among them, K represents the number of mixture components, Θ c =(π 1 ,...,π K1 ,...,θ K ), represents the parameters of the model, π i represents the i-th mixture component Weights, satisfying the constraints: θ i ={μ ii }, represents the parameter of the ith mixed component;

p(X|θi)为每一个高斯分量,其表达形式如下:p(X|θ i ) is each Gaussian component, and its expression is as follows:

pp (( Xx || &theta;&theta; ii )) == 11 (( 22 &pi;&pi; )) DD. // 22 || &Sigma;&Sigma; ii || 11 // 22 expexp (( -- 11 22 (( Xx -- &mu;&mu; ii )) TT &Sigma;&Sigma; ii -- 11 (( Xx -- &mu;&mu; ii )) ))

其中,μi为D维的均值矢量,表示高斯分量的均值;Among them, μ i is the mean value vector of D dimension, represents the mean value of Gaussian component;

Σi为D×D的协方差矩阵,有多种形式,可以是全矩阵,块对角阵或者对角阵。为便于运算,通常假设各维特征之间相互独立,即Σi是对角阵。Σ i is the covariance matrix of D×D, which can be in various forms, such as full matrix, block diagonal matrix or diagonal matrix. For the convenience of operation, it is usually assumed that the features of each dimension are independent of each other, that is, Σ i is a diagonal matrix.

定义两个类别之间的分离度(Separability)为:Define the degree of separation between two categories (Separability) as:

Sf(GMMfk,GMMfl)=dis(GMMfk,GMMfl)S f (GMM fk ,GMM fl )=dis(GMM fk ,GMM fl )

dis(·,·)表示两个高斯混合模型之间的距离,其度量准则有很多种。如表1中所示,A和B分别表示两个高斯混合模型,d(i,j)表示两个高斯分量之间的距离,其常用度量有欧式距离、马氏距离、K-L距离与Bhattachyaryya距离等。其中,高斯混合模型间类散度距离公式中,dis(A,B)可以是这四种距离中的任意一种。dis(·,·) represents the distance between two Gaussian mixture models, and there are many measurement criteria for it. As shown in Table 1, A and B respectively represent two Gaussian mixture models, and d(i,j) represents the distance between two Gaussian components. Commonly used metrics include Euclidean distance, Mahalanobis distance, K-L distance and Bhattachyaryya distance wait. Among them, in the Gaussian mixture model inter-class divergence distance formula, dis(A,B) can be any of these four distances.

表1 常用距离度量准则Table 1 Commonly used distance measurement criteria

上述表格中,K-L距离是非对称的,为便于计算,本实施例采用改进的对称距离度量K-L2距离,计算公式如下:In the above table, the K-L distance is asymmetrical. For ease of calculation, the present embodiment uses an improved symmetric distance to measure the K-L2 distance. The calculation formula is as follows:

dd (( ii ,, jj )) == 11 22 (( &mu;&mu; ii -- &mu;&mu; jj )) TT (( &Sigma;&Sigma; ii -- 11 ++ &Sigma;&Sigma; jj -- 11 )) (( &mu;&mu; ii -- &mu;&mu; jj )) ++ || &Sigma;&Sigma; jj || || &Sigma;&Sigma; ii || ++ || &Sigma;&Sigma; ii || || &Sigma;&Sigma; jj ||

由式Sf(GMMfk,GMMfl)=dis(GMMfk,GMMfl)分离度的计算公式可以看出,离得越远的两个高斯混合模型,其分离度越大,相应的,两个类别也越容易区分。对于易区分类别,其分类准确率较高,因此,在特征选择时,不必给予太多考虑。相反,离得越近的两个高斯混合模型其分离度越小,相应得,这样的两个类别越容易混淆。对于易混淆的类别,其分类准确率很低,导致整个分类器性能的大幅下降。因此,特征选择时,应该选取那些能够使得易混淆类别分离度大的特征,这样才能使最终分类器对所有的类别的分类准确率提高。换言之,在挑选特征时,首先要找到那些易混淆的类别,然后选择使这些易混淆类别容易区分的特征。From the calculation formula of S f (GMM fk , GMM fl )=dis(GMM fk , GMM fl ) separation degree, it can be seen that the farther apart the two Gaussian mixture models are, the greater the separation degree is, correspondingly, the two The categories are also easier to distinguish. For the easy-to-distinguish category, its classification accuracy is higher, so there is no need to give too much consideration when selecting features. On the contrary, the closer the two Gaussian mixture models are, the smaller the separation is, correspondingly, the easier it is for such two categories to confuse. For confusing categories, its classification accuracy is very low, resulting in a large drop in the performance of the entire classifier. Therefore, when selecting features, we should select those features that can make the separation of confusing categories large, so that the classification accuracy of the final classifier for all categories can be improved. In other words, when picking features, first find those confusing categories, and then choose features that make these confusing categories easy to distinguish.

图1示出了本发明的用于提高音频分类器的GDM特征选择方法的流程,如图所示,本发明是这样实现的,一种用于提高音频分类器的GDM特征选择方法的具体步骤如下:Fig. 1 shows the flow process of the GDM feature selection method for improving the audio classifier of the present invention, as shown in the figure, the present invention is implemented in this way, a kind of concrete steps for improving the GDM feature selection method of the audio classifier as follows:

S101:训练模型,对每一种音频类型c,c∈[1,C],训练每种特征f的高斯混合模型GfcS101: train the model, for each audio type c, c∈[1,C], train the Gaussian mixture model G fc of each feature f;

S102:挑选第一个特征f1,对每一个特征,计算每两类别之间的分离度,选择第一个特征满足使所有类别之间的平均分离度最大,即:S102: Select the first feature f 1 , and for each feature, calculate the degree of separation between every two categories, and select the first feature to maximize the average degree of separation between all categories, namely:

ff 11 == argarg maxmax ff &Element;&Element; [[ 11 ,, Ff ]] &Sigma;&Sigma; ii == 11 CC &Sigma;&Sigma; jj == 11 CC SS ff (( GG fithe fi ,, GG fjfj )) ;;

S103:在选出第一个特征f1后,从待选择的特征集合中去掉特征f1,并找出f1对应的各个类别中分离度最小的两个类c1和c2,即:S103: After selecting the first feature f 1 , remove the feature f 1 from the feature set to be selected, and find out the two classes c 1 and c 2 with the smallest degree of separation among the categories corresponding to f 1 , namely:

(( cc 11 ,, cc 22 )) == argarg minmin ii ,, jj SS ff 11 (( GG ff 11 ii ,, GG ff 11 jj )) ;;

S104:选择第二个特征f2,选择使得步骤三中找出的两个类c1和c2分离度最大的特征f2,即:S104: Select the second feature f 2 , select the feature f 2 that maximizes the degree of separation between the two classes c 1 and c 2 found in step 3, namely:

ff 22 == argarg maxmax ff SS ff 11 (( GG fcfc 11 ,, GG fcfc 22 )) ;;

S105:在选出第二个特征f2后,从待选则特征集合中去掉特征f2,并用选出的特征f1和f2组成一个特征矢量,迭代步骤三-步骤四,在迭代时,用下面的公式分别代替步骤三和步骤四中的两个公式,即:S105: After the second feature f 2 is selected, remove the feature f 2 from the feature set to be selected, and use the selected features f 1 and f 2 to form a feature vector, iterate step 3-step 4, during iteration , replace the two formulas in Step 3 and Step 4 with the following formulas, namely:

{{ cc ll ,, cc ll ++ 11 }} == argarg minmin ii ,, jj &Sigma;&Sigma; mm == 11 ll -- 11 SS ff mm (( ii ,, jj ))

ff ll == argarg maxmax ff SS ff ll -- 11 (( GG ff ll -- 11 cc ll ,, GG ff ll -- 11 cc ll ++ 11 ))

其中,l表示迭代的次数;Among them, l represents the number of iterations;

判断是否满足迭代截止条件,若l<L,则返回步骤三,否则,停止迭代,得到选出的特征为:f1,f2,…,fLJudging whether the iteration cut-off condition is satisfied, if l<L, return to step 3, otherwise, stop the iteration, and the selected features are: f 1 , f 2 ,..., f L .

上述虽然结合附图对本发明的具体实施方式进行了描述,但并非对本发明保护范围的限制,所属领域技术人员应该明白,在本发明的技术方案的基础上,本领域技术人员不需要付出创造性的劳动即可做出的各种修改或变形仍在本发明的保护范围之内。Although the specific implementation manner of the present invention has been described above in conjunction with the accompanying drawings, it does not limit the scope of protection of the present invention. Those skilled in the art should understand that on the basis of the technical solution of the present invention, those skilled in the art do not need to pay creative efforts Various modifications or deformations that can be made by labor are still within the protection scope of the present invention.

Claims (2)

1.一种用于提高音频分类器的GDM特征选择方法,其特征在于,所述的用于提高音频分类器的GDM特征选择方法的具体步骤如下:1. a GDM feature selection method for improving audio classifier, it is characterized in that, the described concrete steps for improving the GDM feature selection method of audio classifier are as follows: 步骤一、训练模型,对每一种音频类型c,c∈[1,C],训练每种特征f的高斯混合模型GfcStep 1, training model, for each audio type c, c ∈ [1, C], train the Gaussian mixture model G fc of each feature f; 步骤二、挑选第一个特征f1,对每一个特征,计算每两类别之间的分离度,选择第一个特征满足使所有类别之间的平均分离度最大,即:Step 2. Select the first feature f1. For each feature, calculate the degree of separation between each two categories. Select the first feature to satisfy the maximum average separation degree between all categories, namely: ff 11 == argarg maxmax ff &Element;&Element; [[ 11 ,, Ff ]] &Sigma;&Sigma; ii == 11 CC &Sigma;&Sigma; jj == 11 CC SS ff (( GG fithe fi ,, GG fjfj )) ;; 步骤三、在选出第一个特征f1后,从待选择的特征集合中去掉特征f1,并找出f1对应的各个类别中分离度最小的两个类c1和c2,即:Step 3. After selecting the first feature f 1 , remove the feature f 1 from the feature set to be selected, and find out the two classes c 1 and c 2 with the smallest degree of separation among the categories corresponding to f 1 , namely : (( cc 11 ,, cc 22 )) == argarg minmin ii ,, jj SS ff 11 (( GG ff 11 ii ,, GG ff 11 jj )) ;; 步骤四、选择第二个特征f2,选择使得步骤三中找出的两个类c1和c2分离度最大的特征f2,即:Step 4. Select the second feature f 2 , and select the feature f 2 that maximizes the degree of separation between the two classes c 1 and c 2 found in step 3, namely: ff 22 == argarg maxmax ff SS ff 11 (( GG fcfc 11 ,, GG fcfc 22 )) ;; 步骤五、在选出第二个特征f2后,从待选则特征集合中去掉特征f2,并用选出的特征f1和f2组成一个特征矢量,迭代步骤三-步骤四,在迭代时,用下面的公式分别代替步骤三和步骤四中的两个公式,即:Step 5. After selecting the second feature f 2 , remove the feature f 2 from the feature set to be selected, and use the selected features f 1 and f 2 to form a feature vector, iterative step 3-step 4, in the iterative , replace the two formulas in Step 3 and Step 4 with the following formulas respectively, namely: {{ cc ll ,, cc ll ++ 11 }} == argarg minmin ii ,, jj &Sigma;&Sigma; mm == 11 ll -- 11 SS ff mm (( ii ,, jj )) ff ll == argarg maxmax ff SS ff ll -- 11 (( GG ff ll -- 11 cc ll ,, GG ff ll -- 11 cc ll ++ 11 )) 其中,l表示迭代的次数;Among them, l represents the number of iterations; 判断是否满足迭代截止条件,若l<L,则返回步骤三,否则,停止迭代,得到选出的特征为:f1,f2,…,fLJudging whether the iteration cut-off condition is satisfied, if l<L, return to step 3, otherwise, stop the iteration, and the selected features are: f 1 , f 2 ,..., f L . 2.如权利要求1所述的用于提高音频分类器的GDM特征选择方法,其特征在于,用于提高音频分类器的GDM特征选择方法的问题描述如下:2. the GDM feature selection method for improving audio classifier as claimed in claim 1, is characterized in that, the problem description for improving the GDM feature selection method of audio classifier is as follows: 假设有C个音频类型,F种特征,要从中选出L个子特征,首先,对于每一种特征f,训练每一个类别的高斯混合模型GMMfc,c∈[1,C],第c类高斯混合分布的概率密度函数为:Suppose there are C audio types and F features, and L sub-features should be selected from them. First, for each feature f, train a Gaussian mixture model GMM fc for each category, c∈[1,C], class c The probability density function of the Gaussian mixture distribution is: pp (( Xx || &Theta;&Theta; cc )) == &Sigma;&Sigma; ii == 11 KK &pi;&pi; ii pp (( Xx || &theta;&theta; ii )) 其中,K表示混合分量的个数,Θc=(π1,...,πK1,...,θK),表示模型的参数,πi,表示第i个混合分量的权重,满足约束条件:θi={μii},表示第i个混合分量的参数;Among them, K represents the number of mixture components, Θ c =(π 1 ,...,π K1 ,...,θ K ), represents the parameters of the model, π i represents the i-th mixture component Weights, satisfying the constraints: θ i ={μ ii }, represents the parameter of the ith mixed component; p(X|θi)为每一个高斯分量,其表达形式如下:p(X|θ i ) is each Gaussian component, and its expression is as follows: pp (( Xx || &theta;&theta; ii )) == 11 (( 22 &pi;&pi; )) DD. // 22 || &Sigma;&Sigma; ii || 11 // 22 expexp (( -- 11 22 (( Xx -- &mu;&mu; ii )) TT &Sigma;&Sigma; ii -- 11 (( Xx -- &mu;&mu; ii )) )) 其中,μi为D维的均值矢量,表示高斯分量的均值;Among them, μ i is the mean value vector of D dimension, represents the mean value of Gaussian component; Σi为D×D的协方差矩阵;Σ i is the covariance matrix of D×D; 定义两个类别之间的分离度(Separability)为:Define the degree of separation between two categories (Separability) as: Sf(GMMfk,GMMfl)=dis(GMMfk,GMMfl)S f (GMM fk ,GMM fl )=dis(GMM fk ,GMM fl ) dis(·,·)表示两个高斯混合模型之间的距离,采用改进的对称距离度量K-L2距离,计算公式如下:dis(·,·) represents the distance between two Gaussian mixture models, using the improved symmetric distance measure K-L2 distance, the calculation formula is as follows: dd (( ii ,, jj )) == 11 22 (( &mu;&mu; ii -- &mu;&mu; jj )) TT (( &Sigma;&Sigma; ii -- 11 ++ &Sigma;&Sigma; jj -- 11 )) (( &mu;&mu; ii -- &mu;&mu; jj )) ++ || &Sigma;&Sigma; jj || || &Sigma;&Sigma; ii || ++ || &Sigma;&Sigma; ii || || &Sigma;&Sigma; jj || ..
CN201410298526.2A 2014-06-27 2014-06-27 Method for improving GDM (Global Data Manager) feature selection of audio classifier Pending CN104077382A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410298526.2A CN104077382A (en) 2014-06-27 2014-06-27 Method for improving GDM (Global Data Manager) feature selection of audio classifier

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410298526.2A CN104077382A (en) 2014-06-27 2014-06-27 Method for improving GDM (Global Data Manager) feature selection of audio classifier

Publications (1)

Publication Number Publication Date
CN104077382A true CN104077382A (en) 2014-10-01

Family

ID=51598636

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410298526.2A Pending CN104077382A (en) 2014-06-27 2014-06-27 Method for improving GDM (Global Data Manager) feature selection of audio classifier

Country Status (1)

Country Link
CN (1) CN104077382A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111554273A (en) * 2020-04-28 2020-08-18 华南理工大学 Method for selecting amplified corpora in voice keyword recognition

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020037083A1 (en) * 2000-07-14 2002-03-28 Weare Christopher B. System and methods for providing automatic classification of media entities according to tempo properties
US7065416B2 (en) * 2001-08-29 2006-06-20 Microsoft Corporation System and methods for providing automatic classification of media entities according to melodic movement properties
CN102129456A (en) * 2011-03-09 2011-07-20 天津大学 Method for monitoring and automatically classifying music factions based on decorrelation sparse mapping

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020037083A1 (en) * 2000-07-14 2002-03-28 Weare Christopher B. System and methods for providing automatic classification of media entities according to tempo properties
US7065416B2 (en) * 2001-08-29 2006-06-20 Microsoft Corporation System and methods for providing automatic classification of media entities according to melodic movement properties
CN102129456A (en) * 2011-03-09 2011-07-20 天津大学 Method for monitoring and automatically classifying music factions based on decorrelation sparse mapping

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
王荣燕: "复杂音频分类中的关键问题研究", 《中国博士学位论文全文数据库信息科技辑》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111554273A (en) * 2020-04-28 2020-08-18 华南理工大学 Method for selecting amplified corpora in voice keyword recognition
CN111554273B (en) * 2020-04-28 2023-02-10 华南理工大学 A method for selecting augmented corpus in speech keyword recognition

Similar Documents

Publication Publication Date Title
US9378742B2 (en) Apparatus for speech recognition using multiple acoustic model and method thereof
CN104795064B (en) The recognition methods of sound event under low signal-to-noise ratio sound field scape
Srinivas et al. Learning sparse dictionaries for music and speech classification
JP6556575B2 (en) Audio processing apparatus, audio processing method, and audio processing program
Salman et al. Machine learning inspired efficient audio drone detection using acoustic features
Nguyen et al. Comparison of two main approaches for handling imbalanced data in churn prediction problem
CN103229233B (en) Modeling device and method for speaker recognition, and speaker recognition system
Zeghidour et al. A deep scattering spectrum—Deep Siamese network pipeline for unsupervised acoustic modeling
CN108932950A (en) It is a kind of based on the tag amplified sound scenery recognition methods merged with multifrequency spectrogram
CN110120218A (en) Expressway oversize vehicle recognition methods based on GMM-HMM
CN104167208A (en) Speaker recognition method and device
CN102663432A (en) Kernel fuzzy c-means speech emotion identification method combined with secondary identification of support vector machine
CN105702251B (en) Speech emotion recognition method based on Top-k enhanced audio bag-of-words model
CN112784031B (en) A classification method and system of customer service dialogue text based on small sample learning
Ntalampiras A novel holistic modeling approach for generalized sound recognition
Ben-Harush et al. Initialization of iterative-based speaker diarization systems for telephone conversations
CN104077598A (en) Emotion recognition method based on speech fuzzy clustering
Yang et al. Intrinsic spectral analysis based on temporal context features for query-by-example spoken term detection.
CN113704479A (en) Unsupervised text classification method and device, electronic equipment and storage medium
Pianese et al. Training-free deepfake voice recognition by leveraging large-scale pre-trained models
Shivakumar et al. Simplified and supervised i-vector modeling for speaker age regression
CN116186524A (en) A self-supervised machine abnormal sound detection method
Chin et al. Music emotion detection using hierarchical sparse kernel machines
CN113344031B (en) Text classification method
CN104077382A (en) Method for improving GDM (Global Data Manager) feature selection of audio classifier

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20141001

WD01 Invention patent application deemed withdrawn after publication