Data enhancement method and system based on radio signal fusion
Technical Field
The invention relates to a data enhancement method, which is mainly applied to the field of data enhancement of deep learning models, in particular to a data enhancement method and system based on radio signal fusion.
Background
With the development of neural networks, deep learning models gradually step into ultra-large scale neural networks from simple fully-connected layers, but these networks usually require a large amount of training data to avoid overfitting. However, many application scenarios cannot obtain sufficient data to support a huge network model, and therefore, a data enhancement technology is developed, which expands a data set by generating more equivalent data through limited data to improve the size and quality of a training set, thereby constructing a more accurate and robust deep learning model.
In wireless communication, modulation information is shared between a transmitter and a receiver in a standard communication scenario, but in a specific scenario, such as the military field, a receiving end can only perform blind demodulation on a modulation signal through an automatic modulation classification scheme under the condition that the receiving end does not know what modulation mode is used by a transmitting end. Therefore, in recent years, radio signal identification based on a deep learning model is also gradually studied and exhibits good classification performance. A well-performing model often requires a large number of training samples, however, collecting a large number of high quality and reliable radio samples is sometimes difficult. How to rapidly enhance data from the data of the existing tags is a main motivation for solving the problems of the invention.
At present, a technical scheme disclosed in a patent with application number 201910694936.1 is available as a data enhancement method in the signal field, and the data enhancement method is a small sample radio signal enhancement identification method based on ACGAN. The method comprises the steps of classifying radio signals through a Long Short Term Memory (LSTM) network, further generating a countermeasure network (ACGAN) by using an Auxiliary generation method to expand a small sample data set, further enhancing an identification model, and completing small sample radio signal enhanced identification. The technology is applied to radio signal data enhancement technology, however, the training of ACGAN is unstable, and a great deal of training skill is needed. The technology of the patent is efficient and feasible, and provides a data enhancement method based on radio signal fusion.
Disclosure of Invention
In order to solve the problem of limited data volume in the prior art and further improve the accuracy and robustness of the model in a classification model, the invention provides a data enhancement method based on radio signal fusion, and the effectiveness and universality of the enhanced identification method provided by the method are verified through experiments, and data sets with different sizes can be effectively expanded, so that the model accuracy is improved, and the generalization capability of the model is further improved.
The technical conception of the invention is as follows: the method is suitable for data with time sequence, effectively amplifies the data set by the mentioned data fusion method, reduces the cost of collecting training samples, and further provides more sufficient training samples for the neural network model. The experimental results prove the feasibility and effectiveness of the method provided by the invention.
The technical scheme adopted by the invention for realizing the aim is as follows:
a data enhancement method based on radio signal fusion, characterized by: the method comprises the following steps:
s1: preprocessing a data set and pre-training a deep learning model;
s2: a data enhancement method based on radio signal fusion;
s3: generating and screening an amplification data set by using a data enhancement method;
s4: testing the classification precision of the model before and after enhancement;
preferably, the step S1 specifically includes:
s1.1: using radio signals as the data set V, the data set is first subjected to a normalization pre-processing, and then the signals of different classes are divided into respective sub-data sets { V }1,v2,…,vNN is the number of data set categories;
s1.2: and pre-training a classification model by using a data set V, wherein the classification model adopts a resnet model, the resnet model comprises a residual module consisting of a 1 x 1 convolution, two residual blocks and a maximum pooling layer, and two full connection layers, and selu and softmax are adopted as activation functions of the full connection layers, so that the pre-classification of the signal data set is realized.
Preferably, the step S2 specifically includes:
for a subdata set vkAny two signals s, s' of (k-1, 2, …, N), where s-is(s)1,s2,s3,…,sL) Is a signal sequence of length L. Selecting an arbitrary sample point s in a signal sp(p ∈ {1, 2, …, L }) as a breakpoint, and the signal sequence(s) following the breakpoint is addedp+1,…,sL-1,sL) And(s)p+1’,…,sL-1’,sL') exchange positions with each other and keep the sequence before the break point constant, the resulting signals are(s) respectively1,s2,…,sp,sp+1’,…,sL') and(s)1’,s2’,…,sp’,sp+1,…,sL)。
Preferably, the step S3 specifically includes:
s3.1: for each subdata set vkTaking the similarity S as a measurement index, selecting two similar samples S, S' to perform signal fusion operation so as to avoid the samples with larger differences from generating interference terms to influence the stability of the model, wherein the similarity S is defined as follows:
wherein s isiIs the ith sample point, and L is the length of the signal sequence;
s3.2: and (4) amplifying the data set, namely selecting a proper threshold value Y to judge whether two samples are subjected to signal fusion, randomly selecting two radio signal samples in each subdata set, performing the operation and storing the samples if the similarity S calculated by the two samples is more than or equal to Y, and abandoning the exchange if the S is less than Y. Iterating the process on each sub data set for multiple times until m new samples are generated; assuming that there are M samples per class of the original dataset, the synthesized new samples are equivalent to enhancing the dataset by M/M times, and then all M × N new samples are merged into V, and the resulting enhanced dataset DV is (N × M, L).
Preferably, the step S4 specifically includes:
and putting the amplified data set DV into a neural network for retraining, using the same test data set test model, comparing the precision before and after enhancement, and verifying the effectiveness of the method.
The system for realizing the data enhancement method based on the radio signal fusion comprises the following steps: the system comprises a model pre-training module, a signal fusion module, a data screening module and a model testing module;
the model pre-training module is used for putting the data set into a deep learning model for training on the premise of normalization pre-processing of the data set;
the signal fusion module is a signal-oriented data enhancement method, and specifically comprises the following steps: selecting signal samples of the same category, setting a breakpoint, exchanging signal sequences after the breakpoint, and keeping the sequences before the breakpoint unchanged, so that two new enhanced samples can be combined from every two signal samples;
the data screening module selects a sample to be fused by utilizing a similarity function, and specifically comprises the following steps: judging whether the two samples are suitable for data enhancement according to a threshold value, fusing the two samples to amplify an original data set after the two samples meet the condition, and then amplifying a proper amount of enhanced data set through iteration to train a deep learning model;
the model test module is used for testing the generalization ability of the deep learning model before and after enhancement, and specifically comprises the following steps: after an enhanced data set is generated, putting the enhanced data set into a deep learning model for training, then testing the precision of the model, comparing the precision with the testing precision of an original model, and verifying the effectiveness of the method;
the model pre-training module, the signal fusion module, the data screening module and the model testing module are connected in sequence.
The invention has the beneficial effects that:
(1) the data enhancement method based on radio signal fusion is provided, the operation is convenient and easy, and the scale of a signal data set is greatly expanded;
(2) the invention can provide effective help for further improving the model precision after a large number of samples are generated. Meanwhile, the method has certain enhancement effect on different radio signal data sets, and has remarkable effect on small radio signal data sets.
Drawings
FIG. 1 is a flow chart of the method of the present invention.
FIG. 2 is a schematic diagram of data fusion in accordance with the present invention.
Fig. 3 is a diagram of a neural network model structure applied to the present invention.
Fig. 4 is a schematic diagram of the system architecture of the present invention.
Detailed Description
The following detailed description of embodiments of the invention is provided in connection with the accompanying drawings.
Referring to fig. 1 to 4, a data enhancement method and system based on radio signal fusion, the invention uses modulation signal data set rml2016.10a published by the university of bradley, which contains 11 modulation types, and the signal-to-noise ratio range of each modulation type signal is an even number between [ -20,18 ]. Each radio signal sample is 128x2 in size. The number of training set samples was 176000 and the number of test set samples was 44000. The method uses a portion of the data (880 training samples) in the 18db signal in the data set, while training using a modified ResNet classification model, to test the effectiveness of the enhancement method. The method comprises the following steps:
s1: the data set preprocessing and deep learning model pre-training method specifically comprises the following steps:
s1.1: using radio signals as the data set V, the data set is first subjected to a normalization pre-processing, and then the signals of different classes are divided into respective sub-data sets { V }1,v2,…,vNN is the number of data set categories.
S1.2: and pre-training a classification model by using a data set V, wherein the classification model adopts a resnet model, the resnet model comprises a residual module consisting of a 1 x 1 convolution, two residual blocks and a maximum pooling layer, and two full connection layers, and selu and softmax are adopted as activation functions of the full connection layers, so that the pre-classification of the signal data set is realized.
S2: fig. 2 shows a data enhancement method based on radio signal fusion, which specifically includes:
for a subdata set vkAny two signals s, s' of (k-1, 2, …, N), where s-is(s)1,s2,s3,…,sL) Is a signal sequence of length L. Selecting an arbitrary sample point s in a signal sp(p ∈ {1, 2, …, L }) as a breakpoint, and the signal sequence(s) following the breakpoint is addedp+1,…,sL-1,sL) And(s)p+1’,…,sL-1’,sL') exchange positions with each other and keep the sequence before the break point constant, the resulting signals are(s) respectively1,s2,…,sp,sp+1’,…,sL') and(s)1’,s2’,…,sp’,sp+1,…,sL)。
S3: generating and screening an amplification data set by using a data enhancement method, which specifically comprises the following steps:
s3.1: for each subdata set vkTaking the similarity S as a measurement index, selecting two similar samples S, S' to perform signal fusion operation so as to avoid the samples with larger differences from generating interference terms to influence the stability of the model, wherein the similarity S is defined as follows:
wherein s isiFor the ith sample point, L is the length of the signal sequence.
S3.2: and (4) amplifying the data set, namely selecting a proper threshold value Y to judge whether two samples are subjected to signal fusion, randomly selecting two radio signal samples in each subdata set, performing the operation and storing the samples if the similarity S calculated by the two samples is more than or equal to Y, and abandoning the exchange if the S is less than Y. The process is iterated multiple times over each sub data set until m new samples are generated. Assuming that there are M samples per class of the original dataset, the synthesized new samples are equivalent to enhancing the dataset by M/M times, and then all M × N new samples are merged into V, and the resulting enhanced dataset DV is (N × M, L).
S4: the method for testing the classification precision of the model before and after the enhancement specifically comprises the following steps:
and selecting Y-0.3 as a threshold value to judge whether the two samples are subjected to data fusion, and then finding out that the method has the best effect when points in the selected sequence are selected through experiments of different exchange nodes. Therefore, for a small data set, usually, the whole data set is traversed pairwise, and a midpoint is selected as a switching node, so that pairwise fusion of all data is realized, and the maximum enhancement magnification is achieved. The amplified data set DV is put into a neural network for retraining, the same test data set test model is used, the used model is shown in figure 3, the model adopts a residual block mode, the depth of the model is improved, and meanwhile the problems of gradient dispersion and performance degradation are effectively solved. And finally, comparing the precision before and after enhancement, and verifying the effectiveness of the method.
Table 1 radio signal data enhancement results
As shown in table one, the test accuracy of the method in the small sample data set of the radio signal is shown, wherein 550 training samples, 2200 test samples and the resnet model are selected, and the average value of a plurality of sets of experiments is taken as the final result. It can be seen that the method greatly improves the accuracy and generalization capability of the deep learning model.
The system for implementing a data enhancement method based on radio signal fusion of the invention, as shown in fig. 4, comprises: the device comprises a model pre-training module, a signal fusion module, a data screening module and a model testing module.
The model pre-training module is used for putting the data set into a deep learning model for training on the premise of normalization pre-processing of the data set;
the signal fusion module is a signal-oriented data enhancement method, and specifically comprises the following steps: selecting signal samples of the same category, setting a breakpoint, exchanging signal sequences after the breakpoint, and keeping the sequences before the breakpoint unchanged, so that two new enhanced samples can be combined from every two signal samples;
the data screening module selects a sample to be fused by utilizing a similarity function, and specifically comprises the following steps: judging whether the two samples are suitable for data enhancement according to a threshold value, fusing the two samples to amplify an original data set after the two samples meet the condition, and then amplifying a proper amount of enhanced data set through iteration to train a deep learning model;
the model test module is used for testing the generalization ability of the deep learning model before and after enhancement, and specifically comprises the following steps: after an enhanced data set is generated, putting the enhanced data set into a deep learning model for training, then testing the precision of the model, comparing the precision with the testing precision of an original model, and verifying the effectiveness of the method;
the model pre-training module, the signal fusion module, the data screening module and the model testing module are connected in sequence.
The method can generate a large number of enhanced samples according to the fragment interchange of the radio signals, and then train the deep learning model by combining with the original data set, so that the generalization capability of the model is improved.
The embodiments described in this specification are merely illustrative of implementations of the inventive concept and the scope of the present invention should not be considered limited to the specific forms set forth in the embodiments but rather by the equivalents thereof as may occur to those skilled in the art upon consideration of the present inventive concept.