CN114297640B

CN114297640B - Attack detection method, device, medium and equipment

Info

Publication number: CN114297640B
Application number: CN202111642850.8A
Authority: CN
Inventors: 蔡鑫
Original assignee: China Telecom Corp Ltd
Current assignee: China Telecom Corp Ltd
Priority date: 2021-12-29
Filing date: 2021-12-29
Publication date: 2023-10-27
Anticipated expiration: 2041-12-29
Also published as: CN114297640A

Abstract

The application belongs to the technical field of network technology and security, and particularly relates to an attack detection method, an attack detection device, a computer readable medium and electronic equipment. The method comprises the following steps: acquiring a request sequence to be detected, and performing word segmentation processing on the request sequence to be detected to obtain word segmentation of the request sequence to be detected; performing coding treatment on the segmented words to obtain sequence codes corresponding to the segmented words; extracting features of the sequence codes to obtain first hidden layer features; acquiring a pre-trained segment locator, and extracting the first hidden layer feature and the feature of the segment locator to obtain a second hidden layer feature; and carrying out regression processing on the second hidden layer characteristics to obtain the fragment starting point position, the fragment ending point position and the fragment type of the attack fragment of the request sequence to be detected. The application can realize the prediction of the fragment position and the fragment type of the attack fragment, can effectively detect the attack fragment hidden in the request, and further can improve the prevention effect on network attack.

Description

Attack detection method, device, medium and equipment

Technical Field

The application belongs to the technical field of network technology and security, and particularly relates to an attack detection method, an attack detection device, a computer readable medium and electronic equipment.

Background

With the development of informatization technology, security events aiming at a core information system and a key information infrastructure are frequent, and many attack techniques may be hidden in a request in a fragmented form and are difficult to detect by the existing method for preventing network attacks, so that protection is invalid.

In this context, how to effectively detect the attack fragment hidden in the request is a technical problem that needs to be solved at present.

It should be noted that the information disclosed in the above background section is only for enhancing understanding of the background of the application and thus may include information that does not form the prior art that is already known to those of ordinary skill in the art.

Disclosure of Invention

The application aims to provide an attack detection method, an attack detection device, a computer readable medium and electronic equipment, which at least solve the technical problem of how to effectively detect attack fragments in a hidden request to a certain extent.

Other features and advantages of the application will be apparent from the following detailed description, or may be learned by the practice of the application.

According to an aspect of an embodiment of the present application, there is provided an attack detection method including:

Acquiring a request sequence to be detected, and performing word segmentation processing on the request sequence to be detected to obtain word segmentation of the request sequence to be detected;

performing coding treatment on the segmented words to obtain sequence codes corresponding to the segmented words;

extracting features of the sequence codes to obtain first hidden layer features;

obtaining a pre-trained fragment locator, wherein the fragment locator is used for locating a preset number of fragments with preset characteristics in the request sequence to be detected;

extracting the first hidden layer feature and the segment locator feature to obtain a second hidden layer feature;

and carrying out regression processing on the second hidden layer feature to obtain a fragment starting point position, a fragment ending point position and a fragment type of the attack fragment of the request sequence to be detected, wherein the fragment type is used for representing the attack type of the fragment.

According to an aspect of an embodiment of the present application, there is provided an attack detection device including:

the word segmentation module is configured to acquire a request sequence to be detected, and segment the request sequence to be detected to obtain the word segmentation of the request sequence to be detected;

the coding module is configured to code the word segmentation to obtain a sequence code corresponding to the word segmentation;

The first feature extraction module is configured to perform feature extraction on the sequence codes to obtain first hidden layer features;

the locator acquisition module is configured to acquire a pre-trained segment locator, and the segment locator is used for locating a preset number of segments with preset characteristics in the request sequence to be detected;

the second feature extraction module is configured to extract features of the first hidden layer feature and the fragment locator to obtain a second hidden layer feature;

the regression module is configured to carry out regression processing on the second hidden layer feature to obtain a fragment starting point position, a fragment ending point position and a fragment type of the attack fragment of the request sequence to be detected, wherein the fragment type is used for representing the attack type of the fragment.

In some embodiments of the present application, based on the above technical solution, the first feature extraction module includes:

a first feature extraction unit configured to input the sequence code into a code feature extractor of a pre-trained machine learning model, resulting in the first hidden layer feature;

the second feature extraction module includes:

and a second feature extraction unit configured to input the first hidden layer feature and the fragment locator into a decoding feature extractor of a pre-trained machine learning model to obtain a second hidden layer feature.

In some embodiments of the present application, based on the above technical solution, the attack detection device further includes:

the training unit is configured to input a training request sequence into the machine learning model, and train the machine learning model by taking the segment start position, the segment end position and the segment type of each attack segment in the training request sequence as output.

a first loss value calculation unit configured to calculate a first loss value according to a position of an attack fragment output by a model and a position of an actual attack fragment in the training request sequence, where the first loss value is used to represent a difference between the position of the attack fragment output by the model and the position of the actual attack fragment in the training request sequence;

a second loss value calculation unit configured to calculate a second loss value according to a fragment type of the attack fragment output by the model and a fragment type of an actual attack fragment in the training request sequence, where the second loss value is used to represent a difference between the fragment type of the attack fragment output by the model and the fragment type of the actual attack fragment in the training request sequence;

A comprehensive loss value calculation unit configured to perform weighted summation on the first loss value and the second loss value to obtain a comprehensive loss value;

and the training ending unit is configured to end the training of the machine learning model when the comprehensive loss value is smaller than a preset value.

In some embodiments of the present application, based on the above technical solution, the first loss value calculation unit includes:

a matching relationship obtaining subunit, configured to obtain a plurality of matching relationships between the positions of the attack fragments output by the model and the positions of the actual attack fragments in the training request sequence when the corresponding relationship between the positions of the attack fragments output by the model and the positions of the actual attack fragments in the training request sequence cannot be determined;

an alternative first loss value calculation subunit configured to calculate corresponding alternative first loss values according to the respective matching relationships, respectively;

a first loss value determination subunit configured to take as the first loss value a smallest value of the respective candidate first loss values.

In some embodiments of the present application, based on the above technical solution, the comprehensive loss value calculation unit includes:

An initialization subunit configured to initialize the first weight, the second weight, and the correction parameter;

a weight determination unit configured to take the first weight as a weight of the first loss value and the second weight as a weight of the second loss value;

an intermediate value obtaining subunit configured to multiply the first loss value by the first weight, multiply the second loss value by the second weight, and add a value obtained by multiplying the first loss value by the first weight to a value obtained by multiplying the second loss value by the second weight to obtain an intermediate value;

and the correction subunit is configured to correct the intermediate value by adding or multiplying the intermediate value with the correction parameter to obtain the comprehensive loss value.

In some embodiments of the present application, based on the above technical solution, the fragment locator includes a request number parameter, an attack fragment number parameter, and a vector dimension parameter, where the request number parameter, the attack fragment number parameter, and the vector dimension parameter are all optimized in a model training process, where the request number parameter is used to represent a predicted number of requests included in the request sequence to be detected, the attack fragment number parameter is used to represent a predicted number of attack fragments included in the request sequence to be detected, and the vector dimension parameter is used to represent a vector dimension of an operation vector of a machine learning model obtained through pre-training.

According to an aspect of the embodiments of the present application, there is provided a computer-readable medium having stored thereon a computer program which, when executed by a processor, implements an attack detection method as in the above technical solution.

According to an aspect of an embodiment of the present application, there is provided an electronic apparatus including: a processor; and a memory for storing executable instructions of the processor; wherein the processor is configured to perform the attack detection method as in the above technical solution via execution of the executable instructions.

According to an aspect of embodiments of the present application, there is provided a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions so that the computer device performs the attack detection method as in the above technical solution.

In the technical scheme provided by the embodiment of the application, a request sequence to be detected is obtained, and word segmentation processing is carried out on the request sequence to be detected to obtain the word segmentation of the request sequence to be detected; performing coding treatment on the segmented words to obtain sequence codes corresponding to the segmented words; extracting features of the sequence codes to obtain first hidden layer features; acquiring a pre-trained segment locator, and extracting the first hidden layer feature and the feature of the segment locator to obtain a second hidden layer feature; carrying out regression processing on the second hidden layer characteristics to obtain a fragment starting point position, a fragment ending point position and a fragment type of an attack fragment of the request sequence to be detected; therefore, the prediction of the segment position and the segment type of the attack segment can be realized by carrying out feature extraction on the sequence code of the request sequence to be detected, carrying out feature extraction on the obtained first hidden layer feature and the segment locator, and carrying out regression processing on the obtained second hidden layer feature, so that the attack segment hidden in the request can be effectively detected, and the prevention effect on the network attack can be improved.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application as claimed.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and together with the description, serve to explain the principles of the application. It is evident that the drawings in the following description are only some embodiments of the present application and that other drawings may be obtained from these drawings without inventive effort for a person of ordinary skill in the art.

Fig. 1 schematically shows a block diagram of an exemplary device architecture to which the technical solution of the present application is applied.

Fig. 2 schematically illustrates a flow chart of steps of an attack detection method according to some embodiments of the present application.

FIG. 3 schematically illustrates a flow of steps that may be included in an embodiment of the present application in training a machine learning model.

Fig. 4 schematically shows a flowchart of the steps for calculating the first loss value according to the positions of the attack segments outputted by the model and the actual positions of the attack segments in the training request sequence according to an embodiment of the present application.

Fig. 5 schematically shows a flowchart of the steps for weighting and summing the first loss value and the second loss value to obtain a composite loss value in an embodiment of the application.

Fig. 6 schematically shows a specific flowchart of attack fragment detection for a request sequence to be detected in an embodiment of the present application.

Fig. 7 schematically shows a block diagram of an attack detection device according to an embodiment of the present application.

Fig. 8 schematically shows a block diagram of an electronic device for implementing an embodiment of the application.

Detailed Description

Example embodiments will now be described more fully with reference to the accompanying drawings. However, the exemplary embodiments may be embodied in many forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of the example embodiments to those skilled in the art.

Furthermore, the described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are provided to give a thorough understanding of embodiments of the application. One skilled in the relevant art will recognize, however, that the application may be practiced without one or more of the specific details, or with other methods, components, devices, steps, etc. In other instances, well-known methods, devices, implementations, or operations are not shown or described in detail to avoid obscuring aspects of the application.

The block diagrams depicted in the figures are merely functional entities and do not necessarily correspond to physically separate entities. That is, the functional entities may be implemented in software, or in one or more hardware modules or integrated circuits, or in different networks and/or processor devices and/or microcontroller devices.

The flow diagrams depicted in the figures are exemplary only, and do not necessarily include all of the elements and operations/steps, nor must they be performed in the order described. For example, some operations/steps may be decomposed, and some operations/steps may be combined or partially combined, so that the order of actual execution may be changed according to actual situations.

As shown in fig. 1, the apparatus architecture 100 may include a terminal device 110, a network 120, and a server 130. Terminal device 110 may include various electronic devices such as smart phones, tablet computers, notebook computers, desktop computers, and the like. The server 130 may be an independent physical server, a server cluster or a distributed device formed by a plurality of physical servers, or a cloud server providing cloud computing services. Network 120 may be a communication medium of various connection types capable of providing a communication link between terminal device 110 and server 130, and may be, for example, a wired communication link or a wireless communication link.

The apparatus architecture in the embodiments of the present application may have any number of terminal devices, networks, and servers, as desired for implementation. For example, the server 130 may be a server group composed of a plurality of server devices. In addition, the technical solution provided in the embodiment of the present application may be applied to the terminal device 110, or may be applied to the server 130, or may be implemented by the terminal device 110 and the server 130 together, which is not limited in particular.

For example, the server 130 may be provided with the attack detection method according to the embodiment of the present application, the terminal device 110 interacts with the server 130 or other servers and clients through the internet to send messages, and the server 130 may implement the attack detection method according to the embodiment of the present application to obtain a request sequence to be detected, and perform word segmentation processing on the request sequence to be detected to obtain word segmentation of the request sequence to be detected; performing coding treatment on the segmented words to obtain sequence codes corresponding to the segmented words; extracting features of the sequence codes to obtain first hidden layer features; acquiring a pre-trained segment locator, and extracting the first hidden layer feature and the feature of the segment locator to obtain a second hidden layer feature; carrying out regression processing on the second hidden layer characteristics to obtain a fragment starting point position, a fragment ending point position and a fragment type of an attack fragment of the request sequence to be detected; therefore, the prediction of the segment position and the segment type of the attack segment can be realized by carrying out feature extraction on the sequence code of the request sequence to be detected, carrying out feature extraction on the obtained first hidden layer feature and the segment locator, and carrying out regression processing on the obtained second hidden layer feature, so that the attack segment hidden in the request can be effectively detected, and the prevention effect on the network attack can be improved.

The server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDNs, basic cloud computing services such as big data and artificial intelligent platforms. The terminal may be, but is not limited to, a smart phone, a tablet computer, a notebook computer, a desktop computer, a smart speaker, a smart watch, etc. The terminal and the server may be directly or indirectly connected through wired or wireless communication, and the present application is not limited herein.

The attack detection method provided by the application is described in detail below with reference to the specific embodiments.

Fig. 2 schematically illustrates a flow chart of steps of an attack detection method according to some embodiments of the present application. The execution subject of the attack detection method may be a terminal device, a server, or the like, and the present application is not limited to this. As shown in fig. 2, the attack detection method may mainly include the following steps S210 to S260.

S210, acquiring a request sequence to be detected, and performing word segmentation on the request sequence to be detected to obtain word segmentation of the request sequence to be detected.

The sequence of requests to be detected may be a sequence of text constituting the network request to be detected. The word segmentation processing is carried out on the request sequence to be detected, so that subsequent word segmentation encoding is facilitated.

The request to be detected may include multiple attack fragments of multiple types. In WEB attacks on websites, the payload of a malicious request may contain multiple types of attacks, and related attack behaviors may be dispersed among multiple fragments in the payload.

S220, carrying out coding processing on the segmented words to obtain sequence codes corresponding to the segmented words.

The encoding process for the segmented words may specifically be that word embedding encoding is performed for the segmented words, so as to obtain word vectors corresponding to each segmented word, and a sequence formed by each word vector in the request sequence to be detected is used as a sequence encoding corresponding to the segmented word.

S230, extracting features of the sequence codes to obtain first hidden layer features.

And extracting the characteristics of the sequence codes to obtain first hidden layer characteristics, wherein the first hidden layer characteristics can be expressed by a first tensor.

On the basis of the above embodiment, the feature extraction of the sequence code in step S230 to obtain the first hidden layer feature may further include the following steps:

And inputting the sequence codes into a code feature extractor of a pre-trained machine learning model to obtain first hidden layer features.

Specifically, the pre-trained machine learning model may be a machine learning model such as transformer, bert, which the present application is not limited to. The encoding feature extractor may be a transform encoder feature extractor.

On the basis of the above embodiment, before the step of inputting the sequence code into the code feature extractor of the pre-trained machine learning model to obtain the first hidden layer feature, the following steps may be further included:

and inputting the training request sequence into a machine learning model, and training the machine learning model by taking the segment start position, the segment end position and the segment type of each attack segment in the training request sequence as output.

Therefore, the pre-trained machine learning model can be obtained by training the machine learning model, and the model is convenient to be used for detecting the position and the type of the attack fragment of the request to be detected.

FIG. 3 schematically illustrates a flow of steps that may be included in an embodiment of the present application in training a machine learning model. On the basis of the above embodiment, in the process of training the machine learning model, the following steps 310 to 340 may be further included:

310. Calculating a first loss value according to the positions of the attack fragments output by the model and the actual positions of the attack fragments in the training request sequence, wherein the first loss value is used for representing the difference between the positions of the attack fragments output by the model and the actual positions of the attack fragments in the training request sequence;

320. calculating a second loss value according to the fragment type of the attack fragment output by the model and the fragment type of the actual attack fragment in the training request sequence, wherein the second loss value is used for representing the difference between the fragment type of the attack fragment output by the model and the fragment type of the actual attack fragment in the training request sequence;

330. the first loss value and the second loss value are weighted and summed to obtain a comprehensive loss value;

340. and when the comprehensive loss value is smaller than the preset value, finishing training of the machine learning model.

The comprehensive Loss value Loss can be obtained by weighting and summing the segment position prediction Loss LIoU (used for measuring the difference between N prediction sensitive segment boundaries and real sample boundaries) and the classification prediction Loss Lcross-entopy (used for measuring the difference between the prediction sensitive attack category and the real sample category), so that the accuracy of attack segment positioning and category judgment is considered, the overall accuracy of attack segment detection is improved, and a better detection effect is achieved.

Fig. 4 schematically shows a flowchart of the steps for calculating the first loss value according to the positions of the attack segments outputted by the model and the actual positions of the attack segments in the training request sequence according to an embodiment of the present application. As shown in fig. 4, on the basis of the above embodiment, the calculation of the first loss value according to the position of the attack segment output by the model and the position of the actual attack segment in the training request sequence in step S410 may further include the following steps S410 to S430.

S410, when the corresponding relation between the positions of the attack fragments output by the model and the actual positions of the attack fragments in the training request sequence cannot be determined, obtaining a plurality of matching relations between the positions of the attack fragments output by the model and the actual positions of the attack fragments in the training request sequence;

s420, respectively calculating corresponding alternative first loss values according to each matching relation;

and S430, taking the smallest value in the alternative first loss values as the first loss value.

Therefore, when the corresponding relation between the position of the attack fragment output according to the model and the position of the actual attack fragment in the training request sequence cannot be determined, the loss value corresponding to the optimal matching relation in the plurality of matching relations, namely the optimal loss value, is used as the first loss value, and it is understood that only one of the plurality of matching relations is the correct matching relation, and the minimum loss value is generally provided under the correct matching relation, so that better training and detection effects can be achieved when the corresponding relation between the position of the attack fragment output according to the model and the position of the actual attack fragment in the training request sequence cannot be determined.

Fig. 5 schematically shows a flowchart of the steps for weighting and summing the first loss value and the second loss value to obtain a composite loss value in an embodiment of the application. As shown in fig. 5, on the basis of the above embodiment, the step S430 of weighting and summing the first loss value and the second loss value to obtain a composite loss value may further include the following steps S510 to S540.

S510, initializing a first weight, a second weight and a correction parameter;

s520, taking the first weight as the weight of the first loss value and taking the second weight as the weight of the second loss value;

s530, multiplying a first loss value by a first weight, multiplying a second loss value by a second weight, and adding a value obtained by multiplying the first loss value by the first weight and a value obtained by multiplying the second loss value by the second weight to obtain an intermediate value;

s540, correcting the intermediate value by adding or multiplying the intermediate value with the correction parameter to obtain the comprehensive loss value.

Therefore, the intermediate value is corrected by adding or multiplying the correction parameters to obtain the comprehensive loss value, and the accuracy of predicting the segment position and the segment type of the attack segment can be improved, so that the prevention effect on the network attack can be improved.

S240, acquiring a pre-trained fragment locator, wherein the fragment locator is used for locating a preset number of fragments with preset characteristics in a request sequence to be detected.

Based on the above embodiment, in some implementations, the fragment locator includes a request number parameter, an attack fragment number parameter, and a vector dimension parameter, where the request number parameter, the attack fragment number parameter, and the vector dimension parameter are all optimized in a model training process, where the request number parameter is used to represent a predicted number of requests included in a request sequence to be detected, the attack fragment number parameter is used to represent a predicted number of attack fragments included in the request sequence to be detected, and the vector dimension parameter is used to represent a vector dimension of an operation vector of a machine learning model obtained through pre-training.

S250, extracting the first hidden layer features and the segment locator features to obtain second hidden layer features.

Specifically, in the multilayer transformer decoder, the second hidden layer feature is obtained with the segment locator as one input and the first hidden layer feature as the other input. In a specific embodiment, the segment locator may be expressed in the form of a tensor, and the first hidden layer feature may also be expressed in the form of a tensor, so that calculation may be performed between the segment locator and the first hidden layer feature.

On the basis of the above embodiment, the feature extraction of the first hidden layer feature and the segment locator in step S250 to obtain the second hidden layer feature may further include the following steps:

the first hidden layer feature and the fragment locator are input into a decoding feature extractor of a pre-trained machine learning model to obtain a second hidden layer feature.

In particular, the decoding feature extractor of the machine learning model may be a decoder feature extractor of a transducer. Therefore, the fragment locator can be used as one input, the first hidden layer feature is used as the other input, the second hidden layer feature is obtained, and the offset of the fragment position in the fragment sequence and the attack type of the fragment can be conveniently obtained by the second hidden layer feature.

S260, carrying out regression processing on the second hidden layer characteristics to obtain a fragment starting point position, a fragment ending point position and a fragment type of an attack fragment of the request sequence to be detected, wherein the fragment type is used for representing the attack type of the fragment.

Therefore, the positioning of a plurality of attack fragments and the detection of attack types of the fragments are realized by constructing a transformer encoder-decoder model with a multi-layer feature extraction function, so that the attack fragments hidden in the request can be effectively detected, and the prevention effect on network attacks can be improved.

Fig. 6 schematically shows a specific flowchart of attack fragment detection for a request sequence to be detected in an embodiment of the present application. As shown in fig. 6, a request sequence to be detected is preprocessed, for example, after a request payload sequence is acquired, word segmentation processing is performed on the request payload sequence, and Token & Position coding and ebedding are performed on the segmented words to form vectorized sequence input, that is, text coding, where a tensor shape of the text coding may be a first tensor shape P1:

P1：(batch_size,seq_lenth,vocab_size+pos_unit)。

wherein, batch_size is the number of training data samples in each batch, seq_length is the word segmentation sequence length of the request payload to be detected, vocab_size is the word stock size (i.e. one-hot vector coding length) of word vector Token for word segmentation, and pos_unit is the coding length of Position coding Position.

Specifically, the data source of the request payload sequence may be traffic/log collection, http protocol parsing, decoding filtering), etc. The word segmentation process may be to segment the request sequence to be detected into characters or sub-words, which is not limited in the present application.

Text encoding is then entered into the constructed multi-layer transformer encoder, and the first hidden layer feature of the input request payload vectorization sequence is extracted by a self-attention (self-attention) mechanism, and the first hidden layer feature tensor shape may be the second tensor shape P2:

P2：(batch_size,seq_lenth,h_encoder)

Wherein, batch_size is the number of training data samples in each batch, seq_length is the length of the sequence of the request payload word segmentation sequence, and h_encoder is the vector representation dimension of each word passing transformer encoder.

And then extracting the characteristics of the first hidden layer and the pre-trained segment locator through a plurality of layers transformer decoder to obtain the second hidden layer characteristics. Wherein the number of expected detected sensitive fragments of a given number N (N is greater than the number of sensitive attack fragments that may occur in a single request) is set in the fragment locator. In the training process of the segment locator, the segment locator can be randomly initialized, and when the training is finished, the parameters of the segment locator also change along with the optimization of the transducer, and finally fixed optimal parameters are obtained. Therefore, when the model and the fragment locator are in online prediction, the model extracts the first hidden layer features and the fragment locator in the decoder through an encoder-decoder attention mechanism to obtain second hidden layer features. Wherein the tensor shape of the segment locator may be a third tensor shape:

P3：(batch_size,N,h_encoder)

wherein, the batch_size is the number of training data samples of each batch, N is the number of sensitive fragments expected to be detected, h_encoder is the vector representation dimension of the word passing transformer encoder, and since the batch_size of the second tensor shape P2 is the same as the batch_size of the third tensor shape P3, the h_encoder of the second tensor shape P2 is the same as the h_encoder of the third tensor shape P3, thereby ensuring that the tensor shape P3 of the fragment locator can be calculated with the second tensor shape P2 output by transformer encoder through the setting of the tensor shapes. Next, feature extraction is performed on the first hidden layer feature and the pre-trained segment locator by the multi-layer transformer decoder, and the tensor shape of the obtained second hidden layer feature may be P4:

P4：(batch_size,N,h_decoder)

Wherein, batch_size is the number of training data samples in each batch, N is the number of sensitive fragments obtained by detection, and h_decoder is the vector representation dimension of each sensitive fragment passing transformer decoder.

Finally, the offset Start of the segment Start position, the offset End of the segment End position, and the attack Type to which the segment belongs of the N segments can be obtained for the second hidden layer feature through FFN regression.

In the training process of the model, the offset Start of the segment Start position, the offset End of the segment End position, the attack Type of the segment, and the matching and Loss calculation of the real attack segment are required to obtain the comprehensive Loss value Loss.

The integrated penalty value Loss may predict IoU penalty L for fragment positions _IoU (for measuring the difference after the N prediction sensitive segment boundaries and the real sample boundaries are optimally matched, namely, the first loss value) and classification prediction cross entropy loss L _{cross-entropy} (for measuring the difference between the predicted sensitive attack class and the true sample class, i.e. the second loss value).

The calculation formula of the integrated Loss value Loss can be as follows:

wherein σ1 andsigma 2 is noise parameter and can be optimized in the model training multi-task learning process after initialization to balance each task specific loss L during training _IoU And L _{cross-entropy} . Alternatively, in other embodiments, the first loss value and the second loss value may be obtained by performing weighted average summation calculation using other weights, and the integrated loss value may be obtained.

Accordingly, the integrated Loss value Loss is set to be the balance Loss of 2 prediction tasks (prediction of the segment position and prediction of the segment attack type), and the model is trained, so that the accuracy of the whole prediction task can be improved.

It will be appreciated that in the related art, only attack detection is regarded as a classification task, and only the output request contains the attack type and cannot be positioned to a specific sensitive fragment. Alternatively, the detection attack sensitive segment may be considered as a decimated read understanding task in NLP, but only one sensitive segment may be output for a certain attack type. The attack detection method of the embodiment of the application can realize the positioning of a plurality of attack fragments in the request and the detection of the attack types of the attack fragments, thereby realizing the complete information prediction of the complex attack behaviors possibly existing in a single request.

It should be noted that although the steps of the methods of the present application are depicted in the accompanying drawings in a particular order, this does not require or imply that the steps must be performed in that particular order, or that all illustrated steps be performed, to achieve desirable results. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step to perform, and/or one step decomposed into multiple steps to perform, etc.

The following describes an embodiment of the device of the present application. Fig. 7 schematically shows a block diagram of an attack detection device according to an embodiment of the present application. The attack detection device provided by the embodiment of the application can be used for executing the attack detection method in the embodiment of the application. As shown in fig. 7, the attack detection device 700 includes:

the word segmentation module 710 is configured to obtain a request sequence to be detected, and perform word segmentation processing on the request sequence to be detected to obtain a word segmentation of the request sequence to be detected;

the encoding module 720 is configured to encode the segmented word to obtain a sequence code corresponding to the segmented word;

a first feature extraction module 730 configured to perform feature extraction on the sequence code to obtain a first hidden layer feature;

a locator acquisition module 740 configured to acquire a pre-trained segment locator, where the segment locator is used to locate a preset number of segments with preset features in the request sequence to be detected;

a second feature extraction module 750 configured to perform feature extraction on the first hidden layer feature and the segment locator to obtain a second hidden layer feature;

the regression module 760 is configured to perform regression processing on the second hidden layer feature to obtain a segment start position, a segment end position, and a segment type of the attack segment of the request sequence to be detected, where the segment type is used to represent the attack type of the segment.

In some embodiments of the present application, based on the above embodiments, the first feature extraction module includes:

a first feature extraction unit configured to input a sequence code into a code feature extractor of a pre-trained machine learning model, resulting in a first hidden layer feature;

a second feature extraction module comprising:

and a second feature extraction unit configured to input the first hidden layer feature and the fragment locator into a decoded feature extractor of the pre-trained machine learning model to obtain a second hidden layer feature.

In some embodiments of the present application, based on the above embodiments, the attack detection device further includes:

the training unit is configured to input a training request sequence into the machine learning model, and train the machine learning model by taking the segment start position, the segment end position and the segment type of each attack segment in the training request sequence as outputs.

the first loss value calculation unit is configured to calculate a first loss value according to the position of the attack fragment output by the model and the position of the actual attack fragment in the training request sequence, wherein the first loss value is used for representing the difference between the position of the attack fragment output by the model and the position of the actual attack fragment in the training request sequence;

The second loss value calculation unit is configured to calculate a second loss value according to the fragment type of the attack fragment output by the model and the fragment type of the actual attack fragment in the training request sequence, wherein the second loss value is used for representing the difference between the fragment type of the attack fragment output by the model and the fragment type of the actual attack fragment in the training request sequence;

the comprehensive loss value calculation unit is configured to carry out weighted summation on the first loss value and the second loss value to obtain a comprehensive loss value;

and the training ending unit is configured to end training of the machine learning model when the comprehensive loss value is smaller than a preset value.

In some embodiments of the present application, based on the above embodiments, the first loss value calculation unit includes:

the matching relation acquisition subunit is configured to obtain a plurality of matching relations between the positions of the attack fragments output by the model and the positions of the actual attack fragments in the training request sequence when the corresponding relation between the positions of the attack fragments output by the model and the positions of the actual attack fragments in the training request sequence cannot be determined;

The first loss value determination subunit is configured to take the smallest value of the candidate first loss values as the first loss value.

In some embodiments of the present application, based on the above embodiments, the integrated loss value calculation unit includes:

and a correction subunit configured to correct the intermediate value by adding or multiplying the correction parameter to obtain the integrated loss value.

In some embodiments of the present application, based on the above embodiments, the fragment locator includes a request number parameter, an attack fragment number parameter, and a vector dimension parameter, where the request number parameter, the attack fragment number parameter, and the vector dimension parameter are all optimized in a model training process, where the request number parameter is used to represent a predicted number of requests included in a request sequence to be detected, the attack fragment number parameter is used to represent a predicted number of attack fragments included in the request sequence to be detected, and the vector dimension parameter is used to represent a vector dimension of an operation vector of a machine learning model obtained through pre-training.

Specific details of the attack detection device provided in each embodiment of the present application have been described in the corresponding method embodiments, and are not described herein.

It should be noted that, the electronic device 800 shown in fig. 8 is only an example, and should not impose any limitation on the functions and the application scope of the embodiments of the present application.

As shown in fig. 8, the electronic apparatus 800 includes a central processing unit 801 (Central Processing Unit, CPU) which can execute various appropriate actions and processes according to a program stored in a Read-Only Memory 802 (ROM) or a program loaded from a storage section 808 into a random access Memory 803 (Random Access Memory, RAM). In the random access memory 803, various programs and data required for system operation are also stored. The central processing unit 801, the read only memory 802, and the random access memory 803 are connected to each other through a bus 804. An Input/Output interface 805 (i.e., an I/O interface) is also connected to the bus 804.

The following components are connected to the input/output interface 805: an input portion 806 including a keyboard, mouse, etc.; an output portion 807 including a Cathode Ray Tube (CRT), a liquid crystal display (Liquid Crystal Display, LCD), and the like, and a speaker, and the like; a storage section 808 including a hard disk or the like; and a communication section 809 including a network interface card such as a local area network card, modem, or the like. The communication section 809 performs communication processing via a network such as the internet. The drive 810 is also connected to the input/output interface 805 as needed. A removable medium 811 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 810 as needed so that a computer program read out therefrom is mounted into the storage section 808 as needed.

In particular, the processes described in the various method flowcharts may be implemented as computer software programs according to embodiments of the application. For example, embodiments of the present application include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method shown in the flowcharts. In such an embodiment, the computer program may be downloaded and installed from a network via the communication section 809, and/or installed from the removable media 811. The computer programs, when executed by the central processor 801, perform the various functions defined in the system of the present application.

It should be noted that, the computer readable medium shown in the embodiments of the present application may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-Only Memory (ROM), an erasable programmable read-Only Memory (Erasable Programmable Read Only Memory, EPROM), flash Memory, an optical fiber, a portable compact disc read-Only Memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present application, however, the computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wired, etc., or any suitable combination of the foregoing.

The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

It should be noted that although in the above detailed description several modules or units of a device for action execution are mentioned, such a division is not mandatory. Indeed, the features and functions of two or more modules or units described above may be embodied in one module or unit in accordance with embodiments of the application. Conversely, the features and functions of one module or unit described above may be further divided into a plurality of modules or units to be embodied.

From the above description of embodiments, those skilled in the art will readily appreciate that the example embodiments described herein may be implemented in software, or may be implemented in software in combination with the necessary hardware. Thus, the technical solution according to the embodiments of the present application may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (may be a CD-ROM, a U-disk, a mobile hard disk, etc.) or on a network, and includes several instructions to cause a computing device (may be a personal computer, a server, a touch terminal, or a network device, etc.) to perform the method according to the embodiments of the present application.

Other embodiments of the application will be apparent to those skilled in the art from consideration of the specification and practice of the application disclosed herein. This application is intended to cover any variations, uses, or adaptations of the application following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the application pertains.

It is to be understood that the application is not limited to the precise arrangements and instrumentalities shown in the drawings, which have been described above, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the application is limited only by the appended claims.

Claims

1. An attack detection method, comprising:

2. The attack detection method according to claim 1, wherein the feature extraction of the sequence code to obtain a first hidden layer feature includes:

inputting the sequence codes into a code feature extractor of a pre-trained machine learning model to obtain the first hidden layer features;

The feature extraction is performed on the first hidden layer feature and the fragment locator to obtain a second hidden layer feature, including:

and inputting the first hidden layer feature and the fragment locator into a decoding feature extractor of a pre-trained machine learning model to obtain a second hidden layer feature.

3. The attack detection method according to claim 2, wherein before the inputting the sequence code into the code feature extractor of the pre-trained machine learning model, the method further comprises:

and inputting a training request sequence into the machine learning model, and training the machine learning model by taking the segment starting point position, the segment ending point position and the segment type of each attack segment in the training request sequence as output.

4. The attack detection method according to claim 3, wherein in training the machine learning model, the method further comprises:

calculating a first loss value according to the positions of the attack fragments output by the model and the actual positions of the attack fragments in the training request sequence, wherein the first loss value is used for representing the difference between the positions of the attack fragments output by the model and the actual positions of the attack fragments in the training request sequence;

Calculating a second loss value according to the fragment type of the attack fragment output by the model and the fragment type of the actual attack fragment in the training request sequence, wherein the second loss value is used for representing the difference between the fragment type of the attack fragment output by the model and the fragment type of the actual attack fragment in the training request sequence;

the first loss value and the second loss value are weighted and summed to obtain a comprehensive loss value;

and when the comprehensive loss value is smaller than a preset value, finishing training of the machine learning model.

5. The attack detection method according to claim 4, wherein calculating the first loss value according to the position of the attack segment outputted by the model and the position of the actual attack segment in the training request sequence includes:

when the corresponding relation between the positions of the attack fragments output according to the model and the positions of the actual attack fragments in the training request sequence cannot be determined, obtaining a plurality of matching relations between the positions of the attack fragments output by the model and the positions of the actual attack fragments in the training request sequence;

respectively calculating corresponding alternative first loss values according to each matching relation;

And taking the smallest value in each alternative first loss value as the first loss value.

6. The attack detection method according to claim 4, wherein in training the machine learning model, the weighted summation of the first loss value and the second loss value results in a composite loss value, comprising:

initializing a first weight, a second weight and a correction parameter;

taking the first weight as the weight of the first loss value and the second weight as the weight of the second loss value;

multiplying the first loss value by the first weight, multiplying the second loss value by the second weight, and adding a value obtained by multiplying the first loss value by the first weight and a value obtained by multiplying the second loss value by the second weight to obtain an intermediate value;

and correcting the intermediate value by adding or multiplying the intermediate value with the correction parameter to obtain the comprehensive loss value.

7. The attack detection method according to claim 3, wherein the fragment locator includes a request number parameter, an attack fragment number parameter, and a vector dimension parameter, wherein the request number parameter, the attack fragment number parameter, and the vector dimension parameter are all optimized in a model training process, wherein the request number parameter is used for representing a predicted number of requests included in the request sequence to be detected, the attack fragment number parameter is used for representing a predicted number of attack fragments included in the request sequence to be detected, and the vector dimension parameter is used for representing a vector dimension of an operation vector of a machine learning model obtained through pre-training.

8. An attack detection apparatus, comprising:

9. A computer readable medium, characterized in that a computer program is stored thereon, which computer program, when being executed by a processor, implements the method of any of claims 1 to 7.

10. An electronic device, comprising:

a processor; and

a memory for storing executable instructions of the processor;

wherein the processor is configured to perform the method of any one of claims 1 to 7 via execution of the executable instructions.