Disclosure of Invention
In order to overcome the defects in the prior art, the invention provides a network security dynamic early warning method and system based on multi-source data fusion, which are used for solving the problems in the background art.
In order to achieve the purpose, the invention provides the following technical scheme that the network security dynamic early warning method based on multi-source data fusion comprises the following steps:
Collecting data, namely obtaining key indexes of network protection quality by collecting system configuration, security hole information and security event records; the method comprises the steps of acquiring historical attack records, network monitoring and user behavior time sequence data from a security log and an intrusion monitoring system, and identifying abnormal behavior parameters which are inconsistent with a normal behavior mode by analyzing user operation behaviors;
Step two, data preprocessing, namely removing repeated data, invalid data and noise data, converting data with different sources and different formats into a unified format, and carrying out normalization processing so as to facilitate subsequent analysis;
inputting key indexes of network protection quality into a network protection quality analysis model, and outputting a network protection quality index NPQ;
Step four, constructing an attack probability prediction model based on a long-term and short-term memory network, inputting historical attack records, network monitoring and user behavior time sequence data into the attack probability prediction model, and outputting attack probability in a future time period T;
Inputting abnormal behavior parameters into an operation behavior analysis model, and outputting an operation behavior abnormality index CY, wherein the operation behavior abnormality index reflects the abnormality degree and potential safety risk of the user behavior;
And step six, carrying out joint analysis to obtain a network safety coefficient, namely, obtaining a network safety coefficient Anx by joint analysis of a network protection quality index, attack probability and an operation behavior abnormality index, wherein the network safety coefficient reflects the overall level and potential risk of network safety, carrying out dynamic early warning based on the network safety coefficient, and triggering a network safety early warning mechanism when the Anx is lower than a preset safety threshold.
Preferably, the time period T is dynamically adjusted according to the change of the network environment and the security condition, and the time period T represents a specific time length, and the selection of the time length depends on the requirements of an application scene, the availability of data and the prediction capability of a model.
Preferably, the operation behavior abnormality index CY is obtained by:
Setting the total number of the detected abnormal behaviors as n, and using i to represent the sequence number of the abnormal behaviors;
the severity of the ith abnormal behavior is noted as yci;
the frequency of occurrence of the ith abnormal behavior in a certain time window is recorded as pyi;
Acquiring identity information of a network visitor corresponding to the ith abnormal behavior, and searching a credit adjustment factor according to the identity credit of the network visitor, wherein the credit adjustment factor is used for amplifying or reducing the influence of the abnormal behavior;
Analysis of models by operational behavior And calculating to obtain the operation behavior abnormality index CY.
Preferably, the credit adjustment factor is obtained by setting an evaluation index and a credit score initial value for identity credit, wherein the evaluation index comprises login success rate and operation compliance, distributing corresponding credit score weights for different evaluation indexes, wherein the credit score weights reflect the importance of the evaluation index in identity credit evaluation, updating the identity credit comprehensive score of a network visitor according to the evaluation index and the credit score weights, and determining the corresponding identity credit adjustment factor according to the identity credit comprehensive score.
Preferably, the network protection quality index NPQ is obtained by:
acquiring key indexes of network protection quality, wherein the key indexes at least comprise firewall efficiency, intrusion detection rate and encryption technology;
Setting the number of key indexes of network protection quality as m, using j to represent sequence number of key indexes, recording actual measured value of jth key index as sc-j, recording optimum value of jth key index as max_U, recording worst value of jth key index as max_N, using wj to represent weight of jth key index, using network protection quality analysis model And calculating to obtain a network protection quality index NPQ, wherein Form (·) represents a linear normalization function for converting the numerical value in brackets to a range of 0 to 1, and alpha represents an optional adjustment factor for adjusting the nonlinear influence of the index value on the NPQ.
Preferably, the construction process of the attack probability prediction model comprises the following steps:
Step S11, automatically collecting time sequence data from a plurality of data sources, wherein the time sequence data comprises a history attack record, network monitoring and user behavior, manual intervention is reduced through automatic collection, and the comprehensiveness and timeliness of the data are ensured;
Step S12, eliminating repeated, missing and error data, processing time stamps to ensure data continuity, and identifying and processing abnormal values to obtain preprocessed time sequence data;
step S13, according to correlation data analysis, selecting basic features related to predicted attack probability from the preprocessed time sequence data, wherein the basic features at least comprise attack frequency, trend of specific attack types, network flow change rate, abnormal user behavior duty ratio and abnormal user behavior frequency;
s14, data conversion and coding, namely, adopting coding to convert the classified characteristics into numerical values, and carrying out linear normalization processing on the numerical values;
Step S15, the clear prediction task is time sequence prediction, a long-short-period memory network is selected as an initialization model, and super parameters of the long-short-period memory network, such as learning rate, hidden layer size and iteration times, are adjusted by using a cross verification method;
Step S16, based on the verification set, acquiring an evaluation index of the attack probability prediction model, and adjusting parameters of the model through the evaluation index;
and S17, outputting and deploying, namely deploying the trained attack probability prediction model into a production environment, receiving multi-source data input in real time, and predicting the future attack probability.
Preferably, the network protection quality index, the attack probability and the operation behavior abnormality index are jointly analyzed through the formulaCalculating to obtain a network security coefficient Anx, wherein gamma and delta represent adjustment parameters which are respectively used for controlling the attack probability and the sensitivity of the operation behavior abnormality index to the influence of the network security coefficient, converting the network security coefficient into threat level, and taking corresponding measures based on the threat level.
In order to achieve the purpose, the invention provides the following technical scheme that the network security dynamic early warning system based on multi-source data fusion comprises:
The system comprises a data acquisition module, a network protection quality key index acquisition module, a network monitoring module and a user behavior time sequence module, wherein the data acquisition module acquires key indexes of the network protection quality through collecting system configuration, security hole information and security event records;
The data preprocessing module is used for removing repeated data, invalid data and noise data, converting the data with different sources and different formats into a unified format, and carrying out normalization processing so as to facilitate subsequent analysis;
The network protection quality analysis module is used for inputting key indexes of network protection quality into the network protection quality analysis model and outputting network protection quality index NPQ;
the short-time attack probability prediction module builds an attack probability prediction model based on the long-term memory network, inputs the historical attack record, the network monitoring and the user behavior time sequence data into the attack probability prediction model, and outputs the attack probability in a future time period T;
the user abnormal behavior analysis module is used for inputting abnormal behavior parameters into the operation behavior analysis model and outputting an operation behavior abnormal index CY, wherein the abnormal behavior parameters comprise severity degree and frequency of abnormal behaviors and identity information of network visitors corresponding to the abnormal behaviors;
And the network safety comprehensive evaluation module is used for jointly analyzing the network protection quality index, the attack probability and the operation behavior abnormality index to obtain a network safety coefficient Anx, and carrying out dynamic early warning based on the network safety coefficient.
Preferably, the process for obtaining the network security factor Anx includes:
Setting the total number of detected abnormal behaviors as n, using i to represent the sequence number of the abnormal behaviors, marking the severity of the ith abnormal behavior as yci, evaluating the obtained value based on a predefined rule or a machine learning model, marking the occurrence frequency of the ith abnormal behavior within a certain time window as pyi, marking the credit adjustment factor of the network visitor corresponding to the ith abnormal behavior as xfi, analyzing the model by operating the behavior Calculating to obtain an operation behavior abnormality index CY;
Setting the number of key indexes of network protection quality as m, using j to represent sequence number of key indexes, recording actual measured value of jth key index as sc-j, recording optimum value of jth key index as max_U, recording worst value of jth key index as max_N, using wj to represent weight of jth key index, using network protection quality analysis model Calculating to obtain a network protection quality index NPQ, wherein Form (·) represents a linear normalization function for converting values in brackets to a range of 0 to 1, and alpha represents an optional adjustment factor for adjusting nonlinear influence of an index value on the NPQ;
Deploying the trained attack probability prediction model to a production environment, receiving multi-source data input in real time, and outputting predicted attack probability YP in time T;
By the formula Calculating to obtain a network security coefficient Anx, wherein gamma and delta represent adjustment parameters which are respectively used for controlling the attack probability and the sensitivity of the operation behavior abnormality index to the influence of the network security coefficient, converting the network security coefficient into threat level, and taking corresponding measures based on the threat level.
The invention has the technical effects and advantages that:
(1) The network security dynamic early warning method based on multi-source data fusion comprises the steps of collecting system configuration, security hole information and security event records, obtaining key indexes of network protection quality, inputting the key indexes of the network protection quality into a network protection quality analysis model, outputting a network protection quality index NPQ, enabling the network protection quality index to reflect the strength and effectiveness of network protection, building an attack probability prediction model based on a long-short-period memory network, inputting historical attack records, network monitoring and user behavior time sequence data into the attack probability prediction model, outputting attack probability in a future time period T, analyzing user operation behaviors, identifying abnormal behavior parameters which are not matched with normal behavior modes, inputting the abnormal behavior parameters into an operation behavior analysis model, outputting an operation behavior abnormality index CY, enabling the operation behavior abnormality index to reflect the abnormality degree and potential security risk of the user behaviors, enabling the abnormal behavior parameters to comprise the severity degree, the frequency of the abnormal behaviors and the identity information of network visitors, jointly analyzing the network protection quality index, the attack probability and the operation behavior abnormality index to obtain a network security coefficient Anx, enabling the network security coefficient to reflect the overall level and the potential security, enabling the network security coefficient to be based on the network security coefficient, and enabling the network security coefficient to be lower than the preset security dynamic security system to be accurately predicted when the network security system is not matched with the security dynamic security threshold, and the network security dynamic early warning method is not accurately provided, and the security dynamic early warning system is not accurately supported in time, and the security safety warning system is provided.
(2) The network security dynamic early warning method based on the multi-source data fusion dynamically adjusts the attack probability of the time period T according to the changes of the network environment and the security condition, sets the time period T to be shorter in the security scene needing quick response so that a system can quickly identify and respond to potential attack threats, and shortens the time period T to update attack probability prediction more frequently when the system detects abnormal behaviors or potential threats so as to respond to security events more quickly.
Detailed Description
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
The following description of at least one exemplary embodiment is merely exemplary in nature and is in no way intended to limit the application, its application, or uses.
Techniques, methods, and apparatus known to one of ordinary skill in the relevant art may not be discussed in detail, but are intended to be part of the specification where appropriate.
Example 1
Referring to a flow chart of a network security dynamic early warning method in fig. 1, the invention provides a network security dynamic early warning method based on multi-source data fusion as shown in fig. 1, which is characterized by comprising the following steps:
Collecting data, namely obtaining key indexes of network protection quality by collecting system configuration, security hole information and security event records; the method comprises the steps of acquiring historical attack records, network monitoring and user behavior time sequence data from a security log and an intrusion monitoring system, and identifying abnormal behavior parameters which are inconsistent with a normal behavior mode by analyzing user operation behaviors;
Step two, data preprocessing, namely removing repeated data, invalid data (such as data with wrong format and missing key fields) and noise data, converting the data with different sources and different formats into a unified format, and carrying out normalization processing so as to facilitate subsequent analysis;
inputting key indexes of network protection quality into a network protection quality analysis model, and outputting a network protection quality index NPQ;
Step four, constructing an attack probability prediction model based on a long-term and short-term memory network, inputting historical attack records, network monitoring and user behavior time sequence data into the attack probability prediction model, and outputting attack probability in a future time period T;
Inputting abnormal behavior parameters into an operation behavior analysis model, and outputting an operation behavior abnormality index CY, wherein the operation behavior abnormality index reflects the abnormality degree and potential safety risk of the user behavior;
And step six, carrying out joint analysis to obtain a network safety coefficient, namely, obtaining a network safety coefficient Anx by joint analysis of a network protection quality index, attack probability and an operation behavior abnormality index, wherein the network safety coefficient reflects the overall level and potential risk of network safety, carrying out dynamic early warning based on the network safety coefficient, and triggering a network safety early warning mechanism when the Anx is lower than a preset safety threshold.
It should be further explained that, in the embodiment of the present invention, the time period T is not fixed and is dynamically adjusted according to the changes of the network environment and the security situation, where the time period T represents a specific time length, such as 1 hour in the future, 24 hours in the future, one week in the future, and the like, and the selection of the time length depends on the requirements of the application scenario, the availability of data, and the predictive capability of the model, for example, in the security scenario requiring quick response, the time period T is set to a shorter time so that the system can quickly identify and respond to the potential attack threat, and when the system detects the abnormal behavior or the potential threat, the time period T is shortened to update the attack probability prediction more frequently so as to respond to the security event more quickly, and in the scenario requiring less real-time performance, the time period T is relatively longer.
In the embodiment of the present invention, it is further explained that the operation behavior abnormality index CY is obtained by:
Setting the total number of the detected abnormal behaviors as n, and using i to represent the sequence number of the abnormal behaviors;
the severity degree of the ith abnormal behavior is recorded as yci, and the obtained numerical value is evaluated based on a predefined rule or a machine learning model;
the frequency of occurrence of the ith abnormal behavior in a certain time window is recorded as pyi;
Acquiring identity information of a network visitor corresponding to the ith abnormal behavior, and searching a credit adjustment factor according to the identity credit of the network visitor, wherein the credit adjustment factor is used for amplifying or reducing the influence of the abnormal behavior;
Analysis of models by operational behavior And calculating to obtain the operation behavior abnormality index CY.
The method for acquiring the credit adjustment factor includes the steps of setting an evaluation index and a credit score initial value for identity credit, wherein the evaluation index comprises login success rate, operation compliance and the credit score initial value, distributing corresponding credit score weights for different evaluation indexes, wherein the credit score weights reflect importance of the evaluation index in identity credit evaluation, updating identity credit comprehensive scores of network visitors according to the evaluation index and the credit score weights, determining corresponding identity credit adjustment factors according to the identity credit comprehensive scores, wherein the higher the identity credit comprehensive scores are, the smaller the adjustment factors are (namely the influence of abnormal behaviors is reduced), and the lower the identity credit comprehensive scores are, the larger the adjustment factors are (namely the influence of the abnormal behaviors is amplified).
In the embodiment of the present invention, it needs to be further explained that the network protection quality index NPQ is obtained by:
acquiring key indexes of network protection quality, wherein the key indexes at least comprise firewall efficiency, intrusion detection rate and encryption technology;
Setting the number of key indexes of network protection quality as m, using j to represent sequence number of key indexes, recording actual measured value of jth key index as sc-j, recording optimum value of jth key index as max_U, recording worst value of jth key index as max_N, using wj to represent weight of jth key index, using network protection quality analysis model The network protection quality index NPQ is calculated, wherein Form (·) represents a linear normalization function used for converting values in brackets to a range from 0 to 1, alpha represents an optional adjustment factor used for adjusting nonlinear influence of index values on the NPQ, when alpha=1, a linear relation is represented, when alpha >1, penalty emphasis on deviation from an optimal value is represented, when 0< alpha <1, penalty mitigation on deviation from the optimal value is represented, and the adjustment factor is adjusted according to actual conditions so as to better reflect influence degree of different indexes on NPOI.
In the embodiment of the present invention, it is further explained that the process of constructing the attack probability prediction model includes the following steps:
Step S11, automatically collecting time sequence data from a plurality of data sources, wherein the time sequence data comprises a history attack record, network monitoring and user behavior, manual intervention is reduced through automatic collection, and the comprehensiveness and timeliness of the data are ensured;
The historical attack record refers to a record of network security attack events which occur historically and comprises attack types, time, influence ranges and source IP address information, the network monitoring data refers to data collected through a network monitoring tool, the network monitoring data comprises network traffic, session information and connection attempt records, the network monitoring data can reflect normal modes and abnormal modes of network activities, the user behavior data refers to behavior data of users in the network, the behavior data comprises logging activities, file access and system operation, and abnormal user behaviors are signs of the attack activities.
Step S12, eliminating repeated, missing and error data, processing time stamps to ensure data continuity, and identifying and processing abnormal values to obtain preprocessed time sequence data;
Step S13, according to correlation data analysis, selecting basic features related to predicted attack probability from the preprocessed time sequence data, wherein the basic features at least comprise attack frequency, trend of specific attack types, network flow change rate, abnormal user behavior duty ratio and abnormal user behavior frequency;
illustratively, the correlation data analysis is used to analyze the correlation between individual features and target variables (i.e., attack probabilities), and to determine which features are most helpful in predicting attack probabilities by statistical methods (e.g., correlation coefficients, chi-square test).
Step S14, data conversion and coding, namely, adopting coding (such as single-heat coding, label coding or target coding) to convert the classified characteristics into numerical values, and carrying out linear normalization processing on the numerical values;
Step S15, the explicit prediction task is time sequence prediction (such as predicting the probability of attack occurrence in a few hours in the future), a long-period memory network is selected as an initialization model, and a cross verification method is used for adjusting super parameters of the long-period memory network, such as learning rate, hidden layer size and iteration times;
The method comprises the following steps of explaining that during the training process, an early stopping method or a regularization technology is used for preventing overfitting, wherein the loss function is a cross entropy loss function of a predicted attack probability YP and an actual attack probability SP;
Step S16, based on the verification set, acquiring evaluation indexes of the attack probability prediction model, such as accuracy, recall rate, F1 score and area under an AUC-ROC curve;
and S17, outputting and deploying, namely deploying the trained attack probability prediction model into a production environment, receiving multi-source data input in real time, and predicting the future attack probability.
In the embodiment of the invention, the network protection quality index, the attack probability and the operation behavior abnormality index are analyzed in a combined way, and the analysis is carried out according to the formulaThe network security coefficient Anx is obtained through calculation, wherein gamma and delta are used for representing adjustment parameters respectively used for controlling the attack probability and the sensitivity of the operation behavior abnormality index to the influence of the network security coefficient, adjustment is carried out according to actual conditions so as to reflect the importance of each factor in different network environments, the larger the attack probability is, the smaller the network security coefficient Anx is, the larger the operation behavior abnormality index is, the smaller the network security coefficient Anx is, the larger the protection quality is, and the network security coefficient Anx is larger.
In the embodiment of the present invention, it is further explained that the network security coefficient is converted into threat levels, and the value range [0,1] of the network security coefficient is divided into several intervals, each interval corresponds to a threat level, for example:
when the network security coefficient belongs to [0.8,1.0], corresponding to the low threat level, indicating that the network security condition is good in the future time T;
When the network security coefficient belongs to [0.4,0.8 ], the threat level in the corresponding represents that the network security condition in the future time T needs to be concerned;
When the network security coefficient belongs to [0.2, 0.4), the corresponding high threat level indicates that the network security condition in the future time T needs to take protective measures;
when the network security factor belongs to [0.0,0.2 ], the corresponding extremely high threat level indicates that the network security condition in the future time T is in an emergency state, and network protection resources need to be increased immediately.
For example, assuming that the network protection quality index npq=0.7 belongs to medium protection quality, the predicted attack probability yp=0.3 belongs to lower attack probability, the operation behavior anomaly index cy=0.2 belongs to lower operation behavior anomaly, and the settings γ=2 and δ=1.5, γ and δ are adjusted according to actual conditions, the operation behavior anomaly index cy=0.2 is calculatedAccording to threat level classification, the network security coefficient corresponds to a medium threat level, and a network manager needs to pay attention to the network security coefficient in real time.
Example 2
Referring to fig. 2, which is a block diagram of a network security dynamic early warning system, an embodiment of the present invention provides a network security dynamic early warning system based on multi-source data fusion, which includes:
The system comprises a data acquisition module, a network protection quality key index acquisition module, a network monitoring module and a user behavior time sequence module, wherein the data acquisition module acquires key indexes of the network protection quality through collecting system configuration, security hole information and security event records;
The data preprocessing module is used for removing repeated data, invalid data and noise data, converting the data with different sources and different formats into a unified format, and carrying out normalization processing so as to facilitate subsequent analysis;
The network protection quality analysis module is used for inputting key indexes of network protection quality into the network protection quality analysis model and outputting network protection quality index NPQ;
the short-time attack probability prediction module builds an attack probability prediction model based on the long-term memory network, inputs the historical attack record, the network monitoring and the user behavior time sequence data into the attack probability prediction model, and outputs the attack probability in a future time period T;
the user abnormal behavior analysis module is used for inputting abnormal behavior parameters into the operation behavior analysis model and outputting an operation behavior abnormal index CY, wherein the abnormal behavior parameters comprise severity degree and frequency of abnormal behaviors and identity information of network visitors corresponding to the abnormal behaviors;
And the network safety comprehensive evaluation module is used for jointly analyzing the network protection quality index, the attack probability and the operation behavior abnormality index to obtain a network safety coefficient Anx, and carrying out dynamic early warning based on the network safety coefficient.
In the embodiment of the present invention, it should be further explained that the process of obtaining the network security factor Anx includes:
Setting the total number of detected abnormal behaviors as n, using i to represent the sequence number of the abnormal behaviors, marking the severity of the ith abnormal behavior as yci, evaluating the obtained value based on a predefined rule or a machine learning model, marking the occurrence frequency of the ith abnormal behavior within a certain time window as pyi, marking the credit adjustment factor of the network visitor corresponding to the ith abnormal behavior as xfi, analyzing the model by operating the behavior Calculating to obtain an operation behavior abnormality index CY;
Setting the number of key indexes of network protection quality as m, using j to represent sequence number of key indexes, recording actual measured value of jth key index as sc-j, recording optimum value of jth key index as max_U, recording worst value of jth key index as max_N, using wj to represent weight of jth key index, using network protection quality analysis model Calculating to obtain a network protection quality index NPQ, wherein Form (·) represents a linear normalization function for converting values in brackets to a range of 0 to 1, and alpha represents an optional adjustment factor for adjusting nonlinear influence of an index value on the NPQ;
Deploying the trained attack probability prediction model to a production environment, receiving multi-source data input in real time, and outputting predicted attack probability YP in time T;
By the formula Calculating to obtain a network security coefficient Anx, wherein gamma and delta represent adjustment parameters which are respectively used for controlling the attack probability and the sensitivity of the operation behavior abnormality index to the influence of the network security coefficient, converting the network security coefficient into threat level, and taking corresponding measures based on the threat level.
Finally, the foregoing description of the preferred embodiment of the invention is provided for the purpose of illustration only, and is not intended to limit the invention to the particular form disclosed, but on the contrary, the intention is to cover all modifications, equivalents, and alternatives falling within the spirit and scope of the invention.