Disclosure of Invention
In order to solve the technical problems, the disclosure provides a threat analysis method and device based on multi-level information fusion.
The disclosure provides a threat analysis method based on multi-level information fusion, comprising the following steps:
acquiring a behavior data set corresponding to a target network entity;
performing label fusion on the behavior data set and preset resource data to obtain a data comprehensive information table;
According to an analytic hierarchy process, analyzing the data comprehensive information table from a plurality of dimensions respectively to obtain atomic characterization data results corresponding to the dimensions;
combining atomic characterization data results corresponding to different dimensions to obtain a plurality of data combination results;
Performing suspicious behavior analysis on each data combination result to obtain an initial threat score value corresponding to each data combination result;
and obtaining the threat total score of the target network entity according to the initial threat score values corresponding to the data combination results.
Optionally, the acquiring the behavior data set corresponding to the target network entity includes:
Acquiring network communication data, wherein the network communication data is communication behavior description among different network entities;
grouping the network communication data based on the network entity to obtain a plurality of object combinations, wherein the object combinations comprise the network entities at two ends where communication behaviors occur;
extracting a behavior data set corresponding to each object combination from the network communication data;
And determining a target network entity from the object combination, and corresponding the target network entity to the behavior data set.
Optionally, the analyzing the data integrated information table from multiple dimensions includes:
Obtaining a plurality of atomic depiction models, wherein each atomic depiction model corresponds to at least one dimension;
respectively inputting the data comprehensive information table into each atomic depiction model;
And analyzing the data comprehensive information table according to the corresponding dimension by the atomic characterization model.
Optionally, the performing suspicious behavior analysis on each data combination result includes:
constructing a plurality of comprehensive threat analysis models;
Inputting the data combination results into the comprehensive threat analysis models one by one;
and carrying out suspicious behavior analysis on the data combination result through the comprehensive threat analysis model.
Optionally, the constructing multiple comprehensive threat analysis models includes at least one of the following:
constructing the comprehensive threat analysis model according to a preset abnormal communication behavior;
extracting a collapse Identification (IOC) index in threat information, and constructing the comprehensive threat analysis model according to the IOC index;
constructing the comprehensive threat analysis model according to a preset key port;
and constructing the comprehensive threat analysis model according to a preset rule scene.
Optionally, the method further comprises:
And generating an alarm event when the threat total score is greater than a preset score threshold.
Optionally, the method further comprises:
and sequencing the alarm events according to the threat total score and a preset reference factor, wherein the reference factor comprises an industry type and a project security level.
Optionally, the resource data comprises threat information, domain name query protocol WHOIS, IP portrayal, knowledge base data and organization portrayal data.
Optionally, the dimension comprises a peer IP, a peer domain name, communication time, an organization number, a peer geographic location attribute, peer IP port data, a peer domain name higher than a specified ranking, IP record information, a corresponding threat information label and threat information IOC aggregation.
The disclosure also provides a threat analysis apparatus based on multi-level information fusion, comprising:
the data acquisition module is used for acquiring a behavior data set corresponding to the target network entity;
the data fusion module is used for carrying out label fusion on the behavior data set and preset resource data to obtain a data comprehensive information table;
the first analysis module is used for respectively analyzing the data comprehensive information table from a plurality of dimensions according to an analytic hierarchy process to obtain atomic characterization data results corresponding to the dimensions;
The data combination module is used for combining atomic characterization data results corresponding to different dimensions to obtain a plurality of data combination results;
The second analysis module is used for carrying out suspicious behavior analysis on each data combination result to obtain an initial threat score value corresponding to each data combination result;
And the threat determination module is used for obtaining the threat total score of the target network entity according to the initial threat score values corresponding to the plurality of data combination results.
Compared with the prior art, the technical scheme provided by the embodiment of the disclosure has the following advantages:
the threat analysis method and device based on multi-level information fusion comprise the steps of obtaining a behavior data set corresponding to a target network entity, carrying out label fusion on the behavior data set and preset resource data to obtain a data comprehensive information table, respectively analyzing the data comprehensive information table from multiple dimensions according to a hierarchical analysis method to obtain atomic characterization data results corresponding to the dimensions, combining the atomic characterization data results corresponding to the different dimensions to obtain multiple data combination results, carrying out suspicious behavior analysis on the data combination results to obtain initial threat score values corresponding to the data combination results, and obtaining threat total scores of the target network entity according to the initial threat score values corresponding to the data combination results.
In the threat analysis process, on one hand, the comprehensive information table of the data is analyzed from multiple dimensions to realize multi-dimensional data analysis processing, and on the other hand, atomic characterization data results corresponding to different dimensions are combined and the combined results are analyzed, so that the atomic characterization data results of information fusion can be utilized to carry out comprehensive analysis, fusion analysis can be carried out from an overall view, the research and judgment difficulty is reduced, the problem of data fusion is solved, the data sharing of the atomic characterization data results is realized, the multiplexing degree of common data is improved, and the efficiency and accuracy of threat analysis are further improved.
Detailed Description
In order that the above objects, features and advantages of the present disclosure may be more clearly understood, a further description of aspects of the present disclosure will be provided below. It should be noted that, without conflict, the embodiments of the present disclosure and features in the embodiments may be combined with each other.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present disclosure, but the present disclosure may be practiced otherwise than as described herein, and it is apparent that the embodiments in the specification are only some, rather than all, of the embodiments of the present disclosure.
For advanced threat events, the analysis is performed by a single data source at present, and the analysis mode has low efficiency and poor accuracy of analysis results, so that the analysis target cannot be achieved. The problems may be caused by (1) that communication between entities displayed in the monitoring data is local information, analysis from long period fusion from an overall view is lacking, and difficulty in research and judgment is high, and (2) that the analysis efficiency is low due to low multiplexing degree of common data. Aiming at the problems, the embodiment of the disclosure provides a threat analysis method and device based on multi-level information fusion.
Fig. 1 is a flowchart of a threat analysis method based on multi-level information fusion, which is provided in an embodiment of the disclosure, and the method may be applicable to an analysis scenario of a threat event. The method can be executed by a threat analysis device based on multi-level information fusion, which is configured at the terminal and can be realized by software and/or hardware.
As shown in fig. 1, the threat analysis method based on multi-level information fusion provided in this embodiment may include the following steps.
Step S102, a behavior data set corresponding to the target network entity is obtained.
In this embodiment, the behavior data set is derived from network communication data, and the acquiring manner of the behavior data set may include:
Firstly, network communication data are acquired, and all the network communication data are communication behavior descriptions among different network entities in the form of a map shown in fig. 2. And then grouping network communication data based on the network entities to obtain a plurality of object combinations, wherein the object combinations comprise network entities at two ends of the communication behavior. Specifically, with network entities at two ends of a session behavior as basic units, all network communication data in a network are grouped, and all object combinations, such as a combination of network entities a and B, a combination of network entities a and H, and a combination of network entities C and D, can be obtained at the session level.
And extracting all communication logs from the network communication data according to different object combinations, and extracting information such as communication behaviors, entity attributes and the like between network entities at two ends of the object combinations in a multi-dimensional and fine-grained manner so as to obtain the behavior data set corresponding to each object combination. And determining a target network entity from the object combination, and corresponding the target network entity to the behavior data set. Both end network entities in the object combination can be used as target network entities to be analyzed, and correspondingly, the corresponding behavior data set of the object combination is the behavior data set of the target network entity.
And step S104, carrying out label fusion on the behavior data set and preset resource data to obtain a data comprehensive information table, wherein the resource data comprises threat information, WHOIS (domain name query protocol), IP portrait, knowledge base data and organization portrait data.
In this embodiment, label fusion of the behavior data set and the resource data may be understood as label fusion of data such as source IP and destination IP of each network entity in the behavior data set and resource data such as knowledge base data, threat information and organization portrait data, so as to form a data comprehensive information table. The tag fusion can also become tag association, which is equivalent to associating the behavior data set and the resource data into a data comprehensive information table by adopting the same tag. The label-fused data comprehensive information table can respectively supplement the activity of three identities, namely an attack entity, a suspected victim entity and an unknown entity, corresponding to the network entity in the object combination, and enrich threat information, images and other information of the network entity, so that advanced persistent threat events can be analyzed in an auxiliary manner.
And S106, analyzing the data comprehensive information table from a plurality of dimensions according to an analytic hierarchy process to obtain atomic characterization data results corresponding to the dimensions.
The analytic hierarchy process in the embodiment is a analytic hierarchy process for decision making of hierarchical weights, is mainly used for multi-objective complex problems which are difficult to quantify, and can analyze the problems effectively. The use of analytic hierarchy process is relatively efficient because the relatively complex multi-objective decision problem is neither easy to analyze quantitatively nor simply qualitatively. The core of the analytic hierarchy process is to decompose the complex multi-objective decision problem, decompose the problem into small sub-modules (also called atomic models) according to a plurality of objectives of the system, analyze the sub-modules, and finally integrate all the results of the sub-modules to obtain a comprehensive result, thereby realizing the comprehensive analysis of the objective problem.
And respectively carrying out basic layer depiction analysis on the data comprehensive information table from multiple dimensions to form an atomic depiction data result. The delineated dimensions may include, but are not limited to, peer IP, peer domain name, communication time, organization number, peer geographic location attribute, peer IP port data, peer domain name above a specified ranking, IP docket information, corresponding threat intelligence labels, and threat intelligence IOC (Indicators of Compromise, collapse identification) aggregation.
And S108, combining atomic characterization data results corresponding to different dimensions to obtain a plurality of data combination results. In this embodiment, by combining atomic descriptive data results corresponding to different dimensions, the communication behavior and entity attribute between network entities are combined, and different data combination results can be mapped to different attack scenarios generally, so as to generate a high-value threat analysis event.
And step S110, suspicious behavior analysis is carried out on each data combination result, and initial threat score values corresponding to each data combination result are obtained.
According to the embodiment, the atomic characterization data results are combined and analyzed, so that data sharing in a multi-level analysis process is realized, and the data analysis efficiency can be accelerated.
Step S112, obtaining threat total points of the target network entity according to initial threat score values corresponding to the data combination results. For example, the sum of the plurality of initial threat score values may be determined as a threat total score for the target network entity.
The threat analysis method based on multi-level information fusion comprises the steps of obtaining a behavior data set corresponding to a target network entity, carrying out label fusion on the behavior data set and preset resource data to obtain a data comprehensive information table, respectively analyzing the data comprehensive information table from a plurality of dimensions according to a hierarchical analysis method to obtain atomic characterization data results corresponding to the dimensions, combining the atomic characterization data results corresponding to the different dimensions to obtain a plurality of data combination results, carrying out suspicious behavior analysis on the data combination results to obtain initial threat score values corresponding to the data combination results, and obtaining threat total scores of the target network entity according to the initial threat score values corresponding to the data combination results. In the threat analysis process, on one hand, the comprehensive information table of the data is analyzed from multiple dimensions to realize multi-dimensional data analysis processing, and on the other hand, atomic characterization data results corresponding to different dimensions are combined and the combined results are analyzed, so that the atomic characterization data results of information fusion can be utilized to carry out comprehensive analysis, fusion analysis can be carried out from an overall view, the research and judgment difficulty is reduced, the problem of data fusion is solved, the data sharing of the atomic characterization data results is realized, the multiplexing degree of common data is improved, and the efficiency and accuracy of threat analysis are further improved.
The embodiment combines the application architecture shown in fig. 3 to develop a detailed description of the threat analysis method based on multi-level information fusion.
The application architecture in the embodiment comprises a multi-source data fusion module, a threat analysis module and a result output module, wherein the threat analysis module is a double-layer model system and comprises a basic analysis layer and a comprehensive analysis layer.
The multi-source data fusion module is used for acquiring network communication data, acquiring a behavior data set corresponding to a target network entity based on the network communication data, and carrying out label fusion on the behavior data set and resource data such as threat information, WHOIS, IP portrait, knowledge base data and organization portrait data to form a data comprehensive information table.
The basic analysis layer is used for carrying out basic analysis on the data comprehensive information table from a plurality of dimensions such as opposite end IP, opposite end domain name, communication time, organization number, opposite end geographic position attribute, opposite end IP port data, opposite end domain name Top-k, IP record information, corresponding threat information labels, threat information IOC aggregation and the like according to a hierarchical analysis method, and obtaining atomic characterization data results corresponding to each dimension.
Referring to fig. 4, in one embodiment, a plurality of atomic characterization models, such as atomic characterization model 1, atomic characterization model 2, atomic characterization model 3, and atomic characterization model n, may be acquired first. Wherein each atomic representation model corresponds to at least one dimension, in other words, each dimension corresponds to one atomic representation model, or a plurality of dimension combinations corresponds to one atomic representation model. The input of the atomic characterization model is a data comprehensive information table output by the multi-source data fusion module, and the atomic characterization model outputs atomic characterization data of corresponding dimension. In fig. 4, the atomic characterization data corresponding to each dimension is atomic characterization data 1 corresponding to the opposite terminal IP, atomic characterization data 2 corresponding to the communication time, and atomic characterization data 3 corresponding to the threat information tag.
And analyzing the data comprehensive information table according to the corresponding dimension by the atomic depiction model to obtain atomic depiction data corresponding to the dimension.
The comprehensive analysis layer is used for combining atomic characterization data results corresponding to different dimensions to obtain a plurality of data combination results, and comprehensively analyzing the data combination results to obtain initial threat score values corresponding to the data combination results.
In this embodiment, the comprehensive analysis layer further performs comprehensive analysis from a higher layer based on the atomic characterization data result, combines behaviors and attributes of the atomic characterization data result with different dimensions to be monitored, maps the data combination result to different attack scenes, and generates a high-value threat analysis event.
In the embodiment, suspicious behavior analysis is performed on each data combination result, a plurality of comprehensive threat analysis models can be firstly constructed, and the constructed comprehensive threat analysis models can perform comprehensive analysis on highly suspicious behaviors. Specific construction examples include at least one of:
(a) And constructing a comprehensive threat analysis model according to preset abnormal communication behaviors, wherein the abnormal communication behaviors comprise suspicious time communication, long-period existence communication behaviors and the like.
(B) Extracting IOC indexes in threat information, and constructing a comprehensive threat analysis model according to the IOC indexes.
(C) And constructing a comprehensive threat analysis model according to preset key ports, wherein the key ports are high-risk ports, specific tool ports and the like.
(D) And constructing a comprehensive threat analysis model according to a preset rule scene, wherein the rule scene comprises the steps of stealing host information, downloading suspicious files, flowing out large flow and the like.
After the comprehensive threat analysis model is constructed, each data combination result is input into each comprehensive threat analysis model one by one, and suspicious behavior analysis is carried out on the data combination result through the comprehensive threat analysis model.
Each comprehensive threat analysis model utilizes the data combination result of the atomic characterization data result as input to form a multi-level analysis system, and each comprehensive threat analysis model can multiplex the atomic characterization data result as a raw material, so that data sharing among models (namely, the atomic characterization model and/or the comprehensive threat analysis model) is realized, and the analysis efficiency is accelerated. As shown in fig. 4, the comprehensive threat analysis model 1 uses the atomic descriptive data 1 and the atomic descriptive data 2, and the comprehensive threat analysis model 2 uses the atomic descriptive data 2 and the atomic descriptive data 3, so that the atomic descriptive data 2 is a data source shared by the two models, namely the comprehensive threat analysis model 1 and the comprehensive threat analysis model 2, and the analysis efficiency is improved without recalculation.
And summing the initial threat score values obtained by all the comprehensive threat analysis models to obtain the threat total score of the target network entity.
The embodiment builds a model system of double-layer threat analysis based on basic analysis and comprehensive analysis based on a hierarchical analysis method, forms integral visual angles from multiple dimensions and fusion analysis for a longer period, well realizes comprehensive analysis of atomic characterization data results of multiple dimensions, improves accuracy of comprehensive threat analysis by utilizing multiple aspects of data to the maximum extent, and improves threat analysis efficiency by sharing and multiplexing the atomic characterization data results.
The method provided by the embodiment may further include generating an alarm event when the threat total score is greater than a preset score threshold.
And sequencing the alarm events according to the threat total score and preset reference factors, wherein the reference factors comprise, but are not limited to, industry types and project security levels. In practical applications, the degree of influence generated by the advanced persistent threat event on projects related to different industries and different work units is different, or the degree of attention of different industries and different projects on the advanced persistent threat event is different. Based on the method, the alarm events can be ranked by the threat total score and the preset reference factors, so that the user can preferentially process the alarm events ranked at the front, and threat analysis has a relatively more definite target.
The result output layer is used for outputting the threat total score and the alarm event of the target network entity as a result, and sending the result to the user terminal in the forms of message notification, mail, popup window and the like.
The threat analysis method based on multi-level information fusion provided by the embodiment can be generally applied to situation awareness projects such as big data security analysis products or supervision units, and relates to alarm research and judgment, advanced threat analysis or threat hunting business and other processes. Based on the scenario, a specific embodiment of performing business analysis by applying a threat analysis method based on multi-level information fusion is provided herein, and the following is referred to.
And simultaneously, the big data security analysis platform is accessed into resource data such as threat information, knowledge base data, basic information and the like.
In the process of data processing, a behavior data set corresponding to a target network entity is extracted from network communication data, and the behavior data set and preset resource data are subjected to label fusion to obtain a data comprehensive information table.
In the threat analysis process, the data comprehensive information table is subjected to basic analysis according to an analytic hierarchy process, and the data comprehensive information table is analyzed from multiple dimensions respectively to obtain atomic characterization data results corresponding to the dimensions.
And carrying out suspicious behavior analysis on each data combination result through a comprehensive threat analysis model to obtain an initial threat score value corresponding to each data combination result, and determining the sum of the initial threat score values as the total threat score of the target network entity. And when the threat total score is greater than a preset score threshold, generating an alarm event. And sequencing and outputting the alarm events.
Fig. 5 is a block diagram of a threat analysis apparatus based on multi-level information fusion according to an embodiment of the disclosure, where the apparatus may be used to implement the threat analysis method based on multi-level information fusion. Threat analysis apparatus 500 based on multi-level information fusion includes:
A data acquisition module 502, configured to acquire a behavior data set corresponding to a target network entity;
The data fusion module 504 is configured to perform tag fusion on the behavior data set and preset resource data to obtain a data comprehensive information table;
The first analysis module 506 is configured to analyze the data comprehensive information table from multiple dimensions according to an analytic hierarchy process, to obtain atomic characterization data results corresponding to the dimensions;
The data combination module 508 is configured to combine atomic characterization data results corresponding to different dimensions to obtain a plurality of data combination results;
The second analysis module 510 is configured to perform suspicious behavior analysis on each data combination result, so as to obtain an initial threat score value corresponding to each data combination result;
The threat determination module 512 is configured to obtain a total threat score of the target network entity according to initial threat score values corresponding to the multiple data combination results.
The device provided in this embodiment has the same implementation principle and technical effects as those of the foregoing method embodiment, and for brevity, reference may be made to the corresponding content of the foregoing method embodiment where the device embodiment is not mentioned.
Fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the disclosure. As shown in fig. 6, the electronic device 600 includes one or more processors 601 and memory 602.
The processor 601 may be a Central Processing Unit (CPU) or other form of processing unit having data processing and/or instruction execution capabilities and may control other components in the electronic device 600 to perform desired functions.
The memory 602 may include one or more computer program products that may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. The volatile memory may include, for example, random Access Memory (RAM) and/or cache memory (cache), and the like. The non-volatile memory may include, for example, read Only Memory (ROM), hard disk, flash memory, and the like. One or more computer program instructions may be stored on the computer readable storage medium that can be executed by the processor 601 to implement the multi-level information fusion-based threat analysis method and/or other desired functions of the embodiments of the disclosure described above. Various contents such as an input signal, a signal component, a noise component, and the like may also be stored in the computer-readable storage medium.
In one example, the electronic device 600 may also include an input device 603 and an output device 604, which are interconnected by a bus system and/or other form of connection mechanism (not shown).
In addition, the input device 603 may also include, for example, a keyboard, a mouse, and the like.
The output device 604 may output various information to the outside, including the determined distance information, direction information, and the like. The output means 604 may include, for example, a display, speakers, a printer, and a communication network and remote output devices connected thereto, etc.
Of course, only some of the components of the electronic device 600 that are relevant to the present disclosure are shown in fig. 6, with components such as buses, input/output interfaces, etc. omitted for simplicity. In addition, the electronic device 600 may include any other suitable components depending on the particular application.
Further, the embodiment also provides a computer readable storage medium, and the storage medium stores a computer program, and the computer program is used for executing the threat analysis method based on multi-level information fusion.
The embodiment of the disclosure provides a threat analysis method, a threat analysis apparatus, an electronic device and a threat analysis medium based on multi-level information fusion, which include a computer readable storage medium storing a program code, wherein the program code includes instructions for executing the method described in the foregoing method embodiment, and specific implementation may refer to the method embodiment and will not be described herein.
It should be noted that in this document, relational terms such as "first" and "second" and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises an element.
The foregoing is merely a specific embodiment of the disclosure to enable one skilled in the art to understand or practice the disclosure. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the disclosure. Thus, the present disclosure is not intended to be limited to the embodiments shown and described herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.