CN103746867B - A kind of network protocol analysis method based on basic function - Google Patents
A kind of network protocol analysis method based on basic function Download PDFInfo
- Publication number
- CN103746867B CN103746867B CN201310718896.2A CN201310718896A CN103746867B CN 103746867 B CN103746867 B CN 103746867B CN 201310718896 A CN201310718896 A CN 201310718896A CN 103746867 B CN103746867 B CN 103746867B
- Authority
- CN
- China
- Prior art keywords
- protocol
- combination
- basis function
- target
- function
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Landscapes
- Communication Control (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
本发明公开了基于基函数的网络协议分析方法,包括:建立基函数库和已知结构协议的基函数模式组合方式库;当接收到目标网络发过来的数据,利用该数据和已有的基函数模式组合方式表征该目标网络对应的目标协议的结构;根据目标协议的结构进行判断:如果该数据为已知结构的协议数据,采用分层的方法对该目标协议进行分析;如果该数据位未知结构的协议数据,利用已有基函数或新的基函数生成该目标协议对应的基函数模式组合方式。本发明可以解决协议快速识别、精确分析处理的问题。
The invention discloses a network protocol analysis method based on basis functions, including: establishing a basis function library and a base function pattern combination library of known structural protocols; when receiving data sent by a target network, using the data and the existing basis The combination of function patterns characterizes the structure of the target protocol corresponding to the target network; judge according to the structure of the target protocol: if the data is protocol data with a known structure, analyze the target protocol using a layered method; if the data bit For the protocol data with unknown structure, use the existing basis function or new basis function to generate the basis function mode combination mode corresponding to the target protocol. The invention can solve the problems of rapid protocol identification and accurate analysis and processing.
Description
技术领域technical field
本发明涉及计算机技术领域,尤其涉及一种基于基函数的网络协议分析方法。The invention relates to the field of computer technology, in particular to a network protocol analysis method based on basis functions.
背景技术Background technique
网络协议分析是在网络管理和网络安全研究领域中众多学者关注的核心问题。网络协议分析通过捕获网络中的数据包,分析数据包的首部和数据字段给出协议的详细信息和统计结果,进而将数据归类和分析,从而进一步帮助发现网络潜在安全隐患,并可在网络发生故障时提供故障分析信息。网络协议分析可以使网络管理人员可以快速准确地定位故障原因,找出引起故障的网络节点、网络协议和网络链路,以最快的速度恢复网络的正常运行。此外,网络协议分析还可以通过分析网络通信情况以及网络连接状况,对网络性能和资源的合理分配,为规划及调整网络提供可靠依据。Network protocol analysis is the core issue that many scholars pay attention to in the field of network management and network security research. Network protocol analysis captures data packets in the network, analyzes the header and data fields of the data packets to give detailed information and statistical results of the protocol, and then classifies and analyzes the data, thereby further helping to discover potential security risks in the network, and can be used in the network Provides failure analysis information when a failure occurs. Network protocol analysis can enable network managers to quickly and accurately locate the cause of the fault, find out the network node, network protocol and network link that caused the fault, and restore the normal operation of the network at the fastest speed. In addition, network protocol analysis can also provide a reliable basis for planning and adjusting the network by analyzing network communication and network connection conditions, rationally allocating network performance and resources.
但是由于现有网络协议呈现多样化、私有化的特点,协议分析人员面临着协议种类越来越多、协议状态空间越来越复杂的问题。However, due to the diversification and privatization of existing network protocols, protocol analysts are faced with the problems of more and more types of protocols and more and more complex protocol state spaces.
已有的很多协议分析方法大多使用字符串匹配的方法,由于这种方法使用了大量的匹配方法,所以速度较慢。而其他基于统计学的协议分析方法又存在精确度不高的缺点。此外这些方法都不能对各家网络公司的私有协议进行分析。Most of the existing protocol analysis methods use the method of character string matching, because this method uses a large number of matching methods, so the speed is relatively slow. However, other protocol analysis methods based on statistics have the disadvantage of low accuracy. In addition, none of these methods can analyze the private protocols of various network companies.
发明内容Contents of the invention
鉴于上述的分析,本发明旨在提供一种基于基函数的网络协议分析方法,用以解决目前网络分析领域中存在的协议种类繁多、协议状态空间复杂、私有协议不公开等所带来的协议识别速度慢和精确度不高的问题。In view of the above-mentioned analysis, the present invention aims to provide a network protocol analysis method based on basis functions to solve the problems caused by various types of protocols, complex protocol state space, and non-disclosure of private protocols in the field of network analysis. Problems with slow and inaccurate recognition.
本发明的目的主要是通过以下技术方案实现的:The purpose of the present invention is mainly achieved through the following technical solutions:
本发明提供了一种基于基函数的网络协议分析方法,包括:The invention provides a network protocol analysis method based on basis functions, comprising:
建立基函数库和已知结构协议的基函数模式组合方式库;Establish a base function library and a base function pattern combination library of known structural protocols;
当接收到目标网络发过来的数据作为输入数据,利用该数据和已有的基函数模式组合方式表征该目标网络对应的目标协议的结构;When the data sent by the target network is received as input data, the structure of the target protocol corresponding to the target network is characterized by using the combination of the data and the existing basis function mode;
根据该目标协议的结构进行判断:如果该目标协议为已知结构的协议,采用分层的方法对该目标协议进行分析;如果该目标协议为未知结构的协议,利用已有基函数或新的基函数生成该目标协议对应的基函数模式组合方式。Judging according to the structure of the target protocol: if the target protocol is a protocol with a known structure, use a layered method to analyze the target protocol; if the target protocol is a protocol with an unknown structure, use existing basis functions or new The basis function generates a basis function mode combination mode corresponding to the target protocol.
进一步地,利用沃尔什函数建立基函数库和已知结构协议的基函数模式组合方式库。具体包括:Further, the Walsh function is used to establish a base function library and a base function mode combination library of known structural protocols. Specifically include:
沃尔什函数定义如下:The Walsh function is defined as follows:
若用wal(k,t)(k=0,1,…)来表示区间t∈[0,1)上的沃尔什函数,则其定义为下式:If wal(k,t)(k=0,1,…) is used to represent the Walsh function on the interval t∈[0,1), then it is defined as the following formula:
wal(2k,t)=wal(k,2t)+(-1)kwal(k,2t-1),k=1,2,…wal(2k,t)=wal(k,2t)+(-1) k wal(k,2t-1),k=1,2,…
wal(2k+1,t)=wal(k,2t)+(-1)k+1wal(k,2t-1),k=0,1,…wal(2k+1,t)=wal(k,2t)+(-1) k+1 wal(k,2t-1),k=0,1,…
定义如下变换将沃尔什函数的±1转换为0,1比特流:Define the following transformation to convert the Walsh function of ±1 to a 0,1 bitstream:
则采用变换f(x)得到一组正交基函数base(k,t)(k=0,1,…):Then transform f(x) to obtain a set of orthogonal basis functions base(k,t)(k=0,1,…):
base(2k,t)=f(wal(2k,t)),k=1,2,…base(2k,t)=f(wal(2k,t)),k=1,2,…
base(2k+1,t)=f(wal(2k+1,t)),k=0,1,…base(2k+1,t)=f(wal(2k+1,t)),k=0,1,…
base(0,t)=f(wal(0,t))base(0,t)=f(wal(0,t))
该组正交基函数base即为基函数库,利用该组正交基函数的不同组合模式表述所有的已知协议,就得到已知结构协议的基函数模式组合方式库。The set of orthogonal base functions base is the base function library, and all known protocols are expressed by using different combination modes of the set of orthogonal base functions, so as to obtain a base function mode combination library of known structural protocols.
进一步地,所述利用已有的基函数模式组合方式表征目标协议的结构的步骤具体包括:Further, the step of characterizing the structure of the target protocol by using the existing basis function mode combination specifically includes:
利用已有的基函数模式组合方式和接收到的目标网络的数据,采用时间滑动的方式描绘该目标协议的时间-结构分布图;Using the existing combination of basis function modes and the received data of the target network, the time-sliding method is used to describe the time-structure distribution diagram of the target protocol;
根据目标协议的时间-结构分布图进行描述得到基函数模式组合-匹配率分布图,通过该基函数模式组合-匹配率分部关系来表征目标协议的结构。Based on the description of the time-structure distribution diagram of the target protocol, the basis function mode combination-matching rate distribution diagram is obtained, and the structure of the target protocol is characterized by the basis function mode combination-matching rate sub-relationship.
进一步地,根据如下方法得到上述时间-结构分布图:Further, the above time-structure distribution diagram is obtained according to the following method:
假设C为基函数组合模式集合,对每一种已有的基函数组合模式c1,c2,…,ccn∈C,采用时间滑动的方式与接收到的数据进行异或加操作,得到每种组合模式下的结构值,即对于某一组合模式和输入数据I={b1,b2,...,bn},计算这里若n>cf则令i从1到cf重复,即i=1~cf,1~cf...,从而得到输入数据的时间-结构分布图。Assuming that C is a set of basis function combination patterns, for each existing basis function combination pattern c 1 ,c 2 ,…,c cn ∈C, the XOR operation is performed with the received data in a time sliding manner to obtain The structure value in each combination mode, that is, for a certain combination mode and input data I={b 1 ,b 2 ,...,b n }, calculate Here, if n>cf, let i repeat from 1 to cf, that is, i=1~cf, 1~cf..., so as to obtain the time-structure distribution diagram of the input data.
进一步地,根据如下方法得到上述基函数模式组合-匹配率分布图:Further, the above basis function pattern combination-matching rate distribution diagram is obtained according to the following method:
对不同的基函数组合模式ci,将输入数据的时间-结构分布图的所有数据相加,得到每个基函数组合模式的结构匹配数值(ci,mi),i∈{1,2,…,cn};For different basis function combination modes c i , add all the data in the time-structure distribution diagram of the input data to obtain the structure matching value ( ci ,m i ) of each basis function combination mode, i∈{1,2 ,...,c n };
利用所有已知基函数组合模式的结构匹配数值绘制基函数组合模式-匹配率分布图,其中横坐标为已知基函数组合模式,纵坐标为结构匹配数值。Based on the structural matching values of all known basis function combination modes, the basis function combination mode-matching rate distribution diagram is drawn, where the abscissa is the known basis function combination mode, and the ordinate is the structure matching value.
进一步地,根据目标协议的结构进行判断的步骤具体包括:Further, the step of judging according to the structure of the target agreement specifically includes:
对基函数模式组合-匹配率分布图中的最大匹配率数值m与第一预定阈值t1进行比较:Comparing the maximum matching rate value m in the basis function mode combination-matching rate distribution diagram with the first predetermined threshold t1:
若m大于等于t1,则认为该目标协议为已知结构的协议,采用分层的方法对该目标协议进行分析;If m is greater than or equal to t1, the target protocol is considered to be a protocol with a known structure, and the target protocol is analyzed using a layered method;
若m小于t1,则认为该目标协议为私有协议,利用已有基函数或新的基函数生成该目标协议对应的基函数模式组合方式。If m is less than t1, the target protocol is considered to be a private protocol, and the basis function mode combination corresponding to the target protocol is generated by using existing basis functions or new basis functions.
进一步地,采用分层的方法对该目标协议进行分析的步骤具体包括:Further, the step of analyzing the target protocol using a layered method specifically includes:
(0)将目标网络发过来的数据作为输入数据d;(0) Use the data sent by the target network as input data d;
(1)对于输入数据d,利用其基函数组合模式抽取输入数据d的最外层协议特征字段f1,f2,…,ffn,将接收到的输入协议信息分割为协议首部字段H和上层协议数据supD;(1) For the input data d, use its basis function combination mode to extract the outermost protocol feature fields f 1 , f 2 ,...,f fn of the input data d, and divide the received input protocol information into protocol header fields H and upper layer protocol data supD;
(2)解析首部字段H所包含的所有特征字段;(2) Parse all the feature fields contained in the header field H;
(3)判断上层协议数据supD长度是否为0,是则停止;(3) Determine whether the length of supD of the upper-layer protocol data is 0, and stop if it is;
(4)对分割后得到的上层协议数据supD进行协议结构表征;(4) Perform protocol structure characterization on the upper layer protocol data supD obtained after segmentation;
(5)将已被结构表征的上层协议数据作为输入数据,转(1);(5) Take the upper-layer protocol data that has been characterized by the structure as input data, and turn to (1);
直到该数据全部被处理完。until all the data is processed.
进一步地,利用已有基函数或新的基函数生成该目标协议对应的基函数模式组合方式的步骤具体包括:Further, the step of using existing basis functions or new basis functions to generate the basis function mode combination method corresponding to the target protocol specifically includes:
设F为基函数集合,C为基函数组合模式集合Let F be the set of basis functions, and C be the set of combination patterns of basis functions
1)首先利用已有的基函数f1,f2,…,fcf∈F,采用不同于组合模式库中的新的组合模式即随机取新的ki值,使得新的c不属于C,描绘目标协议的结构分布图;1) First use the existing basis functions f 1 , f 2 ,…, f cf ∈ F, and adopt a new combination mode different from the combination mode library That is, a new ki value is randomly selected so that the new c does not belong to C, and the structure distribution diagram of the target protocol is drawn;
2)利用所有新的基函数组合模式c的结构匹配数值绘制基函数模式组合-匹配率分布图;2) Use the structural matching values of all new basis function combination modes c to draw the basis function mode combination-matching rate distribution diagram;
3)比较最大的匹配率数值m与第二预定阈值t2:3) Compare the maximum matching rate value m with the second predetermined threshold t2:
如果m大于等于t2,则将此协议对应的新的基函数模式组合模式cn加入基函数组合模式库C,停止;If m is greater than or equal to t2, then add the new base function mode combination mode c n corresponding to this protocol into the base function combination mode library C, and stop;
如果m小于t2,则将基函数的维度加1,把得到新的基函数fcf+1加入基函数库F,转步骤1)。If m is less than t2, add 1 to the dimension of the basis function, add the new basis function f cf+1 to the basis function library F, and go to step 1).
本发明有益效果如下:The beneficial effects of the present invention are as follows:
本发明提供了一种基于基函数的网络协议分析方法,可以解决协议快速识别、精确分析处理的问题。The invention provides a network protocol analysis method based on basic functions, which can solve the problems of rapid protocol identification and accurate analysis and processing.
本发明的其他特征和优点将在随后的说明书中阐述,并且,部分的从说明书中变得显而易见,或者通过实施本发明而了解。本发明的目的和其他优点可通过在所写的说明书、权利要求书、以及附图中所特别指出的结构来实现和获得。Additional features and advantages of the invention will be set forth in the description which follows, and in part will be apparent from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention may be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
附图说明Description of drawings
图1为本发明实施例所述方法的流程示意图;Fig. 1 is a schematic flow chart of the method described in the embodiment of the present invention;
图2为本发明实施例所述方法中,输入数据的时间-结构分布图;Fig. 2 is a time-structure distribution diagram of input data in the method described in the embodiment of the present invention;
图3为本发明实施例所述方法中,基函数组合模式-匹配率分布图;Fig. 3 is a distribution diagram of basis function combination mode-matching rate in the method described in the embodiment of the present invention;
图4为本发明实施例所述方法中,对协议进行逐层分析的示意图;4 is a schematic diagram of layer-by-layer analysis of the protocol in the method described in the embodiment of the present invention;
图5为本发明实施例所述方法中,对协议数据分析的结果示意图。Fig. 5 is a schematic diagram of the analysis results of protocol data in the method described in the embodiment of the present invention.
具体实施方式detailed description
本发明所述方法主要包括:建立基函数库和已知结构协议的基函数模式组合方式库;当接收到目标网络发过来的数据,利用该数据和已有的基函数模式组合方式表征该目标网络对应的目标协议的结构;根据该目标协议的结构进行判断:如果该目标协议为已知结构的协议,采用分层的方法对该目标协议进行分析;如果该目标协议为未知结构的协议,利用已有基函数或新的基函数生成该目标协议对应的基函数模式组合方式。The method of the present invention mainly includes: establishing a base function library and a base function mode combination mode library of a known structure protocol; when receiving the data sent by the target network, using the data and the existing base function mode combination mode to represent the target The structure of the target protocol corresponding to the network; judge according to the structure of the target protocol: if the target protocol is a protocol with a known structure, use a layered method to analyze the target protocol; if the target protocol is a protocol with an unknown structure, A basis function mode combination mode corresponding to the target protocol is generated by using existing basis functions or new basis functions.
下面结合附图来具体描述本发明的优选实施例,其中,附图构成本申请一部分,并与本发明的实施例一起用于阐释本发明的原理。Preferred embodiments of the present invention will be specifically described below in conjunction with the accompanying drawings, wherein the accompanying drawings constitute a part of the application and are used together with the embodiments of the present invention to explain the principles of the present invention.
如图1所示,图1为本发明实施例所述方法的流程示意图,具体可以包括如下步骤:As shown in Figure 1, Figure 1 is a schematic flow chart of the method described in the embodiment of the present invention, which may specifically include the following steps:
步骤101:利用沃尔什函数建立基函数库和已知结构协议的基函数模式组合方式库:;Step 101: Utilize the Walsh function to establish a base function library and a base function pattern combination library of known structural protocols:;
由于现有协议的特征字段多种多样,所以需要使用少量的信息对大量协议的特征字段进行表述,从而为协议的分析提供基础。Due to the various characteristic fields of the existing protocols, it is necessary to use a small amount of information to describe the characteristic fields of a large number of protocols, so as to provide a basis for protocol analysis.
由于沃尔什(Walsh)函数系具有函数值为±1、正交性等特点,所以本发明实施例中采用沃尔什(Walsh)函数作为基函数对协议结构进行表征,当然,采用其他类似函数也可以。Since the Walsh function system has the characteristics of function value ±1, orthogonality, etc., in the embodiment of the present invention, the Walsh function is used as the base function to characterize the protocol structure. Of course, other similar Functions do too.
沃尔什(Walsh)函数定义如下:The Walsh function is defined as follows:
若用wal(k,t)(k=0,1,…)来表示区间t∈[0,1)上的沃尔什函数,则其定义为下式:If wal(k,t)(k=0,1,…) is used to represent the Walsh function on the interval t∈[0,1), then it is defined as the following formula:
wal(2k,t)=wal(k,2t)+(-1)kwal(k,2t-1),k=1,2,…wal(2k,t)=wal(k,2t)+(-1) k wal(k,2t-1),k=1,2,…
wal(2k+1,t)=wal(k,2t)+(-1)k+1wal(k,2t-1),k=0,1,…wal(2k+1,t)=wal(k,2t)+(-1) k+1 wal(k,2t-1),k=0,1,…
有了沃尔什(Walsh)函数以后,本发明实施例定义如下变换将沃尔什函数的±1转换为0,1比特流:With the Walsh function, the embodiment of the present invention defines the following transformation to convert the ±1 of the Walsh function into a 0,1 bit stream:
则采用变换f(x)可以得到一组正交基函数base(k,t)(k=0,1,…):Then a set of orthogonal basis functions base(k,t)(k=0,1,…) can be obtained by transforming f(x):
base(2k,t)=f(wal(2k,t)),k=1,2,…base(2k,t)=f(wal(2k,t)),k=1,2,…
base(2k+1,t)=f(wal(2k+1,t)),k=0,1,…base(2k+1,t)=f(wal(2k+1,t)),k=0,1,…
base(0,t)=f(wal(0,t))base(0,t)=f(wal(0,t))
由于已知结构协议的协议数据可以表示为0,1码串,所以可以用正交基函数base的不同组合模式表述所有的已知协议,即对于某已知结构的协议数据d={x1x2...xn},xi∈{0,1},总可以找到实数ci使得其中M为正交基函数base的数量,从而可以建立基函数库和已知结构协议的基函数模式组合方式库。Since the protocol data of a protocol with a known structure can be expressed as a 0,1 code string, all known protocols can be expressed with different combination modes of the orthogonal basis function base, that is, for a protocol data of a known structure d={x 1 x 2 ... x n }, x i ∈ {0,1}, it is always possible to find real numbers c i such that Where M is the number of orthogonal basis function bases, so that a base function library and a base function mode combination library of known structural protocols can be established.
步骤102:接收目标网络发过来的数据作为输入数据;Step 102: receiving data sent by the target network as input data;
步骤103:利用该数据和已有的基函数模式组合方式表征该目标网络对应的目标协议的结构;Step 103: characterize the structure of the target protocol corresponding to the target network by using the combination of the data and the existing basis function mode;
具体的说就是,在基函数和变换f(x)的基础上,利用已有的基函数模式组合方式和接收到的目标网络的数据,采用时间滑动的方式描绘该目标协议的时间-结构分布图,然后根据该时间-结构分布图描绘基函数模式组合-匹配率分布图,从而判断该目标网络的数据是否是已知结构的协议数据,即可以判定是该目标网络所采用的协议是否是已知结构协议。Specifically, on the basis of the basis function and the transformation f(x), using the existing combination of basis function patterns and the received data of the target network, the time-sliding method is used to describe the time-structure distribution of the target protocol , and then draw the basis function mode combination-matching rate distribution diagram according to the time-structure distribution diagram, so as to judge whether the data of the target network is protocol data with a known structure, that is, it can be judged whether the protocol adopted by the target network is Known structure protocol.
假设C为基函数组合模式集合,Assuming that C is a set of basis function combination patterns,
(1)对每一种已有的基函数组合模式c1,c2,…,ccn∈C,采用时间滑动的方式与步骤102中接收到的输入数据进行异或加操作,得到每种组合模式下的结构值,即对于某一组合模式和输入数据I={b1,b2,...,bn},计算(1) For each existing basis function combination mode c 1 ,c 2 ,...,c cn ∈C, perform XOR operation with the input data received in step 102 in a time sliding manner to obtain each The structure value in the combination mode, that is, for a certain combination mode and input data I={b 1 ,b 2 ,...,b n }, calculate
(这里若n>cf则令i从1到cf重复,即i=1~cf,1~cf...),从而得到输入数据的时间-结构分布图,如图2所示; (Here, if n>cf, let i repeat from 1 to cf, that is, i=1~cf, 1~cf...), so as to obtain the time-structure distribution diagram of the input data, as shown in Figure 2;
(2)对不同的基函数组合模式ci,将输入数据的时间-结构分布图的所有数据相加,得到每个基函数组合模式的结构匹配数值(ci,mi),i∈{1,2,…,cn};(2) For different basis function combination patterns c i , add all the data in the time-structure distribution diagram of the input data to obtain the structure matching value ( ci , m i ) of each basis function combination pattern, i∈{ 1,2,...,c n };
(3)利用所有已知基函数组合模式的结构匹配数值绘制基函数组合模式-匹配率分布图,如图3所示,其中横坐标为已知基函数组合模式,纵坐标为结构匹配数值。(3) Use the structural matching values of all known basis function combination modes to draw the basis function combination mode-matching rate distribution diagram, as shown in Figure 3, where the abscissa is the known basis function combination mode, and the ordinate is the structural matching value.
步骤104:在模式-匹配率分布图中找到最大的匹配率数值,比较最大的匹配率数值m与第一预定阈值t1:Step 104: Find the maximum matching rate value in the pattern-matching rate distribution diagram, and compare the maximum matching rate value m with the first predetermined threshold t1:
如果m大于等于t1,则认为该数据为已知结构协议的数据,即该目标网络对应的目标协议为已知结构协议,转到步骤105,流程结束;如果m小于t1,则认为该数据为私有协议的数据,即该目标网络对应的目标协议为私有协议,转到步骤106,进行自学习;If m is greater than or equal to t1, the data is considered to be data of a known structured protocol, that is, the target protocol corresponding to the target network is a known structured protocol, and the process goes to step 105, and the process ends; if m is less than t1, the data is considered to be The data of the private protocol, that is, the target protocol corresponding to the target network is a private protocol, go to step 106, and carry out self-learning;
其中,第一预定阈值t1可人工指定,其影响协议分析的准确率,一般可取为0.8。Wherein, the first predetermined threshold t1 can be manually specified, which affects the accuracy of the protocol analysis, and generally can be taken as 0.8.
步骤105:基于基函数组合模式的分层协议分析;Step 105: Layered protocol analysis based on basis function combination mode;
具体的说就是,由于现有的通信协议具有分层封装的特点,所以本发明实施例中采用基函数组合模式思想,对接收到的协议数据进行逐层分析,解析方法可采用本领域已有成熟技术方案,具体步骤如下:Specifically, because the existing communication protocol has the characteristics of layered encapsulation, the idea of basis function combination mode is adopted in the embodiment of the present invention to analyze the received protocol data layer by layer, and the analysis method can adopt existing methods in this field. Mature technical solutions, the specific steps are as follows:
105-0:将目标网络发过来的数据作为输入数据d;105-0: Use the data sent by the target network as input data d;
105-1:对于输入数据d,利用其基函数组合模式抽取输入数据d的最外层协议特征字段f1,f2,…,ffn,将接收到的输入协议信息分割为协议首部字段H和上层协议数据supD;105-1: For the input data d, use its basis function combination mode to extract the outermost protocol feature fields f 1 , f 2 ,...,f fn of the input data d, and divide the received input protocol information into protocol header fields H and upper layer protocol data supD;
105-2:解析首部字段H所包含的所有特征字段;105-2: Parse all the characteristic fields contained in the header field H;
105-3:判断上层协议数据supD长度是否为0,是则停止;105-3: Judging whether the length of supD of the upper layer protocol data is 0, if yes, stop;
105-4:利用上一节的方法对分割后得到的上层协议数据supD进行协议结构表征;105-4: Use the method in the previous section to characterize the protocol structure of the upper layer protocol data supD obtained after segmentation;
105-5:将已被结构表征的上层协议数据作为输入数据,转(1);105-5: Take the upper-layer protocol data that has been characterized by the structure as input data, go to (1);
即,相当于(1)-(4)处理了一层协议,从最外层协议开始,每循环一次,就处理一层协议,直到数据全部处理完。That is, it is equivalent to (1)-(4) processing a layer of protocol, starting from the outermost protocol, and processing a layer of protocol every cycle until all the data is processed.
如图4所示,4给出了对协议进行逐层分析的示意图。从图4中可以看出,在对目标协议进行逐层分析时,每次分析只采用基函数抽取整体协议的部分信息,这样可以减少处理的数据量,加快协议分析的速度。As shown in Figure 4, 4 provides a schematic diagram of layer-by-layer analysis of the protocol. It can be seen from Figure 4 that when analyzing the target protocol layer by layer, only the basis function is used to extract part of the information of the overall protocol for each analysis, which can reduce the amount of processed data and speed up the speed of protocol analysis.
通过上述方法对目标协议进行分层的结构分析,可以简化协议分析的复杂度,并且达到逐级精确的目的,加大协议分析的精确度。The layered structural analysis of the target protocol by the above method can simplify the complexity of the protocol analysis, and achieve the purpose of level-by-level precision, increasing the accuracy of the protocol analysis.
步骤106:可自学习的基函数及其组合模式扩展方法Step 106: Self-learnable basis function and its combination mode extension method
具体的说就是,由于私有协议结构的不确定性,需要设计具有自学习能力的基函数及其组合模式扩展方法,本发明实施例采用的自学习分析方法可分为以下几个步骤:Specifically, due to the uncertainty of the private protocol structure, it is necessary to design a basis function with self-learning capability and its combination mode extension method. The self-learning analysis method adopted in the embodiment of the present invention can be divided into the following steps:
设F为基函数集合,C为基函数组合模式集合Let F be the set of basis functions, and C be the set of combination patterns of basis functions
106-1:首先利用已有的基函数f1,f2,…,fcf∈F,采用不同于组合模式库中的新的组合模式即随机取新的ki值,使得新的c不属于C,描绘目标协议的结构分布图;106-1: First use the existing basis functions f 1 , f 2 ,…,f cf ∈F, and adopt a new combination pattern different from the combination pattern library That is, a new ki value is randomly selected so that the new c does not belong to C, and the structure distribution diagram of the target protocol is drawn;
106-2:利用所有新的基函数组合模式c的结构匹配数值绘制基函数模式组合-匹配率分布图;106-2: Use the structural matching values of all new basis function combination modes c to draw basis function mode combination-matching rate distribution diagrams;
106-3:比较最大的匹配率数值m与第二预定阈值t2:106-3: Compare the maximum matching rate value m with the second predetermined threshold t2:
如果m大于等于t2,则说明可通过已有的基函数的新的组合模式cn对目标协议进行表征,则将此协议对应的新的基函数模式组合模式cn加入基函数组合模式库C,停止;If m is greater than or equal to t2, it means that the target protocol can be characterized by the new combination mode c n of the existing basis functions, and the new combination mode c n of the basis function mode corresponding to this protocol is added to the base function combination mode library C ,stop;
如果m小于t2,则说明无法通过已有的基函数对目标协议进行表征,则将基函数的维度加1,把得到新的基函数fcf+1加入基函数库F,转(1);If m is less than t2, it means that the target protocol cannot be characterized by the existing basis function, then add 1 to the dimension of the basis function, add the new basis function f cf+1 to the basis function library F, and turn to (1);
其中,第二预定阈值t2可人工指定,其影响协议表征的准确率,一般可取为0.8。Wherein, the second predetermined threshold t2 can be manually specified, which affects the accuracy of the protocol representation, and is generally set to be 0.8.
上述自学习基函数及其组合模式扩展方法可以形成新的基函数及其模式组合方式,用以建立私有协议的表征方式。若此后遇到同样的协议,可以通过基函数组合模式库对此协议进行快速表征和分析。The above-mentioned self-learning basis function and its combination mode extension method can form a new basis function and its mode combination mode, which is used to establish the representation mode of the private protocol. If you encounter the same protocol later, you can quickly characterize and analyze this protocol through the basis function combination pattern library.
下面对本发明实施例提出的基于基函数的协议分析方法进行有效性说明。图5给出了使用匹配方法、统计方法和提出方法对协议数据分析的结果,其中横坐标为输入的协议数据数量,纵坐标为分析时间。从图中可以看出本发明实施例提出的基于基函数的网络协议分析方法在性能上优于匹配方法、统计方法。The effectiveness of the basis function-based protocol analysis method proposed in the embodiment of the present invention is described below. Figure 5 shows the results of protocol data analysis using the matching method, statistical method and proposed method, where the abscissa is the number of input protocol data, and the ordinate is the analysis time. It can be seen from the figure that the basis function-based network protocol analysis method proposed by the embodiment of the present invention is superior to the matching method and the statistical method in terms of performance.
综上所述,本发明实施例提供了一种基于基函数的网络协议分析方法,首先基于基函数的协议表征思想对需要分析的网络协议数据进行分析,判断其是已知结构协议的数据还是私有协议的数据。然后针对已知结构协议和私有协议的特点进行分别处理,对于已知结构协议,由于现今网络中大量采用了协议层级嵌套的设计思路,所以本发明实施例使用了一种分层协议分析方法;对于私有协议,本发明实施例利用自学习的思想,构建新的基函数并对协议数据进行表征,进而将其转换为已知结构的协议进行分析。本发明实施例可以解决协议的快速识别以及精确分析处理的问题。To sum up, the embodiment of the present invention provides a network protocol analysis method based on basis functions. Firstly, based on the protocol characterization idea of basis functions, the network protocol data to be analyzed is analyzed, and it is judged whether it is data of a known structure protocol or Data of private agreement. Then, the characteristics of the known structured protocol and the private protocol are processed separately. For the known structured protocol, since a large number of design ideas of protocol layer nesting are used in today's network, a layered protocol analysis method is used in the embodiment of the present invention. ; For the private protocol, the embodiment of the present invention uses the idea of self-learning to construct a new basis function and characterize the protocol data, and then convert it into a protocol with a known structure for analysis. The embodiments of the present invention can solve the problems of rapid identification and accurate analysis and processing of protocols.
以上所述,仅为本发明较佳的具体实施方式,但本发明的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本发明揭露的技术范围内,可轻易想到的变化或替换,都应涵盖在本发明的保护范围之内。因此,本发明的保护范围应该以权利要求书的保护范围为准。The above is only a preferred embodiment of the present invention, but the scope of protection of the present invention is not limited thereto. Any person skilled in the art within the technical scope disclosed in the present invention can easily think of changes or Replacement should be covered within the protection scope of the present invention. Therefore, the protection scope of the present invention should be determined by the protection scope of the claims.
Claims (8)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201310718896.2A CN103746867B (en) | 2013-12-23 | 2013-12-23 | A kind of network protocol analysis method based on basic function |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201310718896.2A CN103746867B (en) | 2013-12-23 | 2013-12-23 | A kind of network protocol analysis method based on basic function |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN103746867A CN103746867A (en) | 2014-04-23 |
| CN103746867B true CN103746867B (en) | 2016-09-21 |
Family
ID=50503858
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201310718896.2A Active CN103746867B (en) | 2013-12-23 | 2013-12-23 | A kind of network protocol analysis method based on basic function |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN103746867B (en) |
Families Citing this family (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN107689899A (en) * | 2017-09-01 | 2018-02-13 | 南京南瑞集团公司 | A kind of unknown protocol recognition methods and system based on bit stream |
| CN110445750A (en) * | 2019-06-18 | 2019-11-12 | 国家计算机网络与信息安全管理中心 | A kind of car networking protocol traffic recognition methods and device |
| CN116032809B (en) * | 2022-12-28 | 2024-04-30 | 上海天旦网络科技发展有限公司 | Network protocol analysis method and system using Wasm |
Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN102045347A (en) * | 2010-11-30 | 2011-05-04 | 华为技术有限公司 | Method and device for identifying protocol |
| CN103281291A (en) * | 2013-02-19 | 2013-09-04 | 电子科技大学 | Application layer protocol identification method based on Hadoop |
Family Cites Families (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| TW587239B (en) * | 1999-11-30 | 2004-05-11 | Semiconductor Energy Lab | Electric device |
-
2013
- 2013-12-23 CN CN201310718896.2A patent/CN103746867B/en active Active
Patent Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN102045347A (en) * | 2010-11-30 | 2011-05-04 | 华为技术有限公司 | Method and device for identifying protocol |
| CN103281291A (en) * | 2013-02-19 | 2013-09-04 | 电子科技大学 | Application layer protocol identification method based on Hadoop |
Also Published As
| Publication number | Publication date |
|---|---|
| CN103746867A (en) | 2014-04-23 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN103532940B (en) | network security detection method and device | |
| CN102035698B (en) | HTTP tunnel detection method based on decision tree classification algorithm | |
| CN101282331B (en) | P2P network traffic identification method based on transport layer characteristics | |
| CN103795723B (en) | Distributed type internet-of-things safety situation awareness method | |
| CN114257428B (en) | Encryption network traffic identification and classification method based on deep learning | |
| CN104468262B (en) | A kind of network protocol identification method and system based on semantic sensitivity | |
| CN109525508B (en) | Encrypted stream identification method and device based on flow similarity comparison and storage medium | |
| CN101442535B (en) | Method for recognizing and tracking application based on keyword sequence | |
| CN111597411B (en) | Method and system for distinguishing and identifying power specification data frames | |
| CN108833195B (en) | Process-based network data flow analysis method | |
| CN113746707A (en) | Encrypted traffic classification method based on classifier and network structure | |
| CN105024993A (en) | Protocol comparison method based on vector operation | |
| CN107784327A (en) | A kind of personalized community discovery method based on GN | |
| CN102611706A (en) | Network protocol identification method and system based on semi-supervised learning | |
| CN103746867B (en) | A kind of network protocol analysis method based on basic function | |
| CN118353660A (en) | A network traffic anomaly detection algorithm based on multimodal feature fusion | |
| CN115412295A (en) | A multi-scenario low-resource encrypted traffic recognition method and system based on large-scale pre-training | |
| Yan et al. | Principal Component Analysis Based Network Traffic Classification. | |
| CN108805211A (en) | IN service type cognitive method based on machine learning | |
| CN108055166A (en) | A kind of the state machine extraction system and its extracting method of the application layer protocol of nesting | |
| CN111767695B (en) | Method for optimizing field boundary reasoning in protocol reverse engineering | |
| Altschaffel et al. | Statistical pattern recognition based content analysis on encrypted network: Traffic for the teamviewer application | |
| CN111310796B (en) | A Web User Click Recognition Method Oriented to Encrypted Network Stream | |
| CN112235254A (en) | A fast identification method of Tor bridge in high-speed backbone network | |
| CN103259731B (en) | A kind of network key node Self-Similar Traffic based on ON/OFF source model generates method for simplifying |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| C06 | Publication | ||
| PB01 | Publication | ||
| C10 | Entry into substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| C14 | Grant of patent or utility model | ||
| GR01 | Patent grant |