CN119854325A

CN119854325A - Distributed storage method and system based on AI flow identification

Info

Publication number: CN119854325A
Application number: CN202510340158.1A
Authority: CN
Inventors: 杨建军; 高欣; 陆政; 陈叶华; 周洪印
Original assignee: HANGZHOU TRINET INFORMATION TECHNOLOGY CO LTD
Current assignee: HANGZHOU TRINET INFORMATION TECHNOLOGY CO LTD
Priority date: 2025-03-21
Filing date: 2025-03-21
Publication date: 2025-04-18

Abstract

The invention discloses a distributed storage method and a distributed storage system based on AI flow identification, wherein the distributed storage method comprises the steps of obtaining state information of a current CDN network node, simulating and constructing a first GNN node in a GNN model, constructing adjacent nodes of the GNN model according to communication relations between the current CDN network node and different CDN network nodes, obtaining communication flow information between the first GNN node and corresponding adjacent nodes, utilizing an AI model to identify flow characteristic data in the communication flow information, constructing a dynamic adjacent matrix between the first GNN node and the adjacent nodes, utilizing the dynamic adjacent matrix to update the states of the first GNN node and the adjacent nodes, executing a caching strategy or a flow processing strategy of the corresponding CDN network nodes according to the updated states of the first GNN node and the adjacent nodes, and routing the nodes of the current CDN network to the nodes of an optimal CDN network according to the caching strategy or the flow processing strategy.

Description

Distributed storage method and system based on AI flow identification

Technical Field

The invention relates to the technical field of distributed storage, in particular to a distributed storage method and system based on AI flow identification.

Background

The conventional distributed storage method currently existing realizes redundancy and backup of data by storing the data on servers in a plurality of physical locations. Even if one server fails, copies of data stored on other servers are still available, thereby ensuring data persistence and system stability. The traditional distributed storage does not comprehensively analyze traffic, such as data traffic with network threat or data traffic with illegal contents are stored according to a distributed data slicing method, so that the stored data is low in safety and cleanliness, and malicious occupation of server storage resources is easily caused under the condition that a server node is attacked. Meanwhile, in the process of scheduling storage resources, the traditional distributed storage does not effectively analyze data traffic, so that the distributed storage efficiency is low.

Disclosure of Invention

One of the purposes of the invention is to provide a distributed storage method and a system based on AI flow identification, which identify flow data of different storage nodes by using an AI model, wherein the invention identifies network attack data and illegal content data possibly existing in the flow data by using the AI model, and intercepts or schedules the flow data based on the identified network attack data and illegal content data, thereby realizing the directional processing of the flow of the network attack data and the illegal content data, improving the data security of distributed storage, reducing the occupation of the illegal content data to distributed storage resources and improving the storage effect of the distributed data.

The invention further aims to provide a distributed storage method and a distributed storage system based on AI flow identification, wherein the method and the distributed storage system utilize a CDN network (content delivery network) as a distributed storage system, the AI model is utilized to identify and analyze the flows of different nodes of the CDN network to obtain network attack data and illegal content data of the different node flows of the CDN network, and corresponding CDN network node caching strategies are carried out based on the network attack data and the illegal content data.

The invention further aims to provide a distributed storage method and a distributed storage system based on AI flow identification, wherein the method and the distributed storage system utilize a GNN model (graph neural network model) to simulate node behaviors of CDN network nodes including but not limited to flow data, wherein related data of the CDN network nodes are used as node characteristics in the GNN model, and results of AI analysis of the flow data among different CDN network nodes are used as connecting edge characteristics (adjacent matrixes) among different nodes in the GNN model, and the GNN model is utilized to construct a dynamic adjacent matrix based on the flow characteristics identified by the AI model to update data of corresponding CDN network nodes or execute a cache strategy.

In order to achieve at least one of the above objects, the present invention further provides a distributed storage method based on AI traffic identification, the method comprising:

Acquiring current CDN network node state information, simulating and constructing a first GNN node in a GNN model according to the current CDN network node state information, and constructing adjacent nodes of the GNN model according to communication relations between the current CDN network node and different CDN network nodes;

Acquiring communication flow information between the first GNN node and a corresponding adjacent node, identifying flow characteristic data in the communication flow information by using an AI model, and constructing a dynamic adjacent matrix between the first GNN node and the adjacent node according to the flow characteristic data;

Updating the states of the first GNN node and the adjacent nodes by using the dynamic adjacent matrix, and executing a caching strategy or a flow processing strategy of the corresponding CDN network node according to the updated states of the first GNN node and the adjacent nodes;

And acquiring a current CDN network node routing link table, and routing the current CDN network node to an optimal CDN network node according to the caching strategy or the flow processing strategy.

According to one preferred embodiment of the invention, the current CDN network node state information comprises a hardware resource state, a network resource state and a cache state, wherein the hardware resource state comprises CPU utilization rate, memory utilization rate and read-write performance data of a cache disk, the network resource state comprises uplink and downlink bandwidth utilization rate of the current CDN network node, packet loss rate of the current CDN network node, network request rate, concurrent connection number and URL number, the cache state comprises cache hit rate, cache elimination rate, cache preheating state and cache content type, the hardware resource state, the network resource state and the cache state are respectively subjected to data preprocessing to obtain a feature matrix of the first GNN node, and the dynamic adjacency matrix is constructed according to traffic feature data identified by an AI model and between the first GNN node and a corresponding adjacency node.

According to another preferred embodiment of the invention, the construction method of the dynamic adjacency matrix comprises the steps of obtaining network flow characteristic data between the first GNN node and the corresponding adjacency node by utilizing the identification of the AI model including image identification AI model, word identification AI model, voice identification AI model and network attack identification AI model, wherein the flow characteristic data comprise illegal video frame data, illegal voice data, illegal text data and network attack types and quantity, respectively carrying out data preprocessing on different flow characteristic data to convert the flow characteristic data into the dynamic adjacency matrix, and carrying out state updating on the first GNN node and the corresponding adjacency node according to the dynamic adjacency matrix.

According to another preferred embodiment of the present invention, the method for updating the states of the first GNN node and the corresponding neighboring nodes includes constructing a multi-head attention coefficient from traffic feature data obtained by identifying traffic according to the AI modelWherein i represents a current first GNN node, j represents an adjacent node having a communication relationship with the first GNN node, h _i represents a first GNN node feature, a and W represent different learnable parameters, wherein W is a weight matrix, a is an action parameter based on flow feature data conversion between nodes, T represents a transpose matrix identification, leakyReLU represents an activation function, softmax represents a linear classification function, and the attention coefficient α _ij represents similar feature weights of the first GNN node and the corresponding adjacent node.

According to another preferred embodiment of the present invention, the method for updating the states of the first GNN node and the corresponding neighboring nodes further includes performing parallel computation on the multiple-head attention coefficient α _ij for capturing the relationship type between the first GNN node i and the neighboring node j, and computing the updated content of the first GNN node by the multiple-head attention coefficient α _ij includes constructing a dynamic neighboring matrix a _dynamic[i,j]^k =using the multiple-head attention coefficient α _ij Wherein k represents the corresponding attention head, and the dynamic adjacency matrix A _dynamic[i,j]^k is used for replacing the original static adjacency matrix A _static, and after replacement, the update of the first GNN node is performed according to the following formulaWhereinRepresenting the updated predicted first GNN node state, N representing a node relationship function, K representing the total number of attention heads, sigma being an activation function, based on the updated predicted first GNN node stateAnd executing a node caching strategy or a traffic processing strategy corresponding to the current CDN.

According to another preferred embodiment of the present invention, the caching policy or traffic handling policy includes a first GNN node state based on the updated predictionUpdating predicted buffer status for node of current CDN network to allow incoming buffer of data of corresponding adjacent nodeLimiting the current node prediction of CDN network from the corresponding adjacent node network attack data or transferring to the node of the specific CDN networkAnd carrying out a strategy comprising limiting or prohibiting access and caching on the violation data from the corresponding adjacent node predicted by the node of the current CDN.

According to another preferred embodiment of the present invention, the CDN network node includes a central CDN network node and edge CDN network nodes, and after the current CDN network node routing link table is obtained, the central CDN network node determines an edge CDN network node closest to the current CDN network node according to the current CDN network node routing link table, and transmits data of an adjacent node to the closest edge CDN network node through route conversion according to the caching policy, so as to implement efficient processing of cache data.

According to another preferred embodiment of the present invention, the method for calculating the distance L between closest edge CDN network nodes of the current CDN network node includes obtaining a round trip time RTT from the current CDN network node to any one edge CDN network node, calculating a hop count Hops to the edge CDN network node, and calculating a bandwidth utilization P of the current CDN network node, where a calculation formula of the distance L between closest edge CDN network nodes of the current CDN network node is: wherein β, λ, and γ are weight coefficients, respectively.

In order to achieve at least one of the above objects, the present invention further provides a distributed storage system based on AI-traffic recognition, which performs the above-described distributed storage method based on AI-traffic recognition.

The present invention further provides a computer-readable storage medium storing a computer program that is executed by a processor to implement the above-described AI-traffic-identification-based distributed storage method.

Drawings

Fig. 1 shows a flow chart of a distributed storage method based on AI flow identification of the invention.

Detailed Description

The following description is presented to enable one of ordinary skill in the art to make and use the invention. The preferred embodiments in the following description are by way of example only and other obvious variations will occur to those skilled in the art. The basic principles of the invention defined in the following description may be applied to other embodiments, variations, modifications, equivalents, and other technical solutions without departing from the spirit and scope of the invention.

It will be understood that the terms "a" and "an" should be interpreted as referring to "at least one" or "one or more," i.e., in one embodiment, the number of elements may be one, while in another embodiment, the number of elements may be plural, and the term "a" should not be interpreted as limiting the number.

Referring to fig. 1, the invention discloses a distributed storage method and a system based on AI traffic identification, wherein the method includes that firstly, CDN network node state information needs to be acquired, and GNN nodes (graph neural network nodes) are built by performing feature conversion simulation according to the CDN network node state information, wherein the CDN network nodes are content delivery network nodes, the CDN network nodes include a central CDN network node and edge CDN network nodes, the central CDN network node is equivalent to a main control node, and is used for monitoring a plurality of edge CDN network nodes and performing content delivery caching on the plurality of edge CDN network nodes, wherein the central CDN network node can also perform traffic control including among the plurality of edge CDN network nodes, or directionally deliver corresponding edge CDN network nodes to a specific CDN network node. The invention utilizes the CDN network nodes to carry out distributed storage and carries out caching strategies and flow control strategies in different modes according to the flow characteristics among different CDN network nodes, and particularly carries out the caching strategies and the flow control strategies aiming at flow and violation data including network attack. The invention uses a GNN model (graph neural network model) as a driving model of a distributed storage strategy, and utilizes the GNN model to simulate the states of CDN network nodes and the relations between different CDN network nodes, wherein an adjacency relation (equivalent to the connecting edges of different GNN model nodes) based on the GNN model is constructed according to flow characteristic data between the different CDN network nodes, the state change of the GNN model nodes is judged according to the adjacency relation, and the corresponding caching strategy and flow control strategy of the CDN network nodes are executed based on the updated and predicted state of the GNN model nodes. Therefore, the invention can effectively improve the safety performance in the distributed cache, effectively reduce the malicious occupation of the distributed storage resources caused by abnormal traffic and improve the resource utilization rate of the distributed storage.

The method for acquiring the CDN network node state information to construct the GNN model node specifically comprises the steps of acquiring the CDN network node state information, wherein the CDN network node state information comprises a hardware resource state, a network resource state and a cache state, the hardware resource state comprises but is not limited to CPU utilization rate, GPU utilization rate, memory utilization rate, read-write performance data of a cache disk and the like of a current CDN network node, the hardware resource state is used for representing the overall performance of edge equipment where the current CDN network node is located, in general, the CDN network node of the edge equipment is used as a distributed cache hardware resource, and the performances of hardware equipment deployed by different edge equipment are different. The network resource state of the CDN network node is automatically identified according to the network environment change of the current CDN network node, for example, the situation that network bandwidth is insufficient and network request speed is reduced may exist in the high concurrency interaction process of the multiple CDN network nodes, and the identification of the network resource state can provide reference data for a CDN network node caching policy and a traffic limiting policy to a certain extent, so that the distributed storage efficiency of the CDN network node is improved as a whole. The cache state comprises, but is not limited to, a cache hit rate, a cache elimination rate, a cache preheating state, a cache content type and the like, wherein the cache hit rate is used for describing the performance of a cache system, particularly in a specific service scene, if the cache hit performance is higher, the cache system of the CDN network node has better distribution effect on a current service content request, and high adaptability of the CDN network node of edge equipment and corresponding service content can be realized, so that the cache hit rate can be used as a state parameter of the GNN graph node to better describe the adaptability of different CDN network nodes to related service flow. Similarly, based on the cache elimination rate, the cache preheating state and the state parameters converted from the cache content type to the GNN map node, the adaptation effect of the corresponding edge device CDN network node and the corresponding service content can be described, which is not described in detail in the present invention.

Furthermore, after the related state data of the hardware resource state, the network resource state and the cache state of the current CDN network node are obtained, the related state data of the current CDN network node can be more normalized by adopting data preprocessing including but not limited to normalization, standardization and feature transformation, wherein the normalized related state data is classified by adopting a label coding (labelEncoder) mode to construct different features. The tag coding (labelEncoder) mode is an existing method for constructing classification features, which is not described in detail in the present invention. And constructing the GNN graph node feature matrix by utilizing the related state data of the hardware resource state, the network resource state and the cache state of the current CDN network node in the tag coding (labelEncoder), wherein the feature matrix is defined as a feature matrix of a first GNN node and is used for describing the related state data of the current CDN network node.

It should be noted that one of the core technical points of the present invention is to construct an adjacency matrix of a first GNN node, where the adjacency matrix of the first GNN node is a matrix describing a relationship between the first GNN node and an adjacency node, where the adjacency node and the first GNN node are defined as CDN network nodes having a communication relationship, and in particular, CDN network nodes having a content delivery scheduling relationship. The adjacency matrix comprises flow characteristic data between the first GNN node and the adjacency node, and the adjacency matrix constructed by the flow characteristic data is used for updating and predicting state changes of the first GNN node and the adjacency node. It should be noted that, the traffic data may be from the first GNN node to the adjacent node, or from the adjacent node to the first GNN node, where the above definition name may be converted according to the actual traffic transmission situation, which is not limited in detail in the present invention.

In the invention, the flow characteristic data existing in the flow data need to be identified, and because the flow data possibly includes but is not limited to network attack data, illegal video, audio, text and the like, the invention needs to utilize the existing network flow characteristic data comprising an image recognition AI model, a text recognition AI model, a voice recognition AI model and a network attack recognition AI model to identify and acquire the first GNN node and the corresponding adjacent node, utilize the flow characteristics identified by the existing different AI models to output flow characteristic labels (label), and further construct the flow characteristic matrix based on the flow characteristic labels (label) in the same label coding (labelEncoder) mode. Wherein the traffic signature tag includes, but is not limited to, offending video frame data, offending voice data, offending text data, type and number of network attacks, and the like.

It should be noted that, in the conventional GNN model, the adjacency matrix of the GNN network node is generally a static matrix for message passing to update the state of the GNN network node, for example, the conventional GNN model adjacency matrix message passing mechanism formula is as follows: wherein A _Static is a conventional static adjacency matrix, In order to update the state of the GNN node before the update,The updated predicted GNN node states are all in the form of node state feature matrixes, N represents node relation functions, i and j respectively represent different nodes, and W is a learnable parameter and is a weight matrix corresponding to node state features in general.

In order to adapt to dynamic flow characteristics and dynamically update and drive the state of the GNN network node, the invention replaces the conventional static adjacency matrix A with a dynamic adjacency matrix A _dynamic[i,j]^k based on flow characteristic data, wherein the realization method of the dynamic adjacency matrix A _dynamic[i,j]^k comprises the steps of constructing a multi-head attention coefficient according to the flow characteristic data obtained by identifying the flow according to the AI modelWherein i represents a current first GNN node, j represents an adjacent node having a communication relationship with the first GNN node, h _i represents a first GNN node feature, a and W represent different learnable parameters, wherein W is a weight matrix, a is an action parameter based on flow feature data conversion between nodes, T represents a transpose matrix identification, leakyReLU represents an activation function, softmax represents a linear classification function, the attention coefficient alpha _ij represents similar feature weights of the first GNN node and a corresponding adjacent node, and the multi-headed attention coefficient alpha _ij constructed based on flow feature data is further used for constructing the dynamic adjacent matrix a _dynamic[i,j]^k=α_ij, wherein the multi-headed attention coefficient alpha _ij represents the influence of different types of flow features on updating of the first GNN node and the corresponding adjacent node, so the dynamic adjacent matrix a _dynamic[i,j]^k can be constructed directly through parameter conversion.

Furthermore, the method for calculating the updated first GNN node by the multi-head attention coefficient alpha _ij comprises constructing a dynamic adjacency matrix A _dynamic[i,j]^k = using the multi-head attention coefficient alpha _ij Wherein k represents the corresponding attention head, and the dynamic adjacency matrix A _dynamic[i,j]^k is used for replacing the original static adjacency matrix A _static, and after replacement, the update of the first GNN node is performed according to the following formulaWhereinRepresenting the updated predicted first GNN node state, N representing a node relationship function, K representing the total number of attention heads, sigma being an activation function, based on the updated predicted first GNN node stateAnd executing a node caching strategy or a traffic processing strategy corresponding to the current CDN.

The caching strategy or the flow processing strategy comprises the following steps of predicting the state of a first GNN node according to the updatingUpdating predicted buffer status for node of current CDN network to allow incoming buffer of data of corresponding adjacent nodeLimiting the current node prediction of CDN network from the corresponding adjacent node network attack data or transferring to the node of the specific CDN networkAnd carrying out a strategy comprising limiting or prohibiting access and caching on the violation data from the corresponding adjacent node predicted by the node of the current CDN. The above-mentioned caching strategy and flow control strategy are both exemplified, and the present invention is not limited thereto.

In one preferred embodiment of the present invention, in order to find a CDN network node closest to a current CDN network node, after obtaining a current CDN network node routing link table, the central CDN network node determines an edge CDN network node closest to the current CDN network node according to the current CDN network node routing link table, and transmits data of an adjacent node to the closest edge CDN network node through route conversion according to the caching policy, so as to implement efficient processing of cache data. The method for calculating the distance L between the closest edge CDN network nodes of the current CDN network node comprises the steps of obtaining round trip time RTT between the current CDN network node and any one edge CDN network node, calculating the hop count Hops of the edge CDN network node, and calculating the bandwidth utilization rate P of the current CDN network node, wherein the calculation formula of the distance L between the closest edge CDN network nodes of the current CDN network node is as follows: wherein β, λ, and γ are weight coefficients, respectively.

The processes described above with reference to flowcharts may be implemented as computer software programs in accordance with the disclosed embodiments of the application. Embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method shown in the flow chart. In such embodiments, the computer program may be downloaded and installed from a network via a communication portion, and/or installed from a removable medium. The above-described functions defined in the method of the present application are performed when the computer program is executed by a Central Processing Unit (CPU). The computer readable medium of the present application may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the above. More specific examples of a computer-readable storage medium may include, but are not limited to, an electrical connection having one or more wire segments, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present application, however, the computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless segments, radio lines, fiber optic cables, RF, etc., or any suitable combination of the foregoing.

The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

It will be understood by those skilled in the art that the embodiments of the present invention described above and shown in the drawings are merely illustrative and not restrictive of the current invention, and that this invention has been shown and described with respect to the functional and structural principles thereof, without departing from such principles, and that any modifications or adaptations of the embodiments of the invention may be possible and practical.

Claims

1. A distributed storage method based on AI traffic identification, characterized in that the method includes:

Obtain current CDN network node status information, simulate and construct a first GNN node in the GNN model according to the current CDN network node status information, and construct adjacent nodes of the GNN model according to the communication relationship between the current CDN network node and different CDN network nodes;

Acquire communication flow information between the first GNN node and the corresponding adjacent node, identify flow characteristic data in the communication flow information using an AI model, and construct a dynamic adjacency matrix between the first GNN node and the adjacent node according to the flow characteristic data;

Using the dynamic adjacency matrix to update the status of the first GNN node and the adjacent nodes, and executing the caching strategy or traffic processing strategy of the corresponding CDN network node according to the updated status of the first GNN node and the adjacent nodes;

Obtain the current CDN network node routing link table, and route the current CDN network node to the optimal CDN network node according to the cache strategy or traffic processing strategy.

2. A distributed storage method based on AI traffic identification according to claim 1, characterized in that the current CDN network node status information includes hardware resource status, network resource status and cache status, wherein the hardware resource status includes: CPU utilization, memory utilization, cache disk read and write performance data; the network resource status includes: the upstream and downstream bandwidth utilization of the current CDN network node, the packet loss rate of the current CDN network node, the network request rate, the number of concurrent connections and the number of URLs; the cache status includes: cache hit rate, cache elimination rate, cache preheating status and cache content type; the hardware resource status, network resource status and cache status are respectively preprocessed to obtain the feature matrix of the first GNN node, and the dynamic adjacency matrix is constructed according to the traffic feature data between the first GNN node and the corresponding adjacent node identified by the AI model.

3. According to a distributed storage method based on AI traffic identification as described in claim 2, it is characterized in that the method for constructing the dynamic adjacency matrix includes: using an image recognition AI model, a text recognition AI model, a speech recognition AI model and a network attack recognition AI model to identify and obtain the network traffic feature data between the first GNN node and the corresponding adjacent node; wherein the traffic feature data includes: illegal video frame data, illegal voice data, illegal text data, network attack type and quantity; pre-processing the different traffic feature data respectively to convert them into the dynamic adjacency matrix, and updating the status of the first GNN node and the corresponding adjacent node according to the dynamic adjacency matrix.

4. According to the distributed storage method based on AI traffic identification according to claim 1, the state update method of the first GNN node and the corresponding adjacent node includes: constructing a multi-head attention coefficient according to the traffic feature data obtained by identifying the traffic according to the AI model , where i represents the current first GNN node, j represents the adjacent node having a communication relationship with the first GNN node, _hi represents the first GNN node feature, a and W represent different learnable parameters, where W is the weight matrix, a is the action parameter between nodes based on the transformation of traffic feature data, T represents the transposed matrix identifier, LeakyReLU represents the activation function, Softmax represents the linear classification function, and the attention coefficient _αij represents the similarity feature weight of the first GNN node and the corresponding adjacent node.

5. According to a distributed storage method based on AI traffic identification according to claim 4, it is characterized in that the state update method of the first GNN node and the corresponding adjacent node also includes: parallel calculation of the multi-head attention coefficient α _ij to capture the relationship type between the first GNN node i and the adjacent node j, and the content of the first GNN node updated by calculating the multi-head attention coefficient α _ij includes: using the multi-head attention coefficient α _ij to construct a dynamic adjacency matrix A _dynamic [i, j] ^k = , where k represents the corresponding attention head, and the dynamic adjacency matrix A _dynamic [i, j] ^k replaces the original static adjacency matrix A _static , and after replacement, the first GNN node is updated according to the following formula ,in represents the first GNN node state predicted after the update, N represents the node relationship function, K represents the total number of attention heads, σ is the activation function, and the first GNN node state predicted after the update is obtained. Execute the node cache strategy or traffic processing strategy corresponding to the current CDN network.

6. A distributed storage method based on AI traffic identification according to claim 5, characterized in that the cache strategy or traffic processing strategy includes: according to the first GNN node state predicted after the update Update the predicted cache state of the node in the current CDN network to allow the corresponding adjacent node data to be cached; according to the updated predicted first GNN node state The network attack data from the corresponding adjacent nodes predicted by the nodes of the current CDN network is limited or transferred to the nodes of a specific CDN network; according to the first GNN node state predicted after the update The node prediction of the current CDN network is processed with the violation data from the corresponding adjacent node, including flow limiting or access prohibition and caching strategy.

7. According to a distributed storage method based on AI traffic identification as described in claim 1, it is characterized in that the CDN network nodes include a central CDN network node and an edge CDN network node. After obtaining the routing link table of the current CDN network node, the central CDN network node determines the edge CDN network node closest to the current CDN network node according to the routing link table of the current CDN network node, and transmits the data of the adjacent node to the closest edge CDN network node through routing conversion according to the cache strategy, so as to realize efficient processing of cached data.

8. According to a distributed storage method based on AI traffic identification according to claim 7, it is characterized in that the method for calculating the distance L between the edge CDN network node closest to the current CDN network node includes: obtaining the round-trip time RTT from the current CDN network node to any edge CDN network node, and calculating the number of hops Hops to the edge CDN network node, and calculating the bandwidth utilization P of the current CDN network node, then the calculation formula for the distance L between the edge CDN network node closest to the current CDN network node is: , where β, λ and γ are weight coefficients respectively.

9. A distributed storage system based on AI traffic identification, characterized in that the system executes a distributed storage method based on AI traffic identification as described in any one of claims 1-8.

10. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program, and the computer program is executed by a processor to implement a distributed storage method based on AI traffic identification as described in any one of claims 1-8.