CN119179965B

CN119179965B - Microservice cluster global call chain construction method and system based on edge computing

Info

Publication number: CN119179965B
Application number: CN202411686965.0A
Authority: CN
Inventors: 朱韶松; 赵伟廷; 曲延盛; 马超; 刘荫; 徐彬泰; 刘函; 汤琳琳; 徐浩; 孟令震; 范少华; 王有昕; 孙文昌; 姜悦悦
Original assignee: Information and Telecommunication Branch of State Grid Shandong Electric Power Co Ltd
Current assignee: Information and Telecommunication Branch of State Grid Shandong Electric Power Co Ltd
Priority date: 2024-11-25
Filing date: 2024-11-25
Publication date: 2025-03-21
Anticipated expiration: 2044-11-25
Also published as: CN119179965A

Abstract

The present invention proposes a method and system for constructing a global call chain of a microservice cluster based on edge computing, and relates to the field of communication technology. The method includes: obtaining a microservice cluster topology map, locating edge nodes based on data sources; collecting call chain data in the microservice cluster topology map, and transmitting it to edge nodes; the edge nodes use a trained decision tree model to filter out high-priority data in the call chain data for edge computing, build local call chains, and upload them to the central node; the high-priority data includes abnormal requests and high-latency requests; the central node aggregates local call chains to form a global call chain. The present invention deploys a lightweight machine learning model on the edge node, and based on historical data analysis, filters the data that constitutes the local call chain, and then the central node receives and merges the local call chains from each edge node to form a complete global call chain view, effectively reducing the burden of data transmission.

Description

Method and system for constructing micro-service cluster global call chain based on edge calculation

Technical Field

The invention relates to the technical field of communication, in particular to a method and a system for constructing a micro-service cluster global call chain based on edge calculation.

Background

The micro-service architecture meets the complex requirements of the electric power information system for high expansibility, flexibility and stability through highly cohesive and low-coupling service unit decomposition. However, as the number of micro services increases, the calling relationships between the various services within the system become increasingly complex. In order to ensure efficient operation of the system, it is necessary to grasp call links between these services in real time and intuitively display the relationship of each service by constructing a service topology. This not only helps the operation and maintenance management of the system, but also quickly locates the root cause of the problem when an abnormality occurs in the system.

The existing solution is mainly to track call information on a micro-service link, transmit the call information to an information collection module through an http or gRPC method, and then connect related micro-services corresponding to upstream and downstream calls in series in real time, so as to form a directed graph. But such methods rely on centralized processing and analysis during construction, resulting in centralized consumption of computing resources and network delay problems, and have significant delays during data transmission, greatly increasing transmission costs.

The edge calculation is an emerging calculation mode, which can reduce the burden of a central node of the electric power information system and improve the response speed and the data processing capacity. However, the inventor finds that the current method for constructing the micro service call chain by using edge calculation greatly hinders data transmission and increases the load of the central node because of imperfect data filtering mechanism and many invalid and redundant data exist on the edge node.

Disclosure of Invention

In order to solve the problems, the invention provides a method and a system for constructing a micro-service cluster global call chain based on edge calculation, which are characterized in that a lightweight machine learning model is deployed at edge nodes, data forming a local call chain is screened based on historical data analysis, and then the central node receives and merges the local call chains from all the edge nodes to form a complete global call chain view, so that the data transmission burden is effectively reduced, a comprehensive and real-time service call relation view is provided for operation and maintenance management of an electric power information system, and the problems of calculation resource consumption and network delay are greatly reduced.

In order to achieve the above purpose, the present invention adopts the following technical scheme:

in a first aspect, the present invention provides a method for constructing a micro service cluster global call chain based on edge computation, including:

acquiring a micro-service cluster topological graph, and positioning edge nodes based on a data source;

collecting call chain data in a micro-service cluster topological graph and transmitting the call chain data to an edge node;

The edge node screens out high-priority data in the call chain data by utilizing the trained decision tree model to carry out edge calculation, and constructs a local call chain to be uploaded to the central node;

the central node aggregates the local call chains to form a global call chain.

Preferably, the edge node is positioned based on a data source, and particularly the edge node is deployed inside the data source, wherein the data source comprises Internet of things equipment, a user terminal and a local data center.

Preferably, the collecting call chain data in the micro service cluster topology map is transmitted to an edge node, specifically:

Acquiring a micro-service call relationship based on call chain data collected by a center node and an edge node in a micro-service cluster topological graph, and constructing a preliminary directed graph;

According to the preliminary directed graph, call chain data between micro service units is collected by using a proxy tool or tracking codes embedded in the micro service and transmitted to the edge node.

Preferably, the call chain data includes log, request, response time, and error information.

Preferably, the edge node screens out high priority data in call chain data to perform edge calculation by using a trained decision tree model, constructs a local call chain, and uploads the local call chain to a central node, wherein the method specifically comprises the following steps:

deploying a trained decision tree model at the edge node;

and the edge node utilizes the trained decision tree model to analyze and predict call chain data, performs edge calculation on the data predicted to be high-priority, constructs a local call chain, and uploads the data to the central node.

Preferably, the training process of the decision tree model comprises:

Acquiring historical call chain data of an edge node, wherein the historical call chain data comprises high-priority data and other request data;

Marking the historical call chain data, and distinguishing high-priority data from other request data;

On each decision tree node, selecting preferred feature division data from a plurality of features contained in the call chain data, wherein the preferred features are determined according to preset evaluation criteria and can enable data segmentation to achieve optimal effects; constructing a decision tree by recursively selecting the best features according to the evaluation criteria, wherein each node in the decision tree represents a test on a certain feature, each branch represents a value of a feature, and the final leaf node contains a prediction result;

Training the constructed decision tree by using the marked historical call chain data to obtain a trained decision tree model, so that the model can predict whether the data is high-priority data or not based on the characteristics of the call chain data.

Preferably, the central node aggregates local call chains to form a global call chain, which specifically includes:

the method comprises the steps that a central node receives local call chains from different edge nodes, wherein the local call chains contain PARENT SPAN ID information;

Identifying calling relations between services or operations according to PARENT SPAN ID information, and generating corresponding directed edges, wherein the directions of the directed edges point to a called party from a calling party;

for each directed edge, constructing a corresponding call chain path according to the identified call relationship;

Integrating a plurality of directed edge data to form a global call chain;

when a new local call chain arrives, the central node dynamically updates the global call chain.

In a second aspect, the present invention provides a micro service cluster global call chain construction system based on edge computation, including:

The initialization module is used for acquiring a micro-service cluster topological graph and positioning edge nodes based on a data source;

the call chain data acquisition module is used for collecting call chain data in the micro-service cluster topological graph and transmitting the call chain data to the edge node;

The local call chain construction module is used for screening out high-priority data in call chain data by using a trained decision tree model to perform edge calculation, constructing a local call chain and uploading the local call chain to the central node, wherein the high-priority data comprises an abnormal request and a high-delay request;

and the global call chain construction module is used for the central node to aggregate the local call chains to form a global call chain.

In a third aspect, the present invention provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of a method for constructing a micro service cluster global call chain based on edge computation according to the first aspect.

In a fourth aspect, the present invention provides a computer device, including a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor implements the steps in the method for constructing a micro service cluster global call chain based on edge computation according to the first aspect when the processor executes the program.

Compared with the prior art, the invention has the beneficial effects that:

(1) According to the invention, the edge nodes are deployed near the source of data generation and reasonably configured, so that the local data is efficiently processed, and the resource utilization efficiency is greatly optimized. The call chain data between micro service instances, including key information such as logs, requests, response time and the like, is comprehensively collected by using agents such as Jaeger, zipkin or the like or a mode of embedding trace codes, and a solid foundation is provided for subsequent analysis. The lightweight machine learning model (decision tree) is deployed on the edge node to analyze the call chain data, so that the data transmission burden is effectively reduced, the local call chain is initially constructed, the cost of data transmission is reduced, and possible performance bottlenecks or abnormal conditions can be rapidly identified by screening high-priority data based on the decision tree. The center node forms a complete global call chain view by merging the data of the edge nodes, and can update in real time, so that the global view always reflects the latest system state. In addition, the Span data and the service topological graph which are stored in a lasting mode provide powerful support for subsequent operation and maintenance work, further improve the response speed and stability of the system and bring remarkable benefits for operation and maintenance of the power information system.

(2) According to the invention, the decision tree model can accurately distinguish high-priority data (such as abnormal requests and high-delay requests) from other data, key information is effectively focused, and the interference of irrelevant data on edge calculation is avoided, so that the efficiency and pertinence of the edge calculation are improved, and unnecessary resource waste is reduced. And secondly, training a model based on historical call chain data, so that the model has accurate grasp on the data characteristics and the prediction is more reliable. In a complex micro-service cluster environment, the screening mode can rapidly locate links which possibly have problems, is beneficial to timely processing abnormal conditions, ensures the stable operation of micro-services, and improves the performance and reliability of the whole micro-service cluster. Meanwhile, a local call chain is constructed and uploaded to a central node, and powerful support is provided for accurate construction of a global call chain.

Additional aspects of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the invention.

FIG. 1 is a schematic diagram of a method for constructing a micro-service cluster global call chain based on edge computation according to an embodiment of the present invention;

fig. 2 is a main flowchart of a method for constructing a micro service cluster global call chain based on edge computation according to an embodiment of the present invention.

Detailed Description

The invention will be further described with reference to the drawings and examples.

Example 1

As shown in fig. 1, this embodiment discloses a method for constructing a micro-service cluster global call chain based on edge calculation, which includes the following steps:

s1, acquiring a micro-service cluster topological graph, and positioning edge nodes based on a data source;

s2, collecting call chain data in a micro service cluster topological graph and transmitting the call chain data to an edge node;

s3, the edge node screens out high-priority data in the call chain data by utilizing a trained decision tree model to perform edge calculation, constructs a local call chain and uploads the local call chain to the central node, wherein the high-priority data comprises an abnormal request and a high-delay request;

s4, the central node aggregates the local call chains to form a global call chain.

Next, a detailed description will be given of a method for constructing a micro service cluster global call chain based on edge computation disclosed in this embodiment with reference to fig. 2.

In S1, the micro service cluster topology map specifically includes information such as a distribution position of the micro service units in the cluster, a connection relationship between each micro service unit, and a possible hierarchical structure (such as a setting condition of a central node, an edge node, and the like) of the entire cluster.

The edge nodes are deployed inside data sources, such as Internet of things equipment, user terminals or local data centers, and meanwhile, according to the load conditions of different positions, the resource allocation, such as the allocation of processors, storage and network bandwidth, is carried out on the edge nodes so as to ensure that the edge nodes can efficiently process local data.

S2, specifically comprises the following steps:

s201, acquiring a micro-service call relationship based on call chain data collected by a center node and an edge node in a micro-service cluster topological graph, and constructing a preliminary directed graph;

S202, collecting call chain data between micro service units by using a proxy tool or a tracking code embedded in the micro service according to the preliminary directed graph, and transmitting the call chain data to an edge node.

In S201, a micro-service call relationship is constructed based on call chain data collected by the center node and the edge node, and a preliminary directed graph is generated, where the nodes of the graph represent a micro-service unit (actually, a micro-service may be composed of multiple container groups deployed on multiple nodes).

In S202, for the collection of call chain data, agents such as Jaeger, zipkin, or trace codes embedded in micro service codes may be used to collect call chain data between corresponding micro service instances in the running process, including logs, requests, response time, error information, and the like.

Specifically, a suitable agent, such as OpenTelemetry, jaeger, zipkin, may be selected according to the system architecture and requirements. These agents are typically capable of capturing call chain data automatically or semi-automatically. Edge computing applications are packaged and deployed onto edge nodes using container technology (e.g., docker), and automation tools (e.g., kubernetes) are used to manage the deployment, extension, and upgrade of edge nodes. Installing a suitable operating system and middleware ensures that the edge node can run data acquisition and processing tools.

As a specific implementation, jaeger's embedded trace library (tracker) in the micro-service is used to collect call chain data for each service. Each service call is labeled as a "Span", and under this hierarchy, each service call is treated as a separate unit and labeled as a "Span". This "Span" is just like a record package for a service invocation activity, which contains much of the critical information about the invocation, including service name, start time, duration, context information (e.g., trace ID, span ID, PARENT SPAN ID), etc.

Where Trace ID is an ID that is used to uniquely identify a complete distributed transaction or business process. In a complex micro-service architecture, there may be collaboration among multiple services to complete a complete business operation, such as a user ordering may involve collaboration of multiple micro-services such as order service, inventory service, payment service, etc. Trace ID is like a globally unique label to the whole business process, and all service calls related to the business process can be connected in series through the ID, so that the execution condition of the whole business process can be conveniently tracked and analyzed.

Each "Span" has a unique Span ID that is used to uniquely identify a particular service call within the same "Trace" (the complete business process identified by the Trace ID). That is, in a business process that is composed of multiple service calls at a time, each individual service call has its own unique Span ID, through which the corresponding service call activity can be accurately found among the numerous service calls in the business process.

Jaeger uses parent-child relationships to represent calling and called parties. Each Span has a Span ID and may have one or more sub-spans. Specifically, PARENT SPAN ID (parent Span ID) is that when there is a call relationship between services, for example, service a calls service B, the "Span" generated by the call operation of service a is the parent Span, and the "Span" generated by the call operation of service B is the child Span. The child Span contains a PARENT SPAN ID ID pointing to the Span ID of the parent Span, which clearly indicates which parent Span represents the service call the child Span corresponds to was initiated from, thus establishing a hierarchical relationship between the calls, and facilitating understanding of the structure and flow direction of the entire call chain.

The tracking library for each service reports the collected Span data to Jaeger's Agent. This agent is typically deployed alongside each node or service instance.

The embodiment collects the call chain data between the micro service units by using the agent tool or the tracking code embedded in the micro service, can realize comprehensive and accurate collection, can effectively record whatever type of call, and ensures the integrity and accuracy of the data without any link. Meanwhile, the data can be acquired in real time, and new calling conditions can be captured in real time in the micro-service operation, so that the operation state can be grasped dynamically. When problems occur, the calling process can be traced quickly according to the collected calling chain data, fault points can be positioned accurately, and powerful support is provided for optimizing micro-service performance and checking problems.

S3, specifically comprising:

s301, deploying a trained decision tree model at an edge node;

S302, the edge node utilizes the trained decision tree model to analyze and predict call chain data, performs edge calculation on the data predicted to be of high priority, constructs a local call chain, and uploads the local call chain to the central node.

The training process of the decision tree model comprises the following steps:

s3021, acquiring historical call chain data of an edge node, wherein the historical call chain data comprises high-priority data and other request data;

S3022, marking the historical call chain data, and distinguishing high-priority data from other request data;

S3023, selecting preferred feature division data from a plurality of features contained in the call chain data on each decision tree node, wherein the preferred features are determined according to a preset evaluation standard and can enable data segmentation to achieve an optimal effect; constructing a decision tree by recursively selecting the best features according to the evaluation criteria, wherein each node in the decision tree represents a test on a certain feature, each branch represents a value of a feature, and the final leaf node contains a prediction result;

s3024, training the constructed decision tree by using the marked historical call chain data to obtain a trained decision tree model, so that the model can predict whether the data is high-priority data or not based on the characteristics of the call chain data.

Specifically, in the process of constructing a micro-service call chain by edge calculation, a trained decision tree model plays an important role.

First, training of a decision tree model is performed. Historical call chain data containing high priority data (e.g., exception requests, high latency requests) and other request data is obtained from the edge nodes and labeled to distinguish between different types of data. On each node of the decision tree, selecting a preferred feature which can lead the data to be segmented to achieve the optimal effect from a plurality of features of the call chain data according to a preset evaluation standard (such as information gain and the like) to divide the data, constructing the decision tree by recursively selecting the optimal feature, wherein the node represents a feature test, the branch is a feature value, the leaf node contains a prediction result, and the marked historical data training model is utilized to lead the feature prediction data to be based on whether the feature prediction data is of high priority.

And then deploying a trained decision tree model at the edge node. And the edge node utilizes the model to analyze and predict the call chain data so as to acquire high-priority data. The edge node then performs edge computation on the high priority data, and builds a local call chain using the processed high priority data. It should be appreciated that building a local call chain based on edge computation is within the capabilities of those skilled in the art.

In the process, the edge node can capture Span data generated by the local micro-service through the proxy, analyze and extract key information (such as Trace ID and Span ID), and construct a local call chain of a directed graph according to the parent-child relationship. Meanwhile, the edge node can perform data preprocessing, including filtering irrelevant data, compressing data size, aggregating statistical information and the like, so that data transmission burden is reduced, and primary analysis of a local call chain, such as detecting abnormality or performance bottleneck, can be independently completed. Defining priority level according to service requirement, marking specific event (such as abnormality, overtime, system fault) as high priority, and uploading high priority or abnormal call chain data to central node by edge node, wherein low priority data can be selected not to be uploaded or to be uploaded in a delayed way, and the central node is responsible for building and maintaining global view so as to enhance system stability.

The embodiment utilizes a trained lightweight machine learning model, namely a decision tree model, to screen high-priority data, and can effectively filter a large amount of invalid and redundant data on edge nodes. The decision tree model is trained based on historical call chain data, has clear cognition on high-priority data characteristics, can accurately distinguish key data such as abnormal requests and high-delay requests, and avoids irrelevant data from being mixed into subsequent flows. And secondly, the burden of invalid and redundant data transmission is reduced, and only high-priority data is screened out for edge calculation and transmission, so that the data transmission condition is greatly improved, the transmission obstruction caused by a large amount of invalid data transmission is avoided, and the data transmission efficiency is improved. In addition, as the edge nodes effectively screen the data, the data quantity transmitted to the center node is greatly reduced, and the center node does not need to process excessive useless data, so that the load of the center node is effectively reduced, and the performance and stability of the whole micro-service call chain system constructed based on edge calculation are improved.

S4, specifically comprising:

s401, a central node receives local call chains from different edge nodes, wherein the local call chains contain PARENT SPAN ID information;

S402, identifying calling relations between services or operations according to PARENT SPAN ID information, and generating corresponding directed edges, wherein the directions of the directed edges point to called parties from calling parties;

S403, for each directed edge, constructing a corresponding call chain path according to the identified call relationship;

S404, integrating a plurality of directed edge data to form a global call chain;

S405, when a new local call chain arrives, the central node dynamically updates the global call chain.

After receiving call chain data from a plurality of edge nodes, the central node is responsible for merging the local data to form a complete global call chain view, and when the data uploaded by the edge nodes indicate that the topology structure is changed, the global view is updated in time. The Span data and the service topology map are persisted.

The central node integrates Span data from different service instances, and according to PARENT SPAN ID information in the Span data, the central node can identify calling relations between services or operations and generate corresponding directed edges, and the directions of the edges point to a called party from a calling party. The call chain corresponding to each Trace usually forms a path, and the data of a plurality of traces can be integrated into a complete topology structure to show all possible call paths among services. When new Span data arrives, the central node can dynamically update the topology structure, so that the global call chain view always reflects the latest system state. If the service structure in the system changes (e.g., services are added or deleted), the topology map is updated accordingly.

After the central node is integrated, span data and a corresponding service topological graph are obtained, the key information is required to be stored in a database in a lasting mode, data support is provided for the subsequent adjustment of the dynamic monitoring strategy, and meanwhile fault tracing is conducted when the information system is abnormal.

The embodiment discloses a service topology dynamic construction method which can be used in a micro service cluster of a power information system. By using Jaeger's proxy in conjunction with edge nodes, efficient collection and preliminary processing of distributed trace data can be achieved. The edge node not only can reduce the pressure of data transmission, but also can improve the response speed of the system through local analysis, thereby reducing the burden of the central node and ensuring that the tracking data of the system can be processed timely and effectively.

The embodiment creatively utilizes the decision tree model to screen the high-priority data, and the decision tree is trained based on historical call chain data, so that the high-priority data (such as abnormal requests and high-delay requests) and other data can be accurately distinguished. Therefore, invalid data interference can be avoided, key information is focused, edge calculation efficiency is improved, and resource waste is reduced. Meanwhile, the edge node calculates the high-priority data and constructs a local call chain, so that the load of the center node is reduced. The central node generates a global call chain by processing the local call chain, so that the high efficiency and stability of the system are ensured. The method can rapidly locate the problem link and improve the performance of the whole micro-service cluster.

Example two

The embodiment provides a micro-service cluster global call chain construction system based on edge calculation, which comprises the following components:

Example III

The present embodiment provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps in a method for constructing a micro service cluster global call chain based on edge computation as described in the above embodiment.

Example IV

The present embodiment provides a computer device, including a memory, a processor, and a computer program stored in the memory and capable of running on the processor, where the processor implements the steps in the method for constructing a micro service cluster global call chain based on edge computation according to the above embodiment when executing the program.

The steps or modules in the second to fourth embodiments correspond to the first embodiment, and the detailed description of the first embodiment may be referred to in the related description section of the first embodiment. The term "computer-readable storage medium" shall be taken to include a single medium or multiple media that includes one or more sets of instructions, and shall also be taken to include any medium that is capable of storing, encoding or carrying a set of instructions for execution by the processor and that cause the processor to perform any one of the methodologies of the present invention.

The above description is only of the preferred embodiments of the present invention and is not intended to limit the present invention, but various modifications and variations can be made to the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. The utility model provides a method for constructing a micro-service cluster global call chain based on edge calculation, which is characterized by comprising the following steps:

the central node aggregates the local call chains to form a global call chain.

2. The method for constructing a micro service cluster global call chain based on edge calculation according to claim 1, wherein the edge node is positioned based on a data source, specifically, the edge node is deployed inside the data source, and the data source comprises Internet of things equipment, a user terminal and a local data center.

3. The method for constructing a global call chain of a micro service cluster based on edge computation according to claim 1, wherein the collecting call chain data in a micro service cluster topology map and transmitting the call chain data to an edge node is specifically as follows:

4. The method for constructing a micro service cluster global call chain based on edge computation according to claim 1, wherein the call chain data includes log, request, response time and error information.

5. The method for constructing a global call chain of a micro service cluster based on edge calculation according to claim 1, wherein the edge node uses a trained decision tree model to screen out high priority data in call chain data for edge calculation, and constructs a local call chain, and uploads the local call chain to a central node, and the method specifically comprises:

deploying a trained decision tree model at the edge node;

6. The method for constructing a global call chain of a micro service cluster based on edge computation according to claim 5, wherein the training process of the decision tree model comprises:

7. The method for constructing a global call chain of a micro service cluster based on edge calculation according to claim 1, wherein the central node aggregate local call chain forms a global call chain, specifically comprising:

Integrating a plurality of directed edge data to form a global call chain;

8. A micro-service cluster global call chain construction system based on edge computation, comprising:

9. A computer readable storage medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements the steps of a method for constructing a micro service cluster global call chain based on edge computation according to any one of claims 1-7.

10. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the steps of a method for constructing a micro-service cluster global call chain based on edge computation according to any one of claims 1-7 when the program is executed by the processor.