CN114840392B

CN114840392B - Task scheduling anomaly monitoring method, device, medium and program product

Info

Publication number: CN114840392B
Application number: CN202210646351.4A
Authority: CN
Inventors: 刘林; 王志远
Original assignee: WeBank Co Ltd
Current assignee: WeBank Co Ltd
Priority date: 2022-06-09
Filing date: 2022-06-09
Publication date: 2025-08-26
Anticipated expiration: 2042-06-09
Also published as: CN114840392A

Abstract

The present application provides a task scheduling anomaly monitoring method, device, medium and program product, which obtains historical task data of one or more historical task cycles according to the work plan of the current task cycle; determines multiple task time-consuming types and scheduling relationship maps based on the historical task data, and the ratio of the amount of data contained in each task time-consuming type to the total amount of time-consuming data meets the preset proportion requirement; determines at least one timed detection task based on multiple task time-consuming types and scheduling relationship maps; and predicts whether the probability of anomalies in the task scheduling of the target system meets the preset warning requirements based on the detection results of each timed detection task at each detection time point; if so, determines and outputs one or more warning information. This solves the technical problems of existing anomaly monitoring, such as poor response time, inflexible configuration, and only logical-level monitoring, which is not highly coupled with actual business.

Description

Task scheduling abnormality monitoring method, device, medium and program product

Technical Field

The present application relates to the field of financial science and technology (Fintech), and in particular, to a method, apparatus, medium, and program product for monitoring task scheduling anomalies.

Background

With the development of computer technology, more and more technologies are being applied in the financial field, and the traditional financial industry is gradually changing to financial technology (Fintech). At present, offline data to be processed in the finance and internet industries every day has the characteristics of large data scale and high aging requirement, particularly financial enterprises relate to a large amount of supervision report data processing, and if the task of data processing is not completed, related data cannot be given on time, and even can be subjected to supervision responsibility, so that the rating and reputation of the enterprises are affected. Monitoring and abnormal response of the task scheduling system is particularly important.

Currently, open-source task scheduling frameworks or tools, such as Azkaban, airflow, oozie, are mature, but most of monitoring functions configured or developed by using the frameworks or tools are based on abnormal states of tasks or fixed parameters configured based on experience, and the like, so that the tasks are always in abnormal states or have actual influence after the monitoring alarms come out.

The existing abnormal monitoring has the technical problems of poor response time effectiveness, inflexible configuration and low coupling degree with actual service, and only the monitoring of a logic layer, so that the difficulty and the workload of operation and maintenance work are increased.

Disclosure of Invention

The application provides a task scheduling abnormity monitoring method, device, medium and program product, which are used for solving the technical problems of poor response time efficiency, inflexible configuration, only monitoring in a logic level and low coupling degree with actual service in the conventional abnormity monitoring.

In a first aspect, the present application provides a method for monitoring task scheduling abnormality, including:

According to the working plan of the current task period, acquiring historical task data of one or more historical task periods, wherein the similarity between the historical working plan of the historical task period and the working plan of the current task period meets preset requirements, and the historical task data comprises configuration data of each historical task and time-consuming data for executing each historical task;

Performing clustering processing on each historical task cycle by using a preset clustering model according to time-consuming data until a plurality of task time-consuming types are determined, wherein the ratio of the data quantity contained in each task time-consuming type to the total quantity of the time-consuming data meets the preset duty ratio requirement;

determining a scheduling relation graph according to the configuration data, determining at least one timing detection task according to the time consumption types of the tasks and the scheduling relation graph, wherein the scheduling relation graph is used for representing the dependency relation of the inter-calling processing results among the historical tasks;

And according to the detection results of each timing detection task at each detection time point, pre-judging whether the probability of abnormal task scheduling of the target system meets the preset early warning requirement, and if so, determining and outputting one or more pieces of early warning information.

In one possible design, the one or more historical task cycles include a last task cycle that is closest to the current cycle, or a consecutive plurality of task cycles that is closest to the current cycle.

In one possible design, using a preset clustering model, performing clustering on each historical task cycle according to time-consuming data until determining a plurality of task time-consuming types, including:

randomly extracting time-consuming data of a plurality of historical tasks from all the historical tasks to serve as a clustering center;

performing first clustering processing on each historical task according to a clustering center by using a preset clustering model to determine one or more first time consumption types;

judging whether the data volume duty ratio in each first time consuming type meets the preset duty ratio requirement or not;

If yes, determining that the first time-consuming type is a task time-consuming type;

if not, re-determining a clustering center, and re-performing clustering processing to re-determine the first time-consuming types until the data volume duty ratio corresponding to each first time-consuming type meets the preset duty ratio requirement;

wherein the data volume ratio is used to characterize a ratio of the volume of data contained by the first time-consuming type to the total volume of data of the time-consuming data.

In one possible design, the preset duty cycle requirement includes the data amount duty cycle being greater than or equal to a first duty cycle threshold and less than or equal to a second duty cycle threshold.

Optionally, the first value range of the first duty ratio threshold comprises 1% -10%, and the second value range of the second duty ratio threshold comprises 40% -60%.

In one possible design, the cluster center is redetermined and the clustering process is performed again, including:

deleting a first time-consuming type having a data volume duty cycle less than a first duty cycle threshold, and/or,

Randomly extracting at least two historical tasks from each first time-consuming type with the data volume being larger than a second duty ratio threshold value to serve as a new clustering center;

for the first time-consuming type meeting the preset duty ratio requirement, a new clustering center is redetermined according to a preset mode;

and carrying out clustering again according to each new clustering center by utilizing a preset clustering model so as to determine a new first time-consuming type.

In one possible design, for a first time-consuming type that meets a preset duty cycle requirement, redefining a new cluster center according to a preset manner includes:

when the first time consumption type meets the preset duty ratio requirement, taking the average time consumption of the first time consumption type as a new clustering center.

In one possible design, determining at least one timing detection task based on a plurality of task time consuming types and a scheduling relationship graph includes:

according to preset screening requirements, determining a first target type and a second target type from the time-consuming types of each task;

determining a first fluctuation range and a second fluctuation range according to time-consuming data in the first target type and the second target type by using a preset fluctuation algorithm;

And determining detection objects and detection time of each timing detection task according to the scheduling relation map, the first fluctuation range, the second fluctuation range and the starting time of the execution of the historical task in the time-consuming data.

In one possible design, determining the first fluctuation range and the second fluctuation range from each time-consuming data in the first target type and the second target type using a preset fluctuation algorithm includes:

determining a first fluctuation range according to the first average time consumption and the first standard deviation of all time consumption data in the first target type;

And determining a second fluctuation range according to the second average time consumption of all time consumption data in the two target types and the second standard deviation.

In one possible design, determining the first fluctuation range from the first average time consumption of all time-consuming data in the first target type and the first standard deviation includes:

the first fluctuation range is equal to the sum of the first average time consumption and the first standard deviation which is N times;

determining a second fluctuation range according to the second average time consumption of all time consumption data in the two target types and the second standard deviation, wherein the method comprises the following steps:

the second fluctuation range is equal to a difference between the second average time consumption and a second standard deviation of M times.

In one possible design, the detection times include a first detection time including superimposing a first fluctuation range on the basis of the start time and a second detection time including superimposing a second fluctuation range on the basis of the start time.

In one possible design, according to the detection results of each timing detection task at each detection time point, it is pre-determined whether the probability of abnormal task scheduling of the target system meets a preset early warning requirement, including:

If the execution progress of the detection object at the first detection time is determined to be incomplete according to the detection result, determining that the first probability of abnormality in the execution progress of the task meets the early warning requirement;

If the execution progress of the detection object at the second detection time is determined to be completed according to the detection result, the second probability that the data magnitude of the target system scheduling task is abnormal is determined to meet the early warning requirement.

In one possible design, determining and outputting one or more alert information includes:

Calculating the association degree of the previous task and the next task in the scheduling relation map according to a preset association model;

If the association degree is in the first association interval, determining that the early warning information comprises first early warning information and second early warning information, wherein the early warning level of the first early warning information is the same as that of the second early warning information, the first early warning information is used for representing that the scheduling abnormality exists in the previous task and has an association effect on the scheduling of the next task, and the second early warning information is used for representing that the scheduling abnormality of the next task is derived from the delay of the previous task;

and outputting the first early warning information to the previous task and outputting the second early warning information to the next task.

if the association degree is in the second association interval, determining that the early warning information comprises first early warning information and second early warning information, wherein the first early warning level of the first early warning information is larger than the second early warning level of the second early warning information, the first early warning information is used for representing that the scheduling abnormality exists in the previous task and has an association influence on the scheduling of the latter task, and the second early warning information is used for representing that the scheduling abnormality of the latter task is derived from the delay of the former task;

if the association degree is in the third association interval, outputting early warning information to the previous task, wherein the early warning information is used for representing that the previous task has scheduling abnormality.

In one possible design, the pre-warning information includes a weight feedback link;

after determining and outputting the one or more pieces of early warning information, the method further comprises:

Receiving adjustment information input by a user through a weight feedback link;

and adjusting the pre-warning weight of the detection object corresponding to the timing detection task according to the adjustment information.

In one possible design, the method further comprises determining a third detection time for the detection object according to the preset delay time when the detection object is detected to have scheduling abnormality at the first detection time, wherein the detection object is a current execution task;

when the fact that the current execution task still has scheduling abnormality is detected at the third detection time, determining a first early warning level of the current execution task according to a first preset early warning weight and early warning triggering times of the current execution task;

judging whether the first early warning level meets preset early warning conditions or not;

If yes, the early warning information is sent again to the current executing task.

In one possible design, when it is detected at the third detection time that the current execution task still has a scheduling exception, the method further includes:

determining the association degree of the currently executed task and the next task according to the scheduling relation graph by utilizing a preset association model;

Determining a second early warning level of the next task according to the second early warning weight, the association degree and the early warning triggering times of the next task;

judging whether the second early warning level meets preset early warning conditions or not;

if yes, sending early warning information to the next task, wherein the early warning information comprises a scheduling delay for prompting that the scheduling abnormality of the next task is derived from the current executing task.

In a second aspect, the present application provides a task scheduling abnormality monitoring apparatus, including:

The acquisition module is used for acquiring historical task data of one or more historical task periods according to the working plan of the current task period, wherein the similarity between the historical working plan of the historical task period and the working plan of the current task period meets the preset requirement, and the historical task data comprises configuration data of each historical task and time-consuming data for executing each historical task;

A processing module for:

according to the detection results of each timing detection task at each detection time point, pre-judging whether the probability of abnormal task scheduling of the target system meets the preset early warning requirement or not;

and the output module is used for outputting early warning information to the detection object of the timing detection task.

In a third aspect, the present application provides an electronic device comprising:

A memory for storing program instructions;

a processor for calling and executing program instructions in said memory, performing any one of the possible methods provided in the first aspect.

In a fourth aspect, the present application provides a storage medium having stored therein a computer program for executing any one of the possible task scheduling anomaly monitoring methods provided in the first aspect.

In a fifth aspect, the present application also provides a computer program product comprising a computer program which, when executed by a processor, implements any one of the possible task scheduling anomaly monitoring methods provided in the first aspect.

The application provides a task scheduling abnormity monitoring method, device, medium and program product, which are characterized in that historical task data of one or more historical task periods are obtained according to a work plan of a current task period, the similarity of the historical work plan of the historical task period and the work plan of the current task period meets preset requirements, the historical task data comprise configuration data of each historical task and time consuming data for executing each historical task, clustering processing is conducted on each historical task cycle according to the time consuming data by utilizing a preset clustering model until time consuming data are determined, the ratio of the data quantity contained in each task time consuming type to the total quantity of the time consuming data meets preset duty ratio requirements, a scheduling relation map is determined according to the configuration data, at least one timing detection task is determined according to the time consuming type of each task and the scheduling relation map, the scheduling relation map is used for representing the dependency relationship of the processing results of the mutual calling between each historical task, whether the abnormal task scheduling of a target system meets the preset early warning requirements or not is judged according to the detection results of each timing detection task at each detection time point, and if yes, one or more early warning information is determined and output. The method solves the technical problems that the existing abnormal monitoring has poor response time efficiency and inflexible configuration, is only the monitoring of a logic level, and has low coupling degree with actual service. The technical effects of ensuring response timeliness by early warning, reserving sufficient time for problem processing, pushing early warning information aiming at dependence association among tasks and being beneficial to quick positioning of problems and resource coordination are achieved.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and together with the description, serve to explain the principles of the application.

Fig. 1 is a schematic diagram of an application scenario of a task scheduling abnormality monitoring method according to an embodiment of the present application;

FIG. 2 is a schematic flow chart of a task scheduling abnormality monitoring method provided by the application;

FIG. 3 is a flow chart for determining a time-consuming type of a plurality of tasks in a loop in step S202 of the embodiment shown in FIG. 2 according to the present application;

fig. 4 is a schematic flow chart of determining at least one timing detection task in step S203 in the embodiment shown in fig. 2 according to an embodiment of the present application;

FIG. 5 is a flowchart illustrating another task scheduling anomaly monitoring method according to an embodiment of the present application;

Fig. 6 is a schematic structural diagram of a task scheduling abnormality monitoring device according to an embodiment of the present application;

Fig. 7 is a schematic structural diagram of an electronic device according to the present application.

Specific embodiments of the present application have been shown by way of the above drawings and will be described in more detail below. The drawings and the written description are not intended to limit the scope of the inventive concepts in any way, but rather to illustrate the inventive concepts to those skilled in the art by reference to the specific embodiments.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, including but not limited to combinations of embodiments, which are within the scope of the application, can be made by one of ordinary skill in the art without inventive effort based on the embodiments of the application.

The terms "first," "second," "third," "fourth" and the like in the description and in the claims and in the above drawings, if any, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the application described herein may be implemented, for example, in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

The following explains the terms related to the present application:

MQ (Message Queue) is a data structure that is "first in first out" in the underlying data structure. By data to be transmitted (also referred to as messages) is meant that the queuing mechanism is used to effect message delivery, i.e., the producer generates and queues the message, which is then processed by the consumer. The consumer can pull the message to the designated queue or subscribe to the corresponding queue for which the MQ server pushes the message.

The acquisition period refers to the acquisition days of the data for analysis and comparison required by the early warning model, and can be adjusted according to the data scale.

DATACHECK data checking, which is to check the integrity of the dependent data before the scheduling task processes the data.

Job Server, job Server, refers to the Server that receives and executes the specific work content of the scheduled task.

The current offline data to be processed in the finance and internet industries has the characteristics of large data scale and high aging requirement, particularly financial enterprises, and relates to a large amount of supervision and report data processing. Monitoring and abnormal response of the task scheduling system is particularly important.

At present, open-source task scheduling frames or tools, such as Azkaban, airflow, oozie, are mature, but most of monitoring functions configured or developed by using the frames or tools are based on abnormal states of tasks or fixed parameters configured based on experience, and the like, so that after the monitoring alarms come out, the tasks are often abnormal or cause actual influence, response time is poor, the configuration is inflexible, operation and maintenance difficulty and workload are increased, and the monitoring on a logic level is performed, so that the coupling degree with actual service is not high.

Aiming at the situations, the existing solution is to collect cluster health conditions before and after the scheduling time aiming at the task scheduling time through a big data cluster deployment engine so as to perform early warning. The current common early warning implementation algorithm is normal distribution prediction for the whole sample, does not have distinction and needs a large amount of historical data, and the prediction result is inaccurate when the calculation time consumption of the early warning system is increased.

It is to be noted that since a normal distribution is established for the entirety of all history periods, a large amount of history data is required. Further, since each history period is either several history periods in succession or the task being performed has its periodic characteristics over a period of time, the use of an integral sample may be indistinguishable.

In summary, the existing anomaly monitoring has the following technical problems:

(1) The existing monitoring implementation scheme has poor response time efficiency, inflexible configuration, large operation and maintenance workload and low coupling degree with actual service;

(2) The data acquisition requirement required by a large number of early warning causes additional burden to a large data cluster;

(3) The existing algorithm has high complexity, low data discrimination, high calculation cost and inaccurate prediction result.

To improve the existing anomaly monitoring method, the inventor of the present application has found that the following technical obstacles exist in improving the anomaly monitoring method through analysis:

(1) The existing monitoring scheme is mostly based on the condition of abnormal scheduling of large data cluster resources and the task state, and under the condition of cross scheduling of a large amount of service data, the scene of abnormal state of a single task is slow in perception of a downstream task, and the influence on the downstream task cannot be accurately analyzed.

(2) The real-time data analysis and monitoring needs to be frequently interacted with the system, so that system resources are occupied, and the system calculation pressure is increased.

(3) The application of the existing algorithm is not combined with the actual scene, so that extra resource consumption is caused and data errors are increased.

To solve the above problems, the inventive concept of the present application is:

Under the condition that the underlying logic of the task scheduling system is not changed, analyzing the blood-edge relation among tasks (i.e. the interdependence relation during scheduling), the expected normal completion time interval of the tasks and the relativity (i.e. the relativity degree) among the blood-edge tasks (i.e. the tasks with scheduling sequence relation among each other) according to task configuration information, wherein the relativity degree is called as the relativity degree below, the early warning pushing of the upstream task and the downstream task (i.e. the two tasks with adjacent execution sequence) is respectively given, the abnormal response efficiency and the processing timeliness are improved, the historical sample data in the period are analyzed, the early warning detection time is dynamically adjusted, the manpower of the operation and maintenance manual configuration is reduced, the flexibility of the early warning configuration is increased, an early warning level weight module is increased, the service attention degree fed back by an operation and maintenance personnel is provided, the dynamic early warning pushing is generated in combination with the early warning detection result, and the abnormal influence scope is prevented from being upgraded when the actual service requirement is coupled. (4) And (5) by using the existing task configuration information and history information, the scheduling platform can be directly provided and decoupled with large data cluster resources, so that the cluster burden is avoided.

Fig. 1 is a schematic diagram of an application scenario of a task scheduling abnormality monitoring method provided by the present application. As shown in fig. 1, an abnormality monitoring system 200 is independently provided outside the task scheduling system 100, the abnormality monitoring system 200 determines a plurality of timing detection tasks by executing the state task scheduling abnormality monitoring method provided by the present application, and redetermines the detection time of the timing detection tasks every task cycle. The abnormality monitoring system 200 does not alarm after the task scheduling is abnormal, but monitors the execution progress of each task, and sends early warning information to both the front and rear tasks with execution sequence requirements, and before the task scheduling is abnormal as much as possible, the probability of abnormality occurrence is found to be greater than the early warning requirement in advance through the execution progress of the task, namely the corresponding early warning information is sent.

The method for monitoring abnormal state task scheduling is specifically described as follows:

fig. 2 is a flow chart of a task scheduling abnormality monitoring method according to an embodiment of the present application. As shown in fig. 2, the specific steps of the task scheduling abnormality monitoring method include:

S201, according to a work plan of a current task period, historical task data of one or more historical task periods are obtained.

In the step, the similarity between the historical work plan of the historical task period and the work plan of the current task period meets the preset requirement, and the historical task data comprises configuration data of each historical task and time-consuming data for executing each historical task.

It should be noted that, in the anomaly monitoring method of the embodiment of the application, unlike the prior art, which needs to aim at the normal distribution prediction of the whole samples of all the historical task periods, the application compares the working plan of each historical task period with the working plan of the current task period, and when the current task period is started, or before the current task period is started, the historical task data of one or more historical task periods, of which the similarity of the working plan meets the preset requirement, is obtained, so that the execution progress of each task can be more pertinently and flexibly monitored, namely, the time point of timing supervision can be changed instead of being fixed, and the early warning area is higher and more flexible.

Because the work plan of the financial enterprise has a characteristic of being stable in a period of time, such as a plurality of task periods, the historical task period comprises a historical task period of the previous year, which is the same as or similar to the time position of the current task period in one year, or a last task period closest to the current period, or a plurality of continuous task periods closest to the current period.

S202, clustering is conducted on each historical task cycle according to time-consuming data in the historical task data by using a preset clustering model until time-consuming types of a plurality of tasks are determined.

In this step, the ratio of the amount of data contained in each task time-consuming type to the total amount of time-consuming data satisfies a preset duty ratio requirement.

Specifically, a preset number of clustering centers are extracted from time-consuming data of each historical task according to the requirement of a preset clustering model. It should be noted that different preset cluster models may correspond to different numbers of initial cluster centers. And then, clustering time-consuming data of all historical tasks by using a preset clustering model to obtain a first clustering result, namely at least one task time-consuming type obtained for the first time. Next, it is required to determine whether the ratio of the amount of data included in each task time-consuming type to the total amount of time-consuming data meets the preset duty ratio requirement, if so, the next step S203 is entered, otherwise, it is required to reset the clustering center according to the requirement of the preset clustering model, perform clustering again, and determine whether the ratio of the amount of data included in the obtained task time-consuming type to the total amount of time-consuming data meets the preset duty ratio requirement again. And cycling for a plurality of times until the ratio of the data quantity contained in the task time-consuming type to the total quantity of time-consuming data meets the preset duty ratio requirement.

Notably, resetting the cluster centers includes two aspects, one is the number of cluster centers and the other is replacing time-consuming data as cluster centers. Alternatively, the number of the cluster centers may be changed (i.e. increased or decreased), or may be kept unchanged, and those skilled in the art may set the number according to the needs of the actual application scenario.

It should be noted that, in this embodiment, the preset clustering models that perform the clustering process each time may be the same or different, that is, the same preset clustering model may be used during the cyclic clustering process for multiple times, different preset clustering models may be used each time, or one preset clustering model may perform the clustering process for multiple times.

S203, determining a scheduling relation graph according to the configuration data, and determining at least one timing detection task according to the time consumption types of the tasks and the scheduling relation graph.

In this step, the scheduling relationship graph is used to characterize the dependency relationship of the inter-calling processing results between each historical task, or the execution sequence between each historical task.

Specifically, according to task configuration information, the upstream and downstream calling relations of historical tasks are split, wherein the upstream and downstream calling relations comprise DATACHECK, MQ interaction types, and then a blood relationship map of task scheduling, namely a scheduling relationship map, is generated.

Determining at least one timing detection task according to the time consumption types of the tasks and the scheduling relation map, wherein the determining comprises the following steps:

According to preset screening requirements, determining a first target type and a second target type from time-consuming types of each task, wherein the first target type comprises a type with longer time consumption, and the second target type comprises a type with short time consumption angle;

Determining a first fluctuation range and a second fluctuation range according to time-consuming data in the first target type and the second target type by using a preset fluctuation algorithm, wherein the first fluctuation range and the second fluctuation range can be determined according to a normal distribution diagram corresponding to the first target type and the second target type;

In step S202 and step S203, the "schedule relationship map is determined", and there is no sequential hard requirement, and these two steps may be performed simultaneously or may be performed first.

S204, according to detection results of each timing detection task at each detection time point, whether the probability of abnormal task scheduling of the target system meets the preset early warning requirement is judged.

In this step, if yes, S205 is executed, and if no, no abnormality is detected, and the next timing detection task is waited for detection analysis.

S205, determining and outputting one or more pieces of early warning information.

In this step, at least three possible embodiments are included.

1. A first possible embodiment is as follows:

firstly, calculating the association degree of a previous task and a next task in a scheduling relation map according to a preset association model;

Then, if the association degree is in the first association interval, determining that the early warning information comprises first early warning information and second early warning information, wherein the early warning level of the first early warning information is the same as that of the second early warning information, the first early warning information is used for representing that the scheduling abnormality exists in the previous task and has an association effect on the scheduling of the next task, and the second early warning information is used for representing that the scheduling abnormality of the next task is derived from the delay of the previous task;

And finally, outputting first early warning information to the previous task and outputting second early warning information to the next task.

2. A second possible embodiment is as follows:

Then, if the association degree is in the second association interval, determining that the early warning information comprises first early warning information and second early warning information, wherein the first early warning level of the first early warning information is larger than the second early warning level of the second early warning information, the first early warning information is used for representing that scheduling abnormality exists in a previous task and has an association influence on scheduling of a next task, and the second early warning information is used for representing that scheduling abnormality of the next task is derived from delay of the previous task;

3. A third possible embodiment is as follows:

and then, if the association degree is in the third association interval, outputting early warning information to the previous task, wherein the early warning information is used for representing that the scheduling abnormality exists in the previous task.

In the above three embodiments, the association degree between the previous task and the next task in the scheduling relationship graph is calculated according to the preset association model, and the preset association model in this step may be selected according to the actual situation, for example, one embodiment may be represented by the formula (x):

Where r represents the association degree of the previous task X (also referred to as an upstream task) and the next task Y (also referred to as a downstream task), that is, S _x is the standard deviation of sample data (i.e., time-consuming data in each history data) of the task X in the history task period, S _y is the standard deviation of sample data (i.e., time-consuming data in each history data) of the task Y in the history task period, and cov (X, Y) is the covariance of sample data in the acquisition period of the task X and the task Y.

It should be noted that, in this embodiment, the sample data is time-consuming data, and the historical data obtained in this embodiment is offline data, because the data size of the offline data is very large, most of processing is based on map/reduce processing, except for fluctuation of the data size, the most intuitive embodiment is that the task is time-consuming to execute, and the data is necessarily recorded by the scheduling system, so that the data can be directly obtained, and the cluster resource is avoided from being consumed by additionally deploying the acquisition module. The time-consuming data here includes, in addition to the time of data processing, the time to wait for upstream data (i.e., the processing result of the task immediately preceding the current task). In this embodiment, each task (including a task in a current task period and a historical task in a historical task period) is a task based on a blood relationship (i.e. there is a dependency relationship between processing results called each other or an execution sequence), and is necessarily related, on the basis, the correlation of time-consuming fluctuation is more reflected in the layer of dependency of waiting upstream data between tasks, namely delay early warning.

Specifically, for the three embodiments, when the association degree is in the first association interval, for example, 0.6< r <1, the two tasks are considered to be strongly related, the task level weight configuration is read, the first early warning is taken as a default value 1 as an example, first early warning information related to the task x is generated, early warning content related to the influence task y is added into the first early warning information, and meanwhile, the same-level task y delays early warning about the task x, namely, second early warning information is generated.

When the association degree is in a second association interval, for example, 0.3< r is less than or equal to 0.6, the intermediate correlation between the two tasks is considered, the first early warning information is generated in the mode, and the second early warning task, namely the second early warning information, is generated.

When the association degree is less than or equal to 0.3 in a third association interval, if 0<r is less than or equal to 0.3, the two tasks are considered to be weakly related, the first early warning information is generated according to the mode, the early warning processing is not carried out on the task y, and if abnormality exists in the detection of the task y, the related influence content is regenerated and added into the first early warning information.

It should also be noted that the upstream and downstream relationships are based on blood-based analysis, and that in one possible design, only one layer of separation may be considered for two tasks or task interactions with the upstream and downstream systems, as the early warning analysis is for a full set of tasks. For nodes within a single task, the interval is multi-layered, but within the overall task time consumption, the node delays of the upper and lower layers are the same for the overall correlation calculation, and even for the check node of the lower layer, the time consumption is calculated from the overall task scheduling, so that the longer the waiting time is, the closer the fluctuation curve of the node task time consumption is to the time consumption of the overall task, namely, the higher the probability that the upstream and downstream are simultaneously affected is.

The embodiment of the application provides a task scheduling abnormity monitoring method, which comprises the steps of obtaining historical task data of one or more historical task periods according to a work plan of a current task period, determining a scheduling relation graph according to the configuration data, determining at least one timing detection task according to the scheduling relation graph, wherein the scheduling relation graph is used for representing the dependency relationship of a mutual calling processing result between the historical tasks, determining whether the probability of abnormal task scheduling of a target system meets preset early warning requirements or not according to the detection result of each timing detection task at each detection time point by utilizing a preset clustering model, clustering each historical task according to the time consumption data until a plurality of task time consumption types are determined, and the ratio of the total amount of data quantity and the time consumption data contained in each task time consumption type meets the preset duty ratio requirement. The method solves the technical problems that the existing abnormal monitoring has poor response time efficiency and inflexible configuration, is only the monitoring of a logic level, and has low coupling degree with actual service. The technical effects of ensuring response timeliness by early warning, reserving sufficient time for problem processing, pushing early warning information aiming at dependence association among tasks and being beneficial to quick positioning of problems and resource coordination are achieved.

To facilitate an understanding of several possible embodiments corresponding to S202, a specific description is provided below.

Fig. 3 is a schematic flow chart of determining time-consuming types of a plurality of tasks in a loop in step S202 in the embodiment shown in fig. 2 according to the present application. As shown in fig. 3, the specific steps include:

s301, time-consuming data of a plurality of historical tasks are randomly extracted from all the historical tasks to serve as a clustering center.

S302, performing first clustering processing on each historical task according to a clustering center by using a preset clustering model to determine one or more first time consumption types.

In this embodiment, the clustering process of the preset clustering model may be represented by the formula (1):

Wherein C represents a first time-consuming type, k is the number of initial cluster centers, C _i is time-consuming data of each of the k cluster centers when the corresponding historical tasks are executed, that is, the execution time of the initial cluster center, and x _j is the execution time of each of all samples, that is, each of the historical tasks, in one or more historical periods.

It is worth noting that the formula (1) combines the characteristics of task scheduling, and by reducing the dimension of the conventional kmeans, the algorithm complexity is reduced, the accuracy of data is ensured, the early warning deployment burden is reduced, and the operation efficiency is improved.

S303, judging whether the data volume duty ratio in each first time-consuming type meets the preset duty ratio requirement.

In this step, the data volume ratio is used to characterize the ratio of the volume of data contained in the first time-consuming type to the total volume of data of the time-consuming data. If yes, step S304 is executed, and if no, step S305 is executed.

In this embodiment, the preset duty cycle requirement includes the data amount duty cycle being greater than or equal to a first duty cycle threshold and less than or equal to a second duty cycle threshold. Optionally, the first value range of the first duty ratio threshold comprises 1% -10%, and the second value range of the second duty ratio threshold comprises 40% -60%. Preferably, the first duty cycle threshold is 5% and the second duty cycle threshold is 50%.

S304, determining that the first time-consuming type is a task time-consuming type.

S305, the clustering center is redetermined, and clustering processing is conducted again, so that the first time consumption types are redetermined, and the data volume duty ratio corresponding to each first time consumption type meets the preset duty ratio requirement.

In this embodiment, the method specifically includes:

and S3051, deleting the first time-consuming types with the data volume duty ratio smaller than the first duty ratio threshold value, and/or randomly extracting at least two historical tasks from the first time-consuming types with the data volume duty ratio larger than the second duty ratio threshold value to serve as new clustering centers.

Specifically, for example, a classification in which the number of deleted samples is less than 5% of the total number of samples, that is, a first time-consuming type in which the amount of deleted data is less than 5% is deleted. Randomly extracting 2 samples from the classification with the total number of samples being more than 50% as new cluster centers.

S3052, for the first time-consuming type meeting the preset duty ratio requirement, a new clustering center is redetermined according to a preset mode.

In one possible embodiment, when the first time-consuming type meets the preset duty cycle requirement, the average time consumption of the first time-consuming type is taken as a new cluster center.

Specifically, the time-consuming data corresponding to the new cluster center can be represented by formula (2):

Wherein C _i is a classification set, i.e., a set corresponding to the first time-consuming type, i C _i is the number of samples in the set, x is time-consuming data corresponding to each sample in the set, and a _i is time-consuming data corresponding to a new cluster center.

After the step is performed, the step returns to step S302, i.e., the clustering process is performed again according to each new clustering center by using the preset clustering model, so as to determine the new first time-consuming type, until the data volume duty ratio corresponding to each first time-consuming type meets the preset duty ratio requirement.

According to the method for circularly determining the time-consuming types of the tasks, provided by the embodiment, the conventional kmeans is reduced in dimension by combining the characteristics of task scheduling, so that the algorithm complexity is reduced, the accuracy of data is ensured, the early warning deployment burden is reduced, and the operation efficiency is improved. By classifying the historical data, the complexity of an early warning algorithm is reduced, the early warning cost is reduced, the early warning accuracy is improved, and by using the existing task configuration information and the historical information, the scheduling platform can directly provide various data for constructing a scheduling relation map and classified sample data of task time-consuming types, is decoupled from large data cluster resources, and avoids increasing cluster burden.

To facilitate understanding of possible implementations of the "determine at least one timing detection task from a plurality of task time-consuming types and a scheduling relationship map" in step S203 in the embodiment shown in fig. 2, a specific embodiment is described below.

Fig. 4 is a schematic flow chart of determining at least one timing detection task in step S203 in the embodiment shown in fig. 2 according to an embodiment of the present application. As shown in fig. 4, the specific steps include:

S401, determining a first target type and a second target type from the time-consuming types of each task according to preset screening requirements.

In this embodiment, the first target type with the largest value in the time-consuming data corresponding to the clustering center is selected from the time-consuming types of each task, and the second target type with the smallest value in the time-consuming data is selected.

S402, determining a first fluctuation range and a second fluctuation range according to time-consuming data in the first target type and the second target type by using a preset fluctuation algorithm.

In this step, specifically, the method includes:

S4021, determining a first fluctuation range according to the first average time consumption of all time consumption data in the first target type and the first standard deviation.

In one possible design, the first fluctuation range is equal to the sum of the first average time consumption and the first standard deviation of N times, and the first fluctuation range B ₁ is shown in formula (3):

Wherein, the Representing a first average time consumption, S ₁ represents a first standard deviation.

Preferably, if the sample fluctuation interval is subjected to normal distribution as a whole, the confidence level in the three standard deviations of the mean is 99.6%, and therefore, the value of N can be set to 3. It is understood that the value of N can be specifically set by those skilled in the art according to the distribution pattern obeyed by the sample fluctuation interval, and is not limited herein.

S4022, determining a second fluctuation range according to the second average time consumption of all time consumption data in the two target types and the second standard deviation.

In one possible design, the second fluctuation range is equal to the difference between the second average time consumption and the second standard deviation of M times, and the second fluctuation range B ₂ is shown in formula (4):

Wherein, the Representing a second average time, S ₂ represents a second standard deviation.

Preferably, if the sample fluctuation interval is subjected to normal distribution as a whole, the confidence level in the three standard deviations of the mean is 99.6%, and therefore, the value of M can be set to 3. It is understood that the value of M can be specifically set by those skilled in the art according to the distribution pattern obeyed by the sample fluctuation interval, and is not limited herein.

The method comprises the steps of selecting clusters with maximum time consumption and minimum time consumption, calculating time consumption fluctuation intervals of tasks respectively, reducing extra errors caused by differences among sample categories, reducing calculated amount compared with a normal distribution algorithm of a whole sample, and being more suitable for large data cluster environments with a large number of scheduled tasks.

S403, determining detection objects and detection time of each timing detection task according to the scheduling relation map, the first fluctuation range, the second fluctuation range and the starting time of the execution of the historical task in the time-consuming data.

Specifically, according to the result obtained in S402, the current execution starting time T of the scheduled task is read, and a timing detection job or a timing detection task is generated, where the detection times T ₁ and T ₂ are respectively shown in formula (5):

Wherein B ₁ represents a first fluctuation range and B ₂ represents a second fluctuation range.

It is noted that, on the basis of the embodiment shown in fig. 4, in the embodiment shown in fig. 2, S204, according to the detection results of each timing detection task at each detection time point, pre-determines whether the probability of occurrence of an abnormality in task scheduling of the target system meets the preset early warning requirement, which includes two aspects:

if the execution progress of the detection object at the first detection time is determined to be incomplete according to the detection result, determining that the first probability of abnormality in the execution progress of the task meets the preset early warning requirement;

If the execution progress of the detection object at the second detection time is determined to be completed according to the detection result, determining that the second probability of abnormality in the data magnitude of the target system scheduling task meets the preset early warning requirement.

Fig. 5 is a flowchart of another task scheduling anomaly monitoring method according to an embodiment of the present application. As shown in fig. 5, the specific steps of the method include:

s501, acquiring historical task data of one or more historical task periods according to a work plan of a current task period.

S502, performing clustering processing on each historical task cycle according to time-consuming data by using a preset clustering model until time-consuming types of a plurality of tasks are determined.

S503, determining a scheduling relation graph according to the configuration data, and determining at least one timing detection task according to the time consumption types of the tasks and the scheduling relation graph.

In this step, the scheduling relationship map is used to characterize the dependency relationship of the inter-calling processing results between each history task.

S504, according to detection results of each timing detection task at each detection time point, whether the probability of abnormal task scheduling of the target system meets the preset early warning requirement is judged.

If not, the step is executed circularly, namely the next timing detection task is executed continuously. If yes, specifically comprising judging two detection results of the first detection time T ₁ and the second detection time T ₂:

(1) If the execution progress of the detection object at the first detection time is determined to be incomplete according to the detection result, the first probability of abnormality in the execution progress of the task is determined to meet the early warning requirement.

(2) If the execution progress of the detection object at the second detection time is determined to be completed according to the detection result, the second probability that the data magnitude of the target system scheduling task is abnormal is determined to meet the early warning requirement.

In this embodiment, the first detection time T ₁ and the second detection time T ₂ are shown in formula (5).

It should be noted that, in the embodiment of the present application, only the subsequent processing performed in the first case is expanded and illustrated, i.e., steps S505 to S512 are performed. For the second case, the first case can be referred to, and another early warning mode can be independently adopted, for example, early warning information is sent only once.

S505, when the existence of scheduling abnormality of the detection object is detected at the first detection time, early warning information is sent to the current task.

In this step, the detection object is the currently executing task. And detecting that the detected object has abnormal scheduling at the first detection time, wherein the type of the early warning information is delay type early warning.

S506, determining a third detection time of the detection object according to the preset delay time.

In this step, for delayed early warning, in order to avoid that the early warning message is ignored, and thus the expected early warning effect cannot be achieved, that is, the abnormality cannot be corrected in time before a large number of tasks are scheduled abnormally, the preset delay time is added after the delayed early warning is sent, that is, after the first detection time T ₁, to be the third detection time, that is, the third detection time T ₃＝T₁ +td, where Td is the preset delay time, and optionally, the value of Td is equal to S ₁ in the formula (3), that is, T ₃＝T₁+S₁.

Similarly, after the current task is detected again at the third detection time, if the delay type early warning is still sent, the steps S506 to S512 are repeatedly executed.

S507, when the fact that the current execution task still has the scheduling abnormality is detected at the third detection time, determining a first early warning level of the current execution task according to a first preset early warning weight and early warning trigger times of the current execution task.

In the step, if the current task is detected to meet the preset early warning requirement again at the third detection time, the early warning level of the current task needs to be recalculated, so that the problem that the early warning of the lower early warning level cannot be sent, and the small problem becomes a large problem and serious scheduling accidents are caused is avoided.

In this embodiment, assuming that the current task (also referred to as the last task) is task x, the reference value f (x) of the early warning level of task x can be calculated by the formula (6):

f(x)=u_xt (6)

Wherein u _x is the initial pre-warning weight of task x, default to 1, T is the pre-warning number, the first pre-warning is performed at the first detection time T ₁, the second pre-warning is performed at the third detection time T ₃, and so on.

S508, judging whether the first early warning level meets the preset early warning condition.

In this step, if yes, step S505 is executed again, that is, early warning information is sent to the current task, and steps S509 to S512 are also executed, if no, the present flow is directly ended.

Specifically, if f (x) is less than or equal to 0.3, the second early warning level is judged to be low-level early warning and early warning treatment is not carried out, if f (x) is less than or equal to 0.3, the second early warning level is judged to be medium-level early warning, and if f (x) is more than 0.6, the second early warning level is judged to be high-level early warning.

After determining the first early warning level, the corresponding early warning information may be generated with reference to S205, which is not described herein.

S509, determining the association degree of the current execution task and the next task according to the scheduling relation graph by using a preset association model.

In this step, the specific calculation manner of the association degree r between the currently executed task (i.e. task x) and the next task (i.e. task y) may refer to the formula (x) in S205, which is not described herein.

S510, determining a second early warning level of the next task according to the second early warning weight of the next task, the association degree and the early warning trigger times.

In this embodiment, the next task corresponding to the current task, i.e., task x, in the scheduling relationship map is task y, and the reference value f (y) of the early warning level of task y can be calculated by the formula (7):

f(y)=u_ytr (7)

Wherein u _y is the initial pre-warning weight of the task y, default to 1, r is the association degree or the correlation coefficient of the task x and the task y, T is the pre-warning times, the first pre-warning is performed at the first detection time T ₁, the second pre-warning is performed at the third detection time T ₃, and so on.

S511, judging whether the second early warning level meets the preset early warning condition.

In this step, if yes, step S512 is executed.

Specifically, if f (y) is less than or equal to 0.3, the second early warning level is judged to be low-level early warning, early warning processing is not performed, if f (y) is less than or equal to 0.3, the second early warning level is judged to be medium-level early warning, and if f (y) is more than 0.6, the second early warning level is judged to be high-level early warning.

S512, sending early warning information to the next task.

In this step, the early warning information includes a schedule delay prompting that the scheduling abnormality of the next task is derived from the currently executed task.

Specifically, after the early warning level is determined, corresponding early warning information may be generated with reference to the specific step of S205.

It should be noted that S509 to S510 may be executed synchronously with S507, and the pages S508 and S511 may be executed synchronously.

It should be further noted that, in the task scheduling anomaly monitoring method provided in each embodiment, the pre-warning information may include a weight feedback link, and after determining and outputting one or more pre-warning information, the method further includes:

Specifically, the early warning information provides an early warning level weight feedback link, if the early warning level does not accord with the service response level, the weight value to be adjusted is fed back through the link, and the feedback value is updated and recorded in an early warning system database for subsequent early warning generation.

In general, the task scheduling abnormality monitoring method provided by the embodiments of the present application has at least the following advantages:

(1) Early warning can ensure response aging, and enough time is reserved for problem processing;

(2) Aiming at dependent task associated pushing, the method is favorable for quick positioning problem and resource coordination;

(3) According to the data in the period, dynamically adjusting an early warning strategy, so that problems can be found in time;

(4) Providing a feedback interface, and coupling the attention degree of the service to the scheduled task;

(5) The method does not need to share computing resources with a scheduling Server and a Job Server, and basically has no influence on a system;

(6) The system interface implementation is directly invoked following the dependency inversion principle.

Fig. 6 is a schematic structural diagram of a task scheduling abnormality monitoring device according to an embodiment of the present application. The task scheduling anomaly monitoring device 600 may be implemented in software, hardware, or a combination of both.

As shown in fig. 6, the task scheduling abnormality monitoring apparatus 600 includes:

the acquiring module 601 is configured to acquire historical task data of one or more historical task periods according to a work plan of a current task period, where similarity between the historical work plan of the historical task period and the work plan of the current task period meets a preset requirement, and the historical task data includes configuration data of each historical task and time-consuming data for executing each historical task;

a processing module 602, configured to:

and the output module 603 is configured to output early warning information to a detection object of the timing detection task.

In one possible design, the processing module 602 is configured to:

In one possible design, the processing module 602 is further configured to:

In one possible design, the processing module 602 is configured to calculate a first fluctuation range equal to a sum of the first average time and a first standard deviation N times, and calculate a second fluctuation range equal to a difference between the second average time and a second standard deviation M times.

In one possible design, the processing module 602 is configured to:

In one possible design, the output module 603 is configured to:

the acquiring module 601 is further configured to receive adjustment information input by a user through a weight feedback link;

The processing module 602 is further configured to adjust an early warning weight of a detection object corresponding to the timing detection task according to the adjustment information.

In one possible design, the processing module 602 is further configured to:

When the first detection time detects that the detection object has scheduling abnormality, determining a third detection time for the detection object according to the preset delay time, wherein the detection object is a current execution task;

In one possible design, the processing module 602 is further configured to:

It should be noted that, the apparatus provided in the embodiment shown in fig. 6 may perform the method provided in any of the above method embodiments, and the specific implementation principles, technical features, explanation of terms, and technical effects are similar, and are not repeated herein.

Fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present application. As shown in fig. 7, the electronic device 700 may include at least one processor 701 and a memory 702. Fig. 7 shows an electronic device, for example, a processor.

A memory 702 for storing programs. In particular, the program may include program code including computer-operating instructions.

The memory 702 may comprise high-speed RAM memory or may further comprise non-volatile memory (non-volatile memory), such as at least one disk memory.

The processor 701 is configured to execute computer-executable instructions stored in the memory 702 to implement the methods described in the above method embodiments.

The processor 701 may be a central processing unit (central processing unit, abbreviated as CPU), or an Application SPECIFIC INTEGRATED Circuit (ASIC), or one or more integrated circuits configured to implement embodiments of the present application.

Alternatively, the memory 702 may be separate or integrated with the processor 701. When the memory 702 is a device separate from the processor 701, the electronic device 700 may further include:

A bus 703 for connecting the processor 701 and the memory 702. The bus may be an industry standard architecture (industry standard architecture, abbreviated ISA) bus, an external device interconnect (PERIPHERAL COMPONENT, PCI) bus, or an extended industry standard architecture (extended industry standard architecture, EISA) bus, among others. Buses may be divided into address buses, data buses, control buses, etc., but do not represent only one bus or one type of bus.

Alternatively, in a specific implementation, if the memory 702 and the processor 701 are integrated on a single chip, the memory 702 and the processor 701 may communicate through an internal interface.

The embodiment of the application also provides a computer readable storage medium, which can comprise various media capable of storing program codes, such as a U disk, a mobile hard disk, a read-only memory (ROM), a random access memory (random access memory, RAM), a magnetic disk or an optical disk, and the like, and particularly, the computer readable storage medium stores program instructions, wherein the program instructions are used for the method in the method embodiments.

The embodiments of the present application also provide a computer program product comprising a computer program which, when executed by a processor, implements the method of the above-described method embodiments.

Other embodiments of the application will be apparent to those skilled in the art from consideration of the specification and practice of the application disclosed herein. This application is intended to cover any variations, uses, or adaptations of the application following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the application pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the application being indicated by the following claims.

It is to be understood that the application is not limited to the precise arrangements and instrumentalities shown in the drawings, which have been described above, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the application is limited only by the appended claims.

Claims

1. A method for monitoring task scheduling anomalies, comprising:

According to the work plan of the current task cycle, historical task data of one or more historical task cycles are obtained, where the similarity between the historical work plans of the historical task cycles and the work plan of the current task cycle meets a preset requirement, and the historical task data includes: configuration data of each historical task and time-consuming data of executing each historical task;

Using a preset clustering model, clustering is performed on each of the historical task cycles according to the time-consuming data until a plurality of task time-consuming types are determined, and the ratio of the amount of data contained in each task time-consuming type to the total amount of the time-consuming data meets a preset ratio requirement;

Determining a scheduling relationship map based on the configuration data, and determining at least one timed detection task based on the multiple task time-consuming types and the scheduling relationship map, wherein the scheduling relationship map is used to represent the dependency relationship between the processing results of each of the historical tasks;

Based on the detection results of each of the scheduled detection tasks at each detection time point, it is predicted whether the probability of abnormality in the task scheduling of the target system meets the preset warning requirements; if so, one or more warning information is determined and output;

The method of using a preset clustering model to cluster each of the historical task cycles according to the time consumption data until multiple task time consumption types are determined includes:

Randomly extracting the time-consuming data of a plurality of the historical tasks from all the historical tasks as cluster centers;

Using the preset clustering model, and according to the cluster centers, performing the first clustering process on each of the historical tasks to determine one or more first time-consuming types;

Determine whether the proportion of the data volume in each of the first time-consuming types meets the preset proportion requirement;

If so, determining that the first time-consuming type is a task time-consuming type;

If not, re-determine the cluster center and re-perform the clustering process to re-determine the first time-consuming type until the data volume ratio corresponding to each first time-consuming type meets the preset ratio requirement;

The data volume ratio is used to represent the ratio of the data volume contained in the first time-consuming type to the total data volume of the time-consuming data.

2. The task scheduling anomaly monitoring method according to claim 1 is characterized in that one or more of the historical task cycles include: the previous task cycle closest to the current task cycle, or multiple consecutive task cycles closest to the current task cycle.

3. The task scheduling anomaly monitoring method according to any one of claims 1-2 is characterized in that the preset proportion requirement includes: the data volume proportion is greater than or equal to a first proportion threshold and less than or equal to a second proportion threshold.

4. The task scheduling anomaly monitoring method according to claim 3 is characterized in that a first value range of the first proportion threshold includes: 1%~10%, and a second value range of the second proportion threshold includes: 40%~60%.

5. The method for monitoring task scheduling anomalies according to claim 3, wherein the re-determining the cluster center and re-performing the clustering process comprises:

Deleting the first time-consuming type whose data volume accounts for less than the first proportion threshold; and/or,

Randomly selecting at least two of the historical tasks from each of the first time-consuming types whose data volume accounts for more than the second proportion threshold as new cluster centers;

For the first time-consuming type that meets the preset proportion requirement, re-determine a new cluster center according to a preset method;

The clustering process is re-performed according to each new cluster center by using a preset clustering model to determine a new first time-consuming type.

6. The method for monitoring task scheduling anomalies according to claim 5, wherein for the first time-consuming type that meets the preset proportion requirement, a new cluster center is re-determined according to a preset method, comprising:

When the first time-consuming type meets the preset proportion requirement, the average time-consuming of the first time-consuming type is used as the new cluster center.

7. The method for monitoring task scheduling anomalies according to claim 1, wherein determining at least one scheduled detection task based on the plurality of task time-consuming types and the scheduling relationship map comprises:

Determining a first target type and a second target type from each of the task time-consuming types according to preset screening requirements;

Determine a first fluctuation range and a second fluctuation range according to each of the time-consuming data in the first target type and the second target type using a preset fluctuation algorithm;

The detection object and detection time of each of the scheduled detection tasks are determined according to the scheduling relationship map, the first fluctuation range, the second fluctuation range, and the start time of the execution of the historical tasks in the time-consuming data.

8. The method for monitoring task scheduling anomalies according to claim 7, wherein the step of determining the first fluctuation range and the second fluctuation range based on the time-consuming data of each of the first target type and the second target type by using a preset fluctuation algorithm comprises:

determining the first fluctuation range according to a first average duration and a first standard deviation of all the duration data in the first target type;

The second fluctuation range is determined according to a second average duration and a second standard deviation of all the duration data in the two target types.

9. The method for monitoring task scheduling anomalies according to claim 8, wherein determining the first fluctuation range based on the first average duration and the first standard deviation of all the duration data in the first target type comprises:

The first fluctuation range is equal to the sum of the first average time consumption and N times the first standard deviation;

The determining the second fluctuation range according to the second average time consumption and the second standard deviation of all the time consumption data of the two target types includes:

The second fluctuation range is equal to the difference between the second average time consumption and M times the second standard deviation.

10. The task scheduling anomaly monitoring method according to any one of claims 7 to 9 is characterized in that the detection time includes: a first detection time and a second detection time, the first detection time includes: superimposing the first fluctuation range on the basis of the start time, and the second detection time includes: superimposing the second fluctuation range on the basis of the start time.

11. The method for monitoring task scheduling anomalies according to claim 10, wherein the step of predicting whether the probability of anomalies in task scheduling of the target system meets preset warning requirements based on the detection results of each of the scheduled detection tasks at each detection time point comprises:

If it is determined according to the detection result that the execution progress of the detection object at the first detection time is incomplete, then a first probability of determining that there is an abnormality in the execution progress of the task meets the preset warning requirement;

If it is determined according to the detection result that the execution progress of the detection object at the second detection time is completed, then it is determined that the second probability that there is an abnormality in the data level of the task scheduled by the target system meets the preset warning requirement.

12. The method for monitoring task scheduling anomalies according to any one of claims 1 to 2 and 7 to 9, wherein determining and outputting one or more warning information comprises:

Calculating the correlation between the previous task and the next task in the scheduling relationship graph according to a preset correlation model;

If the correlation degree is within a first correlation interval, it is determined that the warning information includes first warning information and second warning information, and the warning levels of the first warning information and the second warning information are the same, the first warning information is used to indicate that a scheduling anomaly exists in the previous task and has an associated impact on the scheduling of the subsequent task, and the second warning information is used to indicate that the scheduling anomaly of the subsequent task is caused by a delay of the previous task;

The first warning information is output to the preceding task, and the second warning information is output to the succeeding task.

13. The method for monitoring task scheduling anomalies according to any one of claims 1 to 2 and 7 to 9, wherein determining and outputting one or more warning information comprises:

If the correlation degree is within a second correlation interval, it is determined that the warning information includes first warning information and second warning information, and the first warning level of the first warning information is greater than the second warning level of the second warning information, the first warning information is used to indicate that a scheduling anomaly exists in the previous task and has an associated impact on the scheduling of the subsequent task, and the second warning information is used to indicate that the scheduling anomaly of the subsequent task is caused by a delay of the previous task;

14. The method for monitoring task scheduling anomalies according to any one of claims 1 to 2 and 7 to 9, wherein determining and outputting one or more warning information comprises:

If the correlation degree is within a third correlation interval, the warning information is output to the previous task, where the warning information is used to indicate that a scheduling anomaly exists in the previous task.

15. The method for monitoring task scheduling anomalies according to any one of claims 1-2 and 7-9, wherein the warning information includes: a weight feedback link;

After determining and outputting one or more warning information, the method further includes:

receiving adjustment information input by the user through the weight feedback link;

The warning weight of the detection object corresponding to the scheduled detection task is adjusted according to the adjustment information.

16. The method for monitoring task scheduling anomalies according to claim 10, further comprising: when the scheduling anomaly is detected in the detection object at the first detection time, determining a third detection time for the detection object according to a preset delay time, wherein the detection object is a currently executing task;

When it is detected at the third detection time that the scheduling anomaly still exists in the currently executed task, determining a first warning level of the currently executed task according to the first preset warning weight of the currently executed task and the number of warning triggers;

Determining whether the first warning level meets the preset warning conditions;

If so, the warning information is sent again to the currently executing task.

17. The method for monitoring task scheduling anomalies according to claim 16, wherein when it is detected at the third detection time that the scheduling anomaly still exists in the currently executed task, the method further comprises:

Determine the correlation between the currently executed task and the next task according to the scheduling relationship map using a preset correlation model;

determining a second warning level of the next task according to the second warning weight of the next task, the correlation degree, and the number of warning triggering times;

Determining whether the second warning level meets the preset warning condition;

If so, the warning information is sent to the next task, where the warning information includes a prompt indicating that the scheduling anomaly of the next task is caused by the scheduling delay of the currently executed task.

18. A task scheduling anomaly monitoring device, comprising:

An acquisition module is configured to acquire historical task data of one or more historical task cycles based on a work plan of a current task cycle, wherein the similarity between the historical work plans of the historical task cycles and the work plan of the current task cycle meets a preset requirement, and the historical task data includes configuration data of each historical task and time-consuming data for executing each historical task;

Processing module for:

Determine a scheduling relationship map based on the configuration data, and determine at least one timed detection task based on the multiple task time-consuming types and the scheduling relationship map, wherein the scheduling relationship map is used to represent the dependency relationship between the processing results of each of the historical tasks;

Based on the detection results of each of the scheduled detection tasks at each detection time point, it is predicted whether the probability of abnormality in the task scheduling of the target system meets the preset warning requirements; if so, one or more warning information is determined;

An output module, configured to output the warning information to the detection object of the scheduled detection task;

The processing module is specifically used to:

19. An electronic device, comprising:

processor; and,

a memory for storing a computer program for the processor;

The processor is configured to execute the task scheduling exception monitoring method according to any one of claims 1 to 17 by executing the computer program.

20. A computer-readable storage medium having a computer program stored thereon, wherein when the computer program is executed by a processor, the method for monitoring task scheduling anomalies according to any one of claims 1 to 17 is implemented.

21. A computer program product, comprising a computer program, wherein when the computer program is executed by a processor, the method for monitoring task scheduling anomalies according to any one of claims 1 to 17 is implemented.