[go: up one dir, main page]

CN113407429B - Task processing method and device - Google Patents

Task processing method and device Download PDF

Info

Publication number
CN113407429B
CN113407429B CN202110700169.8A CN202110700169A CN113407429B CN 113407429 B CN113407429 B CN 113407429B CN 202110700169 A CN202110700169 A CN 202110700169A CN 113407429 B CN113407429 B CN 113407429B
Authority
CN
China
Prior art keywords
task
processing
batch
data
execution
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110700169.8A
Other languages
Chinese (zh)
Other versions
CN113407429A (en
Inventor
陈兵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Construction Bank Corp
Original Assignee
China Construction Bank Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Construction Bank Corp filed Critical China Construction Bank Corp
Priority to CN202110700169.8A priority Critical patent/CN113407429B/en
Publication of CN113407429A publication Critical patent/CN113407429A/en
Application granted granted Critical
Publication of CN113407429B publication Critical patent/CN113407429B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3466Performance evaluation by tracing or monitoring
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3466Performance evaluation by tracing or monitoring
    • G06F11/3476Data logging
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Prevention of errors by analysis, debugging or testing of software
    • G06F11/362Debugging of software
    • G06F11/3636Debugging of software by tracing the execution of the program
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/542Event management; Broadcasting; Multicasting; Notifications

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention discloses a task processing method and device, and relates to the technical field of big data. One embodiment of the method comprises the following steps: determining a batch task to be executed, acquiring a slicing parameter from the attribute of the batch task, and judging whether the slicing parameter is set to a preset value or not; if the judgment result is yes, determining a plurality of devices for processing the plurality of data, respectively distributing different data to corresponding devices for processing, and summarizing the processing result to obtain an execution result; if the judgment result is negative, determining one device for processing the plurality of data together, and distributing the plurality of data to the one device together for processing to obtain an execution result. The implementation mode can realize the rapid processing of various batch processing demands of institutions (such as banking systems), and simplifies the development and operation and maintenance processing of batch processing.

Description

Task processing method and device
Technical Field
The present invention relates to the field of big data technologies, and in particular, to a task processing method and apparatus.
Background
With the development of technology and the continuous promotion of banking business, the banking system needs to process batch business data in a large number, and the data types and the processing flows are complex. How to perform batch processing on data rapidly, stably and flexibly and smoothly transition the existing batch processing is a problem to be considered currently.
At present, various technical schemes are provided with a function of batch processing data, such as Kettle, dataStage, mainly using DataStage processing, but the schemes have certain disadvantages, such as complex development flow, long maintainability, inconvenient configuration and the like.
Taking the DataStage as an example, when a batch job is newly developed, the developed batch job is called by adding a job to the DataStage and then adding a schedule to the Control-M. Control-M scheduling supports scheduling configuration that is performed on a daily basis, and is weak for batch jobs that need to be performed multiple times a day. The data processing mode uses the traditional ETL mode to process, so that transaction support is weak, and if the operation flow comprises a plurality of steps, when the operation execution is wrong, the specific cause of the error is difficult to track. In addition, for the existing batch operation, the processing flow chain is complex, and when the optimization needs to be modified, the modification is difficult.
Disclosure of Invention
In view of the above, embodiments of the present invention provide a task processing method and apparatus, which at least can solve the problems of complex development flow, longer maintainability, inconvenient configuration, and the like in the existing task batch processing in the prior art.
To achieve the above object, according to an aspect of an embodiment of the present invention, there is provided a task processing method including:
determining a batch task to be executed, acquiring a slicing parameter from the attribute of the batch task, and judging whether the slicing parameter is set to a preset value or not; wherein, a batch processing task comprises a plurality of data to be processed;
If the judgment result is yes, determining a plurality of devices for processing the plurality of data, respectively distributing different data to corresponding devices for processing, and summarizing the processing result to obtain an execution result; or (b)
If the judgment result is negative, determining one device for processing the plurality of data together, and distributing the plurality of data to the one device together for processing to obtain an execution result.
Optionally, the determining a plurality of devices for processing the plurality of data includes:
and calculating the allocated data quantity of each device according to the current processor utilization rate and memory utilization rate of each device and the data quantity to be processed in the batch processing task by adopting a hash algorithm.
Optionally, the determining a device that processes the plurality of data together includes:
And screening one device with the minimum current load from a plurality of devices, and taking the one device as a target device for processing the plurality of data together.
Optionally, the determining a device that processes the plurality of data together includes:
Determining one or more devices for historically processing the batch task, counting the historical execution time of the batch task and the utilization rate of a processor when each batch task is processed, and further calculating the weight value of each device;
and screening one device with the maximum weight value from the one or more devices, and taking the one device as a target device for processing the plurality of data together.
Optionally, the allocating different data to the corresponding device for processing, or allocating the plurality of data to the one device together for processing, includes:
Acquiring serial parameters from the attribute of the batch task, and judging whether the serial parameters are set to a first preset value or not; the serial parameters correspond to multiple execution examples of the batch processing task, and each time the batch processing task is executed, an example is generated;
if the judgment result is yes, processing data when the task execution period of the batch processing task is reached; or (b)
If the judgment result is negative, the execution state of the instance of the batch processing task executed at the previous time is obtained, and the data is processed again when the execution state is the execution completion.
Optionally, the reprocessing the data when the execution state is that the execution is completed further includes:
In the case of processing the plurality of data together, if the execution state is not completed when the execution cycle of the batch task is reached, another device that processes the plurality of data together is newly determined.
Optionally, the method further comprises: and pulling a plurality of batch processing tasks to be executed from the database/operation path, and arranging the batch processing tasks according to the pulling order to generate the task queue to be executed.
Optionally, the method further comprises: and when the data to be processed is a file, checking whether the file exists in the operation path based on a checking period, and if the file does not exist, executing the operation of waiting for the file.
Optionally, the attribute further includes a task number, a name, an executable period, and a task parameter; wherein the task parameters include specific parameters required for task execution.
Optionally, the method further comprises: and judging whether the current time period is within the executable time period, and if not, executing a waiting operation.
Optionally, the method further comprises: in the process of executing the batch processing task, monitoring an execution state by utilizing a monitoring mechanism, and if abnormality occurs, recording abnormal data, abnormality reasons and operation equipment to generate a task execution abnormal log.
Optionally, after the generating the task execution exception log, the method further includes:
Responding to the opening of the task execution exception log, and positioning an exception execution step according to the exception data and the exception reason; wherein a batch task comprises a plurality of execution steps.
Optionally, the method further comprises: if the abnormality occurs, a notification message or a pop-up reminding message is sent.
To achieve the above object, according to another aspect of an embodiment of the present invention, there is provided a task processing device including:
The judging module is used for determining a batch processing task to be executed, acquiring a slicing parameter from the attribute of the batch processing task and judging whether the slicing parameter is set to a preset value or not; wherein, a batch processing task comprises a plurality of data to be processed;
the slicing module is used for determining a plurality of devices for processing the plurality of data if the judgment result is yes, respectively distributing different data to the corresponding devices for processing, and summarizing the processing result to obtain an execution result; or (b)
And the non-segmentation module is used for determining one device for processing the plurality of data together if the judgment result is negative, and distributing the plurality of data to the one device for processing together to obtain an execution result.
Optionally, the slicing module is configured to: and calculating the allocated data quantity of each device according to the current processor utilization rate and memory utilization rate of each device and the data quantity to be processed in the batch processing task by adopting a hash algorithm.
Optionally, the non-slicing module is configured to: and screening one device with the minimum current load from a plurality of devices, and taking the one device as a target device for processing the plurality of data together.
Optionally, the non-slicing module is configured to:
Determining one or more devices for historically processing the batch task, counting the historical execution time of the batch task and the utilization rate of a processor when each batch task is processed, and further calculating the weight value of each device;
and screening one device with the maximum weight value from the one or more devices, and taking the one device as a target device for processing the plurality of data together.
Optionally, the device further comprises a serial module for:
Acquiring serial parameters from the attribute of the batch task, and judging whether the serial parameters are set to a first preset value or not; the serial parameters correspond to multiple execution examples of the batch processing task, and each time the batch processing task is executed, an example is generated;
if the judgment result is yes, processing data when the task execution period of the batch processing task is reached; or (b)
If the judgment result is negative, the execution state of the instance of the batch processing task executed at the previous time is obtained, and the data is processed again when the execution state is the execution completion.
Optionally, the serial module is further configured to:
In the case of processing the plurality of data together, if the execution state is not completed when the execution cycle of the batch task is reached, another device that processes the plurality of data together is newly determined.
Optionally, the system further comprises a task pulling module for:
and pulling a plurality of batch processing tasks to be executed from the database/operation path, and arranging the batch processing tasks according to the pulling order to generate the task queue to be executed.
Optionally, the task pulling module is further configured to: and when the data to be processed is a file, checking whether the file exists in the operation path based on a checking period, and if the file does not exist, executing the operation of waiting for the file.
Optionally, the attribute further includes a task number, a name, an executable period, and a task parameter; wherein the task parameters include specific parameters required for task execution.
Optionally, the method further comprises an execution module for: and judging whether the current time period is within the executable time period, and if not, executing a waiting operation.
Optionally, the device further comprises an anomaly monitoring module for: in the process of executing the batch processing task, monitoring an execution state by utilizing a monitoring mechanism, and if abnormality occurs, recording abnormal data, abnormality reasons and operation equipment to generate a task execution abnormal log.
Optionally, the system further comprises an exception handling module, configured to: responding to the opening of the task execution exception log, and positioning an exception execution step according to the exception data and the exception reason; wherein a batch task comprises a plurality of execution steps.
Optionally, the method further comprises: if the abnormality occurs, a notification message or a pop-up reminding message is sent.
To achieve the above object, according to still another aspect of an embodiment of the present invention, there is provided a task processing electronic device.
The electronic equipment of the embodiment of the invention comprises: one or more processors; and a storage device for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to implement any of the task processing methods described above.
To achieve the above object, according to still another aspect of the embodiments of the present invention, there is provided a computer-readable medium having stored thereon a computer program which, when executed by a processor, implements any of the task processing methods described above.
According to the solution provided by the present invention, one embodiment of the above invention has the following advantages or beneficial effects: in order to enable the batch processing process to be faster, more flexible and controllable, a private cloud framework and a developed system general processing flow are combined, partial batch processing requirements can be rapidly developed and configured, the development efficiency and maintainability of the complex batch processing flow are improved, and monitoring and notification of task and operation execution states are supported.
Further effects of the above-described non-conventional alternatives are described below in connection with the embodiments.
Drawings
The drawings are included to provide a better understanding of the invention and are not to be construed as unduly limiting the invention. Wherein:
FIG. 1 is a schematic flow diagram of a task processing method according to an embodiment of the present invention;
FIG. 2 is a flow chart of an alternative task processing method according to an embodiment of the invention;
FIG. 3 is a flow chart of another alternative task processing method according to an embodiment of the invention;
FIG. 4 is a flow chart of yet another alternative task processing method according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of the main modules of a task processing device according to an embodiment of the present invention;
FIG. 6 is an exemplary system architecture diagram in which embodiments of the present invention may be applied;
Fig. 7 is a schematic diagram of a computer system suitable for use in implementing a mobile device or server of an embodiment of the invention.
Detailed Description
Exemplary embodiments of the present invention will now be described with reference to the accompanying drawings, in which various details of the embodiments of the present invention are included to facilitate understanding, and are to be considered merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
The words related to the scheme are explained as follows:
batch processing: the IT system processes data in two kinds of online transaction and batch processing, wherein online transaction is initiated through various terminal devices, and is accessed through various channels in an information processing mode requiring timely response, so that the processing time is fast and the timeliness is high. Batch processing refers to a process of processing accumulated data generated by a system in a certain period according to service requirements and a technical scheme, the processing time is relatively long, the processing efficiency is high, and requirements of daily final processing, statistical settlement, report analysis and the like can be met.
Batch Task (Task): a job chain consisting of one or more batch jobs.
Batch Job (Job): a single batch job is one step in Task.
Batch processing Step (Step): one step in a single batch Job.
ETL: the abbreviation of Extract-Transform-Load, i.e., the process of data extraction, transformation, loading.
The detailed comparison of the prior art Kettle, dataStage with the present solution is described herein:
1. Kettle is an open source ETL tool for processing, converting, migrating, etc., various data. Kettle is developed by clients, and configuration development and management are difficult to complete in the face of a large number of jobs. And the transaction function support is weaker, and the support is insufficient for complex operation flow requiring transaction control. The scheme provides three-level flow control of Task-Job-Step, can flexibly configure, has a complete transaction control function of the basic Spring Batch, and can meet the transaction processing requirements of complex flows.
2. The DataStage supports the functionality, flexibility and scalability required by data integration, supports various complex data transformations and processes, and the like. DATASTAGE has high requirement on hardware, has weak self-scheduling function, and generally needs to additionally develop scheduling and monitoring programs under the condition of a large number of jobs. And the transaction function support is weaker, and the support is insufficient for complex operation flow requiring transaction control. Positioning problems are difficult in the case of job exception handling. The scheme integrates the functions of job scheduling and batch processing, can operate in a single machine environment, and also supports cluster mode operation. Flexible scheduling configuration is supported, perfect transaction processing functions are supported, built-in batch tools are provided, and the customization of the operation flow is convenient. The exception handling flow can be quickly positioned to the cause of the problem, and the operation and maintenance of the system are facilitated.
Referring to fig. 1, a main flowchart of a task processing method provided by an embodiment of the present invention is shown, including the following steps:
S101: determining a batch task to be executed, acquiring a slicing parameter from the attribute of the batch task, and judging whether the slicing parameter is set to a preset value or not; wherein, a batch processing task comprises a plurality of data to be processed;
S102: if the judgment result is yes, determining a plurality of devices for processing the plurality of data, respectively distributing different data to corresponding devices for processing, and summarizing the processing result to obtain an execution result;
S103: if the judgment result is negative, determining one device for processing the plurality of data together, and distributing the plurality of data to the one device together for processing to obtain an execution result.
In the above embodiment, the task scheduling period, that is, the task execution interval, is set in advance for step S101, for example, once every 22 points per night, or once a month, once every 5 minutes, and once every 1 minute. And pulling a plurality of batch processing tasks to be executed from the database/operation path according to the configured task scheduling period, and arranging the batch processing tasks according to the pulling sequence to generate a task queue to be executed, wherein the queue contains the plurality of batch processing tasks. The path of operation here may be a specific location in disk, typically for storing files. When the data to be processed is a file, whether the file exists in the operation path or not needs to be queried based on a checking period, if the file does not exist, the data to be processed needs to be waited, and if the file exists, the data to be processed can be processed.
A batch Task (Task) consists of one or more batch jobs (Job), each batch Job (Job) containing one or more batch steps (Step). Each batch Step (Step) includes specific operations and data, i.e., specific workflow logic processes, whereby a batch task contains a plurality of data to be processed.
The method comprises the steps of obtaining the attribute pre-configured for the batch task to be executed, wherein the attribute comprises a task number, a name, a task execution state, an execution period, a serial parameter, a fragmentation parameter, an executable period, a task parameter (comprising a specific parameter required by the task execution) and the like. Before executing the task, firstly judging whether the current time period is within the executable time period, and if not, executing the waiting operation.
The slicing parameter, if the value is 1, indicates that when the task is executed, the data to be processed can be distributed to a plurality of hosts for execution; if the value is 0, the data to be processed is required to be distributed to a single host for execution when the task is executed.
Serial parameters, if the value is 0, represent the instance that the same task is executed each time and do not depend on the state that the previous instance is executed; if the value is 1, the instance which performs the same task each time must wait for the execution of the previous instance to complete (whether successful or failed) before performing; if the value is 2, the instance that executes the same task each time must wait for the previous instance to execute successfully.
If the data is in the executable period of the task, continuing to judge whether the fragmentation mark in the attribute is judged to be yes, if so, indicating that the data in the batch processing task supports the fragmentation processing, and if not, indicating that the data is not supported and the data to be processed of the same batch processing task is required to be distributed to a single host for processing, wherein different data can be distributed to different hosts (or called devices) for processing. Assuming that 3 hosts are used in a batch processing cluster, executing tasks once, and for a segmentation mode, 3 hosts are required to be executed simultaneously, wherein each host processes more than one third of data and is mainly used for the conditions of large data volume and relatively quick processing time; for the non-fragmented mode, all data needs to be distributed to a certain host for execution.
In the method provided in the above embodiment, the fragmentation flag is set in the task attribute to determine whether to use the fragmentation mode to process a plurality of data in the task. The implementation mode can realize the rapid processing of various batch processing demands of institutions (such as banking systems), and simplifies the development and operation and maintenance of batch processing.
Referring to fig. 2, a flowchart of an alternative task processing method according to an embodiment of the present invention is shown, including the following steps:
S201: determining a batch task to be executed, acquiring a slicing parameter from the attribute of the batch task, and judging whether the slicing parameter is set to a preset value or not; wherein, a batch processing task comprises a plurality of data to be processed;
s202: if the judgment result is yes, calculating the allocated data quantity of each device by adopting a hash algorithm according to the current processor utilization rate and memory utilization rate of each device and the data quantity to be processed in the batch processing task, and summarizing the processing result to obtain a processing total result;
S203: if the judgment result is negative, screening one device with the minimum current load from the plurality of devices, and taking the one device as a target device for processing the plurality of data together to obtain an execution result;
S204: if the result is negative, determining one or more devices for historically processing the batch processing task, counting the historical execution time of the batch processing task and the utilization rate of a processor when each batch processing task is processed, and further calculating the weight value of each device;
s205: and screening one device with the maximum weight value from the one or more devices, and taking the one device as a target device for processing the plurality of data together to obtain an execution result.
In the above embodiment, for step S201, reference may be made to the description of step S101 shown in fig. 1, and the description is omitted here.
For step S202, assume that the batch task to be executed at this time includes 20 pieces of data to be processed, and there are 10 hosts in total. For the situation that the task supports the execution of the fragments, a hash algorithm is needed, the allocated proportion of each host is calculated according to the current CPU utilization rate and the memory utilization rate of each host, the proportion is multiplied by the data volume to be processed, the allocated data volume of each host is obtained, the algorithm ensures that the data is distributed, and 10 hosts are substantially equally distributed.
For step S203, for the case where the task does not support the execution of the slices, one host with the smallest current load may be selected from the 10 hosts to process the 20 data together. When the number of tasks to be processed is large, for example, the task a-host 2 and the task B-host 3 may be adopted, and each time, after determining which host the previous task is processed by, one host with the smallest current load is determined, and a situation that one host processes multiple tasks may sometimes occur.
Dynamic planning algorithms are typically used to solve problems with certain optimal properties. In such a problem, there may be many possible solutions, each corresponding to a value, for which it is desirable to find a solution with an optimal value. The basic idea is to decompose the problem to be solved into a number of sub-problems, solve the sub-problems first, and then obtain the solution of the original problem from the solutions of the sub-problems.
If there are 3 hosts, if it is feasible to distribute task a to all 3 hosts, there are 3 possible solutions. And a dynamic programming algorithm is adopted to distribute the task to one host with the minimum load, namely an optimal solution. Every time a task execution instance is newly added, the optimal solution is searched. If 10 tasks are to be executed, the situation that the task allocation of 3 hosts is balanced is finally achieved according to the mode.
For steps S204 to S205, in addition to the above manner, for the case that the task does not support the execution of the slices, the time for which the task is executed in history, the device for processing the task in history, and the CPU utilization rate when the device processes the task may be counted, so as to calculate the weight values of the devices, and further select a device with the largest weight value from the weight values to process the task.
The method provided by the embodiment adopts different modes to determine the equipment for processing the data for the situation of executing the task in a slicing way and executing the task in a non-slicing way, thereby realizing the high efficiency of data processing.
Referring to fig. 3, a flowchart of an alternative task processing method according to an embodiment of the present invention is shown, including the following steps:
S301: acquiring serial parameters from attributes of batch processing tasks, and judging whether the serial parameters are set to a first preset value or not; the serial parameters correspond to multiple execution examples of the batch processing task, and each time the batch processing task is executed, an example is generated;
s302: if the judgment result is yes, processing data when the task execution period of the batch processing task is reached;
s303: if the judgment result is negative, acquiring the execution state of the instance of the batch processing task executed at the previous time, and processing the data under the condition that the execution state is finished;
S304: in the case of processing the plurality of data together, if the execution state is not completed when the execution cycle of the batch task is reached, another device that processes the plurality of data together is newly determined.
In the above embodiment, for steps S301 to S304, if the serial flag in the batch task attribute is determined to be yes, it indicates that the batch task does not support parallel execution, and otherwise, it indicates that parallel execution is supported. In parallel, it means that a task may have multiple instances at the same time; non-parallel, means that a task has only one instance at a time.
Each execution of a task generates an instance, and each execution depends on the state results of the previous instance execution. For example, task A is set to execute once every 5 minutes and is executed serially, and task A is executed once at 19:55, 20:00, and 20:05 time points, respectively, resulting in 3 instances. Task a executes an instance at time 19:55, but until 20:00 has not yet been executed, another device that handles task a will be redetermined at 20:00. As for the example executed by 20:05, since the example executed by 20:00 is not yet executed, another device of the processing task A is redetermined at 20:05, and if the example executed by 20:00 is not yet executed at 20:10, another device of the processing task A is redetermined at 20:10.
For example, task A is preset to be performed once every 22 th night, but task A No. 1 has not been performed at 22 nd night. For the serial processing mode, task No. 2 needs to be executed after task No. 1 is executed, so task No. 2 needs to continue waiting even if it is currently 22 points No. 2. For the parallel processing mode, i.e. when the point No. 2 reaches the point No. 22, no. 2 task starts to be executed no matter whether or not the task No. 1 is executed and when the task is executed.
According to the method provided by the embodiment, for the serial mode and the parallel mode, the equipment processes the data in the task in different modes, particularly the serial mode, the execution state of the previous example is considered, and if the execution state is not in accordance with the requirement, the processing equipment is determined again, so that the timely processing of the task is realized.
Referring to fig. 4, a flow chart of another alternative task processing method according to an embodiment of the present invention is shown, including the following steps:
s401: determining a batch task to be executed, acquiring a slicing parameter from the attribute of the batch task, and judging whether the slicing parameter is set to a preset value or not; wherein, a batch processing task comprises a plurality of data to be processed;
S402: if the judgment result is yes, determining a plurality of devices for processing the plurality of data, respectively distributing different data to corresponding devices for processing, and summarizing the processing result to obtain an execution result;
S403: if the judgment result is negative, determining one device for processing the plurality of data together, and distributing the plurality of data to the one device for processing together to obtain an execution result;
S404: monitoring an execution state in the process of executing batch processing tasks, and if abnormality occurs, recording abnormal data, operation equipment and abnormality reasons to generate task execution abnormal logs;
s405: responding to the opening of the task execution exception log, and positioning an exception execution step according to the exception data and the exception reason; wherein a batch task comprises a plurality of execution steps.
In the above embodiment, the steps S401 to S403 may be described with reference to fig. 1 to 3, and are not described herein. A step of
For steps S404 and S405, during execution of the task, the execution state of the task in each execution step is monitored by the event monitor to send a notification message when an error occurs, optimizing the task configuration. The subsequent developer can open the task execution exception log to locate exception execution steps according to the exception data, exception reasons and operating equipment recorded in the log to locate specific problems and analyze.
Besides, the scheme can package operation task parameters according to the task configuration operation steps. The operation task parameters include, in addition to the previous task configuration attributes, some parameters generated when the task starts to execute, such as the currently executed service date, the task queue number, the data file path used by the task, the data file waiting detection time interval, the rollback file path, the number of submissions of data batch writing into the database, and the like, which can only be determined when the task starts to execute, so that the encapsulation processing is required.
The task parameter is encapsulated in the first stage, and in the second stage, the task parameters obtained in each batch processing step may be different, each step has its own parameters, and then the required parameters and the parameters of the step need to be taken out from the task parameters, and then assembled.
According to the method provided by the embodiment, the enhanced event monitor is utilized, in the task execution process, error data can be recorded in the log file, and error reasons are recorded, so that the specific problems can be conveniently and quickly positioned according to the reasons and the data, and the running monitoring and the operation and maintenance processing of the batch processing task are improved.
Compared with the prior art, the method provided by the embodiment of the invention has at least the following beneficial effects:
1. The state of each stage of the operation life cycle is easily mastered by using a multidirectional event monitoring mechanism provided by the Spring Batch. Based on an event-based exception handling mechanism, when the task execution is abnormal, the exception position can be accurately positioned, so that a developer can conveniently and timely handle the task.
2. Based on an open source Batch processing framework Spring Batch and a Quartz scheduling framework, the functions of flow configuration, operation monitoring, event notification and the like are developed in a targeted manner by combining a private cloud system, java language development is used, a developer is fast in operation, the development flow is simple, operation and maintenance are convenient, and a strong technical innovation idea is provided.
3. The whole implementation is convenient, the configuration is convenient, the operation and maintenance are simple and convenient, the usability is strong, the rapid integration is convenient, the problems that the existing batch processing frame is too heavy in weight and needs to be developed by professional skill personnel and is difficult to deal with in face of large-batch requirements are solved, the rapid development of batch processing operation is realized, and the development efficiency and maintainability of complex batch processing flow are improved.
Referring to fig. 5, a schematic diagram of main modules of a task processing device 500 according to an embodiment of the present invention is shown, including:
a judging module 501, configured to determine a batch task to be executed, obtain a slicing parameter from an attribute of the batch task, and judge whether the slicing parameter is set to a preset value; wherein, a batch processing task comprises a plurality of data to be processed;
The slicing module 502 is configured to determine a plurality of devices for processing the plurality of data if the determination result is yes, respectively allocate different data to corresponding devices for processing, and summarize the processing result to obtain an execution result; or (b)
The non-slicing module 503 is configured to determine one device that processes the plurality of data together if the determination result is negative, and allocate the plurality of data to the one device together for processing, so as to obtain an execution result.
In the embodiment of the present invention, the slicing module 502 is configured to:
and calculating the allocated data quantity of each device according to the current processor utilization rate and memory utilization rate of each device and the data quantity to be processed in the batch processing task by adopting a hash algorithm.
In the embodiment of the present invention, the non-slicing module 503 is configured to:
And screening one device with the minimum current load from a plurality of devices, and taking the one device as a target device for processing the plurality of data together.
In the embodiment of the present invention, the non-slicing module 503 is configured to:
Determining one or more devices for historically processing the batch task, counting the historical execution time of the batch task and the utilization rate of a processor when each batch task is processed, and further calculating the weight value of each device;
and screening one device with the maximum weight value from the one or more devices, and taking the one device as a target device for processing the plurality of data together.
The implementation device of the invention also comprises a serial module for:
Acquiring serial parameters from the attribute of the batch task, and judging whether the serial parameters are set to a first preset value or not; the serial parameters correspond to multiple execution examples of the batch processing task, and each time the batch processing task is executed, an example is generated;
if the judgment result is yes, processing data when the task execution period of the batch processing task is reached; or (b)
If the judgment result is negative, the execution state of the instance of the batch processing task executed at the previous time is obtained, and the data is processed again when the execution state is the execution completion.
In the implementation device of the present invention, the serial module is further configured to:
In the case of processing the plurality of data together, if the execution state is not completed when the execution cycle of the batch task is reached, another device that processes the plurality of data together is newly determined.
The implementation device of the invention also comprises a task pulling module for:
and pulling a plurality of batch processing tasks to be executed from the database/operation path, and arranging the batch processing tasks according to the pulling order to generate the task queue to be executed.
In the embodiment of the present invention, the task pulling module is further configured to: and when the data to be processed is a file, checking whether the file exists in the operation path based on a checking period, and if the file does not exist, executing the operation of waiting for the file.
In the implementation device of the invention, the attributes also comprise task numbers, names, executable time periods and task parameters; wherein the task parameters include specific parameters required for task execution.
The implementation device of the invention further comprises an execution module for:
And judging whether the current time period is within the executable time period, and if not, executing a waiting operation.
The implementation device of the invention also comprises an anomaly monitoring module which is used for:
in the process of executing the batch processing task, monitoring an execution state by utilizing a monitoring mechanism, and if abnormality occurs, recording abnormal data, abnormality reasons and operation equipment to generate a task execution abnormal log.
The implementation device of the invention also comprises an exception handling module which is used for:
Responding to the opening of the task execution exception log, and positioning an exception execution step according to the exception data and the exception reason; wherein a batch task comprises a plurality of execution steps.
The implementation device of the invention further comprises: if the abnormality occurs, a notification message or a pop-up reminding message is sent.
In addition, the implementation of the apparatus in the embodiments of the present invention has been described in detail in the above method, so that the description is not repeated here.
Fig. 6 shows an exemplary system architecture 600 in which embodiments of the invention may be applied, including terminal devices 601, 602, 603, a network 604, and a server 605 (by way of example only).
The terminal devices 601, 602, 603 may be various electronic devices having a display screen and supporting web browsing, are installed with various communication client applications, and a user may interact with the server 605 through the network 604 using the terminal devices 601, 602, 603 to receive or transmit messages, etc.
The network 604 is used as a medium to provide communication links between the terminal devices 601, 602, 603 and the server 605. The network 604 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
The server 605 may be a server providing various services for performing a determination of batch tasks to be performed, and performing a slicing operation or a non-slicing operation according to task slicing properties.
It should be noted that, the method provided by the embodiment of the present invention is generally performed by the server 605, and accordingly, the apparatus is generally disposed in the server 605.
It should be understood that the number of terminal devices, networks and servers in fig. 6 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
Referring now to FIG. 7, there is illustrated a schematic diagram of a computer system 700 suitable for use in implementing an embodiment of the present invention. The terminal device shown in fig. 7 is only an example, and should not impose any limitation on the functions and the scope of use of the embodiment of the present invention.
As shown in fig. 7, the computer system 700 includes a Central Processing Unit (CPU) 701, which can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 702 or a program loaded from a storage section 708 into a Random Access Memory (RAM) 703. In the RAM 703, various programs and data required for the operation of the system 700 are also stored. The CPU 701, ROM 702, and RAM 703 are connected to each other through a bus 704. An input/output (I/O) interface 705 is also connected to bus 704.
The following components are connected to the I/O interface 705: an input section 706 including a keyboard, a mouse, and the like; an output portion 707 including a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, a speaker, and the like; a storage section 708 including a hard disk or the like; and a communication section 709 including a network interface card such as a LAN card, a modem, or the like. The communication section 709 performs communication processing via a network such as the internet. The drive 710 is also connected to the I/O interface 705 as needed. A removable medium 711 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 710 as necessary, so that a computer program read therefrom is mounted into the storage section 708 as necessary.
In particular, according to embodiments of the present disclosure, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method shown in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via the communication portion 709, and/or installed from the removable medium 711. The above-described functions defined in the system of the present invention are performed when the computer program is executed by a Central Processing Unit (CPU) 701.
The computer readable medium shown in the present invention may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present invention, however, the computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The modules involved in the embodiments of the present invention may be implemented in software or in hardware. The described modules may also be provided in a processor, for example, as: a processor comprises a judging module, a slicing module and a non-slicing module. The names of these modules do not in some way constitute a limitation on the module itself, and for example, a slicing module may also be described as a "non-slicing execution module".
As another aspect, the present invention also provides a computer-readable medium that may be contained in the apparatus described in the above embodiments; or may be present alone without being fitted into the device. The computer readable medium carries one or more programs which, when executed by a device, cause the device to include:
determining a batch task to be executed, acquiring a slicing parameter from the attribute of the batch task, and judging whether the slicing parameter is set to a preset value or not; wherein, a batch processing task comprises a plurality of data to be processed;
If the judgment result is yes, determining a plurality of devices for processing the plurality of data, respectively distributing different data to corresponding devices for processing, and summarizing the processing result to obtain an execution result; or (b)
If the judgment result is negative, determining one device for processing the plurality of data together, and distributing the plurality of data to the one device together for processing to obtain an execution result.
According to the technical scheme of the embodiment of the invention, the slicing mark is set in the task attribute to judge whether to adopt the slicing mode to process a plurality of data in the task. The implementation mode can realize the rapid processing of various batch processing demands of institutions (such as banking systems), and simplifies the development and operation and maintenance of batch processing.
The above embodiments do not limit the scope of the present invention. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives can occur depending upon design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention should be included in the scope of the present invention.

Claims (16)

1. A method of task processing, comprising:
Determining a batch task to be executed, acquiring a slicing parameter from the attribute of the batch task, and judging whether the slicing parameter is set to a preset value or not; wherein, a batch task is composed of one or more batch jobs, each batch job comprises one or more batch steps, each batch step comprises a specific data group containing a plurality of data to be processed by the batch task; the attribute also comprises serial parameters, the serial parameters correspond to multiple execution examples of the batch processing task, each time the batch processing task is executed, an example is generated, and each time the batch processing task is executed, the execution depends on the state result of the previous example; the serial parameters are also used to represent: each instance of the same task must wait for the previous instance to execute successfully;
If the judgment result is yes, determining a plurality of devices for processing the plurality of data, respectively distributing different data to corresponding devices for processing, and summarizing the processing result to obtain an execution result; or (b)
If the judgment result is negative, determining one device for processing the plurality of data together, and distributing the plurality of data to the one device together for processing to obtain an execution result.
2. The method of claim 1, wherein the determining a plurality of devices that process the plurality of data comprises:
and calculating the allocated data quantity of each device according to the current processor utilization rate and memory utilization rate of each device and the data quantity to be processed in the batch processing task by adopting a hash algorithm.
3. The method of claim 1, wherein said determining a device that processes said plurality of data together comprises:
And screening one device with the minimum current load from a plurality of devices, and taking the one device as a target device for processing the plurality of data together.
4. The method of claim 1, wherein said determining a device that processes said plurality of data together comprises:
Determining one or more devices for historically processing the batch task, counting the historical execution time of the batch task and the utilization rate of a processor when each batch task is processed, and further calculating the weight value of each device;
and screening one device with the maximum weight value from the one or more devices, and taking the one device as a target device for processing the plurality of data together.
5. The method of claim 1, wherein the assigning the different data to the respective device for processing or the plurality of data together to the one device for processing comprises:
Acquiring serial parameters from the attribute of the batch task, and judging whether the serial parameters are set to a first preset value or not;
if the judgment result is yes, processing data when the task execution period of the batch processing task is reached; or (b)
If the judgment result is negative, the execution state of the instance of the batch processing task executed at the previous time is obtained, and the data is processed again when the execution state is the execution completion.
6. The method of claim 5, wherein the reprocessing the data if the execution status is complete, further comprises:
In the case of processing the plurality of data together, if the execution state is not completed when the execution cycle of the batch task is reached, another device that processes the plurality of data together is newly determined.
7. The method as recited in claim 1, further comprising:
and pulling a plurality of batch processing tasks to be executed from the database/operation path, and arranging the batch processing tasks according to the pulling order to generate a task queue to be executed.
8. The method as recited in claim 7, further comprising: and when the data to be processed is a file, checking whether the file exists in the operation path based on a checking period, and if the file does not exist, executing the operation of waiting for the file.
9. The method of any of claims 1-8, wherein the attributes further comprise a task number, a name, an executable period, and a task parameter; wherein the task parameters include specific parameters required for task execution.
10. The method as recited in claim 9, further comprising:
And judging whether the current time period is within the executable time period, and if not, executing a waiting operation.
11. The method as recited in claim 1, further comprising:
in the process of executing the batch processing task, monitoring an execution state by utilizing a monitoring mechanism, and if abnormality occurs, recording abnormal data, abnormality reasons and operation equipment to generate a task execution abnormal log.
12. The method of claim 11, further comprising, after the generating the task execution exception log:
Responding to the opening of the task execution exception log, and positioning an exception execution step according to the exception data and the exception reason; wherein a batch task comprises a plurality of execution steps.
13. The method as recited in claim 11, further comprising: if the abnormality occurs, a notification message or a pop-up reminding message is sent.
14. A task processing device, comprising:
The judging module is used for determining a batch processing task to be executed, acquiring a slicing parameter from the attribute of the batch processing task and judging whether the slicing parameter is set to a preset value or not; wherein, a batch task is composed of one or more batch jobs, each batch job comprises one or more batch steps, each batch step comprises a specific data group containing a plurality of data to be processed by the batch task; the attribute also comprises serial parameters, the serial parameters correspond to multiple execution examples of the batch processing task, each time the batch processing task is executed, an example is generated, and each time the batch processing task is executed, the execution depends on the state result of the previous example; the serial parameters are also used to represent: each instance of the same task must wait for the previous instance to execute successfully;
the slicing module is used for determining a plurality of devices for processing the plurality of data if the judgment result is yes, respectively distributing different data to the corresponding devices for processing, and summarizing the processing result to obtain an execution result; or (b)
And the non-segmentation module is used for determining one device for processing the plurality of data together if the judgment result is negative, and distributing the plurality of data to the one device for processing together to obtain an execution result.
15. An electronic device, comprising:
One or more processors;
storage means for storing one or more programs,
When executed by the one or more processors, causes the one or more processors to implement the method of any of claims 1-13.
16. A computer readable medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements the method according to any of claims 1-13.
CN202110700169.8A 2021-06-23 2021-06-23 Task processing method and device Active CN113407429B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110700169.8A CN113407429B (en) 2021-06-23 2021-06-23 Task processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110700169.8A CN113407429B (en) 2021-06-23 2021-06-23 Task processing method and device

Publications (2)

Publication Number Publication Date
CN113407429A CN113407429A (en) 2021-09-17
CN113407429B true CN113407429B (en) 2024-07-19

Family

ID=77682691

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110700169.8A Active CN113407429B (en) 2021-06-23 2021-06-23 Task processing method and device

Country Status (1)

Country Link
CN (1) CN113407429B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113495784B (en) * 2021-07-27 2024-03-19 中国银行股份有限公司 Method and device for data batch processing
CN115048199A (en) * 2022-05-25 2022-09-13 上海浦东发展银行股份有限公司 Batch processing method based on distributed architecture
CN116107713A (en) * 2023-01-17 2023-05-12 江铃汽车股份有限公司 Batch processing task execution method, system, storage medium and equipment

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102929585A (en) * 2012-09-25 2013-02-13 上海证券交易所 Batch processing method and system supporting multi-master distributed data processing
CN110308980A (en) * 2019-06-27 2019-10-08 深圳前海微众银行股份有限公司 Data batch processing method, device, equipment and storage medium

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2010176303A (en) * 2009-01-28 2010-08-12 Nippon Yunishisu Kk Batch processing system, information terminal apparatus for use in the same, and method for recovering batch processing
CN109376004B (en) * 2018-08-20 2024-10-18 中国平安人寿保险股份有限公司 Cluster calculation-based data batch processing method and device, electronic equipment and medium
CN109144731B (en) * 2018-08-31 2024-08-09 中国平安人寿保险股份有限公司 Data processing method, device, computer equipment and storage medium
CN110008018B (en) * 2019-01-17 2023-08-29 创新先进技术有限公司 Batch task processing method, device and equipment
CN110113387A (en) * 2019-04-17 2019-08-09 深圳前海微众银行股份有限公司 A kind of processing method based on distributed batch processing system, apparatus and system
CN111061762A (en) * 2019-11-08 2020-04-24 京东数字科技控股有限公司 Distributed task processing method, related device, system and storage medium
CN111400012A (en) * 2020-03-20 2020-07-10 中国建设银行股份有限公司 Data parallel processing method, device, equipment and storage medium
CN111679920A (en) * 2020-06-08 2020-09-18 中国银行股份有限公司 Method and device for processing batch equity data
CN111767126A (en) * 2020-06-15 2020-10-13 中国建设银行股份有限公司 System and method for distributed batch processing
CN112148711B (en) * 2020-09-21 2023-04-25 建信金融科技有限责任公司 Batch processing task processing method and device

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102929585A (en) * 2012-09-25 2013-02-13 上海证券交易所 Batch processing method and system supporting multi-master distributed data processing
CN110308980A (en) * 2019-06-27 2019-10-08 深圳前海微众银行股份有限公司 Data batch processing method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN113407429A (en) 2021-09-17

Similar Documents

Publication Publication Date Title
US11005969B2 (en) Problem solving in a message queuing system in a computer network
CN113407429B (en) Task processing method and device
CN108595316B (en) Lifecycle management method, manager, device, and medium for distributed application
CN107729139B (en) Method and device for concurrently acquiring resources
US8140591B2 (en) Enabling workflow awareness within a business process management (BPM) system
CN109446274B (en) Method and device for managing BI metadata of big data platform
EP3901773A1 (en) Dynamically allocated cloud worker management system and method therefor
US9497096B2 (en) Dynamic control over tracing of messages received by a message broker
CN111045911B (en) Performance test method, performance test device, storage medium and electronic equipment
US11803421B2 (en) Monitoring health status of a large cloud computing system
CN113127057B (en) Method and device for parallel execution of multiple tasks
US11861397B2 (en) Container scheduler with multiple queues for special workloads
CN114418403A (en) Order distribution method, apparatus, equipment and storage medium
CN113360368A (en) Method and device for testing software performance
CN112269672B (en) File downloading exception handling method and device and electronic equipment
CN109213743B (en) Data query method and device
CN114816477A (en) Server upgrading method, device, equipment, medium and program product
CN113760482B (en) Task processing method, device and system
CN111212112B (en) Information processing method and device
US20130145004A1 (en) Provisioning using presence detection
WO2020047390A1 (en) Systems and methods for hybrid burst optimized regulated workload orchestration for infrastructure as a service
CN119444125B (en) Document process management method, device, storage medium and server
CN114090206B (en) Method, device, storage medium and equipment for adjusting workflow task executor
US20240020171A1 (en) Resource and workload scheduling
CN118426933A (en) Method, device, electronic equipment and storage medium for processing business data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant