CN109981731B

CN109981731B - Data processing method and equipment

Info

Publication number: CN109981731B
Application number: CN201910117769.4A
Authority: CN
Inventors: 余健伟; 王耀晖; 杨帆; 张成松
Original assignee: Lenovo Beijing Ltd
Current assignee: Lenovo Beijing Ltd
Priority date: 2019-02-15
Filing date: 2019-02-15
Publication date: 2021-06-15
Anticipated expiration: 2039-02-15
Also published as: CN109981731A

Abstract

The embodiment of the invention discloses a data processing method, which comprises the following steps: acquiring data to be uploaded, and converting the data to be uploaded into a collection task of a plurality of data blocks; acquiring the load of each server in a server cluster, and distributing the acquisition task to each server based on the load of each server; the server cluster comprises at least two servers used for collecting the data to be uploaded and uploading the data to the data processing platform. The embodiment of the invention also discloses a data processing device.

Description

Data processing method and equipment

Technical Field

The present invention relates to data processing technologies in the field of communications, and in particular, to a data processing method and device.

Background

At present, different property data are stored in a distributed manner in a corresponding data warehouse, for example, the data may include Device Control System (DCS) real-time data, Laboratory Information Management System (LIMS) data, Quality Management System (QMS) data, and the like. With the gradual and extensive application of big data, how to efficiently pull each mass data dispersed in different warehouses to a big data platform in real time and adopt different acquisition strategies to treat different weight data simultaneously becomes a big problem of a big data platform data acquisition link gradually.

In order to solve the above problems, the following technical solutions appear in the related art: and according to the scheme A, different acquisition services are developed aiming at different data, and a plurality of acquisition services are deployed for acquisition by splitting a data block in a warehouse of mass data. Each acquisition service is independent, and the main and standby acquisition services are deployed aiming at the same data block acquisition service, so that high availability is realized. And B, deploying a plurality of services in a distributed mode, and collecting the services according to the set collection tasks. And in the scheme C, the active pulling of the data is changed into the active pushing of the data to the big data platform by different data sources. However, the above scheme has the problems that there is no correlation between services, the required resources are excessive, the acquisition cannot be balanced, and the acquisition cannot be highly available.

Disclosure of Invention

In order to solve the above technical problems, embodiments of the present invention are intended to provide a data processing method and device, which solve the problem that there is no association between services in the relative technologies, reduce required resources, and implement balanced acquisition; meanwhile, high availability of service is realized in the acquisition process.

The technical scheme of the invention is realized as follows:

a method of data processing, the method comprising:

acquiring data to be uploaded, and converting the data to be uploaded into a collection task of a plurality of data blocks;

acquiring the load of each server in a server cluster, and distributing the acquisition task to each server based on the load of each server; the server cluster comprises at least two servers used for collecting the data to be uploaded and uploading the data to the data processing platform.

Optionally, the obtaining a load of each server in the server cluster, and distributing the collection task to each server based on the load of each server includes:

acquiring a first server which is in an abnormal state and is distributed with a first acquisition task in the server cluster through the heartbeat information of each server;

if the first acquisition task is matched with the acquisition task of the data block, setting the service state of the first server as a first state, and setting the identification of the acquisition state of the first acquisition task as a first identification; wherein the first state indicates that the server is abnormal; the first identification indicates that the first collection task was not collected.

acquiring a second acquisition task which needs to be redistributed to the server and is not acquired in the acquisition tasks of the data blocks;

acquiring a second server which has a normal service state and a task number smaller than a rated task number and runs on the server from the server cluster; the rated task number refers to the maximum operable task number on the server;

distributing the second collection task to the second server according to the rated task number of the second server;

if the number of the tasks running on the second server is equal to the rated number of the tasks, setting the service state of the second server to be a second state; and the second state indicates that the service state of the server is normal and the number of running tasks is equal to the rated number of tasks.

acquiring a third acquisition task which is abnormal in acquisition and is not acquired in acquisition tasks of the data block;

acquiring a second server with the number of tasks running on the server being smaller than the rated number of tasks and the service state being normal from the server cluster, and calculating the number of remaining executable tasks on the second server;

and distributing the third acquisition task to the second server based on the number of the tasks which remain to be operated on the second server.

Optionally, the method further includes:

if the third acquisition task comprises a first sub-acquisition task which is not allocated to the server and the number of tasks running on the second server is equal to the rated number of tasks, setting the service state of the second server to be a second state, and acquiring a third server which has the number of tasks running on the server equal to the rated number of tasks and has a normal service state from the server cluster; the third server comprises a second server;

and calculating the number of the remaining executable tasks on the third server, and sequentially distributing the first sub-collection tasks to the third server based on the number of the remaining executable tasks on the third server.

Optionally, the method further includes:

if a second sub-collection task which is not distributed to the server exists in the first sub-collection tasks and the number of tasks running on the third server is larger than the rated number of tasks, setting the service state of the third server to be a third state, and acquiring a fourth server which has the normal service state and the number of tasks running on the server larger than the rated number of tasks from the server cluster; the fourth server comprises a third server, and the third state indicates that the service state of the server is normal and the number of running tasks is greater than the rated number of tasks;

and sequentially distributing the second sub-collection tasks to the fourth server.

acquiring a fourth server with normal service state and the number of tasks running on the server being greater than the rated number of tasks from the server cluster, and acquiring a second server with normal service state and the number of tasks running on the server being less than the rated number of tasks;

calculating the number of tasks which are remained and can be operated on the second server based on the rated number of tasks of the second server and the number of tasks which are operated on the second server;

stopping the thread of the collection task exceeding the rated task in the fourth server;

distributing a first number of acquisition tasks exceeding the rated tasks in the fourth server to the second server, and setting the acquisition state identifiers of the first number of acquisition tasks exceeding the rated tasks as second identifiers; wherein the second identifier indicates that the collection task reallocates servers and has been currently collected; the first number is the number of tasks remaining operable on the second server.

acquiring a fourth acquisition task which is distributed with a server and is not acquired in the acquisition tasks of the data block;

creating and executing an acquisition task thread, and setting an acquisition state identifier of the fourth acquisition task as a third identifier; wherein the third identifier indicates that an acquisition task is currently being acquired;

acquiring a fifth acquisition task which needs to be reallocated to the server and is acquired in the acquisition tasks of the data blocks;

suspending a thread for acquiring the fifth acquisition task, and setting an identifier of an acquisition state of the fifth acquisition task as a fourth identifier; wherein the fourth identification indicates that the collection task needs to reallocate servers and is not currently collected.

Optionally, the method further includes:

setting a task name of each data block acquisition task and a service address corresponding to each server;

defining a first relation among a task name of each data block acquisition task, an acquisition state of each data block acquisition task, a service address corresponding to each data block acquisition task and a meaning indicating a condition about the acquisition task;

defining a second relationship between the service address of each server, the rated number of tasks of the server, the service status of each server and the meaning indicating the operation condition of the server; wherein the first relationship and the second relationship are used when distributing the collection task to the server.

A data processing apparatus, the apparatus comprising: a processor, a memory, and a communication bus;

the communication bus is used for realizing communication connection between the processor and the memory;

the processor is configured to execute a data processing program stored in the memory to implement the steps of:

The data processing method and the data processing device provided by the embodiment of the invention are used for acquiring data to be uploaded, converting the data to be uploaded into acquisition tasks of a plurality of data blocks, acquiring the load of each server in a server cluster, and distributing the acquisition tasks to each server based on the load of each server; the server cluster comprises at least two servers for acquiring data to be uploaded and uploading the data to the data processing platform, so that acquisition tasks are distributed to the servers based on the load of each server without deploying one server for each data block, the problem that services in the relative technology are not related is solved, required resources are reduced, and balanced acquisition can be realized; meanwhile, high availability of service is realized in the acquisition process.

Drawings

Fig. 1 is a schematic flow chart of a data processing method according to an embodiment of the present invention;

FIG. 2 is a flow chart illustrating another data processing method according to an embodiment of the present invention;

FIG. 3 is a flow chart illustrating a further data processing method according to an embodiment of the present invention;

fig. 4 is a flowchart illustrating a data processing method according to another embodiment of the present invention;

FIG. 5 is a flow chart illustrating another data processing method according to another embodiment of the present invention;

FIG. 6 is a flowchart illustrating a data processing method according to another embodiment of the present invention;

fig. 7 is a flowchart illustrating a data processing method according to another embodiment of the present invention;

fig. 8 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present invention.

Detailed Description

The technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention.

An embodiment of the present invention provides a data processing method, which is shown in fig. 1 and includes the following steps

Step 101, acquiring data to be uploaded, and converting the data to be uploaded into a collection task of a plurality of data blocks.

The acquisition task of acquiring the data to be uploaded and converting the data to be uploaded into a plurality of data blocks in the step 101 can be realized by data processing equipment; the data to be uploaded may include Device Control System (DCS) real-time data, Laboratory Information Management System (LIMS) data, Quality Management System (QMS) data, and the like. In the embodiment of the application, the data to be uploaded is abstracted into a plurality of data blocks.

And 102, acquiring the load of each server in the server cluster, and distributing an acquisition task to each server based on the load of each server.

The server cluster comprises at least two servers used for collecting data to be uploaded and uploading the data to the data processing platform.

Step 102, acquiring the load of each server in the server cluster, and distributing the acquisition task to each server based on the load of each server can be realized by data processing equipment; the data processing device may determine a service state of the server according to a load of the server and allocate the acquisition task to the server according to the service state of the server.

The data processing method provided by the embodiment of the invention comprises the steps of acquiring data to be uploaded, converting the data to be uploaded into acquisition tasks of a plurality of data blocks, acquiring the load of each server in a server cluster, and distributing the acquisition tasks to each server based on the load of each server; the server cluster comprises at least two servers for acquiring data to be uploaded and uploading the data to the data processing platform, so that acquisition tasks are distributed to the servers based on the load of each server without deploying one server for each data block, the problem that services in the relative technology are not related is solved, required resources are reduced, and balanced acquisition can be realized; meanwhile, high availability of service is realized in the acquisition process.

Based on the foregoing embodiments, an embodiment of the present invention provides a data processing method, which is shown in fig. 2 and includes the following steps:

step 201, the data processing device obtains data to be uploaded and converts the data to be uploaded into a collection task of a plurality of data blocks.

Step 202, the data processing device obtains a first server in an abnormal state and allocated with a first acquisition task in the server cluster through the heartbeat information of each server.

The data processing equipment can determine the service state of the server according to the heartbeat information of the server by monitoring the heartbeat information of each server in real time; acquiring servers which are in abnormal states and distributed with collection tasks from a server cluster after the service state of each server is determined; and, the first collection task may be any task that collects data to be uploaded.

Step 203, if the first collection task is matched with the collection task of the data block, the data processing device sets the service state of the first server to be a first state, and sets the identification of the collection state of the first collection task to be a first identification.

Wherein the first state indicates that the server is abnormal; the first identification indicates that the first collection task was not collected.

In other embodiments of the present invention, if the first acquisition task belongs to an acquisition task of a data block, it is said that the first acquisition task matches the acquisition task of the data block; at this time, the data processing apparatus may set the service state of the first server to an abnormal state, and set the collection state of the first collection task to be not collected.

In other embodiments of the present invention, as shown with reference to fig. 3, the method may further comprise the steps of:

and step 204, the data processing equipment acquires a second acquisition task which needs to be redistributed to the server and is not acquired in the acquisition tasks of the data blocks.

And step 205, the data processing device acquires a second server from the server cluster, wherein the number of the tasks running on the server is smaller than the rated number of the tasks and the service state of the second server is normal.

The rated task number refers to the maximum number of tasks that can be run on the server.

And step 206, the data processing equipment distributes the second acquisition task to the second server according to the rated task number of the second server.

And allocating the second acquisition task to the second server according to the maximum number of tasks which can be operated on the second server.

And step 207, if the number of the tasks running on the second server is equal to the rated number of the tasks, setting the service state of the second server to be a second state by the data processing equipment.

And the second state indicates that the service state of the server is normal and the number of running tasks is equal to the rated number of tasks.

In another embodiment of the present invention, if the number of tasks running on the second server reaches the maximum number of tasks that can be run on the second server, the data processing apparatus may set the service state of the second server to be the normal state, and the number of tasks running on the second server is the rated number of tasks.

Wherein, the step 204-207 can also be executed after the step 201.

It should be noted that, for the descriptions of the same steps and the same contents in this embodiment as those in other embodiments, reference may be made to the descriptions in other embodiments, which are not described herein again.

According to the data processing method provided by the embodiment of the invention, when the acquisition task is distributed to the servers, the acquisition task is carried out based on the load of each server, and a server does not need to be deployed for each data block, so that the problem that the services in the relative technology are not related is solved, the required resources are reduced, and balanced acquisition can be realized; meanwhile, high availability of service is realized in the acquisition process.

Based on the foregoing embodiments, an embodiment of the present invention provides a data processing method, which is shown in fig. 4 and includes the following steps:

step 301, the data processing device obtains data to be uploaded and converts the data to be uploaded into a collection task of a plurality of data blocks.

Step 302, the data processing device obtains, through the heartbeat information of each server, a first server in an abnormal state and allocated with a first acquisition task in the server cluster.

Step 303, if the first collection task is matched with the collection task of the data block, the data processing device sets the service state of the first server to be a first state, and sets the identification of the collection state of the first collection task to be a first identification.

And step 304, acquiring a third acquisition task which is abnormal and is not acquired in the acquisition tasks of the data blocks by the data processing equipment.

The third collection task may refer to a collection task that has not been collected by any server and collects an exception.

And 305, the data processing equipment acquires a second server which has the number of tasks running on the server smaller than the rated number of tasks and is in a normal service state from the server cluster, and calculates the number of remaining executable tasks on the second server.

The second server is a server which can run other acquisition tasks on the servers in the server cluster; the number of the tasks which can be operated on the second server is obtained by subtracting the number of the tasks which are operated on the second server from the rated number of the tasks of the second server.

And step 306, the data processing device allocates the third collection task to the second server based on the number of the remaining operable tasks on the second server.

And the number of the acquisition tasks which can be distributed on the second server is less than or equal to the number of the remaining operable tasks on the second server.

Step 307, if the first sub-collection task which is not allocated to the server exists in the third collection task and the number of tasks running on the second server is equal to the rated number of tasks, the data processing device sets the service state of the second server to be the second state, and obtains the third server which has the number of tasks running on the server equal to the rated number of tasks and has a normal service state from the server cluster.

Wherein the third server comprises a second server.

In other embodiments of the present invention, if there is a collection task that has not been allocated to a server in the third collection task and the number of tasks running on the second server reaches the maximum number of tasks that can run on the second server, the data processing device may set the service state of the second server to be a normal state and the number of tasks running on the second server to be a rated number of tasks.

And 308, calculating the number of the remaining executable tasks on the third server by the data processing equipment, and sequentially distributing the first sub-collection tasks to the third server based on the number of the remaining executable tasks on the third server.

And the residual operable task number on the third server is obtained by subtracting the current operating task number on the second server from the rated task number of the third server. When the acquisition tasks are allocated to the third server, the first sub-acquisition tasks can be sequentially allocated to the third server until the first sub-acquisition tasks are allocated and the number of tasks running on the third server is less than or equal to the rated number of tasks; or sequentially distributing the first sub-collection tasks to the third server until the number of tasks running on the third server is equal to the rated number of tasks.

In other embodiments of the present invention, as illustrated with reference to FIG. 5, the method further comprises the steps of:

step 309, if a second sub-collection task which is not allocated to the server exists in the first sub-collection tasks and the number of tasks running on the third server is greater than the rated number of tasks, the data processing device sets the service state of the third server to be a third state, and obtains a fourth server which has the normal service state and the number of tasks running on the server is greater than the rated number of tasks from the server cluster.

The fourth server comprises a third server, and the third state indicates that the service state of the server is normal and the number of running tasks is greater than the rated number of tasks.

In other embodiments of the present invention, if there is a collection task that has not been allocated to a server in the first sub-collection task, and the number of tasks running on the third server exceeds the maximum number of tasks that can run on the third server, the data processing device may set the service state of the third server to be a normal state, and the number of tasks running on the third server is greater than the rated number of tasks.

And 310, the data processing equipment sequentially distributes the second sub-collection tasks to the fourth server.

When the acquisition tasks are allocated to the fourth server, the second sub-acquisition tasks can be sequentially allocated to the fourth server until the second sub-acquisition tasks are allocated and the number of tasks running on the fourth server is less than or equal to the rated number of tasks; or sequentially distributing the second sub-collection tasks to the fourth server until the number of tasks running on the fourth server is equal to the rated number of tasks.

Wherein, the step 304 and the step 310 can also be executed after the step 301.

Based on the foregoing embodiments, an embodiment of the present invention provides a data processing method, which is shown in fig. 6 and includes the following steps:

step 401, the data processing device obtains data to be uploaded and converts the data to be uploaded into a collection task of a plurality of data blocks.

Step 402, the data processing device obtains a first server in an abnormal state and allocated with a first acquisition task in the server cluster through the heartbeat information of each server.

Step 403, if the first collection task matches with the collection task of the data block, the data processing device sets the service state of the first server to be the first state, and sets the identification of the collection state of the first collection task to be the first identification.

Step 404, the data processing device obtains a fourth server with normal service status and the number of tasks running on the server being greater than the rated number of tasks from the server cluster, and obtains a second server with normal service status and the number of tasks running on the server being less than the rated number of tasks.

Step 405, the data processing device calculates the number of remaining executable tasks on the second server based on the rated number of tasks of the second server and the number of tasks running on the second server.

And the number of the tasks which can be operated on the second server is obtained by subtracting the number of the tasks which are operated on the second server from the rated number of the tasks of the second server.

Step 406, the data processing device stops the thread of the collection task in the fourth server that exceeds the rated task.

After the thread of the collection task exceeding the rated task in the fourth server is stopped, the fourth server can not provide any service for the collection task exceeding the rated task in the fourth server.

Step 407, the data processing device allocates the first number of collection tasks exceeding the rated tasks in the fourth server to the second server, and sets the identifiers of the collection states of the first number of collection tasks exceeding the rated tasks as second identifiers.

Wherein the second identifier indicates that the collection task has reallocated servers and has been currently collected; the first number is the number of tasks remaining executable on the second server.

In other implementations of the invention, a first number of collection tasks in the fourth server that exceed the rated task may be assigned to other servers that may serve them; the first number is the number of tasks remaining operable on the other servers; alternatively, the first number is the number of acquisition tasks of the acquisition tasks in the fourth server that exceed the rated task.

In other embodiments of the present invention, as shown in fig. 7, the following steps may also be performed after step 403:

and step 408, the data processing equipment acquires a fourth acquisition task which is distributed with the server and is not acquired in the acquisition tasks of the data blocks.

Step 409, the data processing device creates and executes an acquisition task thread, and sets the acquisition state identifier of the fourth acquisition task as a third identifier.

Wherein the third identification indicates that the collection task is currently being collected.

And step 410, the data processing device acquires a fifth acquired task which needs to be reallocated to the server and is acquired in the acquisition tasks of the data blocks.

The fifth collection task is a collection task to which a server needs to be newly allocated for some reason and to which a server has already been allocated.

Step 411, the data processing device suspends the thread for acquiring the fifth acquisition task, and sets the identifier of the acquisition state of the fifth acquisition task as the fourth identifier.

Wherein the fourth flag indicates that the collection task needs to reallocate servers and is not currently collected.

In other embodiments of the present invention, the data processing apparatus may terminate the thread of the fifth collection task and set the collection state of the fifth collection task to a state in which the server needs to be reallocated and is not currently collected.

Wherein, the

step

404 and 411 can also be executed after the step 401.

Based on the foregoing embodiments, in other embodiments of the present invention, the method further comprises the steps of:

A. and setting the task name of each data block acquisition task and the corresponding service address of each server.

B. A first relationship is defined between a task name of each data block collection task, a collection status of each data block collection task, a service address to which each data block collection task corresponds, and a meaning indicating a condition about the collection task.

Here, the task name of each data block collection task and the service address corresponding to each server may be set according to the example shown in table 1 shown below, and a first relationship between the task name of each data block collection task, the collection state of each data block collection task, the service address corresponding to each data block collection task, and a meaning indicating a situation regarding the collection task may also be set according to the example shown in table 1, and a meaning indicating a situation regarding the collection task may be shown as the meaning indicating the situation regarding the collection task shown in table 1. Of course, table 1 is only for illustration and is not limited to table 1.

TABLE 1

C. A second relationship is defined between the service address of each server, the rated number of tasks for the server, the service status of each server, and a meaning indicative of an operational condition of the server.

The first relation and the second relation are used when the acquisition task is distributed to the server.

In other embodiments of the present invention, a second relationship between the service address of each server, the rated number of tasks of the server, the service status of each server, and the meaning indicating the behavior with respect to the server may be set according to an example as shown in table 2 below, and the meaning indicating the behavior with respect to the server may be as shown in the meaning indicating table 2. Of course, table 2 is only for illustration and is not limited to the illustration shown in table 2.

TABLE 2

Based on the foregoing embodiments, an embodiment of the present invention provides a data processing apparatus, which may be applied to the data processing method provided in the embodiments corresponding to fig. 1 to 7, and as shown in fig. 8, the apparatus may include: a processor 51, a memory 52, and a communication bus 53, wherein:

the communication bus 53 is used for realizing communication connection between the processor 51 and the memory 52;

the processor 51 is configured to execute a data processing program stored in the memory 52 to implement the following steps:

acquiring the load of each server in a server cluster, and distributing an acquisition task to each server based on the load of each server;

In other embodiments of the present invention, the processor 51 is configured to execute the data processing program stored in the memory 52 to obtain the load of each server in the server cluster, and allocate the collection task to each server based on the load of each server, so as to implement the following steps:

acquiring a first server which is in an abnormal state and is distributed with a first acquisition task in a server cluster through the heartbeat information of each server;

if the first acquisition task is matched with the acquisition task of the data block, setting the service state of the first server as a first state, and setting the identification of the acquisition state of the first acquisition task as a first identification;

acquiring a second server which has a normal service state and a task number smaller than a rated task number and runs on the server from the server cluster;

the rated task number refers to the maximum operable task number on the server;

distributing the second acquisition task to the second server according to the rated task number of the second server;

if the number of the tasks running on the second server is equal to the rated number of the tasks, setting the service state of the second server as a second state;

acquiring a second server with the number of tasks running on the server being smaller than the rated number of tasks and the service state being normal from the server cluster, and calculating the number of remaining tasks running on the second server;

In other embodiments of the present invention, the processor 51 is configured to execute the data processing program stored in the memory 52, and may further implement the following steps:

if the third acquisition task comprises a first sub-acquisition task which is not distributed to the server and the number of tasks running on the second server is equal to the rated number of tasks, setting the service state of the second server to be a second state, and acquiring a third server which has the number of tasks running on the server equal to the rated number of tasks and is in a normal service state from the server cluster;

the third server comprises a second server;

if a second sub-collection task which is not distributed to the server exists in the first sub-collection tasks and the number of tasks running on the third server is larger than the rated number of tasks, setting the service state of the third server to be a third state, and acquiring a fourth server which has the number of tasks running on the server larger than the rated number of tasks and has a normal service state from the server cluster;

the fourth server comprises a third server, and the third state indicates that the service state of the server is normal and the number of running tasks is greater than the rated number of tasks;

and sequentially distributing the second sub-collection tasks to a fourth server.

calculating the number of the tasks which are remained and can be operated on the second server based on the rated task number of the second server and the number of the tasks which are operated on the second server;

distributing a first number of acquisition tasks exceeding the rated tasks in the fourth server to the second server, and setting the acquisition state identifiers of the first number of acquisition tasks exceeding the rated tasks as second identifiers;

acquiring a fourth acquisition task which is distributed with a server and is not acquired in the acquisition tasks of the data blocks;

creating and executing an acquisition task thread, and setting an acquisition state identifier of a fourth acquisition task as a third identifier;

wherein the third identifier indicates that the collection task is currently being collected;

acquiring a fifth acquisition task which needs to be redistributed to the server and is acquired in the acquisition tasks of the data blocks;

suspending a thread for acquiring a fifth acquisition task, and setting an identifier of an acquisition state of the fifth acquisition task as a fourth identifier;

defining a first relationship between a task name of each data block acquisition task, an acquisition state of each data block acquisition task, a service address corresponding to each data block acquisition task, and a meaning indicating a condition about the acquisition task;

defining a second relationship between the service address of each server, the rated number of tasks of the server, the service status of each server and the meaning indicating the operation condition of the server;

wherein the first relationship and the second relationship are for use in assigning the collection task to the server.

It should be noted that, for a specific implementation process of the steps executed by the processor in this embodiment, reference may be made to the implementation processes in the data processing method provided in the embodiments corresponding to fig. 1 to 7, and details are not described here again.

The data processing equipment provided by the embodiment of the invention is carried out based on the load of each server when the acquisition task is allocated to the server, and a server does not need to be deployed for each data block, so that the problem that the services in the relative technology are not related is solved, the required resources are reduced, and balanced acquisition can be realized; meanwhile, high availability of service is realized in the acquisition process.

Based on the foregoing embodiments, embodiments of the present invention provide a computer-readable storage medium storing one or more programs, which can be executed by a processor to implement the steps in the data processing method provided in the embodiments corresponding to fig. 1 to 7.

As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of a hardware embodiment, a software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, optical storage, and the like) having computer-usable program code embodied therein.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

The above description is only a preferred embodiment of the present invention, and is not intended to limit the scope of the present invention.

Claims

1. A method of data processing, the method comprising:

setting a task name of an acquisition task of each data block and a service address corresponding to each server;

defining a first relation among a task name of a collection task of each data block, a collection state of the collection task of each data block, a service address corresponding to the collection task of each data block, and a meaning indicating a condition about the collection task;

acquiring the load of each server in the server cluster, and distributing the acquisition task to each server based on the load of each server, the first relation and the second relation; the server cluster comprises at least two servers used for collecting the data to be uploaded and uploading the data to the data processing platform.

2. The method of claim 1, further comprising:

3. The method according to claim 1 or 2, characterized in that the method further comprises:

4. The method according to claim 1 or 2, characterized in that the method further comprises:

5. The method according to claim 1 or 2, characterized in that the method further comprises:

6. The method according to claim 1 or 2, characterized in that the method further comprises:

7. A data processing apparatus, characterized in that the apparatus comprises: a processor, a memory, and a communication bus;