CN109981731B - Data processing method and equipment - Google Patents
Data processing method and equipment Download PDFInfo
- Publication number
- CN109981731B CN109981731B CN201910117769.4A CN201910117769A CN109981731B CN 109981731 B CN109981731 B CN 109981731B CN 201910117769 A CN201910117769 A CN 201910117769A CN 109981731 B CN109981731 B CN 109981731B
- Authority
- CN
- China
- Prior art keywords
- server
- task
- tasks
- acquisition
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000003672 processing method Methods 0.000 title abstract description 24
- 238000000034 method Methods 0.000 claims description 31
- 230000002159 abnormal effect Effects 0.000 claims description 18
- 238000004891 communication Methods 0.000 claims description 10
- 238000010586 diagram Methods 0.000 description 9
- 238000005516 engineering process Methods 0.000 description 8
- 238000004590 computer program Methods 0.000 description 7
- 238000003326 Quality management system Methods 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 230000006399 behavior Effects 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
- H04L43/10—Active monitoring, e.g. heartbeat, ping or trace-route
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/1001—Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/50—Network services
- H04L67/60—Scheduling or organising the servicing of application requests, e.g. requests for application data transmissions using the analysis and optimisation of the required network resources
- H04L67/63—Routing a service request depending on the request content or context
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Cardiology (AREA)
- General Health & Medical Sciences (AREA)
- Hardware Redundancy (AREA)
- Computer And Data Communications (AREA)
Abstract
The embodiment of the invention discloses a data processing method, which comprises the following steps: acquiring data to be uploaded, and converting the data to be uploaded into a collection task of a plurality of data blocks; acquiring the load of each server in a server cluster, and distributing the acquisition task to each server based on the load of each server; the server cluster comprises at least two servers used for collecting the data to be uploaded and uploading the data to the data processing platform. The embodiment of the invention also discloses a data processing device.
Description
Technical Field
The present invention relates to data processing technologies in the field of communications, and in particular, to a data processing method and device.
Background
At present, different property data are stored in a distributed manner in a corresponding data warehouse, for example, the data may include Device Control System (DCS) real-time data, Laboratory Information Management System (LIMS) data, Quality Management System (QMS) data, and the like. With the gradual and extensive application of big data, how to efficiently pull each mass data dispersed in different warehouses to a big data platform in real time and adopt different acquisition strategies to treat different weight data simultaneously becomes a big problem of a big data platform data acquisition link gradually.
In order to solve the above problems, the following technical solutions appear in the related art: and according to the scheme A, different acquisition services are developed aiming at different data, and a plurality of acquisition services are deployed for acquisition by splitting a data block in a warehouse of mass data. Each acquisition service is independent, and the main and standby acquisition services are deployed aiming at the same data block acquisition service, so that high availability is realized. And B, deploying a plurality of services in a distributed mode, and collecting the services according to the set collection tasks. And in the scheme C, the active pulling of the data is changed into the active pushing of the data to the big data platform by different data sources. However, the above scheme has the problems that there is no correlation between services, the required resources are excessive, the acquisition cannot be balanced, and the acquisition cannot be highly available.
Disclosure of Invention
In order to solve the above technical problems, embodiments of the present invention are intended to provide a data processing method and device, which solve the problem that there is no association between services in the relative technologies, reduce required resources, and implement balanced acquisition; meanwhile, high availability of service is realized in the acquisition process.
The technical scheme of the invention is realized as follows:
a method of data processing, the method comprising:
acquiring data to be uploaded, and converting the data to be uploaded into a collection task of a plurality of data blocks;
acquiring the load of each server in a server cluster, and distributing the acquisition task to each server based on the load of each server; the server cluster comprises at least two servers used for collecting the data to be uploaded and uploading the data to the data processing platform.
Optionally, the obtaining a load of each server in the server cluster, and distributing the collection task to each server based on the load of each server includes:
acquiring a first server which is in an abnormal state and is distributed with a first acquisition task in the server cluster through the heartbeat information of each server;
if the first acquisition task is matched with the acquisition task of the data block, setting the service state of the first server as a first state, and setting the identification of the acquisition state of the first acquisition task as a first identification; wherein the first state indicates that the server is abnormal; the first identification indicates that the first collection task was not collected.
Optionally, the obtaining a load of each server in the server cluster, and distributing the collection task to each server based on the load of each server includes:
acquiring a second acquisition task which needs to be redistributed to the server and is not acquired in the acquisition tasks of the data blocks;
acquiring a second server which has a normal service state and a task number smaller than a rated task number and runs on the server from the server cluster; the rated task number refers to the maximum operable task number on the server;
distributing the second collection task to the second server according to the rated task number of the second server;
if the number of the tasks running on the second server is equal to the rated number of the tasks, setting the service state of the second server to be a second state; and the second state indicates that the service state of the server is normal and the number of running tasks is equal to the rated number of tasks.
Optionally, the obtaining a load of each server in the server cluster, and distributing the collection task to each server based on the load of each server includes:
acquiring a third acquisition task which is abnormal in acquisition and is not acquired in acquisition tasks of the data block;
acquiring a second server with the number of tasks running on the server being smaller than the rated number of tasks and the service state being normal from the server cluster, and calculating the number of remaining executable tasks on the second server;
and distributing the third acquisition task to the second server based on the number of the tasks which remain to be operated on the second server.
Optionally, the method further includes:
if the third acquisition task comprises a first sub-acquisition task which is not allocated to the server and the number of tasks running on the second server is equal to the rated number of tasks, setting the service state of the second server to be a second state, and acquiring a third server which has the number of tasks running on the server equal to the rated number of tasks and has a normal service state from the server cluster; the third server comprises a second server;
and calculating the number of the remaining executable tasks on the third server, and sequentially distributing the first sub-collection tasks to the third server based on the number of the remaining executable tasks on the third server.
Optionally, the method further includes:
if a second sub-collection task which is not distributed to the server exists in the first sub-collection tasks and the number of tasks running on the third server is larger than the rated number of tasks, setting the service state of the third server to be a third state, and acquiring a fourth server which has the normal service state and the number of tasks running on the server larger than the rated number of tasks from the server cluster; the fourth server comprises a third server, and the third state indicates that the service state of the server is normal and the number of running tasks is greater than the rated number of tasks;
and sequentially distributing the second sub-collection tasks to the fourth server.
Optionally, the obtaining a load of each server in the server cluster, and distributing the collection task to each server based on the load of each server includes:
acquiring a fourth server with normal service state and the number of tasks running on the server being greater than the rated number of tasks from the server cluster, and acquiring a second server with normal service state and the number of tasks running on the server being less than the rated number of tasks;
calculating the number of tasks which are remained and can be operated on the second server based on the rated number of tasks of the second server and the number of tasks which are operated on the second server;
stopping the thread of the collection task exceeding the rated task in the fourth server;
distributing a first number of acquisition tasks exceeding the rated tasks in the fourth server to the second server, and setting the acquisition state identifiers of the first number of acquisition tasks exceeding the rated tasks as second identifiers; wherein the second identifier indicates that the collection task reallocates servers and has been currently collected; the first number is the number of tasks remaining operable on the second server.
Optionally, the obtaining a load of each server in the server cluster, and distributing the collection task to each server based on the load of each server includes:
acquiring a fourth acquisition task which is distributed with a server and is not acquired in the acquisition tasks of the data block;
creating and executing an acquisition task thread, and setting an acquisition state identifier of the fourth acquisition task as a third identifier; wherein the third identifier indicates that an acquisition task is currently being acquired;
acquiring a fifth acquisition task which needs to be reallocated to the server and is acquired in the acquisition tasks of the data blocks;
suspending a thread for acquiring the fifth acquisition task, and setting an identifier of an acquisition state of the fifth acquisition task as a fourth identifier; wherein the fourth identification indicates that the collection task needs to reallocate servers and is not currently collected.
Optionally, the method further includes:
setting a task name of each data block acquisition task and a service address corresponding to each server;
defining a first relation among a task name of each data block acquisition task, an acquisition state of each data block acquisition task, a service address corresponding to each data block acquisition task and a meaning indicating a condition about the acquisition task;
defining a second relationship between the service address of each server, the rated number of tasks of the server, the service status of each server and the meaning indicating the operation condition of the server; wherein the first relationship and the second relationship are used when distributing the collection task to the server.
A data processing apparatus, the apparatus comprising: a processor, a memory, and a communication bus;
the communication bus is used for realizing communication connection between the processor and the memory;
the processor is configured to execute a data processing program stored in the memory to implement the steps of:
acquiring data to be uploaded, and converting the data to be uploaded into a collection task of a plurality of data blocks;
acquiring the load of each server in a server cluster, and distributing the acquisition task to each server based on the load of each server; the server cluster comprises at least two servers used for collecting the data to be uploaded and uploading the data to the data processing platform.
The data processing method and the data processing device provided by the embodiment of the invention are used for acquiring data to be uploaded, converting the data to be uploaded into acquisition tasks of a plurality of data blocks, acquiring the load of each server in a server cluster, and distributing the acquisition tasks to each server based on the load of each server; the server cluster comprises at least two servers for acquiring data to be uploaded and uploading the data to the data processing platform, so that acquisition tasks are distributed to the servers based on the load of each server without deploying one server for each data block, the problem that services in the relative technology are not related is solved, required resources are reduced, and balanced acquisition can be realized; meanwhile, high availability of service is realized in the acquisition process.
Drawings
Fig. 1 is a schematic flow chart of a data processing method according to an embodiment of the present invention;
FIG. 2 is a flow chart illustrating another data processing method according to an embodiment of the present invention;
FIG. 3 is a flow chart illustrating a further data processing method according to an embodiment of the present invention;
fig. 4 is a flowchart illustrating a data processing method according to another embodiment of the present invention;
FIG. 5 is a flow chart illustrating another data processing method according to another embodiment of the present invention;
FIG. 6 is a flowchart illustrating a data processing method according to another embodiment of the present invention;
fig. 7 is a flowchart illustrating a data processing method according to another embodiment of the present invention;
fig. 8 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present invention.
Detailed Description
The technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention.
An embodiment of the present invention provides a data processing method, which is shown in fig. 1 and includes the following steps
The acquisition task of acquiring the data to be uploaded and converting the data to be uploaded into a plurality of data blocks in the step 101 can be realized by data processing equipment; the data to be uploaded may include Device Control System (DCS) real-time data, Laboratory Information Management System (LIMS) data, Quality Management System (QMS) data, and the like. In the embodiment of the application, the data to be uploaded is abstracted into a plurality of data blocks.
And 102, acquiring the load of each server in the server cluster, and distributing an acquisition task to each server based on the load of each server.
The server cluster comprises at least two servers used for collecting data to be uploaded and uploading the data to the data processing platform.
The data processing method provided by the embodiment of the invention comprises the steps of acquiring data to be uploaded, converting the data to be uploaded into acquisition tasks of a plurality of data blocks, acquiring the load of each server in a server cluster, and distributing the acquisition tasks to each server based on the load of each server; the server cluster comprises at least two servers for acquiring data to be uploaded and uploading the data to the data processing platform, so that acquisition tasks are distributed to the servers based on the load of each server without deploying one server for each data block, the problem that services in the relative technology are not related is solved, required resources are reduced, and balanced acquisition can be realized; meanwhile, high availability of service is realized in the acquisition process.
Based on the foregoing embodiments, an embodiment of the present invention provides a data processing method, which is shown in fig. 2 and includes the following steps:
The data processing equipment can determine the service state of the server according to the heartbeat information of the server by monitoring the heartbeat information of each server in real time; acquiring servers which are in abnormal states and distributed with collection tasks from a server cluster after the service state of each server is determined; and, the first collection task may be any task that collects data to be uploaded.
Wherein the first state indicates that the server is abnormal; the first identification indicates that the first collection task was not collected.
In other embodiments of the present invention, if the first acquisition task belongs to an acquisition task of a data block, it is said that the first acquisition task matches the acquisition task of the data block; at this time, the data processing apparatus may set the service state of the first server to an abnormal state, and set the collection state of the first collection task to be not collected.
In other embodiments of the present invention, as shown with reference to fig. 3, the method may further comprise the steps of:
and step 204, the data processing equipment acquires a second acquisition task which needs to be redistributed to the server and is not acquired in the acquisition tasks of the data blocks.
And step 205, the data processing device acquires a second server from the server cluster, wherein the number of the tasks running on the server is smaller than the rated number of the tasks and the service state of the second server is normal.
The rated task number refers to the maximum number of tasks that can be run on the server.
And step 206, the data processing equipment distributes the second acquisition task to the second server according to the rated task number of the second server.
And allocating the second acquisition task to the second server according to the maximum number of tasks which can be operated on the second server.
And step 207, if the number of the tasks running on the second server is equal to the rated number of the tasks, setting the service state of the second server to be a second state by the data processing equipment.
And the second state indicates that the service state of the server is normal and the number of running tasks is equal to the rated number of tasks.
In another embodiment of the present invention, if the number of tasks running on the second server reaches the maximum number of tasks that can be run on the second server, the data processing apparatus may set the service state of the second server to be the normal state, and the number of tasks running on the second server is the rated number of tasks.
Wherein, the step 204-207 can also be executed after the step 201.
It should be noted that, for the descriptions of the same steps and the same contents in this embodiment as those in other embodiments, reference may be made to the descriptions in other embodiments, which are not described herein again.
According to the data processing method provided by the embodiment of the invention, when the acquisition task is distributed to the servers, the acquisition task is carried out based on the load of each server, and a server does not need to be deployed for each data block, so that the problem that the services in the relative technology are not related is solved, the required resources are reduced, and balanced acquisition can be realized; meanwhile, high availability of service is realized in the acquisition process.
Based on the foregoing embodiments, an embodiment of the present invention provides a data processing method, which is shown in fig. 4 and includes the following steps:
Wherein the first state indicates that the server is abnormal; the first identification indicates that the first collection task was not collected.
And step 304, acquiring a third acquisition task which is abnormal and is not acquired in the acquisition tasks of the data blocks by the data processing equipment.
The third collection task may refer to a collection task that has not been collected by any server and collects an exception.
And 305, the data processing equipment acquires a second server which has the number of tasks running on the server smaller than the rated number of tasks and is in a normal service state from the server cluster, and calculates the number of remaining executable tasks on the second server.
The second server is a server which can run other acquisition tasks on the servers in the server cluster; the number of the tasks which can be operated on the second server is obtained by subtracting the number of the tasks which are operated on the second server from the rated number of the tasks of the second server.
And step 306, the data processing device allocates the third collection task to the second server based on the number of the remaining operable tasks on the second server.
And the number of the acquisition tasks which can be distributed on the second server is less than or equal to the number of the remaining operable tasks on the second server.
Wherein the third server comprises a second server.
In other embodiments of the present invention, if there is a collection task that has not been allocated to a server in the third collection task and the number of tasks running on the second server reaches the maximum number of tasks that can run on the second server, the data processing device may set the service state of the second server to be a normal state and the number of tasks running on the second server to be a rated number of tasks.
And 308, calculating the number of the remaining executable tasks on the third server by the data processing equipment, and sequentially distributing the first sub-collection tasks to the third server based on the number of the remaining executable tasks on the third server.
And the residual operable task number on the third server is obtained by subtracting the current operating task number on the second server from the rated task number of the third server. When the acquisition tasks are allocated to the third server, the first sub-acquisition tasks can be sequentially allocated to the third server until the first sub-acquisition tasks are allocated and the number of tasks running on the third server is less than or equal to the rated number of tasks; or sequentially distributing the first sub-collection tasks to the third server until the number of tasks running on the third server is equal to the rated number of tasks.
In other embodiments of the present invention, as illustrated with reference to FIG. 5, the method further comprises the steps of:
The fourth server comprises a third server, and the third state indicates that the service state of the server is normal and the number of running tasks is greater than the rated number of tasks.
In other embodiments of the present invention, if there is a collection task that has not been allocated to a server in the first sub-collection task, and the number of tasks running on the third server exceeds the maximum number of tasks that can run on the third server, the data processing device may set the service state of the third server to be a normal state, and the number of tasks running on the third server is greater than the rated number of tasks.
And 310, the data processing equipment sequentially distributes the second sub-collection tasks to the fourth server.
When the acquisition tasks are allocated to the fourth server, the second sub-acquisition tasks can be sequentially allocated to the fourth server until the second sub-acquisition tasks are allocated and the number of tasks running on the fourth server is less than or equal to the rated number of tasks; or sequentially distributing the second sub-collection tasks to the fourth server until the number of tasks running on the fourth server is equal to the rated number of tasks.
It should be noted that, for the descriptions of the same steps and the same contents in this embodiment as those in other embodiments, reference may be made to the descriptions in other embodiments, which are not described herein again.
Wherein, the step 304 and the step 310 can also be executed after the step 301.
According to the data processing method provided by the embodiment of the invention, when the acquisition task is distributed to the servers, the acquisition task is carried out based on the load of each server, and a server does not need to be deployed for each data block, so that the problem that the services in the relative technology are not related is solved, the required resources are reduced, and balanced acquisition can be realized; meanwhile, high availability of service is realized in the acquisition process.
Based on the foregoing embodiments, an embodiment of the present invention provides a data processing method, which is shown in fig. 6 and includes the following steps:
Wherein the first state indicates that the server is abnormal; the first identification indicates that the first collection task was not collected.
And the number of the tasks which can be operated on the second server is obtained by subtracting the number of the tasks which are operated on the second server from the rated number of the tasks of the second server.
After the thread of the collection task exceeding the rated task in the fourth server is stopped, the fourth server can not provide any service for the collection task exceeding the rated task in the fourth server.
Wherein the second identifier indicates that the collection task has reallocated servers and has been currently collected; the first number is the number of tasks remaining executable on the second server.
In other implementations of the invention, a first number of collection tasks in the fourth server that exceed the rated task may be assigned to other servers that may serve them; the first number is the number of tasks remaining operable on the other servers; alternatively, the first number is the number of acquisition tasks of the acquisition tasks in the fourth server that exceed the rated task.
In other embodiments of the present invention, as shown in fig. 7, the following steps may also be performed after step 403:
and step 408, the data processing equipment acquires a fourth acquisition task which is distributed with the server and is not acquired in the acquisition tasks of the data blocks.
Wherein the third identification indicates that the collection task is currently being collected.
And step 410, the data processing device acquires a fifth acquired task which needs to be reallocated to the server and is acquired in the acquisition tasks of the data blocks.
The fifth collection task is a collection task to which a server needs to be newly allocated for some reason and to which a server has already been allocated.
Wherein the fourth flag indicates that the collection task needs to reallocate servers and is not currently collected.
In other embodiments of the present invention, the data processing apparatus may terminate the thread of the fifth collection task and set the collection state of the fifth collection task to a state in which the server needs to be reallocated and is not currently collected.
It should be noted that, for the descriptions of the same steps and the same contents in this embodiment as those in other embodiments, reference may be made to the descriptions in other embodiments, which are not described herein again.
Wherein, the step 404 and 411 can also be executed after the step 401.
According to the data processing method provided by the embodiment of the invention, when the acquisition task is distributed to the servers, the acquisition task is carried out based on the load of each server, and a server does not need to be deployed for each data block, so that the problem that the services in the relative technology are not related is solved, the required resources are reduced, and balanced acquisition can be realized; meanwhile, high availability of service is realized in the acquisition process.
Based on the foregoing embodiments, in other embodiments of the present invention, the method further comprises the steps of:
A. and setting the task name of each data block acquisition task and the corresponding service address of each server.
B. A first relationship is defined between a task name of each data block collection task, a collection status of each data block collection task, a service address to which each data block collection task corresponds, and a meaning indicating a condition about the collection task.
Here, the task name of each data block collection task and the service address corresponding to each server may be set according to the example shown in table 1 shown below, and a first relationship between the task name of each data block collection task, the collection state of each data block collection task, the service address corresponding to each data block collection task, and a meaning indicating a situation regarding the collection task may also be set according to the example shown in table 1, and a meaning indicating a situation regarding the collection task may be shown as the meaning indicating the situation regarding the collection task shown in table 1. Of course, table 1 is only for illustration and is not limited to table 1.
TABLE 1
C. A second relationship is defined between the service address of each server, the rated number of tasks for the server, the service status of each server, and a meaning indicative of an operational condition of the server.
The first relation and the second relation are used when the acquisition task is distributed to the server.
In other embodiments of the present invention, a second relationship between the service address of each server, the rated number of tasks of the server, the service status of each server, and the meaning indicating the behavior with respect to the server may be set according to an example as shown in table 2 below, and the meaning indicating the behavior with respect to the server may be as shown in the meaning indicating table 2. Of course, table 2 is only for illustration and is not limited to the illustration shown in table 2.
TABLE 2
Based on the foregoing embodiments, an embodiment of the present invention provides a data processing apparatus, which may be applied to the data processing method provided in the embodiments corresponding to fig. 1 to 7, and as shown in fig. 8, the apparatus may include: a processor 51, a memory 52, and a communication bus 53, wherein:
the communication bus 53 is used for realizing communication connection between the processor 51 and the memory 52;
the processor 51 is configured to execute a data processing program stored in the memory 52 to implement the following steps:
acquiring data to be uploaded, and converting the data to be uploaded into a collection task of a plurality of data blocks;
acquiring the load of each server in a server cluster, and distributing an acquisition task to each server based on the load of each server;
the server cluster comprises at least two servers used for collecting data to be uploaded and uploading the data to the data processing platform.
In other embodiments of the present invention, the processor 51 is configured to execute the data processing program stored in the memory 52 to obtain the load of each server in the server cluster, and allocate the collection task to each server based on the load of each server, so as to implement the following steps:
acquiring a first server which is in an abnormal state and is distributed with a first acquisition task in a server cluster through the heartbeat information of each server;
if the first acquisition task is matched with the acquisition task of the data block, setting the service state of the first server as a first state, and setting the identification of the acquisition state of the first acquisition task as a first identification;
wherein the first state indicates that the server is abnormal; the first identification indicates that the first collection task was not collected.
In other embodiments of the present invention, the processor 51 is configured to execute the data processing program stored in the memory 52 to obtain the load of each server in the server cluster, and allocate the collection task to each server based on the load of each server, so as to implement the following steps:
acquiring a second acquisition task which needs to be redistributed to the server and is not acquired in the acquisition tasks of the data blocks;
acquiring a second server which has a normal service state and a task number smaller than a rated task number and runs on the server from the server cluster;
the rated task number refers to the maximum operable task number on the server;
distributing the second acquisition task to the second server according to the rated task number of the second server;
if the number of the tasks running on the second server is equal to the rated number of the tasks, setting the service state of the second server as a second state;
and the second state indicates that the service state of the server is normal and the number of running tasks is equal to the rated number of tasks.
In other embodiments of the present invention, the processor 51 is configured to execute the data processing program stored in the memory 52 to obtain the load of each server in the server cluster, and allocate the collection task to each server based on the load of each server, so as to implement the following steps:
acquiring a third acquisition task which is abnormal in acquisition and is not acquired in acquisition tasks of the data block;
acquiring a second server with the number of tasks running on the server being smaller than the rated number of tasks and the service state being normal from the server cluster, and calculating the number of remaining tasks running on the second server;
and distributing the third acquisition task to the second server based on the number of the tasks which remain to be operated on the second server.
In other embodiments of the present invention, the processor 51 is configured to execute the data processing program stored in the memory 52, and may further implement the following steps:
if the third acquisition task comprises a first sub-acquisition task which is not distributed to the server and the number of tasks running on the second server is equal to the rated number of tasks, setting the service state of the second server to be a second state, and acquiring a third server which has the number of tasks running on the server equal to the rated number of tasks and is in a normal service state from the server cluster;
the third server comprises a second server;
and calculating the number of the remaining executable tasks on the third server, and sequentially distributing the first sub-collection tasks to the third server based on the number of the remaining executable tasks on the third server.
In other embodiments of the present invention, the processor 51 is configured to execute the data processing program stored in the memory 52, and may further implement the following steps:
if a second sub-collection task which is not distributed to the server exists in the first sub-collection tasks and the number of tasks running on the third server is larger than the rated number of tasks, setting the service state of the third server to be a third state, and acquiring a fourth server which has the number of tasks running on the server larger than the rated number of tasks and has a normal service state from the server cluster;
the fourth server comprises a third server, and the third state indicates that the service state of the server is normal and the number of running tasks is greater than the rated number of tasks;
and sequentially distributing the second sub-collection tasks to a fourth server.
In other embodiments of the present invention, the processor 51 is configured to execute the data processing program stored in the memory 52 to obtain the load of each server in the server cluster, and allocate the collection task to each server based on the load of each server, so as to implement the following steps:
acquiring a fourth server with normal service state and the number of tasks running on the server being greater than the rated number of tasks from the server cluster, and acquiring a second server with normal service state and the number of tasks running on the server being less than the rated number of tasks;
calculating the number of the tasks which are remained and can be operated on the second server based on the rated task number of the second server and the number of the tasks which are operated on the second server;
stopping the thread of the collection task exceeding the rated task in the fourth server;
distributing a first number of acquisition tasks exceeding the rated tasks in the fourth server to the second server, and setting the acquisition state identifiers of the first number of acquisition tasks exceeding the rated tasks as second identifiers;
wherein the second identifier indicates that the collection task has reallocated servers and has been currently collected; the first number is the number of tasks remaining executable on the second server.
In other embodiments of the present invention, the processor 51 is configured to execute the data processing program stored in the memory 52 to obtain the load of each server in the server cluster, and allocate the collection task to each server based on the load of each server, so as to implement the following steps:
acquiring a fourth acquisition task which is distributed with a server and is not acquired in the acquisition tasks of the data blocks;
creating and executing an acquisition task thread, and setting an acquisition state identifier of a fourth acquisition task as a third identifier;
wherein the third identifier indicates that the collection task is currently being collected;
acquiring a fifth acquisition task which needs to be redistributed to the server and is acquired in the acquisition tasks of the data blocks;
suspending a thread for acquiring a fifth acquisition task, and setting an identifier of an acquisition state of the fifth acquisition task as a fourth identifier;
wherein the fourth flag indicates that the collection task needs to reallocate servers and is not currently collected.
In other embodiments of the present invention, the processor 51 is configured to execute the data processing program stored in the memory 52, and may further implement the following steps:
setting a task name of each data block acquisition task and a service address corresponding to each server;
defining a first relationship between a task name of each data block acquisition task, an acquisition state of each data block acquisition task, a service address corresponding to each data block acquisition task, and a meaning indicating a condition about the acquisition task;
defining a second relationship between the service address of each server, the rated number of tasks of the server, the service status of each server and the meaning indicating the operation condition of the server;
wherein the first relationship and the second relationship are for use in assigning the collection task to the server.
It should be noted that, for a specific implementation process of the steps executed by the processor in this embodiment, reference may be made to the implementation processes in the data processing method provided in the embodiments corresponding to fig. 1 to 7, and details are not described here again.
The data processing equipment provided by the embodiment of the invention is carried out based on the load of each server when the acquisition task is allocated to the server, and a server does not need to be deployed for each data block, so that the problem that the services in the relative technology are not related is solved, the required resources are reduced, and balanced acquisition can be realized; meanwhile, high availability of service is realized in the acquisition process.
Based on the foregoing embodiments, embodiments of the present invention provide a computer-readable storage medium storing one or more programs, which can be executed by a processor to implement the steps in the data processing method provided in the embodiments corresponding to fig. 1 to 7.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of a hardware embodiment, a software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The above description is only a preferred embodiment of the present invention, and is not intended to limit the scope of the present invention.
Claims (7)
1. A method of data processing, the method comprising:
acquiring data to be uploaded, and converting the data to be uploaded into a collection task of a plurality of data blocks;
setting a task name of an acquisition task of each data block and a service address corresponding to each server;
defining a first relation among a task name of a collection task of each data block, a collection state of the collection task of each data block, a service address corresponding to the collection task of each data block, and a meaning indicating a condition about the collection task;
defining a second relationship between the service address of each server, the rated number of tasks of the server, the service status of each server and the meaning indicating the operation condition of the server;
acquiring the load of each server in the server cluster, and distributing the acquisition task to each server based on the load of each server, the first relation and the second relation; the server cluster comprises at least two servers used for collecting the data to be uploaded and uploading the data to the data processing platform.
2. The method of claim 1, further comprising:
acquiring a first server which is in an abnormal state and is distributed with a first acquisition task in the server cluster through the heartbeat information of each server;
if the first acquisition task is matched with the acquisition task of the data block, setting the service state of the first server as a first state, and setting the identification of the acquisition state of the first acquisition task as a first identification; wherein the first state indicates that the server is abnormal; the first identification indicates that the first collection task was not collected.
3. The method according to claim 1 or 2, characterized in that the method further comprises:
acquiring a second acquisition task which needs to be redistributed to the server and is not acquired in the acquisition tasks of the data blocks;
acquiring a second server which has a normal service state and a task number smaller than a rated task number and runs on the server from the server cluster; the rated task number refers to the maximum operable task number on the server;
distributing the second collection task to the second server according to the rated task number of the second server;
if the number of the tasks running on the second server is equal to the rated number of the tasks, setting the service state of the second server to be a second state; and the second state indicates that the service state of the server is normal and the number of running tasks is equal to the rated number of tasks.
4. The method according to claim 1 or 2, characterized in that the method further comprises:
acquiring a third acquisition task which is abnormal in acquisition and is not acquired in acquisition tasks of the data block;
acquiring a second server with the number of tasks running on the server being smaller than the rated number of tasks and the service state being normal from the server cluster, and calculating the number of remaining executable tasks on the second server;
and distributing the third acquisition task to the second server based on the number of the tasks which remain to be operated on the second server.
5. The method according to claim 1 or 2, characterized in that the method further comprises:
acquiring a fourth server with normal service state and the number of tasks running on the server being greater than the rated number of tasks from the server cluster, and acquiring a second server with normal service state and the number of tasks running on the server being less than the rated number of tasks;
calculating the number of tasks which are remained and can be operated on the second server based on the rated number of tasks of the second server and the number of tasks which are operated on the second server;
stopping the thread of the collection task exceeding the rated task in the fourth server;
distributing a first number of acquisition tasks exceeding the rated tasks in the fourth server to the second server, and setting the acquisition state identifiers of the first number of acquisition tasks exceeding the rated tasks as second identifiers; wherein the second identifier indicates that the collection task reallocates servers and has been currently collected; the first number is the number of tasks remaining operable on the second server.
6. The method according to claim 1 or 2, characterized in that the method further comprises:
acquiring a fourth acquisition task which is distributed with a server and is not acquired in the acquisition tasks of the data block;
creating and executing an acquisition task thread, and setting an acquisition state identifier of the fourth acquisition task as a third identifier; wherein the third identifier indicates that an acquisition task is currently being acquired;
acquiring a fifth acquisition task which needs to be reallocated to the server and is acquired in the acquisition tasks of the data blocks;
suspending a thread for acquiring the fifth acquisition task, and setting an identifier of an acquisition state of the fifth acquisition task as a fourth identifier; wherein the fourth identification indicates that the collection task needs to reallocate servers and is not currently collected.
7. A data processing apparatus, characterized in that the apparatus comprises: a processor, a memory, and a communication bus;
the communication bus is used for realizing communication connection between the processor and the memory;
the processor is configured to execute a data processing program stored in the memory to implement the steps of:
acquiring data to be uploaded, and converting the data to be uploaded into a collection task of a plurality of data blocks;
setting a task name of an acquisition task of each data block and a service address corresponding to each server;
defining a first relation among a task name of a collection task of each data block, a collection state of the collection task of each data block, a service address corresponding to the collection task of each data block, and a meaning indicating a condition about the collection task;
defining a second relationship between the service address of each server, the rated number of tasks of the server, the service status of each server and the meaning indicating the operation condition of the server;
acquiring the load of each server in the server cluster, and distributing the acquisition task to each server based on the load of each server, the first relation and the second relation; the server cluster comprises at least two servers used for collecting the data to be uploaded and uploading the data to the data processing platform.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201910117769.4A CN109981731B (en) | 2019-02-15 | 2019-02-15 | Data processing method and equipment |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201910117769.4A CN109981731B (en) | 2019-02-15 | 2019-02-15 | Data processing method and equipment |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN109981731A CN109981731A (en) | 2019-07-05 |
| CN109981731B true CN109981731B (en) | 2021-06-15 |
Family
ID=67077026
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201910117769.4A Active CN109981731B (en) | 2019-02-15 | 2019-02-15 | Data processing method and equipment |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN109981731B (en) |
Families Citing this family (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN111124673A (en) * | 2019-12-11 | 2020-05-08 | 中盈优创资讯科技有限公司 | Data acquisition system and method |
| CN115509717A (en) * | 2022-10-14 | 2022-12-23 | 北京百度网讯科技有限公司 | Task scheduling method, device and equipment for distributed storage |
Citations (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN101873005A (en) * | 2010-06-17 | 2010-10-27 | 深圳市科陆电子科技股份有限公司 | A method for realizing balanced collection of electric energy |
Family Cites Families (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP5493479B2 (en) * | 2008-10-03 | 2014-05-14 | 富士通株式会社 | Service providing system, method, uniqueness assurance information setting management program, load distribution program, and management apparatus |
| CN102158386B (en) * | 2010-02-11 | 2015-06-03 | 威睿公司 | Distributed load balance for system management program |
| US8954587B2 (en) * | 2011-07-27 | 2015-02-10 | Salesforce.Com, Inc. | Mechanism for facilitating dynamic load balancing at application servers in an on-demand services environment |
| CN102436399A (en) * | 2011-07-29 | 2012-05-02 | 青岛海信网络科技股份有限公司 | Load balancing acquisition method |
| CN105096181A (en) * | 2015-07-23 | 2015-11-25 | 浪潮软件集团有限公司 | A big data e-commerce transaction method and e-commerce transaction system |
| CN108337275A (en) * | 2017-01-19 | 2018-07-27 | 百度在线网络技术(北京)有限公司 | Task distribution method, device and equipment for Distributor |
| CN107800768B (en) * | 2017-09-13 | 2020-01-10 | 平安科技(深圳)有限公司 | Open platform control method and system |
-
2019
- 2019-02-15 CN CN201910117769.4A patent/CN109981731B/en active Active
Patent Citations (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN101873005A (en) * | 2010-06-17 | 2010-10-27 | 深圳市科陆电子科技股份有限公司 | A method for realizing balanced collection of electric energy |
Non-Patent Citations (2)
| Title |
|---|
| A Multiqueue Interlacing Peak Scheduling Method Based on Tasks’ Classification in Cloud Computing;Liyun Zuo;《IEEE》;20160429;全文 * |
| 基于负载均衡的Multi-UAV任务分配算法的研究;杨媛琦;《中国优秀硕士学位论文全文数据库》;20160601;全文 * |
Also Published As
| Publication number | Publication date |
|---|---|
| CN109981731A (en) | 2019-07-05 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN110515704B (en) | Resource scheduling method and device based on Kubernetes system | |
| CN108768877B (en) | Distribution method and device of burst traffic and proxy server | |
| CN108632365B (en) | Service resource adjusting method, related device and equipment | |
| EP3567829B1 (en) | Resource management method and apparatus | |
| CN111182061B (en) | Task distribution processing method, system, computer device and storage medium | |
| CN110764908B (en) | A load adjustment method, device, server and storage medium | |
| CN114625533B (en) | Distributed task scheduling method, device, electronic device and storage medium | |
| CN107168777B (en) | Method and device for scheduling resources in distributed system | |
| KR20130136449A (en) | Controlled automatic healing of data-center services | |
| CN106648900B (en) | Supercomputing method and system based on smart television | |
| CN109766172A (en) | A kind of asynchronous task scheduling method and device | |
| CN109981731B (en) | Data processing method and equipment | |
| CN105491150A (en) | Load balance processing method based on time sequence and system | |
| CN105893497A (en) | Task processing method and device | |
| CN116340005B (en) | Container cluster scheduling method, device, equipment and storage medium | |
| CN106572137A (en) | Distributed service resource management method and apparatus | |
| US20170075713A1 (en) | Dispatching the Processing of a Computer Process Amongst a Plurality of Virtual Machines | |
| CN113301087A (en) | Resource scheduling method, device, computing equipment and medium | |
| CN113626173B (en) | Scheduling method, scheduling device and storage medium | |
| CN111258710B (en) | System maintenance method and device | |
| CN105933136A (en) | Resource scheduling method and system | |
| CN117435324B (en) | Task scheduling method based on containerization | |
| US20170235288A1 (en) | Process control program, process control device, and process control method | |
| CN114157569A (en) | Cluster system and construction method and construction device thereof | |
| CN111796934A (en) | Task issuing method and device, storage medium and electronic equipment |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |