CN112463411A

CN112463411A - Data processing method, device, server and storage medium

Info

Publication number: CN112463411A
Application number: CN202011455610.2A
Authority: CN
Inventors: 贺宁; 魏程琛; 傅浩; 张智鹏; 李�诚; 张瑞
Original assignee: Chongqing Unisinsight Technology Co Ltd
Current assignee: Chongqing Unisinsight Technology Co Ltd
Priority date: 2020-12-10
Filing date: 2020-12-10
Publication date: 2021-03-09

Abstract

The present invention relates to the technical field of big data, and provides a data processing method, device, server and storage medium, which are applied to a first server, where the first server is connected in communication with a second server running message middleware, and the first server includes data The method includes: obtaining data to be processed through a message middleware, wherein the data to be processed includes an identifier used to uniquely characterize the data to be processed; if there is a check table related to the data table, use the check table to judge the data Whether there is data to be processed in the table, where the data table includes an identification field, and the primary key of the verification table is related to the identification field of the data table; if there is no data to be processed in the data table, the data to be processed is processed. Compared with the prior art, the present invention can judge whether there is duplicate data to be processed in the data table according to the check table, thereby improving the check efficiency of duplicate data and finally improving the efficiency of data processing.

Description

Data processing method, device, server and storage medium

Technical Field

The invention relates to the technical field of big data, in particular to a data processing method, a data processing device, a server and a storage medium.

Background

The message middleware utilizes an efficient and reliable message transfer mechanism for platform-independent data communication and integration of a distributed system based on data communication. By providing the message transmission and message queuing models, a producer producing data and a consumer processing the data can be decoupled, a certain data buffering function is provided, and the reliability of data processing is greatly improved.

In the process of recovering the message middleware exception, the consumer can repeatedly take out the historical data for repeated processing because the rollback of the exception data can be involved.

In order to avoid repeated processing of data, a consumer usually determines whether repeated data exists first, and if so, the repeated processing of the data is not performed any more.

Disclosure of Invention

The invention aims to provide a data processing method, a data processing device, a server and a storage medium, which can improve the verification efficiency of repeated data and further improve the data processing efficiency.

In order to achieve the purpose, the technical scheme adopted by the invention is as follows:

in a first aspect, the present invention provides a data processing method applied to a first server, where the first server is communicatively connected to a second server running a message middleware, and the first server includes a data table, where the method includes: acquiring data to be processed through the message middleware, wherein the data to be processed comprises an identifier for uniquely representing the data to be processed; if a check table related to the data table exists, judging whether the data to be processed exists in the data table or not by using the check table, wherein the data table comprises an identification field, and a main key of the check table is related to the identification field of the data table; and if the data to be processed does not exist in the data table, processing the data to be processed.

In a second aspect, the present invention provides a data processing apparatus applied to a first server, the first server being communicatively connected to a second server running a message middleware, the first server including a data table, the apparatus comprising: the system comprises an acquisition module, a processing module and a processing module, wherein the acquisition module is used for acquiring data to be processed through the message middleware, and the data to be processed comprises an identifier for uniquely representing the data to be processed; the judging module is used for judging whether the data to be processed exists in the data table or not by utilizing a checking table if the checking table related to the data table exists, wherein the data table comprises an identification field, and a main key of the checking table is related to the identification field of the data table; and the processing module is used for processing the data to be processed if the data to be processed does not exist in the data table.

In a third aspect, the present invention provides a server comprising a memory and a processor, wherein the memory stores a computer program, and the processor implements the data processing method as described above when executing the computer program.

In a fourth aspect, the invention provides a computer-readable storage medium, on which a computer program is stored, which computer program, when being executed by a processor, realizes the data processing method as described above.

Compared with the prior art, the method and the device have the advantages that the main key of the check table is determined according to the identification field of the data table, and whether repeated data to be processed exist in the data table can be judged according to the check table, so that the check efficiency of repeated data is improved, and the data processing efficiency is finally improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained according to the drawings without inventive efforts.

Fig. 1 shows a schematic view of an application scenario provided in an embodiment of the present invention.

Fig. 2 is a block diagram illustrating a server according to an embodiment of the present invention.

Fig. 3 is a flowchart illustrating a data processing method according to an embodiment of the present invention.

Fig. 4 is a flowchart illustrating another data processing method according to an embodiment of the present invention.

Fig. 5 is a flowchart illustrating another data processing method according to an embodiment of the present invention.

Fig. 6 is a flowchart illustrating another data processing method according to an embodiment of the present invention.

Fig. 7 is a flowchart illustrating another data processing method according to an embodiment of the present invention.

Fig. 8 is a flowchart illustrating another data processing method according to an embodiment of the present invention.

Fig. 9 is a block diagram illustrating a data processing apparatus according to an embodiment of the present invention.

Icon: 10-a first server; 11-a processor; 12-a memory; 13-a bus; 14-a communication interface; 20-a second server; 30-a third server; 100-a data processing device; 110-an obtaining module; 120-a judgment module; 130-a processing module; 140-a recovery module; 150-an update module; 160-purge module.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. The components of embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.

Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.

In the description of the present invention, it should be noted that if the terms "upper", "lower", "inside", "outside", etc. indicate an orientation or a positional relationship based on that shown in the drawings or that the product of the present invention is used as it is, this is only for convenience of description and simplification of the description, and it does not indicate or imply that the device or the element referred to must have a specific orientation, be constructed in a specific orientation, and be operated, and thus should not be construed as limiting the present invention.

Furthermore, the appearances of the terms "first," "second," and the like, if any, are used solely to distinguish one from another and are not to be construed as indicating or implying relative importance.

It should be noted that the features of the embodiments of the present invention may be combined with each other without conflict.

At present, under the background of big data, along with the development of distributed computation, the drawback of the mode of synchronous processing shows gradually, because the producer who is responsible for data generation and the consumer who is responsible for processing data are not matched with each other in step, the synchronous processing mode often can cause losing of data, and the mode of asynchronous processing owing to kept in to the data that the producer generated to the problem that the data that the synchronous mode exists are lost has been solved. To achieve decoupling between the producer and the consumer, message middleware is typically employed to temporarily store data generated by the producer. Referring to fig. 1, fig. 1 is a schematic view illustrating an application scenario provided by an embodiment of the present invention, in fig. 1, a first server 10 is a consumer responsible for data processing, a second server 20 is a server running a message middleware and responsible for temporary storage of data, and a third server is a producer responsible for data generation. The producer generates data, sends the generated data to the message middleware for temporary storage, and the consumer acquires the data from the message middleware and processes the acquired data, for example, stores the acquired data in a preset database.

In this embodiment, the message middleware may be, but is not limited to, kakfa, ActiveMQ, RabbitMQ, and the like.

In this embodiment, the first server 10, the second server 20, and the third server 30 may be physical computers or virtual machines capable of implementing the same functions as the physical computers, one server, or a server cluster composed of multiple servers.

It should be noted that the message middleware may run on the second server 20 independently, or may run on the first server 10 or the third server 30 as a software functional module.

On the basis of fig. 1, a block schematic diagram of the first server 10 in fig. 1 is provided in the embodiment of the present invention, please refer to fig. 2, and fig. 2 shows a block schematic diagram of the first server 10 provided in the embodiment of the present invention.

The first server 10 comprises a processor 11, a memory 12, a bus 13, a communication interface 14. The processor 11 and the memory 12 are connected by a bus 13, and the processor 11 communicates with the second server 20 through a communication interface 14.

The processor 11 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware or instructions in the form of software in the processor 11. The Processor 11 may be a general-purpose Processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like; but may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components.

The memory 12 is used for storing a program, such as the data processing apparatus 100 in the embodiment of the present invention, the data processing apparatus 100 includes at least one software functional module which can be stored in the memory 12 in a form of software or firmware (firmware), and the processor 11 executes the program after receiving an execution instruction to implement the data processing method in the embodiment of the present invention.

The Memory 12 may include a high-speed Random Access Memory (RAM) and may also include a non-volatile Memory (non-volatile Memory). Alternatively, the memory 12 may be a storage device built in the processor 11, or may be a storage device independent of the processor 11.

The bus 13 may be an ISA bus, a PCI bus, an EISA bus, or the like. Fig. 2 is represented by only one double-headed arrow, but does not represent only one bus or one type of bus.

Referring to fig. 3, fig. 3 is a flowchart illustrating a data processing method according to an embodiment of the present invention, where the method is applied to the first server 10 in fig. 1 and fig. 2, and the method includes the following steps:

step S100, data to be processed is obtained through message middleware, wherein the data to be processed comprises an identifier for uniquely representing the data to be processed.

In this embodiment, when the producer generates data, an identifier for uniquely characterizing the data is generated, the data and the identifier thereof are sent to the message middleware for temporary storage, and the consumer takes the data out of the middleware, processes the data, and stores the data into the data table. The data table may be stored in advance on the first server 10, or may be stored in advance in an external database server communicatively connected to the first server 10.

In this embodiment, the identifier of the data to be processed may be composed of two or more of the generation time, the service name, and the module number.

Step S110, if a checking table related to the data table exists, judging whether the data table has data to be processed by using the checking table, wherein the data table comprises an identification field, and a main key of the checking table is related to the identification field of the data table.

In this embodiment, the check table is provided with a primary key, where the primary key is a candidate field selected for unique identification, and the primary key may be composed of one field or a plurality of fields. The primary key typically functions four times: the integrity of an entity can be ensured; the operation speed of the database can be accelerated; when adding new record in the table, automatically checking the primary key value of the new record, and not allowing the value to be repeated with the primary key values of other records; and fourthly, automatically displaying the records in the table according to the sequence of the primary key values. Therefore, after the check table sets the primary key, when the identification of the data is stored in the check table, the primary key check is triggered first, that is, whether the value of the primary key of the data to be stored already exists is judged, if the value of the primary key of the data to be stored already exists, the primary key check fails, which means that the data is the repeated data, and the identification of the data stored in the check table fails in the data table, otherwise, the primary key check succeeds, which means that the data is not the repeated data, and the identification of the data stored in the check table succeeds.

In this embodiment, the primary key of the check table is related to the identification field of the data table, for example, the primary key of the check table may be the identification field of the data table, or may be a combination of the identification field of the data table and other fields. For example, the data table includes a field a, a field B, and a field C, where the field a is an identification field, that is, two records having the same value of the field a do not exist in the data table, and the related check table of the data table includes the field a, and the field a is set as the primary key.

In this embodiment, the number of fields in the check table may be smaller than that of the data table, for example, the check table includes field a and the data table includes fields A, B and C, or the number of data pieces in the check table is smaller than that of the data table, for example, the data table stores data of the last year, the number of data pieces is ten thousand, and the check table stores data of the last day, and the number of data pieces is only thousands. The data volume in the check table is smaller than that in the data table, the check table is provided with the main key, the data table does not need to be provided with the main key, and the main key is related to the identification field of the data table, so that whether the data to be processed is repeated data in the data table can be quickly judged by using the check table.

And step S120, if the data to be processed does not exist in the data table, processing the data to be processed.

In this embodiment, the absence of the data to be processed in the data table means that the data to be processed is not duplicated data, at this time, the data to be processed needs to be processed, and the data to be processed is stored in the data table after processing, it should be noted that the identifier of the data to be processed is correspondingly stored in the corresponding position of the identifier field in the data table.

According to the method provided by the embodiment of the invention, the comprehensiveness of the data is ensured and the reliability of the data is increased by setting the check table for the data table, the main key of the check table is determined according to the identification field of the data table, and whether repeated data to be processed exists in the data table can be judged according to the main key of the check table, so that the check efficiency of repeated data is improved, and the data processing efficiency is finally improved.

On the basis of fig. 3, an embodiment of the present invention further provides a specific implementation manner for determining whether there is data to be processed in the data table, please refer to fig. 4, where fig. 4 shows a flowchart of another data processing method provided in the embodiment of the present invention, and step S110 includes the following sub-steps:

and a substep S1101 of performing primary key verification on the identifier of the data to be processed by using the verification table.

In this embodiment, because the primary key of the check table is related to the identification field of the data table, and the identification field of the data table is the only field for representing data, the primary key check can be performed on the identification of the data to be processed through the check table, and whether the data to be processed is the repeated data in the data table is determined according to the primary key check result.

And a substep S1102, if the identifier of the data to be processed passes the primary key verification, judging that the data to be processed does not exist in the data table, and storing the identifier of the data to be processed into the verification table.

And a substep S1103 of determining that the data to be processed exists in the data table if the identifier of the data to be processed does not pass the primary key verification.

In this embodiment, if the identifier of the to-be-processed data does not pass the primary key verification, the identifier of the to-be-processed data is not stored in the verification table.

According to the method provided by the embodiment of the invention, the main key check is carried out on the identifier of the data to be processed to judge whether the data to be processed is the repeated data, so that the judging efficiency is improved.

In this embodiment, when the system is started, a check table is not yet created, and at this time, in order to subsequently perform repeated data determination according to the check table, the check table needs to be created first, and a primary key is set for the newly created check table, so that another data processing method is further provided in this embodiment of the present invention, referring to fig. 5, where fig. 5 shows a flowchart of another data processing method provided in this embodiment of the present invention, the method further includes:

step S130, if the checking table related to the data table does not exist, the checking table is created, and the main key of the checking table is determined according to the identification field of the data table.

According to the method provided by the embodiment of the invention, the main key of the check table is set while the check table is created, so that the influence of creating the check table on the subsequent data processing efficiency can be reduced to the maximum extent.

In this embodiment, in order to record the location of the consumer fetching data from the message middleware, the consumer usually sets a second offset in the message middleware to represent the location of the current data to be processed, and after the consumer finishes processing the data, the consumer usually sets a first offset to represent the location of the current processed data, where the first offset and the second offset are normally synchronous or substantially synchronous. However, when the message middleware is abnormal, for example, the consumer fails to write the second offset, or the consumer fails to process the data, or the consumer fails to write the first offset, at this time, data rollback needs to be performed, the stored data is restored to a previous state, and the first offset and the second offset also need to be synchronously rolled back to the previous state, so as to avoid missing during data restoration, an embodiment of the present invention further provides a specific implementation of abnormal restoration, please refer to fig. 6, and fig. 6 shows a flowchart of another data processing method provided by the embodiment of the present invention, where the method includes:

step S200, when detecting that the message middleware performs an abnormal recovery, using the smaller of the first offset and the second offset as a start position.

In this embodiment, the first offset and the second offset are continuously increased along with the processing of the data, so that the smaller position of the first offset and the second offset is used as the starting position of the recovery, the starting position corresponds to the earlier processed data, and in order to avoid the omission of the data, the data recovery is performed from the data at the starting position (i.e., the earlier data). For example, if the first offset is 1000 and the second offset is 998, the start position is 998, and data is restored from the data at the position 998.

Step S210, updating the data table and the check table from the data at the start position of the message middleware to perform data exception recovery.

In this embodiment, the process of performing data exception recovery on data from the start position is similar to the foregoing steps S100 to S120, and data to be recovered is sequentially taken out from the start position, and a check table is used to determine whether the data to be recovered already exists in the data table, if so, the data to be recovered is ignored, and the next data is continuously recovered, otherwise, the data to be recovered is processed and stored in the data table.

In this embodiment, in order to enable the first offset and the second offset to be always in correct positions, an embodiment of the present invention further provides a specific implementation manner for updating the first offset and the second offset, please refer to fig. 7, and fig. 7 shows a flowchart of another data processing method provided in the embodiment of the present invention, where the method includes step S101 and step S131.

Step S101, the control message middleware updates the second offset.

In this embodiment, the updating of the second offset may be performed after the first server 10 obtains the data to be processed through the message middleware, and the first server 10 controls the second server 20 to update the second offset in the second server 20, so that the second offset points to the next data to be processed, which needs to be obtained.

Step S121, store the data to be processed in the data table, and update the first offset.

In this embodiment, the updating of the first offset may be performed after the data to be processed is processed, in an application scenario where the data to be processed needs to be stored, after the data to be processed is processed, the data to be processed also needs to be stored in the data table, and after the data to be processed is successfully stored, the first offset is updated to point to the currently processed data.

Note that, when data is restored, the second offset and the first offset may not be updated in step S101 and step S131, so that the second offset and the first offset point to the correct positions after data restoration.

In this embodiment, in order to avoid that the data amount in the check table is too large to affect the efficiency of determining the repeated data, it is necessary to properly clear the data in the check table, so that only a preset amount of data is stored in the check table, and therefore, an embodiment of the present invention further provides a method for clearing the data in the check table, please refer to fig. 8, where fig. 8 shows a flowchart of another data processing method provided by the embodiment of the present invention, where the method includes the following steps:

step S300, analyzing the generation time of each data record from the identification field of each data record in the check table.

In this embodiment, the check table includes a plurality of data records, a value of an identification field of each data record is an identification of the data record, and the identification is determined according to a generation time of each data record, for example, the identification may be composed of the generation time and a serial number, and in a scenario where a plurality of types of services exist, a unique code may also be set for each type of service, and at this time, the identification may be composed of the generation time, a service name, a code of the service type, and a serial number. For example, each service type is defined as a unique code with a number of 2 bits, such as person: 01, face: 02, each type of service is also coded according to the codes of the pods, the number of coded bits is 2, if person-pod-0 is numbered 00, person-pod-1 is numbered 01, and the number of the service of the same type is unique. And the four-bit stream is processed and is subjected to scribing distribution through the serial number of the pod, the number of the 4-bit stream is 1 ten thousand pieces of data in total, if the same type of service is N, the number of one service is 10000/N, and if a remainder exists, the number is given to the last service number. The overall identification generation rule is as follows: time 20201024000000+ type number 01+ service number 01+ serial number 4 bits. The identifier generation rule provided by the embodiment can avoid repetition of the primary key to the greatest extent and prevent data loss.

In this embodiment, the producer generates an identifier for the produced data according to the identifier generation rule, sends the identifier and the corresponding data to the message middleware for temporary storage, and the consumer takes the data and the corresponding identifier out of the message middleware, stores the identifier of the data in the check table, and stores the data and the identifier of the data in the data table.

In step S310, the data record in the check table whose generation time is within the preset time period is cleared.

In this embodiment, a preset time period may be preset, only the data in the preset time period is stored in the check table, and the data in the check table may be cleared according to a preset cycle, so that only the data in the preset time period is stored in the check table. For example, the preset period is one day, the preset time period is the last 3 days, and when each day is 3 am, the generation time of each data record is analyzed from the identification field of each data record in the check table, all data before 2 days are deleted, and it is ensured that only the data of the last 3 days are stored in the check table.

According to the method provided by the embodiment of the invention, the data in the sub-table is maintained to be a relatively small data volume by regularly deleting the data in the sub-table, so that the verification speed of the repeated data is accelerated.

In order to more clearly illustrate the data processing scheme provided by the above embodiment, the embodiment of the present invention further provides a specific example for detailed description, and takes the message middleware kafaka as an example for description.

For example, the number of the vehicle consumption service is 01, the number is 4, the preset period of the check table is 1 day, and the current time is 2020, 10, 24, 05, 04 minutes and 02 seconds.

The data processing procedure is as follows:

firstly, the built-in initialization data is loaded in the service initialization process, the vehicle 01 is used for loading, the number is from 00 to 03 because the number is 4, and the processing serial numbers of each service are 00-03: 0000-; when the data is pod0 and only this piece of data is processed this second, the token generated according to the token generation rule of the above embodiment is 2020102405040201000000. And after the data generates the identification and completes the corresponding service processing, the data is stored in kafka.

Secondly, the consumer service monitors that the second offset in the kakfa is updated, the data in the kafka are consumed, the second offset on the kafka side is updated immediately after the data are consumed in the service memory, whether a check table for 24 days exists is determined, and if the second offset does not exist, the check table is created; if the data is consistent with the identification in the current check table, the data is discarded, if the data is not consistent with the identification, other data are continuously compared, after comparison processing is completed, the data is uniformly stored in the data table through a copy method, and whether new data is written into the check table or not, the first offset needs to be updated.

The data clearing process in the check table is as follows: the check table only stores the data of the identification field of the data table, and the data table stores the data of all the fields of the data. When the time is 10 and 25 months in 2020 and 3 am, a new checking table of 25/26/27 days is created, and if the checking table exists, the checking table is skipped. After the new check table is built, the expired data in the sub-table is cleaned, and the retention period is only 1 day, so that the expired data is data before 24 days, including all data of No. 23.

In order to perform the embodiments of the data processing method described above and the corresponding steps in the various possible embodiments, an implementation of the data processing apparatus 100 is given below. Referring to fig. 9, fig. 9 is a block diagram illustrating a data processing apparatus 100 according to an embodiment of the invention. It should be noted that the basic principle and the resulting technical effect of the data processing apparatus 100 provided in the present embodiment are the same as those of the above embodiments, and for the sake of brief description, no reference is made to this embodiment.

The data processing apparatus 100 includes an obtaining module 110, a determining module 120, a processing module 130, a recovering module 140, an updating module 150, and a clearing module 160.

The obtaining module 110 is configured to obtain data to be processed through message middleware, where the data to be processed includes an identifier for uniquely characterizing the data to be processed.

The determining module 120 is configured to determine whether to-be-processed data exists in the data table by using the check table if the check table related to the data table exists, where the data table includes an identification field, and a primary key of the check table is related to the identification field of the data table.

As a specific implementation manner, the determining module 120 is specifically configured to: performing primary key verification on the identifier of the data to be processed by using a verification table; if the identification of the data to be processed passes the primary key verification, judging that the data to be processed does not exist in the data table, and storing the identification of the data to be processed into a verification table; and if the identifier of the data to be processed does not pass the primary key verification, judging that the data to be processed exists in the data table.

As a specific implementation manner, the determining module 120 is further configured to: and if the checking table related to the data table does not exist, creating the checking table, and determining the main key of the checking table according to the identification field of the data table.

The processing module 130 is configured to process the data to be processed if the data to be processed does not exist in the data table.

A recovery module 140 for: when the message middleware is detected to perform abnormal recovery, the smaller one of the first offset and the second offset is used as an initial position; the data table and the check table are updated from the data at the start position of the message middleware for data exception recovery.

An updating module 150, configured to control the message middleware to update the second offset.

As a specific embodiment, the update module 150 is further configured to: and storing the data to be processed into a data table and updating the first offset.

A purge module 160 for: analyzing the generation time of each data record from the identification field of each data record in the check table; and clearing the data records with the generation time within the preset time period in the check table.

An embodiment of the present invention provides a computer-readable storage medium, on which a computer program is stored, which, when executed by a processor, implements the data processing method described above.

In summary, embodiments of the present invention provide a data processing method, an apparatus, a server, and a storage medium, which are applied to a first server, where the first server is communicatively connected to a second server running a message middleware, and the first server includes a data table, where the method includes: acquiring data to be processed through message middleware, wherein the data to be processed comprises an identifier for uniquely representing the data to be processed; if the checking table related to the data table exists, judging whether the data table has data to be processed or not by using the checking table, wherein the data table comprises an identification field, and a main key of the checking table is related to the identification field of the data table; and if the data to be processed does not exist in the data table, processing the data to be processed. Compared with the prior art, the embodiment of the invention determines the main key of the check table according to the identification field of the data table, and can judge whether repeated data to be processed exists in the data table according to the check table, thereby improving the check efficiency of repeated data and finally improving the data processing efficiency.

The above description is only for the specific embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the appended claims.

Claims

1. A data processing method, characterized in that it is applied to a first server, the first server is communicatively connected with a second server running message middleware, the first server comprises a data table, and the method comprises:

Obtaining data to be processed through the message middleware, wherein the data to be processed includes an identifier for uniquely characterizing the data to be processed;

If there is a check table related to the data table, use the check table to determine whether the data to be processed exists in the data table, wherein the data table includes an identification field, and the primary key of the check table is the same as the The identification field of the data table is related;

If the data to be processed does not exist in the data table, the data to be processed is processed.

2. The data processing method according to claim 1, wherein the step of judging whether the data to be processed exists in the data table by using a check table comprises:

Use the check table to perform primary key check on the identifier of the data to be processed;

If the identifier of the data to be processed passes the primary key verification, it is determined that the data to be processed does not exist in the data table, and the identifier of the data to be processed is stored in the verification table;

If the identifier of the data to be processed fails the primary key verification, it is determined that the data to be processed exists in the data table.

3. The data processing method according to claim 1, wherein the method further comprises:

If there is no check table related to the data table, a check table is created, and the primary key of the check table is determined according to the identification field of the data table.

4. The data processing method according to claim 1, wherein the first server stores a first offset, the first offset is used to represent the position of the currently processed data, the first offset The second server stores a second offset, where the second offset is used to represent the current position of the data to be processed, and the method further includes:

When it is detected that the message middleware performs abnormal recovery, the smaller of the first offset and the second offset is used as the starting position;

The data table and the check table are updated from the data at the starting position of the message middleware, so as to perform abnormal data recovery.

5. The data processing method according to claim 4, wherein after the step of acquiring the data to be processed through the message middleware, the step further comprises:

The message middleware is controlled to update the second offset.

6. The data processing method according to claim 4, wherein the step of processing the data to be processed further comprises:

The data to be processed is stored in the data table, and the first offset is updated.

7. The data processing method according to claim 1, wherein the primary key of the check table is the identification field of the data table, and the check table comprises a plurality of data records, and each of the data records The value of the identification field is determined according to the generation time of each of the data records, and the method further includes:

Parse out the generation time of each of the data records from the identification field of each of the data records in the check table;

Clearing data records whose generation time is within a preset time period in the verification table.

8. A data processing device, characterized in that it is applied to a first server, the first server is communicatively connected to a second server running message middleware, the first server comprises a data table, and the device comprises:

an acquisition module, configured to acquire data to be processed through the message middleware, wherein the data to be processed includes an identifier for uniquely characterizing the data to be processed;

The judgment module is configured to use the check table to judge whether the data to be processed exists in the data table if there is a check table related to the data table, wherein the data table includes an identification field, and the check table includes an identification field. The primary key of the verification table is related to the identification field of the data table;

A processing module, configured to process the data to be processed if the data to be processed does not exist in the data table.

9. A server comprising a memory and a processor, wherein the memory stores a computer program, and the processor implements the data processing according to any one of claims 1-7 when the processor executes the computer program method.

10. A computer-readable storage medium on which a computer program is stored, characterized in that, when the computer program is executed by a processor, the data processing method according to any one of claims 1-7 is implemented.