[go: up one dir, main page]

CN114138564B - Fault handling method, processing device, electronic device and readable storage medium - Google Patents

Fault handling method, processing device, electronic device and readable storage medium Download PDF

Info

Publication number
CN114138564B
CN114138564B CN202111487776.7A CN202111487776A CN114138564B CN 114138564 B CN114138564 B CN 114138564B CN 202111487776 A CN202111487776 A CN 202111487776A CN 114138564 B CN114138564 B CN 114138564B
Authority
CN
China
Prior art keywords
user data
white list
fault
service table
whitelist
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111487776.7A
Other languages
Chinese (zh)
Other versions
CN114138564A (en
Inventor
李承文
陈志国
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Construction Bank Corp
Original Assignee
China Construction Bank Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Construction Bank Corp filed Critical China Construction Bank Corp
Priority to CN202111487776.7A priority Critical patent/CN114138564B/en
Publication of CN114138564A publication Critical patent/CN114138564A/en
Application granted granted Critical
Publication of CN114138564B publication Critical patent/CN114138564B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1479Generic software techniques for error detection or fault masking
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1415Saving, restoring, recovering or retrying at system level
    • G06F11/142Reconfiguring to eliminate the error

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Hardware Redundancy (AREA)

Abstract

本公开提供了一种故障处理方法,可以应用于计算机技术领域,具体涉及系统故障处理领域。该故障处理方法包括:在第一系统出现故障的情况下,确定第一系统中与故障相关联的用户数据;将用户数据标记为白名单用户数据,其中,第一系统能够响应除白名单用户数据之外的其他用户的请求;将白名单用户数据转移至第二系统,以便对第一系统中的白名单用户数据进行故障隔离;在第一系统故障修复的情况下,接收来自第二系统的白名单用户数据。本公开还提供了一种处理装置、电子设备及可读存储介质。

The present disclosure provides a fault handling method, which can be applied to the field of computer technology, and specifically relates to the field of system fault handling. The fault handling method includes: in the case of a fault in a first system, determining user data associated with the fault in the first system; marking the user data as whitelist user data, wherein the first system can respond to requests from other users except the whitelist user data; transferring the whitelist user data to a second system so as to perform fault isolation on the whitelist user data in the first system; and receiving the whitelist user data from the second system when the fault in the first system is repaired. The present disclosure also provides a processing device, an electronic device, and a readable storage medium.

Description

Fault processing method, processing device, electronic equipment and readable storage medium
Technical Field
The present disclosure relates to the field of computer technology, and in particular, to the field of system fault handling, and more particularly, to a fault handling method, a processing apparatus, an electronic device, a readable storage medium, and a program product.
Background
Along with the continuous iterative updating and operation and maintenance of the system, various faults are inevitably encountered in the operation process of the system, and the fault processing method directly influences the experience of a system user and reflects the availability and reliability of the system.
In the related art, when a system fails, the whole system needs to be stopped to perform fault processing on the failure of the system, so that the efficiency of fault processing is reduced, and the user experience of the system is poor.
Disclosure of Invention
In view of the foregoing, the present disclosure provides a fault handling method, a processing apparatus, an electronic device, a readable storage medium, and a program product.
According to a first aspect of the disclosure, there is provided a fault handling method comprising determining user data associated with a fault in a first system in the event of a fault in the first system, marking the user data as whitelisted user data, wherein the first system is capable of responding to requests by other users than whitelisted user data, transferring the whitelisted user data to a second system for fault isolation of the whitelisted user data in the first system, and receiving the whitelisted user data from the second system in the event of a fault repair of the first system.
According to the embodiment of the disclosure, the second system comprises a temporary service table and a target service table, all initial user data in the first system are transferred from the second system, all initial user data in the first system are stored in the target service table, the white list user data are transferred to the second system so as to perform fault isolation on the white list user data in the first system, the method comprises the steps of transferring the white list user data to the temporary service table, updating the initial user data in the target service table which is the same as a main key in the temporary service table into the white list data based on the white list data transferred in the temporary service table, and modifying a transaction route of the white list data to the second system so as to perform fault isolation on the white list user data in the first system.
According to the embodiment of the disclosure, in the case of fault restoration of the first system, the method comprises the steps of clearing user data associated with faults in the first system and receiving the white list user data from the second system based on preset logic rules.
According to an embodiment of the present disclosure, after receiving the whitelisted user data from the second system, modifying the transaction route of the whitelisted user data to the repaired first system is further included.
According to an embodiment of the disclosure, determining user data associated with the fault in the first system includes analyzing a transaction log in the first system to obtain the user data associated with the fault.
According to the embodiment of the disclosure, marking the user data as the white list user data comprises modifying the state of the user data associated with the fault in the first system, and marking the modified user data as the white list user data.
According to an embodiment of the present disclosure, after marking the user data as whitelisted user data, controlling the business transaction of the whitelisted user data is further included.
A second aspect of the present disclosure provides a fault handling apparatus, including a determining module configured to determine user data associated with a fault in a first system in case of a fault in the first system, a marking module configured to mark the user data as whitelist user data, wherein the first system is capable of responding to a request of a user other than the whitelist user data, a transferring module configured to transfer the whitelist user data to a second system so as to perform fault isolation on the whitelist user data in the first system, and a receiving module configured to receive the whitelist user data from the second system in case of a fault repair in the first system.
A third aspect of the present disclosure provides an electronic device comprising one or more processors and a memory for storing one or more programs, wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to perform the above-described fault handling method.
A fourth aspect of the present disclosure also provides a computer-readable storage medium having stored thereon executable instructions that, when executed by a processor, cause the processor to perform the above-described fault handling method.
A fifth aspect of the present disclosure also provides a computer program product comprising a computer program which, when executed by a processor, implements the above-described fault handling method.
According to the embodiment of the disclosure, user data associated with faults in a first system are determined according to the situation that the first system breaks down, the user data are marked as white list user data, the first system can respond to requests of other users except the white list user data, the white list user data are transferred to a second system so as to conduct fault isolation on the white list user data in the first system, and the white list user data from the second system are received under the situation that the first system breaks down and is repaired. The technical problems that once a system fails in the related art, the normal operation of the whole system is affected and the user experience of the system is reduced are solved. The method can realize that when a system fails, the white list data associated with the failure is subjected to fault isolation and fault recovery based on a processing mode of the white list, so that the operation of the whole system is not influenced by the occurred fault, the fault processing efficiency is improved, the influence range of the system fault is reduced, the system user experience is improved, meanwhile, the processing method is realized based on Python script language writing, when the system fails, the deployment is convenient, the operation is convenient, and the data migration efficiency is improved.
Drawings
The foregoing and other objects, features and advantages of the disclosure will be more apparent from the following description of embodiments of the disclosure with reference to the accompanying drawings, in which:
FIG. 1 schematically illustrates an application scenario diagram of a fault handling method and handling apparatus according to an embodiment of the present disclosure;
FIG. 2 schematically illustrates a flow chart of a fault handling method according to an embodiment of the present disclosure;
FIG. 3 schematically illustrates a flow chart of a method of whitelist user data transfer to a second system, in accordance with an embodiment of the disclosure;
FIG. 4 schematically illustrates a schematic diagram of a fault handling method according to an embodiment of the present disclosure;
FIG. 5 schematically shows a block diagram of a fault handling apparatus according to an embodiment of the present disclosure, and
Fig. 6 schematically illustrates a block diagram of an electronic device adapted to implement a fault handling method according to an embodiment of the disclosure.
Detailed Description
Hereinafter, embodiments of the present disclosure will be described with reference to the accompanying drawings. It should be understood that the description is only exemplary and is not intended to limit the scope of the present disclosure. In the following detailed description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the embodiments of the present disclosure. It may be evident, however, that one or more embodiments may be practiced without these specific details. In addition, in the following description, descriptions of well-known structures and techniques are omitted so as not to unnecessarily obscure the concepts of the present disclosure.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. The terms "comprises," "comprising," and/or the like, as used herein, specify the presence of stated features, steps, operations, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, or components.
All terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art unless otherwise defined. It should be noted that the terms used herein should be construed to have meanings consistent with the context of the present specification and should not be construed in an idealized or overly formal manner.
Where a convention analogous to "at least one of A, B and C, etc." is used, in general such a convention should be interpreted in accordance with the meaning of one of skill in the art having generally understood the convention (e.g., "a system having at least one of A, B and C" would include, but not be limited to, systems having a alone, B alone, C alone, a and B together, a and C together, B and C together, and/or A, B, C together, etc.).
In the upgrading and updating process of the system, the parallel verification stage of the new system and the old system is indispensable for some important systems, in this state, the new system and the old system both provide services to the outside, and the old system is verified in a long-term actual service scene, so that the running state is stable. In the online process of the new system, long-term verification of a long-term actual service scene is not performed, and the probability of faults is high. When a fault occurs, the fault needs to be processed efficiently and stably, the influence range of the fault is reduced, and the use experience of a system user is improved.
To this end, embodiments of the present disclosure provide a fault handling method, a processing apparatus, an electronic device, a readable storage medium, and a program product. The fault processing method comprises the steps of determining user data associated with faults in a first system under the condition that the first system breaks down, marking the user data as white list user data, enabling the first system to respond to requests of other users except the white list user data, transferring the white list user data to a second system so as to conduct fault isolation on the white list user data in the first system, and receiving the white list user data from the second system under the condition that the first system breaks down.
In the technical scheme of the disclosure, the related processes of collecting, storing, using, processing, transmitting, providing, disclosing, applying and the like of the personal information of the user all conform to the regulations of related laws and regulations, necessary security measures are adopted, and the public order harmony is not violated.
In the technical scheme of the disclosure, the authorization or consent of the user is obtained before the personal information of the user is obtained or acquired.
Fig. 1 schematically illustrates an application scenario diagram of a fault handling method and a handling apparatus according to an embodiment of the present disclosure.
As shown in fig. 1, an application scenario 100 according to this embodiment may include a first system 101 and a second system 102. The first system 101 includes a plurality of servers, i.e., servers 1011, 1012, 1013, etc., and the second system 102 includes a plurality of servers, i.e., servers 1021, 1022, 1023, etc.
The servers in the first system 101 and the second system 102 are servers that provide various services. For example, a background management server (by way of example only) that provides support for services handled by users. The background management server may analyze and process the received data such as the user request, and feed back the processing result (e.g., the web page, information, service handling result or data obtained or generated according to the user request) to the user.
The first system 101 and the second system 102 are in parallel operation, the first initialization user data in the first system 101 being transferred from the first part of all user data in the second system. After the first system is stable in operation, transferring the second part of user data in the other user data except the first part of user data in the second system to the first system to serve as second initial user data, and after the first system 101 is stable in operation, sequentially transferring all the user data in the second system 102 to the first system, so that the first system is verified by an actual service scene to achieve a stable operation state.
In the process of parallel operation of the first system 101 and the second system 102, if a fault occurs in the first system 101, user data associated with the fault in the first system 101 may be transferred to the second system 102 as white list user data, and after the fault of the first system 101 is repaired, the white list data transferred to the second system 102 is transferred to the first system.
It should be noted that, the fault handling method provided in the embodiments of the present disclosure may be generally executed by a certain server in the first system 101 and the second system 102, or may be executed by a server cluster in the first system 101 and the second system 102. Accordingly, the fault handling apparatus provided in the embodiments of the present disclosure may be generally disposed in a server in the first system 101 and the second system 102, or may be disposed in a server cluster in the first system 101 and the second system 102.
It should be understood that the number of systems and servers in fig. 1 is merely illustrative. There may be any number of systems and servers, as desired for implementation.
The fault handling method of the disclosed embodiment will be described in detail with reference to fig. 2 to 4 based on the scenario described in fig. 1.
In the technical scheme of the disclosure, the processes of acquiring, collecting, storing, using, processing, transmitting, providing, disclosing, applying and the like of the data all conform to the regulations of related laws and regulations, necessary security measures are adopted, and the public order harmony is not violated.
Fig. 2 schematically illustrates a flow chart of a fault handling method according to an embodiment of the present disclosure.
As shown in fig. 2, the fault handling method of this embodiment includes operations S210 to S240.
In operation S210, in case of a failure of the first system, user data associated with the failure in the first system is determined.
According to embodiments of the present disclosure, the first system failure may include a hardware failure of the system, a software program failure, a network failure, a subsystem operation failure, and so forth.
According to an embodiment of the present disclosure, the user data may include user identification information and service information corresponding to the user identification information.
It should be noted that, in the technical solution of the present disclosure, the authorization or consent of the user is obtained before the personal information of the user is obtained or collected.
According to the embodiment of the disclosure, in order to detect and repair a fault occurring in the first system when the first system is in fault during operation of the first system, user data associated with the fault is first required to be determined so as to locate and repair the fault after the user data associated with the fault is isolated from the fault.
In operation S220, the user data is marked as white list user data, wherein the first system is able to respond to requests of other users than the white list user data.
According to embodiments of the present disclosure, whitelist user data refers to user data associated with a need to handle a fault that occurs. While other non-failed associated user data does not belong to the whitelist of user data.
According to the embodiment of the disclosure, for the user request not belonging to the white list user data, the first system can normally operate for the user request not belonging to the white list user data, and respond to the user request.
According to an embodiment of the present disclosure, after marking the user data as whitelisted user data, traffic transactions of the whitelisted user data are controlled. That is, the transaction of the white list user data is refused, and at the same time, a state prompt for prohibiting the transaction operation is displayed to the outside through the external interface.
The whitelist user data is transferred to the second system in order to perform fault isolation of the whitelist user data in the first system in operation S230.
According to the embodiment of the disclosure, when a problem occurs in part of applications or subsystems in the first system, the part with the problem is isolated and repaired independently, so that other applications or subsystems in the first system can normally operate and cannot be influenced by the isolated fault.
According to an embodiment of the present disclosure, the whitelisted user data is part of the user data in the first system, the user data associated with the fault.
The transfer of the whitelist user data to the second system may include migration of the whitelist user data and modification of the transaction route of the whitelist user data, in accordance with embodiments of the present disclosure.
In operation S240, white list user data from the second system is received in case of the first system failover.
According to the embodiment of the disclosure, after the white list user data is transferred to the second system, the white list user data is operated on the second system, meanwhile, the fault of the first system is positioned and repaired, and after the repair is completed, the white list user data transferred to the second system is transferred to the first system again, so that the white list user data can be recovered from the fault on the first system.
According to the embodiment of the disclosure, fault recovery can be a strategy, a method and a technology adopted for isolating faults after the first system detects the faults, and selecting a preset method to enable the first system to return to a task point before the faults after the faults are repaired, so that the first system continues to work.
According to the embodiment of the disclosure, user data associated with faults in a first system are determined according to the situation that the first system breaks down, the user data are marked as white list user data, the first system can respond to requests of other users except the white list user data, the white list user data are transferred to a second system so as to conduct fault isolation on the white list user data in the first system, and the white list user data from the second system are received under the situation that the first system breaks down and is repaired. The technical problems that once a system fails in the related art, the normal operation of the whole system is affected and the user experience of the system is reduced are solved. The method can realize that when a system fails, the white list data associated with the failure is subjected to fault isolation and fault recovery based on a processing mode of the white list, so that the operation of the whole system is not influenced by the occurred fault, the fault processing efficiency is improved, the influence range of the system fault is reduced, the system user experience is improved, meanwhile, the processing method is realized based on Python script language writing, when the system fails, the deployment is convenient, the operation is convenient, and the data migration efficiency is improved.
According to an embodiment of the present disclosure, the second system includes a temporary service table and a target service table, initial user data in the first system is transferred from the second system, and the target service table stores the initial user data in the first system.
According to an embodiment of the present disclosure, the second system and the first system belong to a parallel system, the first system and the second system are in a parallel operation state, and the initialized user data in the first system is transferred from a part of user data in all user data in the second system, and the initialized user data in the first system is reserved in the second system. The initialized user data in the first system are all user data in the first system.
According to the embodiment of the disclosure, the temporary service table can be used for storing the white list user data when the white list user data in the first system is transferred to the second system.
Fig. 3 schematically illustrates a flow chart of a method of whitelist user data transfer to a second system, in accordance with an embodiment of the disclosure.
As shown in FIG. 3, the method may include operations S310-S330.
The whitelist user data is transferred to the temporary service table in operation S310.
According to embodiments of the present disclosure, transferring whitelist data to a temporary service table may be accomplished by utilizing a structured query language (Structured Query Language).
In accordance with embodiments of the present disclosure, the structured query language may be a database query and programming language that may be used to access data as well as query, update, and manage relational database systems.
In operation S320, the initial user data in the target service table, which is the same as the primary key in the temporary service table, is updated to the white list data based on the transferred white list data in the temporary service table.
According to embodiments of the present disclosure, a primary key is a unique key that may be used to uniquely identify a record in a table for one or more fields in the data table. For example, in a two table relationship, a primary key is used to reference a particular record in one table from the other table.
According to the embodiment of the disclosure, for example, the user data with the same primary key information of the temporary service table and the target service table are combined to reference the white list user data from the temporary service table in the target service table, so as to update the initial user data in the target service to the white list user data.
According to the embodiment of the disclosure, for the data which does not exist with the primary key information in the temporary service table, the white list data in the temporary service table can be directly inserted into the target service table, so that the data updating operation in the target service table is realized.
According to the embodiment of the disclosure, when the white list user data is transferred to the second system, since the white list user data in the first system runs for a period of time in the first system, part of nonsensical data or dirty data is generated, and the nonsensical data or dirty data can be deleted in the process of transferring to the second system.
In operation S330, the transaction route of the whitelist data is modified to the second system so as to perform fault isolation on the whitelist user data in the first system.
According to embodiments of the present disclosure, transaction routing refers to controlling a transaction path of user data, i.e., whether the user data is operating on a first system or a second system.
According to the embodiment of the disclosure, information configuration is performed by using a preset programming language and a file form of a preset format, so that migration of the white list user data and modification of a transaction route of the white list data from the first system to the second system are realized, and fault isolation is performed on the white list data of the first system.
According to the embodiment of the disclosure, after the transaction route of the white list user data is modified, the transaction of the white list user data is contacted, so that the white list user data normally operates in the second system.
According to an embodiment of the disclosure, in the case of fault repair of a first system, receiving whitelist user data from a second system includes clearing user data associated with a fault in the first system and receiving whitelist user data from the second system based on a preset logic rule.
According to embodiments of the present disclosure, after transferring the whitelist data to the second system, repairing the first system failure may include accurately locating the first system failure and repairing the located failure.
According to embodiments of the present disclosure, after a first system is repaired by a fault, user data associated with the fault in the first system, that is, whitelist data before being transferred to a second system, is purged.
It should be noted that, the clearing of the user data associated with the fault in the first system may be performed while the white list user data is transferred to the second system, or may be performed before the white list user data is received from the second system. In the embodiments of the present disclosure, there is no particular limitation.
According to the embodiment of the disclosure, the user data in the target service table of the second system is filtered to obtain white list user data. And controlling the business transaction corresponding to the white list user data in the second system, rejecting the transaction of the white list user data, and displaying a state prompt for prohibiting the transaction operation to the outside through an external interface.
According to an embodiment of the disclosure, the whitelist user data in the second system is transferred to the first system with the repaired fault through the structured query language.
According to an embodiment of the present disclosure, after receiving the whitelisted user data from the second system, modifying the transaction route of the whitelisted user data to the repaired first system is further included.
According to the embodiment of the disclosure, after the transaction route of the white list user data transferred to the first system is modified from the second system to the first system, the transaction operation of rejecting the white list user data is relieved, so that the white list user data can normally run in the repaired first system.
Determining user data associated with the fault in the first system, according to embodiments of the present disclosure, includes analyzing a transaction log in the first system to obtain the user data associated with the fault.
According to the embodiment of the disclosure, when determining the user data associated with the fault in the first system, log transaction information of all the user data in the first system can be acquired first, then analysis processing is carried out on the log transaction information, error reporting information associated with the fault is screened out, and the user data associated with the fault is determined according to the error reporting information.
According to an embodiment of the disclosure, marking user data as whitelisted user data includes modifying a state of user data associated with a fault in a first system and marking the modified user data as whitelisted user data.
According to embodiments of the present disclosure, the status of user data may include that the operation may be performed and controlled, e.g., the status that the operation may be performed is characterized by "Y" and the status that the operation is denied is characterized by "N".
According to the embodiment of the disclosure, when the first system fails, user data in the first system is in an operational state. When the first system fails, after the user data associated with the failure is determined, the state of the user data associated with the failure is changed from the operable "Y" to the refused operation "N", and the user data is marked as white list user data.
According to embodiments of the present disclosure, the fault handling method may be written based on a Python scripting language.
According to the embodiment of the disclosure, the fault processing method is realized by writing the Python script language, so that configuration and deployment are convenient and flexible when faults occur, the efficiency of data migration is improved, the method can be suitable for various flexible migration scenes, and the application range of the fault processing method is wider.
Fig. 4 schematically illustrates a schematic diagram of a fault handling method according to an embodiment of the present disclosure.
As shown in fig. 4, on the premise that the first system 401 fails, user data 4011 associated with the failure in the first system 401 is determined, the user data 4011 is marked as white list user data 402, the white list user data 402 is transferred to the second system 403 so as to perform fault isolation on the white list user data 402, and on the premise that the first system fails to repair, the white list user data in the second system 403 is transferred to the first system 401.
Based on the fault processing method, the disclosure further provides a fault processing device. The device will be described in detail below in connection with fig. 5.
Fig. 5 schematically shows a block diagram of a fault handling apparatus according to an embodiment of the present disclosure.
As shown in fig. 5, the fault handling apparatus 500 of this embodiment may include a determination module 510, a tagging module 520, a transfer module 530, and a receiving module 540.
A determining module 510 is configured to determine, in case of a failure of the first system, user data associated with the failure in the first system. In an embodiment, the determining module 510 may be configured to perform the operation S210 described above, which is not described herein.
A tagging module 520, configured to tag the user data as white list user data, wherein the first system is capable of responding to requests of other users than the white list user data. In an embodiment, the marking module 520 may be used to perform the operation S220 described above, which is not described herein.
A transferring module 530, configured to transfer the whitelist user data to the second system, so as to perform fault isolation on the whitelist user data in the first system. In an embodiment, the transfer module 530 may be configured to perform the operation S230 described above, which is not described herein.
And a receiving module 540, configured to receive the white list user data from the second system in the case of the first system fault repair. In an embodiment, the receiving module 540 may be configured to perform the operation S240 described above, which is not described herein.
According to the embodiment of the disclosure, user data associated with faults in a first system are determined according to the situation that the first system breaks down, the user data are marked as white list user data, the first system can respond to requests of other users except the white list user data, the white list user data are transferred to a second system so as to conduct fault isolation on the white list user data in the first system, and the white list user data from the second system are received under the situation that the first system breaks down and is repaired. The technical problems that once a system fails in the related art, the normal operation of the whole system is affected and the user experience of the system is reduced are solved. The method can realize that when a system fails, the white list data associated with the failure is subjected to fault isolation and fault recovery based on a processing mode of the white list, so that the operation of the whole system is not influenced by the occurred fault, the fault processing efficiency is improved, the influence range of the system fault is reduced, the system user experience is improved, meanwhile, the processing method is realized based on Python script language writing, when the system fails, the deployment is convenient, the operation is convenient, and the data migration efficiency is improved.
According to an embodiment of the present disclosure, the second system includes a temporary service table and a target service table, all initial user data in the first system is transferred from the second system, and all initial user data in the first system is stored in the target service table.
According to embodiments of the present disclosure, the transfer module 530 may include a transfer sub-module, an update sub-module, a first modification sub-module.
And the transferring sub-module is used for transferring the white list user data to the temporary service table.
And the updating sub-module is used for updating the initial user data in the target service table which is the same as the main key in the temporary service table into the white list data based on the white list data transferred in the temporary service table.
And the first modification submodule is used for modifying the transaction route of the white list data to the second system so as to carry out fault isolation on the white list user data in the first system.
The receiving module 530 may include a purge sub-module, a receiving sub-module, according to an embodiment of the present disclosure.
And the clearing sub-module is used for clearing the user data associated with the faults in the first system.
And the receiving sub-module is used for receiving the white list user data from the second system based on a preset logic rule.
According to an embodiment of the present disclosure, after receiving the whitelisted user data from the second system, modifying the transaction route of the whitelisted user data to the repaired first system is further included.
According to an embodiment of the present disclosure, the determination module 510 may include an analysis sub-module.
And the analysis sub-module is used for analyzing the transaction log in the first system to obtain user data associated with the fault.
The tagging module 520 may include a second modification sub-module, a tagging sub-module, according to an embodiment of the present disclosure.
And the second modification submodule is used for modifying the state of the user data associated with the fault in the first system.
And the marking sub-module is used for marking the modified user data as white list user data.
According to an embodiment of the present disclosure, after marking the user data as whitelisted user data, controlling the business transaction of the whitelisted user data is further included.
Any of the determining module 510, the marking module 520, the transferring module 530, and the receiving module 540 may be combined in one module to be implemented, or any of the modules may be split into a plurality of modules according to an embodiment of the present disclosure. Or at least some of the functionality of one or more of the modules may be combined with, and implemented in, at least some of the functionality of other modules. According to embodiments of the present disclosure, at least one of the determination module 510, the tagging module 520, the transfer module 530, and the receiving module 540 may be implemented at least in part as hardware circuitry, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a system on a chip, a system on a substrate, a system on a package, an Application Specific Integrated Circuit (ASIC), or may be implemented in hardware or firmware in any other reasonable way of integrating or packaging circuitry, or in any one of or a suitable combination of three of software, hardware, and firmware. Or at least one of the determining module 510, the marking module 520, the transferring module 530 and the receiving module 540 may be at least partially implemented as computer program modules which, when executed, may perform the corresponding functions.
Fig. 6 schematically illustrates a block diagram of an electronic device adapted to implement a fault handling method according to an embodiment of the disclosure.
As shown in fig. 6, an electronic device 600 according to an embodiment of the present disclosure includes a processor 601 that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 602 or a program loaded from a storage section 608 into a Random Access Memory (RAM) 603. The processor 601 may include, for example, a general purpose microprocessor (e.g., a CPU), an instruction set processor and/or an associated chipset and/or a special purpose microprocessor (e.g., an Application Specific Integrated Circuit (ASIC)), or the like. Processor 601 may also include on-board memory for caching purposes. The processor 601 may comprise a single processing unit or a plurality of processing units for performing different actions of the method flows according to embodiments of the disclosure.
In the RAM 603, various programs and data necessary for the operation of the electronic apparatus 600 are stored. The processor 601, the ROM 602, and the RAM 603 are connected to each other through a bus 604. The processor 601 performs various operations of the method flow according to the embodiments of the present disclosure by executing programs in the ROM 602 and/or the RAM 603. Note that the program may be stored in one or more memories other than the ROM 602 and the RAM 603. The processor 601 may also perform various operations of the method flow according to embodiments of the present disclosure by executing programs stored in one or more memories.
According to an embodiment of the present disclosure, the electronic device 600 may also include an input/output (I/O) interface 605, the input/output (I/O) interface 605 also being connected to the bus 604. The electronic device 600 may also include one or more of an input portion 606 including a keyboard, mouse, etc., an output portion 607 including a display such as a Cathode Ray Tube (CRT), liquid Crystal Display (LCD), etc., and speakers, etc., a storage portion 608 including a hard disk, etc., and a communication portion 609 including a network interface card such as a LAN card, modem, etc., connected to the I/O interface 605. The communication section 609 performs communication processing via a network such as the internet. The drive 610 is also connected to the I/O interface 605 as needed. Removable media 611 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is installed as needed on drive 610 so that a computer program read therefrom is installed as needed into storage section 608.
The present disclosure also provides a computer-readable storage medium that may be included in the apparatus/device/system described in the above embodiments, or may exist alone without being assembled into the apparatus/device/system. The computer-readable storage medium carries one or more programs which, when executed, implement methods in accordance with embodiments of the present disclosure.
According to embodiments of the present disclosure, the computer-readable storage medium may be a non-volatile computer-readable storage medium, which may include, for example, but is not limited to, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this disclosure, a computer-readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. For example, according to embodiments of the present disclosure, the computer-readable storage medium may include ROM 602 and/or RAM 603 and/or one or more memories other than ROM 602 and RAM 603 described above.
Embodiments of the present disclosure also include a computer program product comprising a computer program containing program code for performing the methods shown in the flowcharts. The program code, when executed in a computer system, causes the computer system to implement the fault handling methods provided by embodiments of the present disclosure.
The above-described functions defined in the system/apparatus of the embodiments of the present disclosure are performed when the computer program is executed by the processor 601. The systems, apparatus, modules, units, etc. described above may be implemented by computer program modules according to embodiments of the disclosure.
In one embodiment, the computer program may be based on a tangible storage medium such as an optical storage device, a magnetic storage device, or the like. In another embodiment, the computer program may also be transmitted, distributed in the form of signals over a network medium, and downloaded and installed via the communication section 609, and/or installed from the removable medium 611. The computer program may comprise program code that is transmitted using any appropriate network medium, including but not limited to wireless, wireline, etc., or any suitable combination of the preceding.
In such an embodiment, the computer program may be downloaded and installed from a network through the communication portion 609, and/or installed from the removable medium 611. The above-described functions defined in the system of the embodiments of the present disclosure are performed when the computer program is executed by the processor 601. The systems, devices, apparatus, modules, units, etc. described above may be implemented by computer program modules according to embodiments of the disclosure.
According to embodiments of the present disclosure, program code for performing computer programs provided by embodiments of the present disclosure may be written in any combination of one or more programming languages, and in particular, such computer programs may be implemented in high-level procedural and/or object-oriented programming languages, and/or assembly/machine languages. Programming languages include, but are not limited to, such as Java, c++, python, "C" or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of remote computing devices, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., connected via the Internet using an Internet service provider).
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
Those skilled in the art will appreciate that the features recited in the various embodiments of the disclosure and/or in the claims may be provided in a variety of combinations and/or combinations, even if such combinations or combinations are not explicitly recited in the disclosure. In particular, the features recited in the various embodiments of the present disclosure and/or the claims may be variously combined and/or combined without departing from the spirit and teachings of the present disclosure. All such combinations and/or combinations fall within the scope of the present disclosure.
The embodiments of the present disclosure are described above. These examples are for illustrative purposes only and are not intended to limit the scope of the present disclosure. Although the embodiments are described above separately, this does not mean that the measures in the embodiments cannot be used advantageously in combination. The scope of the disclosure is defined by the appended claims and equivalents thereof. Various alternatives and modifications can be made by those skilled in the art without departing from the scope of the disclosure, and such alternatives and modifications are intended to fall within the scope of the disclosure.

Claims (12)

1. A fault handling method, comprising:
in the event of a failure of a first system, determining user data associated with the failure in the first system;
Marking the user data as whitelisted user data, wherein the first system is capable of responding to requests of other users than the whitelisted user data;
Transferring the white list user data to a second system so as to perform fault isolation on the white list user data in the first system, wherein the second system and the first system belong to a parallel operation state, the second system comprises a temporary service table and a target service table, and all initial user data in the first system are stored in the target service table, and
Receiving the whitelist user data from the second system in case of a failover of the first system;
Wherein transferring the whitelist user data to a second system for fault isolation of the whitelist user data in the first system comprises:
transferring the white list user data to the temporary service table;
updating the initial user data in the target service table which is the same as a main key in the temporary service table into the white list data based on the white list data transferred in the temporary service table;
Modifying the transaction route of the white list data to the second system so that the white list user data normally operates in the second system, and so that fault isolation is performed on the white list user data in the first system.
2. The method of claim 1, wherein all initial user data in the first system is transferred from the second system.
3. The method of claim 1, wherein receiving the whitelist user data from the second system in the event of a failover of the first system comprises:
clearing user data associated with the fault in the first system;
And receiving the white list user data from the second system based on a preset logic rule.
4. The method of claim 1, further comprising, after said receiving the whitelisted user data from the second system:
and modifying the transaction route of the white list user data to the repaired first system.
5. The method of claim 1, wherein determining user data associated with the failure in the first system comprises:
And analyzing the transaction log in the first system to obtain user data associated with the fault.
6. The method of claim 1, wherein the tagging the user data as whitelisted user data comprises:
modifying a state of user data associated with the fault in the first system;
Marking the modified user data as white list user data.
7. The method of claim 1, further comprising, after said marking said user data as whitelisted user data:
And controlling the business transaction of the white list user data.
8. The method according to any one of claims 1-7, wherein the fault handling method is implemented based on Python scripting language writing.
9. A fault handling apparatus comprising:
a determining module, configured to determine, in case of a failure of a first system, user data associated with the failure in the first system;
A tagging module configured to tag the user data as whitelisted user data, wherein the first system is capable of responding to requests of other users than the whitelisted user data;
A transfer module for transferring the white list user data to a second system so as to perform fault isolation on the white list user data in the first system, wherein the second system and the first system belong to a parallel operation state, the second system comprises a temporary service table and a target service table, all initial user data in the first system are stored in the target service table, and
A receiving module, configured to receive the whitelist user data from the second system in the case of failover of the first system;
wherein the transfer module comprises:
A transferring sub-module, configured to transfer the whitelist user data to the temporary service table;
an updating sub-module, configured to update the initial user data in the target service table that is the same as the primary key in the temporary service table to the whitelist data based on the whitelist data transferred in the temporary service table;
And the first modification submodule is used for modifying the transaction route of the white list data to the second system so that the white list user data normally operates in the second system and fault isolation is carried out on the white list user data in the first system.
10. An electronic device, comprising:
One or more processors;
Storage means for storing one or more programs,
Wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to perform the method of any of claims 1-8.
11. A computer readable storage medium having stored thereon executable instructions which, when executed by a processor, cause the processor to perform the method according to any of claims 1-8.
12. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any one of claims 1-8.
CN202111487776.7A 2021-12-07 2021-12-07 Fault handling method, processing device, electronic device and readable storage medium Active CN114138564B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111487776.7A CN114138564B (en) 2021-12-07 2021-12-07 Fault handling method, processing device, electronic device and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111487776.7A CN114138564B (en) 2021-12-07 2021-12-07 Fault handling method, processing device, electronic device and readable storage medium

Publications (2)

Publication Number Publication Date
CN114138564A CN114138564A (en) 2022-03-04
CN114138564B true CN114138564B (en) 2025-06-06

Family

ID=80385000

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111487776.7A Active CN114138564B (en) 2021-12-07 2021-12-07 Fault handling method, processing device, electronic device and readable storage medium

Country Status (1)

Country Link
CN (1) CN114138564B (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102693324A (en) * 2012-01-09 2012-09-26 西安电子科技大学 Distributed database synchronization system, synchronization method and node management method

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130124916A1 (en) * 2011-11-16 2013-05-16 Microsoft Corporation Layout of mirrored databases across different servers for failover
CN104750757B (en) * 2013-12-31 2018-05-08 中国移动通信集团公司 A kind of date storage method and equipment based on HBase
US9436553B2 (en) * 2014-08-04 2016-09-06 Microsoft Technology Licensing, Llc Recovering usability of cloud based service from system failure

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102693324A (en) * 2012-01-09 2012-09-26 西安电子科技大学 Distributed database synchronization system, synchronization method and node management method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
"基于CTMO模型的数据库损坏数据隔离技术";戴华 等;《计算机学报》;20110228;第34卷(第2期);第275-290 *

Also Published As

Publication number Publication date
CN114138564A (en) 2022-03-04

Similar Documents

Publication Publication Date Title
US10951491B2 (en) Automatic microservice problem detection in enterprise applications
US11449379B2 (en) Root cause and predictive analyses for technical issues of a computing environment
US11397739B2 (en) Automated information technology services composition
US10846210B1 (en) Automation of platform release
US11741065B2 (en) Hardware, firmware, and software anomaly handling based on machine learning
US20200410423A1 (en) Mining process logs for generation of workflow for service request completion
CN114115628B (en) U shield display information acquisition method, device, equipment, medium and program product
US9632922B2 (en) Workload mapper for potential problem areas using modules and defect data
CN116166390A (en) Service processing method and device, electronic equipment and storage medium
CN110795447A (en) Data processing method, data processing system, electronic device, and medium
CN113392002A (en) Test system construction method, device, equipment and storage medium
CN118885406B (en) Database cluster abnormity testing method and device
US10042670B2 (en) Providing automatic retry of transactions with diagnostics
CN114138564B (en) Fault handling method, processing device, electronic device and readable storage medium
CN113553462A (en) Asset information verification method, system and device
CN112463635A (en) Software acceptance test method and device
CN118964164A (en) A code repair method, device, equipment and medium
CN116975200A (en) Method, device, equipment and medium for controlling working state of server
US20190065168A1 (en) Apparatus and method to shorten software installation time based on a history of file installation
KR20200000684A (en) Test unified administration system and Controlling Method for the Same
US9087311B2 (en) Method, system and program product for grouping related program sequences
US20190377628A1 (en) Dynamically controlling runtime system logging based on end-user reviews
CN119449602B (en) Network connection method, apparatus, device, medium, and program product
US11907409B2 (en) Dynamic immutable security personalization for enterprise products
CN116467209A (en) Performance test method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant