[go: up one dir, main page]

CN109525434B - Redundancy backup method based on onboard equipment board card - Google Patents

Redundancy backup method based on onboard equipment board card Download PDF

Info

Publication number
CN109525434B
CN109525434B CN201811528706.XA CN201811528706A CN109525434B CN 109525434 B CN109525434 B CN 109525434B CN 201811528706 A CN201811528706 A CN 201811528706A CN 109525434 B CN109525434 B CN 109525434B
Authority
CN
China
Prior art keywords
board card
equipment
equipment board
board
card
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811528706.XA
Other languages
Chinese (zh)
Other versions
CN109525434A (en
Inventor
符腾飞
李春芳
柏晓平
黄小亮
李曦雅
周长红
解小刚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
AVIC Shanghai Aeronautical Measurement Controlling Research Institute
Original Assignee
AVIC Shanghai Aeronautical Measurement Controlling Research Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by AVIC Shanghai Aeronautical Measurement Controlling Research Institute filed Critical AVIC Shanghai Aeronautical Measurement Controlling Research Institute
Priority to CN201811528706.XA priority Critical patent/CN109525434B/en
Publication of CN109525434A publication Critical patent/CN109525434A/en
Application granted granted Critical
Publication of CN109525434B publication Critical patent/CN109525434B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0654Management of faults, events, alarms or notifications using network fault recovery
    • H04L41/0668Management of faults, events, alarms or notifications using network fault recovery by dynamic selection of recovery network elements, e.g. replacement by the most appropriate element after failure
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0805Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters by checking availability
    • H04L43/0817Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters by checking availability by checking functioning
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/10Active monitoring, e.g. heartbeat, ping or trace-route

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Cardiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Environmental & Geological Engineering (AREA)
  • Hardware Redundancy (AREA)

Abstract

The invention discloses a redundancy backup method based on airborne equipment board cards, when a system comprises two equipment board cards, one of the two equipment board cards is randomly selected as a main board, the other equipment board is selected as a spare board, the main board carries out system service processing, the spare board is switched into the main board after the main board fails, and the system service processing is continued. The invention automatically switches the redundancy backup under the conditions of single board starting, abnormal network storm, abnormal heartbeat pin, failure of the main board and the like, prolongs the service life of the main board and improves the robustness of the system.

Description

Redundancy backup method based on onboard equipment board card
Technical Field
The invention relates to a main/standby switching technology of communication equipment, in particular to a redundancy backup method based on an onboard equipment board card.
Background
Airborne equipment requires than higher to the reliability, and airborne equipment integrated circuit board generally adopts the form of two equipment integrated circuit boards, and wherein a equipment integrated circuit board is as the redundancy of another equipment integrated circuit board, and the task of equipment is normally handled to the mainboard, and when the mainboard sent out the trouble, the spare plate must switch over to the mainboard in the specified time, replaces original mainboard and continues to accomplish the task of equipment, but starts fixed mainboard and spare plate at every turn and can influence the life of integrated circuit board.
In addition, the airborne equipment has higher requirements on the switching time, the switching must be completed within ms-level time generally, otherwise, the normal operation of service software is affected, the patent 1+1 redundancy backup method and system of communication equipment provides that "heartbeat" is sent in a network mode, and the main/standby switching is completed when the heartbeat is abnormal, but the network is not real-time, the arrival time of the message is uncontrollable, and the airborne equipment cannot be guaranteed to complete the main/standby switching within the specified ms-level time.
Disclosure of Invention
The invention aims to provide a redundancy backup method based on an onboard equipment board card.
The technical solution for realizing the purpose of the invention is as follows: when the system comprises two equipment board cards, one of the two equipment board cards is randomly selected as a main board, the other equipment board is a standby board, the main board performs system service processing, the standby board is switched to the main board after the main board fails, and the system service processing is continued.
Further, the method includes four stages of system power-on, board card heartbeat package pin inspection, main and standby board election and operation, and board card detection is performed at each stage, and a detection result message code is sent, including:
1, restarting the board in the moment of failure in operation
2 the board is a main board
3 the board is a spare board
4, the board is a main board and a spare board fails
5. a switchover occurs
6 the plate is broken
7 spare board is broken in the starting process
8, the main board is broken in the starting process.
Furthermore, the system has 4 IPs, wherein IP _1 is a transition IP of the equipment board card A, IP _2 is a transition IP of the equipment board card B, IP _3 is an external IP of the equipment, and IP _4 is a spare board IP, wherein the IP _1 and the IP _2 are set by a state bit of pulling up and pulling down an I/O port of the two board cards after being electrified, and if the state bit read by the I/O port is high, the IP is set as IP _ 1; if the status bit read by the I/O port is low, then IP is set to IP _ 2.
Furthermore, in the system power-on stage, in order to judge whether the starting is due to the fault restarting of the board in the operation process, whether IP _3 equipment exists in the system is checked, if so, a message code 1 is sent (the fault instant restarting occurs in the operation process), and the redundant backup operation is quitted; if not, directly checking the pins of the board card heartbeat package, and selecting and operating the main board and the standby board.
Furthermore, in the main board election stage, the device board a serves as a client of a UDP (User Datagram Protocol) and the device board B serves as a server of the UDP;
the equipment board card A elects the main board and the auxiliary board through Rand (), the electing result is sent to the equipment board card B, and if the equipment board card A is elected as the main board, a master is sent; if the equipment board card A is elected as the spare board, then the slave is sent;
the equipment board card B waits for the election result sent by the equipment board card A, if the election result sent by the equipment board card A is not received within 10.5s, whether the board normally works is judged, if yes, the IP of the board is switched to IP _3, and a message 8 is sent (the main board is broken in the starting process); if not, send message 6 (the board is bad); if the election result sent by the equipment board card A is received, replying the equipment board card A to answer ACK, if the election result is received as the master, switching the IP to IP _4, and sending a message code 3 (the board is an equipment board); if the slave is received, switching the IP to IP _3, and sending a message code 2 (the board is a main board);
the equipment board card A waits for the response ACK replied by the equipment board card B, if the response ACK replied by the equipment board card B is not received within 10.5s, overtime processing is carried out, whether the board normally works is judged, if yes, the IP of the board is switched to IP _3, and a message code 7 is sent (the equipment board is damaged in the starting process); if not, sending a message code 6 (the board is bad); if the equipment board card A receives the ACK replied by the equipment board card B, further operation is carried out according to the election result, if the equipment board card A is the master, the IP is switched to be IP _3, and a message code 2 is sent (the board is a mainboard); if it is slave, the IP is switched to IP _4, and message code 3 is sent (the board is the standby board).
Furthermore, the equipment board card A receives the ACK replied by the equipment board card B, and a heartbeat thread is established; similarly, the equipment board card B receives the election result sent by the equipment board card A and also creates a heartbeat thread; the heartbeat signal is a square wave signal, and the period of the square wave is 200 ms.
Further, in the data processing board card heartbeat packet pin checking stage, if the device board a detects that the heartbeat pin is abnormal, the device board B detects that the heartbeat pin is normal, the device a waits for 13 seconds, the device B does not wait, and the device board B enters into a state of being ready to receive the election result sent by the device board a, if the device board B cannot receive the election result sent by the device board a within 10.5 seconds, the device board B judges whether the board normally works, if so, the device board B is switched to IP _3, and sends a message code 8 (a main board is damaged in the starting process), and after 2.5 seconds, the device board a finds that the IP _3 exists, switches the IP _1, and sends a message code 6 (the board is damaged); if not, the equipment board B is switched to IP _2, a message code 6 is sent (the board is damaged), and after 2.5s, the equipment board A finds that IP _3 does not exist, the equipment board A judges whether the board normally works, if so, the IP is switched to IP _3 and a message code 7 is sent (the equipment board is damaged in the starting process), and if not, the IP is switched to IP _1 and a message code 6 is sent (the board is damaged);
if the equipment board A detects that the heartbeat pin is normal, the equipment board B detects that the heartbeat pin is abnormal, the equipment B waits for 17s, the equipment A does not wait, the main board and the equipment board which are started at this time are elected, the electing result is sent to the equipment B, if the equipment board A cannot receive response ACK sent by the equipment board B in 10.5s, the equipment board A judges whether the board normally works, if so, the equipment board A switches the IP to IP _3 and sends a message code 7 (the equipment board is damaged in the starting process), and after 6.5s, the equipment board B finds that the IP _3 exists, switches the IP to IP _2 and sends a message code 6 (the board is damaged); if not, the device board A switches the IP to IP _1, sends a message code 6 (the board is broken), and after 6.5s, the device board B finds that the IP _3 does not exist, and judges whether the board normally works or not, if so, the device board B switches the IP to IP _3 and sends a message code 8 (the main board is broken in the starting process), and if not, the device board A switches the IP to IP _2 and sends a message code 6 (the board is broken);
if the equipment board A detects that the heartbeat pin is abnormal, the equipment board B detects that the heartbeat pin is abnormal, the equipment board A waits for 13s, the equipment board B waits for 17s, and after 13s, the equipment board A finds that IP _3 does not exist, and judges whether the board normally works, if the mainboard A normally works, the board switches IP _3, and simultaneously sends a message code 7 (the equipment board is damaged in the starting process), and after 4s, the equipment board B finds that the IP _3 exists, switches the IP _2 of the board, and sends a message code 6 (the board is damaged); if the main board A does not work normally, the board switches to IP _1, and simultaneously sends a message code 6 (the board is broken), and after 4s, the equipment board B finds that IP _3 does not exist, the equipment board B judges whether the board works normally, if so, the IP is switched to IP _3 and a message code 8 is sent (the main board is broken in the starting process), and if not, the IP is switched to IP _2 and a message code 6 is sent (the board is broken);
and if the heartbeat pins detected by the equipment board A are normal and the heartbeat pins detected by the equipment board B are normal, the main and auxiliary boards are directly selected and operated.
Further, in the operation stage, if the hard-line heartbeat signal of the equipment board card B received by the equipment board card a is overtime, the equipment board card a judges whether the network storm is caused, if the network storm is caused, judges whether the board is a main board, if the board is the main board, sends a message 4 (the board is the main board, the spare board has a fault), otherwise, switches the IP to IP _1, and sends a message 6 (the board is damaged); if the board is not caused by the network storm, judging whether the board works normally, if not, switching the IP to IP _1, and sending a message 6 (the board is damaged); if the board still works normally, judging whether the board is a main board, if so, sending a message 4 (the board is the main board and the spare board has a fault), if not, switching the IP to IP _3, and sending a message 5 (switching occurs);
if the equipment board card B receives the hard-line heartbeat signal of the equipment board card A and is overtime, the equipment board card B judges whether the hard-line heartbeat signal is caused by a network storm, if so, judges whether the board is a main board, if so, sends a message 4 (the board is the main board and the spare board has a fault), otherwise, switches the IP to IP _2, and sends a message 6 (the board is damaged); if the board is not caused by the network storm, judging whether the board works normally, if not, switching the IP to IP _2, and sending a message 6 (the board is damaged); if the board still works normally, judging whether the board is a main board, if so, sending a message 4 (the board is the main board and the spare board has a fault), if not, switching the IP to IP _3, and sending a message 5 (the switching occurs).
Compared with the prior art, the invention has the following remarkable advantages: 1) the invention can randomly select the main board and the standby board which are started at this time, thereby prolonging the service life of the main board; 2) the heartbeat packet of the invention adopts a hard wire mode and can complete the switching within the specified ms time; 3) the invention automatically switches the redundancy backup under the conditions of single board starting, abnormal network storm, abnormal heartbeat pin, failure of the main board and the like, ensures that at least one main board works normally, and improves the robustness of the system.
Drawings
Fig. 1 is a system equipment composition diagram of the onboard equipment board card of the present invention.
Fig. 2 is a flow chart of the redundant backup of the present invention.
Detailed Description
The invention is further illustrated by the following examples in conjunction with the accompanying drawings.
A redundant backup method for a board card of airborne equipment needs the following equipment:
as shown in fig. 1, the external communication between the device board a and the device board B mainly uses an ethernet mode, and the hardware difference between the device board a and the device board B mainly differs through the high and low levels collected by the I/O port. And judging whether the equipment board card A or the equipment board card B is the equipment board card A or the equipment board card B through the status bits of the I/O port which is pulled up and pulled down. And a heartbeat package is realized between the equipment board card A and the equipment board card B in a hard wire mode.
The redundant backup software sends the following message codes to the service software through the message queue.
a.1, restarting at the moment of fault in operation
b.2 is the main board
c.3 self is a spare board
d.4, the self is the main board and the spare board has a fault
e.5 that a handover has occurred
f.6 is bad by oneself
g.7 spare board is broken in the starting process
h.8 main board is broken in the starting process
With reference to fig. 2, a method for redundancy backup of a board card of an airborne device includes the following steps:
s1, the system has 4 IPs, IP _3 is the external IP of the device, IP _1 is the transition IP of the device board card A, IP _2 is the transition IP of the device board card B, IP _4 is the IP of the device board, after the system is powered on, the two board cards can set the corresponding transition IP by the status bit of pulling up and pulling down through the I/O port, if the status bit read by the I/O port is high, the IP is set as IP _1, and the board is the upper main board of the hardware; if the status bit read by the I/O port is low, the board is a hardware-ready board, and IP is set to IP _ 2.
S2, the system will check if there is IP _3 device in the system, if yes, then send message code 1 (when the system is in operation, it will restart at the moment of failure), the redundant backup software will exit; if not, operation continues. After the device board a and the device board B operate normally, if one of the two boards is abnormally restarted, the IP of the other board is switched to IP _3, and the restarted board should enter a failure mode.
S3, the system checks whether the heartbeat packet pin is normal, and the following four conditions are adopted:
a. the device board A detects that the heartbeat pin is abnormal, and the device board B detects that the heartbeat pin is normal. The equipment A waits for 13s, the equipment B does not wait, the equipment A enters the state of being ready to receive the election result sent by the equipment board A, if the equipment board B cannot receive the election result sent by the equipment board A in 10.5s, the equipment board B judges whether the board normally works, if so, the equipment board B is switched to IP _3 and sends a message code 8 (the main board is damaged in the starting process), and after 2.5s, the equipment board A finds that the IP _3 exists, switches IP _1 and sends a message code 6 (the board is damaged); if not, the equipment board B is switched to IP _2, a message code 6 is sent (the board is damaged), after 2.5s, the equipment board A finds that IP _3 does not exist, the equipment board A judges whether the board normally works, if so, the IP is switched to IP _3 and a message code 7 is sent (the equipment board is damaged in the starting process), and if not, the IP is switched to IP _1 and a message code 6 is sent (the board is damaged).
b. The device board A detects that the heartbeat pin is normal, and the device board B detects that the heartbeat pin is abnormal. The equipment B waits for 17s, the equipment A does not wait for the "election" which started block to be the main board and which to be the spare board, the election result is sent to the equipment A, if the equipment board A cannot receive the response ACK sent by the equipment board B in 10.5s, the equipment board A judges whether the board normally works, if so, the equipment board A switches the IP to IP _3, sends a message code 7 (the spare board is damaged in the starting process), and after 6.5s, the equipment board B finds that the IP _3 exists, switches the IP to IP _2 and sends a message code 6 (the board is damaged); if not, the device board A switches the IP to IP _1, sends a message code 6 (the board is broken), and after 6.5s, the device board B finds that IP _3 does not exist, and judges whether the board normally works, if so, the device board B switches the IP to IP _3 and sends a message code 8 (the main board is broken in the starting process), and if not, the device board A switches the IP to IP _2 and sends a message code 6 (the board is broken).
c. The device board A detects the abnormality of the heartbeat pin, and the device board B detects the abnormality of the heartbeat pin. Equipment board a waits 13s and equipment board B waits 17 s. After 13s, the equipment board A finds that the IP _3 does not exist, whether the board normally works is judged, if the main board A normally works, the board switches the IP _3 and simultaneously sends a message code 7 (the equipment board is damaged in the starting process), and after 4s, the equipment board B finds that the IP _3 exists, switches the IP _2 of the equipment board B and sends a message code 6 (the board is damaged); if the main board A does not work normally, the board switches to IP _1, and simultaneously sends a message code 6 (the board is broken), after 4s, the equipment board B finds that IP _3 does not exist, the equipment board B judges whether the board works normally, if so, the IP is switched to IP _3 and a message code 8 is sent (the main board is broken in the starting process), and if not, the IP is switched to IP _2 and a message code 6 is sent (the board is broken).
d. If the heartbeat pin detected by the equipment board A is normal, the heartbeat pin detected by the equipment board B is normal, and the execution is continued.
And S4, the equipment board card A can select which equipment is used as the main board and which equipment is used as the standby board when the system is started up through the Rand (). When the main board elects, the device board a serves as a client of a UDP (User Datagram Protocol), and the device board B serves as a server of the UDP. The equipment board card A sends the election result to the equipment board card B, and if the equipment board card A started at this time is elected as a mainboard, a master is sent; and if the starting device board card A is elected as the standby board, sending the slave. In the mainboard election stage, board card fault conditions are divided into the following 2 conditions:
a. the equipment board card B waits for the equipment board card A to send an election result, if the election result sent by the equipment board card A is not received within 10.5s, whether the equipment board card B is alive or not is judged, if the equipment board card B is not alive, the IP of the equipment board card B is switched into IP _3, and a message 8 is sent (a main board is broken in the starting process); if not, message 6 (itself bad) is sent. And if the election result sent by the equipment board card A is received, replying the equipment board card A to answer the ACK. If the received master is the master, the IP is switched to IP _4, and a message code 3 (the master is the standby board) is sent; if the slave is received, the IP is switched to IP _3, and the message code 2 (which is the main board) is sent.
b. The equipment board card A waits for the response ACK replied by the equipment board card B, if the response ACK replied by the equipment board card B is not received within 10.5s, overtime processing is carried out, whether the equipment board card A is alive or not is judged, if the equipment board card A is alive, the IP of the equipment board card A is switched into IP _3, and a message code 7 is sent (the equipment board is damaged in the starting process); if not, message code 6 (bad by itself) is sent. The equipment board card A receives the ACK replied by the equipment board card B, further operation is carried out according to the election result, if the result is master, the IP is switched to be IP _3, and a message code 2 (the own is a mainboard) is sent; if it is slave, the switching IP is IP _4 and message code 3 (itself is the standby board) is sent.
S5, when the equipment board card A receives the ACK replied by the equipment board card B, a heartbeat thread is created; similarly, the equipment board card B receives the election result sent by the equipment board card A, and a heartbeat thread is also created; the heartbeat signal is a square wave signal, and the period of the square wave is 200 ms.
In the running process of the system, the network storm, abnormal exit of the service software, system crash and system restart can stop sending heartbeat signals. The network storm can cause the system to slowly jam in operation, the heartbeat driving software is provided with a timer, when the system normally operates, the overflow value can be reset when one heartbeat is sent, whether the overflow value is reset or not can be checked when the timer arrives, if the overflow value is not reset, the network storm mark position is effective, and when the network storm mark position is detected to be effective in redundancy backup, the network storm fault appears. The redundancy backup software monitors the service software, and when the service software exits abnormally, the redundancy backup software of the mainboard stops the heartbeat process. The system crash and the system restart can cause the heartbeat software process of the mainboard to stop.
a. If the equipment board card A receives the hard-line heartbeat signal of the equipment board card B and is overtime, the equipment board card A judges whether the hard-line heartbeat signal is caused by a network storm, judges whether the equipment board card A is a main board if the equipment board card A is the network storm, sends a message 4 (the equipment board card A is the main board and the equipment board has a fault) if the equipment board card A is the main board, and switches the IP to be IP _1 and sends a message 6 (the equipment board card A is damaged). If not caused by the network storm, judging whether the mobile terminal is alive, if not, switching the IP to IP _1, and sending a message 6 (bad) by the mobile terminal; if the self is still alive, judging whether the self is the main board, if the self is the main board, sending a message 4 (the self is the main board, the spare board has a fault), if the self is not the main board, switching the IP to IP _3, and sending a message 5 (the switching occurs).
b. If the equipment board card B receives the hard-line heartbeat signal of the equipment board card A and is overtime, the equipment board card B judges whether the hard-line heartbeat signal is caused by a network storm, judges whether the equipment board card B is a main board if the equipment board card B is caused by the network storm, sends a message 4 (the equipment board card B is the main board and the equipment board has a fault) if the equipment board card B is the main board, and switches the IP to IP _2 and sends a message 6 (the equipment board card B is damaged). If not caused by the network storm, judging whether the mobile terminal is alive, if not, switching the IP to IP _2, and sending a message 6 (bad) by the mobile terminal; if the self is still alive, judging whether the self is the main board, if the self is the main board, sending a message 4 (the self is the main board, the spare board has a fault), if the self is not the main board, switching the IP to IP _3, and sending a message 5 (the switching occurs).

Claims (2)

1. A redundancy backup method based on onboard equipment board cards is characterized in that when a system comprises two equipment board cards, one of the two equipment board cards is randomly selected as a main board, the other equipment board is selected as a spare board, the main board carries out system service processing, the spare board is switched to the main board after the main board fails, and the system service processing is continued;
including system power-on, board card heartbeat package pin inspection, main and standby board election and operation four stages, all carry out board card fault detection in each stage, send testing result message sign indicating number, include:
1, instant restart in case of fault in operation
2 the main board is a main board
3 the main board is a spare board
4, the main board is a main board, and the spare board has a fault
5. a switchover occurs
6 the board is broken
7 spare board is broken in the starting process
8, the main board is broken in the starting process;
the system has 4 IPs, IP _1 is the transition IP of the equipment board card A, IP _2 is the transition IP of the equipment board card B, IP _3 is the external IP of the system, IP _4 is the IP of the equipment board, wherein IP _1 and IP _2 are set by the status bit of pulling up and pulling down the I/O port of the two board cards after being electrified, if the status bit read by the I/O port is high, the IP is set as IP _ 1; if the status bit read by the I/O port is low, then IP is set to IP _ 2;
in the system power-on stage, checking whether an IP _3 device board exists in the system, if so, sending a message code 1, and exiting the redundant backup operation; if not, directly checking the pins of the board card heartbeat package, and electing and operating the main board and the standby board;
in the main and standby board election stage, the device board card A is used as a client of the UDP, and the device board card B is used as a server of the UDP;
the equipment board card A elects the main board and the auxiliary board through Rand (), the electing result is sent to the equipment board card B, and if the equipment board card A is elected as the main board, a master is sent; if the equipment board card A is elected as the spare board, then the slave is sent;
the equipment board card B waits for an election result sent by the equipment board card A, if the election result sent by the equipment board card A is not received within 10.5s, whether the equipment board card B normally works is judged, if yes, the IP of the equipment board card B is switched to IP _3, and a message code 8 is sent; if not, sending a message code 6; if the election result sent by the equipment board card A is received, replying the equipment board card A to answer ACK, if the election result is received as the master, switching the IP to IP _4, and sending a message code 3; if the slave is received, switching the IP to IP _3, and sending a message code 2;
the equipment board card A waits for the response ACK replied by the equipment board card B, if the response ACK replied by the equipment board card B is not received within 10.5s, overtime processing is carried out, whether the equipment board card A works normally is judged, if yes, the IP of the equipment board card A is switched to IP _3, and a message code 7 is sent; if not, sending a message code 6; if the equipment board card A receives the ACK replied by the equipment board card B, further operation is carried out according to the election result, if the equipment board card A is the master, the IP is switched to be IP _3, and a message code 2 is sent; if the slave is slave, switching the IP to IP _4, and sending a message code 3;
in a data processing board card heartbeat package pin checking stage, if the equipment board card A detects that the heartbeat pin is abnormal, the equipment board card B detects that the heartbeat pin is normal, the equipment board card A can wait for 13s, the equipment board card B does not wait, and the equipment board card B enters the option result to be sent by the equipment board card A, if the equipment board card B cannot receive the option result sent by the equipment board card A in 10.5s, the equipment board card B judges whether the board normally works, if so, the equipment board card B is switched into IP _3 and sends message codes 8, and after 2.5s, the equipment board card A finds that IP _3 exists, switches IP _1 and sends the message codes 6; if not, the equipment board card B is switched to IP _2, the message code 6 is sent, and after 2.5s, the equipment board card A finds that IP _3 does not exist, the equipment board card A judges whether the board normally works, if so, the IP is switched to IP _3 and the message code 7 is sent, and if not, the IP is switched to IP _1 and the message code 6 is sent;
if the equipment board card A detects that the heartbeat pin is normal, the equipment board card B detects that the heartbeat pin is abnormal, the equipment board card B waits for 17s, the equipment board card A does not wait, the started main board and equipment board are elected, the elected result is sent to the equipment board card B, if the equipment board card A cannot receive response ACK sent by the equipment board card B in 10.5s, the equipment board card A judges whether the board normally works, if yes, the equipment board card A switches IP to IP _3 and sends message codes 7, and after 6.5s, the equipment board card B finds that IP _3 exists, switches IP to IP _2 and sends message codes 6; if not, the equipment board card A switches the IP to IP _1 and sends a message code 6, and after 6.5s, the equipment board card B finds that IP _3 does not exist, the equipment board card B judges whether the board normally works, if so, the IP is switched to IP _3 and a message code 8 is sent, and if not, the IP is switched to IP _2 and a message code 6 is sent;
if the equipment board card A detects that the heartbeat pin is abnormal, the equipment board card B detects that the heartbeat pin is abnormal, the equipment board card A waits for 13s, the equipment board card B waits for 17s, and after 13s, the equipment board card A finds that IP _3 does not exist, whether the equipment board card A normally works is judged, if the equipment board card A normally works, the equipment board card A switches IP _3, and simultaneously sends a message code 7, and after 4s, the equipment board card B finds that IP _3 exists, switches own IP _2, and sends a message code 6; if the equipment board card A works abnormally, the equipment board card A switches IP _1 and sends a message code 6 at the same time, 4s later, the equipment board card B finds that IP _3 does not exist, the equipment board card B judges whether the board works normally, if yes, the IP is switched to be IP _3 and a message code 8 is sent, and if not, the IP is switched to be IP _2 and a message code 6 is sent;
if the equipment board card A detects that the heartbeat pin is normal and the equipment board card B detects that the heartbeat pin is normal, the main and standby boards are directly selected and operated;
in the operation stage, if the hard-line heartbeat signal of the equipment board card B received by the equipment board card A is overtime, the equipment board card A judges whether the network storm is caused, if the network storm is caused, the equipment board card A judges whether the equipment board card A is a mainboard, if the equipment board card A is the mainboard, the message code 4 is sent, otherwise, the IP is switched to IP _1, and the message code 6 is sent; if the device board card A does not work normally due to the network storm, judging whether the device board card A works normally, if not, switching the IP to IP _1, and sending a message code 6; if the equipment board card A still works normally, judging whether the equipment board card A is a mainboard or not, if so, sending a message code 4, and if not, switching the IP to IP _3 and sending a message code 5;
if the equipment board card B receives the hard-line heartbeat signal of the equipment board card A and is overtime, the equipment board card B judges whether the hard-line heartbeat signal is caused by a network storm, if so, the equipment board card B judges whether the equipment board card B is a mainboard, if so, the message code 4 is sent, otherwise, the IP is switched to be IP _2, and the message code 6 is sent; if the device board card B does not work normally due to the network storm, judging whether the device board card B works normally, if not, switching the IP to IP _2, and sending a message code 6; and if the equipment board card B still works normally, judging whether the equipment board card B is a mainboard, if so, sending a message code 4, and if not, switching the IP to IP _3 and sending a message code 5.
2. The onboard equipment board card-based redundancy backup method according to claim 1, wherein the equipment board card a receives ACK replied by the equipment board card B and creates a heartbeat thread; similarly, the equipment board card B receives the election result sent by the equipment board card A and also creates a heartbeat thread; the heartbeat signal is a square wave signal, and the period of the square wave is 200 ms.
CN201811528706.XA 2018-12-13 2018-12-13 Redundancy backup method based on onboard equipment board card Active CN109525434B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811528706.XA CN109525434B (en) 2018-12-13 2018-12-13 Redundancy backup method based on onboard equipment board card

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811528706.XA CN109525434B (en) 2018-12-13 2018-12-13 Redundancy backup method based on onboard equipment board card

Publications (2)

Publication Number Publication Date
CN109525434A CN109525434A (en) 2019-03-26
CN109525434B true CN109525434B (en) 2022-02-22

Family

ID=65796390

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811528706.XA Active CN109525434B (en) 2018-12-13 2018-12-13 Redundancy backup method based on onboard equipment board card

Country Status (1)

Country Link
CN (1) CN109525434B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112445503B (en) * 2019-08-28 2025-07-22 中兴通讯股份有限公司 Upgrade method, communication device and computer readable storage medium
CN112953803B (en) * 2021-02-10 2022-07-08 西南电子技术研究所(中国电子科技集团公司第十研究所) Airborne redundant network data transmission method
CN112893192B (en) * 2021-03-30 2023-08-04 湖州霍里思特智能科技有限公司 Board card, detection mechanism and mineral product sorting machine
CN113131993B (en) * 2021-04-16 2022-06-17 中电科航空电子有限公司 Airborne satellite communication system and satellite link switching method thereof
CN113568707B (en) * 2021-07-29 2024-06-25 中国船舶重工集团公司第七一九研究所 Computer control method and system for ocean platform based on container technology

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101115033A (en) * 2007-09-04 2008-01-30 武汉市中光通信公司 Active-standby switching system and method for session initiation protocol gateway
CN101483540A (en) * 2008-01-11 2009-07-15 上海博达数据通信有限公司 Master-slave switching method in high class data communication equipment
CN106656589A (en) * 2016-12-13 2017-05-10 武汉船舶通信研究所 Server dual hot backup system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101115033A (en) * 2007-09-04 2008-01-30 武汉市中光通信公司 Active-standby switching system and method for session initiation protocol gateway
CN101483540A (en) * 2008-01-11 2009-07-15 上海博达数据通信有限公司 Master-slave switching method in high class data communication equipment
CN106656589A (en) * 2016-12-13 2017-05-10 武汉船舶通信研究所 Server dual hot backup system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于CPCI的双CPU冗余备份系统设计;王江;《中国优秀硕士学位论文全文数据库信息科技辑》;20150415;第22-23、46-48页 *

Also Published As

Publication number Publication date
CN109525434A (en) 2019-03-26

Similar Documents

Publication Publication Date Title
CN109525434B (en) Redundancy backup method based on onboard equipment board card
US6859889B2 (en) Backup system and method for distributed systems
US20230362051A1 (en) Control Plane Device Switching Method and Apparatus, and Forwarding-Control Separation System
CN106161109B (en) Network abnormity self-recovery method
CN101324855A (en) Method, system, component and multi-CPU equipment for detecting auxiliary CPU operating status
CN104199869B (en) A kind of business batch processing method, service server and system
CN114218004B (en) Fault processing method and system of Kubernetes cluster physical node based on BMC
CN118245269B (en) PCI equipment fault processing method and device and fault processing system
CN104079454A (en) Equipment exception detecting method and equipment
US7953016B2 (en) Method and system for telecommunication apparatus fast fault notification
CN102026042A (en) Keep-alive and self-healing method and device for advanced telecom computing architecture control surface
CN105636096A (en) Self-recovery method and device after base station is interrupted
JP6421516B2 (en) Server device, redundant server system, information takeover program, and information takeover method
CN112311621B (en) Communication detection method and device
US11954509B2 (en) Service continuation system and service continuation method between active and standby virtual servers
CN116302851B (en) FPGA logic abnormality monitoring and recovering method, device, equipment and medium
CN111221683A (en) Double-flash hot backup method, system, terminal and storage medium for data center switch
KR102262942B1 (en) Gateway self recovery method by the wireless bridge of wireless network system system
CN105306256B (en) A kind of two-node cluster hot backup implementation method based on VxWorks equipment
CN111858183B (en) Restarting method and device of electronic equipment
US8111625B2 (en) Method for detecting a message interface fault in a communication device
JPH10133963A (en) Computer failure detection and recovery method
CN119676058B (en) Link interrupt error reporting method and device, computer equipment and storage medium
CN116669084B (en) Fault restoration method, device, equipment and storage medium based on cellular network
JP2001184138A (en) Hardware system and its fault elimination method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant