[go: up one dir, main page]

CN120560954A - Partition monitoring method, device, equipment and medium for double partition deployment - Google Patents

Partition monitoring method, device, equipment and medium for double partition deployment

Info

Publication number
CN120560954A
CN120560954A CN202511064433.8A CN202511064433A CN120560954A CN 120560954 A CN120560954 A CN 120560954A CN 202511064433 A CN202511064433 A CN 202511064433A CN 120560954 A CN120560954 A CN 120560954A
Authority
CN
China
Prior art keywords
partition
primary
kernel
file system
preset time
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202511064433.8A
Other languages
Chinese (zh)
Inventor
陈旭华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Metabrain Intelligent Technology Co Ltd
Original Assignee
Suzhou Metabrain Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Metabrain Intelligent Technology Co Ltd filed Critical Suzhou Metabrain Intelligent Technology Co Ltd
Priority to CN202511064433.8A priority Critical patent/CN120560954A/en
Publication of CN120560954A publication Critical patent/CN120560954A/en
Pending legal-status Critical Current

Links

Landscapes

  • Stored Programmes (AREA)

Abstract

The application discloses a partition monitoring method, a device, equipment and a medium for double partition deployment, which relate to the technical field of computers and are used for determining a main partition of a management controller after a system is electrified in an activated state, loading and starting a main partition kernel and a main starting file system corresponding to the main partition, judging whether the main partition kernel and the main starting file system are successfully started within a preset time period, switching to a standby partition, loading and starting the standby partition kernel and the standby starting file system to complete double partition switching if the main partition kernel and the main starting file system are not successfully started within the preset time period, acquiring a system updating compressed packet if the standby partition kernel and the standby starting file system are not successfully started within the preset time period, installing the system updating compressed packet to the main partition and the standby partition if signature verification of the system updating compressed packet is passed, completing remote recovery, avoiding equipment runaway caused by abnormality of the management controller, realizing quick positioning of faults, and improving the remote maintenance capability and the safety and reliability of the management controller.

Description

Partition monitoring method, device, equipment and medium for double partition deployment
Technical Field
The present invention relates to the field of computer technologies, and in particular, to a partition monitoring method, apparatus, device, and medium for dual partition deployment.
Background
In modern servers and embedded devices, a BMC (Baseboard Management Controller ) serves as a core management unit of a bottom hardware platform, and performs key functions such as system initialization, remote operation and maintenance, power consumption management, fault monitoring and the like. At present, a single partition mechanism is adopted in a BMC firmware deployment architecture, and once an abnormality occurs in an upgrading process, such as power failure, image damage, file system damage and the like, BMC startup or operation failure can be caused, so that the operation stability of the whole machine is affected. The BMC firmware recovery mechanism adopts an A/B partition mirror strategy through some BMC systems, and is matched with bootloader to realize simple double-mirror startup and rollback, and a plurality of firmware images are stored in Flash Memory to trigger startup of standby firmware images through matching with external signals, such as specific instructions, so that fault recovery of BMC is realized, however, most of existing schemes only judge system availability through success or failure of kernel loading, the integrity and the system service state of RootFS (root file system) cannot be detected, the health detection mechanism is single, the capability of remotely entering the recovery system cannot be provided under the condition that the A/B partition is damaged, and the traditional means such as serial ports, physical access and the like are needed.
Therefore, how to realize the automatic switching of the double partitions of the management controller, ensure the continuous operation of the system, avoid the out of control of the equipment caused by the abnormality of the management controller, realize the rapid positioning of faults, improve the remote maintenance capability and the safety and reliability of the management controller are the problems to be solved by the technicians in the field.
Disclosure of Invention
The embodiment of the invention aims to provide a partition monitoring method, device, equipment and medium for double partition deployment, which can realize double partition automatic switching of a management controller, ensure continuous operation of a system, avoid out of control of equipment caused by abnormality of the management controller, realize rapid fault positioning and improve remote maintenance capability and safety and reliability of the management controller. The specific scheme is as follows:
In a first aspect, the present application discloses a partition monitoring method for dual partition deployment, including:
determining a main partition of a management controller in an activated state after the system is electrified, and loading and starting a main partition kernel and a main starting file system corresponding to the main partition;
judging whether the main partition kernel and the main startup file system are started successfully within a preset time length;
If the main partition kernel and the main starting file system are not started successfully within the preset time, switching to the standby partition, loading and starting the standby partition kernel and the standby starting file system corresponding to the standby partition to finish double-partition switching;
if the standby partition kernel and the standby starting file system are not started successfully within the preset time, acquiring a system updating compressed packet, performing signature verification on the system updating compressed packet, and if the signature verification is passed, installing the system updating compressed packet to the main partition and the standby partition so as to complete remote recovery.
In a second aspect, the present application discloses a partition monitoring device for dual partition deployment, including:
The loading starting module is used for determining a main partition of the management controller in an activated state after the system is electrified, and loading and starting a main partition kernel and a main starting file system corresponding to the main partition;
the judging module is used for judging whether the main partition kernel and the main startup file system are started successfully within a preset time length;
The double-partition switching module is used for switching to the standby partition if the main partition kernel and the main starting file system are not started successfully within a preset time period, and loading and starting the standby partition kernel and the standby starting file system corresponding to the standby partition so as to complete double-partition switching;
and the remote recovery module is used for acquiring the system update compressed package if the standby partition kernel and the standby startup file system are not started successfully within a preset time period, carrying out signature verification on the system update compressed package, and installing the system update compressed package to the main partition and the standby partition if the signature verification is passed so as to complete remote recovery.
In a third aspect, the present application discloses an electronic device, comprising:
a memory for storing a computer program;
and the processor is used for realizing the steps of the partition monitoring method facing the double partition deployment when executing the computer program.
In a fourth aspect, the present application discloses a computer readable storage medium, in which a computer program is stored, where the computer program, when executed by a processor, implements the steps of the partition monitoring method for dual partition deployment described above.
The application provides a partition monitoring method for double partition deployment, which comprises the steps of determining a main partition of a management controller after a system is electrified in an activated state, loading and starting a main partition kernel and a main starting file system corresponding to the main partition, judging whether the main partition kernel and the main starting file system are successfully started in a preset time period, switching to a standby partition if the main partition kernel and the main starting file system are not successfully started in the preset time period, loading and starting a standby partition kernel and a standby starting file system corresponding to the standby partition to complete double partition switching, acquiring a system updating compressed packet if the standby partition kernel and the standby starting file system are not successfully started in the preset time period, carrying out signature verification on the system updating compressed packet, and installing the system updating compressed packet to the main partition and the standby partition if the signature verification is passed so as to complete remote recovery. The method comprises the steps of firstly determining a main partition of a management controller after the system is electrified in an activated state, loading and starting a main partition kernel and a main starting file system corresponding to the main partition, switching to a standby partition if the main partition kernel and the main starting file system are not started successfully within a preset time period, realizing automatic switching to the standby system under the conditions of firmware damage, update failure or abnormal power failure and the like through a double-partition structure, ensuring that the management controller can continuously and stably operate, avoiding the problem of equipment out-of-control caused by the abnormality of the management controller, carrying out signature verification on an acquired system update compressed packet if the standby partition kernel and the standby starting file system are not started successfully within the preset time period, and installing the system update compressed packet to the main partition and the standby partition if the signature verification is passed.
Drawings
For a clearer description of embodiments of the present application, the drawings that are required to be used in the embodiments will be briefly described, it being apparent that the drawings in the following description are only some embodiments of the present application, and other drawings may be obtained according to the drawings without inventive effort for those skilled in the art.
FIG. 1 is a flow chart of a partition monitoring method for dual partition deployment disclosed by the application;
FIG. 2 is a flow chart of a dual partition switch according to the present application;
FIG. 3 is a flowchart of a remote recovery method according to the present application;
Fig. 4 is a schematic structural diagram of a partition monitoring device for dual partition deployment disclosed by the application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. Based on the embodiments of the present application, all other embodiments obtained by a person of ordinary skill in the art without making any inventive effort are within the scope of the present application.
In modern servers and embedded devices, a BMC is used as a core management unit of a bottom hardware platform and bears key functions such as system initialization, remote operation and maintenance, power consumption management, fault monitoring and the like. At present, a single partition mechanism is adopted in a BMC firmware deployment architecture, and once an abnormality occurs in an upgrading process, such as power failure, image damage, file system damage and the like, BMC startup or operation failure can be caused, so that the operation stability of the whole machine is affected. The BMC firmware recovery mechanism adopts an A/B partition mirror image strategy through some BMC systems, realizes simple double-mirror image starting and rollback through matching with bootloaders, realizes the fault recovery of the BMC by storing a plurality of firmware images in Flash and triggering and starting standby firmware images through matching with external signals such as specific instructions, however, most of the existing schemes only judge the availability of the system through whether the loading of a kernel is successful or not, cannot detect RootFS integrity and system service state, has single health detection mechanism, cannot provide the capability of remotely entering the recovery system under the condition that the A/B partition is damaged, and needs to rely on traditional means such as serial ports, physical access and the like. Therefore, how to realize the automatic switching of the double partitions of the management controller, ensure the continuous operation of the system, avoid the out of control of the equipment caused by the abnormality of the management controller, realize the rapid positioning of faults, improve the remote maintenance capability and the safety and reliability of the management controller are the problems to be solved by the technicians in the field.
Referring to fig. 1, the embodiment of the invention discloses a partition monitoring method for dual partition deployment, which specifically includes:
and S11, determining a main partition of the management controller in an activated state after the system is electrified, and loading and starting a main partition kernel and a main starting file system corresponding to the main partition.
In the embodiment, after the system is powered on, a management controller of each partition is read and analyzed by a preset loading program, the partition in the management controller, in which the program of the system upgrading controller is in an activated state, is used as a main partition, a main partition kernel corresponding to the main partition and a main starting file system are loaded and started, and the state of a timer is set to be started.
The server system architecture of the application is that two 64MB NOR Flash (NOR Flash EEPROM Memory, nonvolatile memory) are configured in the server system, which are Flash A and Flash B respectively. And a complete BMC (Baseboard Management Controller ) starting system is independently deployed on each Flash. The system comprises a main partition and a standby partition, the structures of the main partition and the standby partition are completely consistent, and each sub-module comprises a kernel, a root file system, a starting configuration file, a health sign and the like. The boot priority is determined by Bootloader reading the status flag. The structure of the main partition is shown in table 1:
TABLE 1 Main partition Structure Table
The structure of the spare partition is similar to that of the main partition, but the mirror images are independent, and the same layout is used for mirror image redundancy or update peer-to-peer synchronization.
In this embodiment, after the system is powered on, the management controller of each partition is read and parsed by using a U-Boot (preset loader), the partition in the management controller in which the program of the system upgrade controller is in an active state is used as a main partition, the kernel of the main partition corresponding to the main partition and the main startup file system are loaded and started, that is, the state of the timer is set to be started, that is, the current active state is parsed by using the U-Boot read RAUC (reliable system upgrade controller) to read Update Controller, the partition in the active state is determined, if the partition in the active state is a, a is used as the main partition, bootname =a, if the partition in the active state is B, B is used as the main partition, bootname =b, then the corresponding kernel-a/B (kernel corresponding to the a/B partition) and the startup file system corresponding to the rofs-a/B (a/B partition) are selected to be started, and the watch dog is set to be started.
And step S12, judging whether the main partition kernel and the main startup file system are started successfully within a preset time length.
In the embodiment, a system initialization process tool or a custom script is utilized to call an interface of a system upgrade controller, a program label is written into a main partition based on a main partition kernel and a main startup file system, and after the program label is written, whether the main partition kernel and the main startup file system are started successfully within a preset time period is judged.
Specifically, during the starting process, a systemd (system initialization process tool) or custom script invokes RAUC the internal interface to set a boot_ok (program tag), if the boot_ok is written into the main partition.
The method comprises the steps of judging whether a main partition kernel and a main startup file system are started successfully in a preset time period after program label writing is completed, determining the current state of a timer after the program label writing is completed, judging that the main partition kernel and the main startup file system are not started successfully in the preset time period if the current state of the timer is started, obtaining the writing time of the program label from the timer if the current state of the timer is closed, judging whether the writing time is greater than the preset time period, judging that the main partition kernel and the main startup file system are started successfully in the preset time period if the writing time is not greater than the preset time period, and judging that the main partition kernel and the main startup file system are not started successfully in the preset time period if the writing time is greater than the preset time period.
That is, if systemd is started, the state of the watchdog is started, the start is considered to be failed, if the state of the watchdog is closed, the writing time of the program label is determined, and if the writing time is longer than a preset time length, for example, the writing time is longer than 60 seconds, the partition start is considered to be failed.
And step S13, if the main partition kernel and the main startup file system are not started successfully within the preset time, switching to the standby partition, loading and starting the standby partition kernel and the standby startup file system corresponding to the standby partition, so as to complete double partition switching.
In the embodiment, if the main partition kernel and the main startup file system are not started successfully within a preset time period, the main partition is marked as failed in startup, the system is restarted after being switched to the standby partition, the standby partition kernel and the standby startup file system corresponding to the standby partition are loaded and started after the system is restarted, and whether the standby partition kernel and the standby startup file system are started successfully within the preset time period is judged.
Specifically, if the main boot fails, the Bootloader automatically switches to the standby partition, switches bootname (program name), and restarts the system, performs loading and starts the processes of the standby partition kernel and the standby boot file system corresponding to the standby partition, and determines whether the standby partition kernel and the standby boot file system are successfully started within a preset duration.
In this embodiment, as shown in fig. 2, the specific flow of the dual partition switching is that, first, after the system is powered on, the U-Boot in the Bootloader reads RAUC status the partition, analyzes the current activation state, determines bootname =a or B (i.e., determines whether the program name of the current main partition is a or B), loads and starts the corresponding main partition kernel and the main startup file system, writes the boot_ok into the main partition, determines whether the writing is successful in a preset time period, if the writing is successful, starts the main partition successfully, if the writing is not successful, starts the main partition failed, marks the main partition, switches to the standby partition, restarts the system, writes the boot_ok again, determines whether the writing is successful in the preset time period, if the writing is successful, starts the standby partition successfully, if the writing is not successful, starts the standby partition failed, enters the recovery partition, and executes the flow of the subsequent remote recovery. According to the application, a complete starting system and a double RootFS (starting file system) structure are arranged in the Flash chip, and a RAUC state management mechanism is combined, so that the standby system can be automatically switched to under the conditions of firmware damage, update failure or abnormal power failure, and the like, the BMC can be ensured to continuously and stably operate, and the problem of equipment out of control caused by firmware abnormality is avoided. Compared with the traditional scheme, the disaster recovery capacity is stronger.
And S14, if the kernel of the standby partition and the standby startup file system are not started successfully within the preset time, acquiring a system update compressed packet, carrying out signature verification on the system update compressed packet, and if the signature verification is passed, installing the system update compressed packet to the main partition and the standby partition so as to complete remote recovery.
In the embodiment, if the standby partition kernel and the standby startup file system are not started successfully within a preset time period, a system upgrading controller is utilized to acquire a system upgrading compressed package from a remote end and verify a signature in the system upgrading compressed package according to a preset remote access mode, if the signature verification is passed, the system upgrading compressed package is installed to a main partition and the standby partition, the system upgrading controller is utilized to set the main partition after the system upgrading compressed package into an activated state and perform a system restarting operation, after the system is restarted, an updated main partition kernel and a main startup file system corresponding to the main partition after the system upgrading compressed package are loaded and started, whether the updated main partition kernel and the main startup file system are started successfully within the preset time period is judged, if the updated main partition kernel and the main startup file system are started successfully within the preset time period, the remote restoration is judged to be successful, and if the updated main partition kernel and the main startup file system are not started successfully within the preset time period, the remote restoration is judged to be failed, wherein the remote access mode comprises a network security and simple file transmission protocol.
That is, if the standby partition kernel and the standby startup file system are not started successfully within a preset period of time, the method jumps to the Recovery partition, and opens an SSH (Secure Shell) or Web (World Wide Web) console to allow remote diagnosis and repair. The Recovery system is Busybox (lightweight software) environment, built-in network driver, RAUC command, diagnosis script, TFTP (TRIVIAL FILE TRANSFER Protocol, simple file transfer Protocol) tool, etc.
The specific flow of remote recovery in the application is shown in fig. 3, a system update compression packet is obtained from a remote end by utilizing RAUC, the signature in the system update compression packet is checked, if the check is passed, the program name of the current standby partition is weighed and newly modified into the program name of the main partition, the system update compression packet is installed to the main partition and the standby partition, the system restarting operation is carried out, after restarting, the updated main partition is entered, the updated main partition kernel and the main startup file system are loaded and started, the boot_ok is written, whether the writing is successful is judged, if the writing is successful, the updating is successful, the remote recovery is judged to be successful, and if the remote recovery is failed, the remote recovery is judged to be failed.
The Recovery system can configure network automatic online, such as DHCP (Dynamic Host Configuration Protocol ), and built-in remote command execution tools, such as Dropbear SSH, telnetd (standard protocol for remote connection service), curl (application layer network protocol library), RAUC CLI, support operations of remote uploading RAUC packets, triggering repair, rewriting status partitions, and the like, and support uploading failure partition logs to a remote Syslog or cloud diagnosis platform.
The invention constructs an automatic switching mechanism of failure in starting and a Recovery first-aid subsystem, when the main/standby system cannot be started normally, the system can automatically enter a preset minimum operation environment, and firmware is remotely accessed, recovered or redeployed by SSH, TFTP and the like, so that the manual intervention and maintenance cost is effectively reduced, and the system is particularly suitable for large-scale deployment scenes such as data centers and the like.
In addition, the RAUC-state partition can completely record firmware upgrading states, version information and starting health conditions, and by combining a starting log and a failure rollback record, remote monitoring and tracing of the running state of the BMC system are supported, so that quick fault positioning and batch problem early warning are facilitated, and the operation and maintenance complexity and time cost are remarkably reduced.
The application is realized based on the main stream open source components such as RAUC, U-Boot, linux kernel and the like, is compatible with the existing embedded Linux system architecture, can be widely adapted to main stream BMC platforms such as AST2500/AST2600 and the like, has good expandability and productive landing capability, and is convenient for large-scale popularization and iterative optimization. The application not only realizes double redundancy and high reliability on the aspect of system structural design, but also has a complete and closed-loop operation mechanism on the aspects of software control and remote maintainability, thereby remarkably improving the safety, stability and engineering practicability of the BMC firmware system in actual deployment, and having wide application prospect and popularization value.
For the health detection mechanism, a multi-dimensional health detection mechanism can be established, and health detection can be performed according to service registration number checking, whether network initialization is completed, whether BMC can Ping through a host, and statistics of the number of times of repeatedly restarting of the watchdog. For upgrading, differential packet upgrading, such as RAUC delta update, is supported, when Flash space is limited, RAUC delta upgrade is supported, only partial changes of differential parts, such as rofs, are uploaded, and bandwidth and Flash writing burden are reduced.
In addition, in the stage of switching the double partitions, in order to avoid the situation that the firmware and the current hardware/peripheral driver may drift due to long-term non-starting of the standby partition, periodic gray level exercise can be performed, for example, once a month, the traffic is migrated to the standby partition through script in a low service window, and the traffic is run for 5 minutes, the main partition is switched back after the normal verification of the driver, the peripheral and the service, and if the exercise is successful, the tag is played in the cloud as a 'standby partition health credential'. In the remote recovery stage, in order to defend against version rollback attacks (i.e. an attacker pushes an old version vulnerability system), a monotonically increasing version number and a timestamp can be embedded in an update packet, a Bootloader needs to verify that a new version number is greater than the version number of a current partition and the timestamp is greater than the last update time, and the validity period check of a certificate chain can be increased.
In the embodiment, a main partition of a management controller in an activated state after the system is powered on is determined, a main partition kernel and a main startup file system corresponding to the main partition are loaded and started, whether the main partition kernel and the main startup file system are started successfully or not is judged, if the main partition kernel and the main startup file system are not started successfully in a preset time, the main partition kernel and the main startup file system are switched to a standby partition, the standby partition kernel and the standby startup file system corresponding to the standby partition are loaded and started to complete double partition switching, if the standby partition kernel and the standby startup file system are not started successfully in the preset time, a system update compression packet is acquired, signature verification is carried out on the system update compression packet, and if the signature verification is passed, the system update compression packet is installed to the main partition and the standby partition, and remote recovery is completed. The method comprises the steps of firstly determining a main partition of a management controller after the system is electrified in an activated state, loading and starting a main partition kernel and a main starting file system corresponding to the main partition, switching to a standby partition if the main partition kernel and the main starting file system are not started successfully within a preset time period, realizing automatic switching to the standby system under the conditions of firmware damage, update failure or abnormal power failure and the like through a double-partition structure, ensuring that the management controller can continuously and stably operate, avoiding the problem of equipment out-of-control caused by the abnormality of the management controller, carrying out signature verification on an acquired system update compressed packet if the standby partition kernel and the standby starting file system are not started successfully within the preset time period, and installing the system update compressed packet to the main partition and the standby partition if the signature verification is passed.
Referring to fig. 4, the embodiment of the invention discloses a partition monitoring device for dual partition deployment, which specifically may include:
The loading starting module 11 is used for determining a main partition of the management controller in an activated state after the system is powered on, and loading and starting a main partition kernel and a main starting file system corresponding to the main partition;
The judging module 12 is configured to judge whether the main partition kernel and the main startup file system are started successfully within a preset duration;
The dual-partition switching module 13 is configured to switch to a standby partition if the main partition kernel and the main startup file system are not started successfully within a preset duration, and load and start the standby partition kernel and the standby startup file system corresponding to the standby partition to complete dual-partition switching;
The remote recovery module 14 is configured to obtain the system update package if the kernel of the spare partition and the spare startup file system are not started successfully within a preset period of time, and perform signature verification on the system update package, and if the signature verification is passed, install the system update package to the main partition and the spare partition to complete remote recovery.
In some specific embodiments, loading the starting module 11 may specifically include:
the reading and analyzing module is used for reading and analyzing the management controllers of all the partitions by using a preset loading program after the system is powered on;
and the main partition determining module is used for taking a partition of the management controller in which the program of the system upgrade controller is in an activated state as a main partition.
In some specific embodiments, the determining module 12 may specifically include:
The loading starting module is used for loading and starting a main partition kernel corresponding to the main partition and a main starting file system, and setting the state of a timer to be started;
The program label writing module is used for calling an interface of the system upgrade controller by using a system initialization process tool or a custom script, and writing a program label into the main partition based on a main partition kernel and a main startup file system;
and the starting success judging module is used for judging whether the main partition kernel and the main starting file system are started successfully within a preset duration after the program label is written.
In some embodiments, the starting success determining module may specifically include:
the current state determining module is used for determining the current state of the timer after the program label is written;
the first un-started success module is used for judging that the main partition kernel and the main started file system are not started successfully within a preset duration if the current state of the timer is started;
The write time acquisition module is used for acquiring the write time of the program label from the timer if the current state of the timer is closed;
the writing time judging module is used for judging whether the writing time is longer than a preset duration;
The starting success module is used for judging that the main partition kernel and the main starting file system are successfully started within the preset time length if the writing time is not longer than the preset time length;
And the second un-started success module is used for judging that the main partition kernel and the main started file system are not started successfully within the preset duration if the writing time is longer than the preset duration.
In some specific embodiments, the dual partition switching module 13 may specifically include:
The starting failure marking module is used for marking the main partition as starting failure if the kernel of the main partition and the main starting file system are not started successfully within a preset time length;
The system restarting module is used for switching to the standby partition, performing system restarting operation, and loading and starting a standby partition kernel and a standby starting file system corresponding to the standby partition after the system is restarted;
And the standby partition judging module is used for judging whether the standby partition kernel and the standby starting file system are started successfully within a preset time length.
In some embodiments, the remote recovery module 14 may specifically include:
The system comprises a compressed package acquisition and verification module, a system upgrading controller, a remote access module and a remote access module, wherein the compressed package acquisition and verification module is used for acquiring a system updating compressed package from a remote end by using the system upgrading controller and according to a preset remote access mode if the standby partition kernel and the standby starting file system are not started successfully within a preset time period, and the remote access mode comprises a network security protocol and a simple file transmission protocol.
In some embodiments, the remote recovery module 14 may specifically include:
The system restarting module is used for setting the main partition after the system updating compression packet is installed into an activated state by using the system upgrading controller and performing system restarting operation;
The loading and starting module is used for loading and starting the updated main partition kernel and the main starting file system corresponding to the main partition after the system is installed with the system updating compression packet after the system is restarted;
the updated starting success judging module is used for judging whether the updated main partition kernel and the main starting file system are successfully started within a preset time length;
The remote recovery success module is used for judging that the remote recovery is successful if the updated main partition kernel and the main startup file system are started successfully within a preset time length;
and the remote recovery failure judging module is used for judging that the remote recovery fails if the updated main partition kernel and the main startup file system are not started successfully within a preset time length.
The description of the features in the embodiment corresponding to the partition monitoring device for dual partition deployment may refer to the related description of the embodiment corresponding to the partition monitoring method for dual partition deployment, which is not described herein in detail.
An embodiment of the application also provides an electronic device comprising a memory having stored therein a computer program and a processor arranged to run the computer program to perform the steps of any of the above-described embodiments of the partition monitoring method for a dual partition deployment.
Embodiments of the present application also provide a computer readable storage medium having a computer program stored therein, wherein the computer program is configured to perform the steps of any of the above-described embodiments of the partition monitoring method for a dual partition deployment at run-time.
In an exemplary embodiment, the computer readable storage medium may include, but is not limited to, a U disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a removable hard disk, a magnetic disk, or an optical disk, etc. various media in which a computer program may be stored.
Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative elements and steps are described above generally in terms of functionality in order to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
The partition monitoring method, the device, the equipment and the medium for double partition deployment provided by the application are described in detail. The principles and embodiments of the present application have been described herein with reference to specific examples, the description of which is intended only to facilitate an understanding of the method of the present application and its core ideas. It should be noted that it will be apparent to those skilled in the art that various modifications and adaptations of the application can be made without departing from the principles of the application and these modifications and adaptations are intended to be within the scope of the application as defined in the following claims.

Claims (10)

1.一种面向双分区部署的分区监控方法,其特征在于,包括:1. A partition monitoring method for dual-partition deployment, comprising: 确定系统上电后的管理控制器的程序处于激活状态下的主分区,加载并启动与所述主分区对应的主分区内核以及主启动文件系统;Determine a primary partition in which a program of a management controller is activated after the system is powered on, and load and start a primary partition kernel and a primary boot file system corresponding to the primary partition; 判断所述主分区内核以及所述主启动文件系统在预设时长内是否启动成功;Determine whether the primary partition kernel and the primary boot file system are successfully started within a preset time period; 若所述主分区内核以及所述主启动文件系统在预设时长内未启动成功,则切换至备用分区,加载并启动与所述备用分区对应的备用分区内核以及备用启动文件系统,以完成双分区切换;If the primary partition kernel and the primary boot file system fail to start successfully within a preset time, switching to the backup partition, loading and starting the backup partition kernel and the backup boot file system corresponding to the backup partition, to complete the dual partition switching; 若所述备用分区内核以及备用启动文件系统在预设时长内未启动成功,则获取系统更新压缩包,并对所述系统更新压缩包进行签名校验,若签名校验通过,则将所述系统更新压缩包安装至所述主分区和所述备用分区,以完成远程恢复。If the backup partition kernel and the backup boot file system fail to start successfully within the preset time, the system update compressed package is obtained and the signature of the system update compressed package is verified. If the signature verification passes, the system update compressed package is installed to the primary partition and the backup partition to complete remote recovery. 2.根据权利要求1所述的面向双分区部署的分区监控方法,其特征在于,所述确定系统上电后的管理控制器的程序处于激活状态下的主分区,包括:2. The partition monitoring method for dual-partition deployment according to claim 1, wherein determining the primary partition in which the program of the management controller is in an activated state after the system is powered on comprises: 当系统上电后,利用预设加载程序读取并解析各分区的管理控制器;When the system is powered on, the preset loader is used to read and parse the management controller of each partition; 将管理控制器中系统升级控制器的程序处于激活状态下的分区作为主分区。The partition in which the system upgrade controller program in the management controller is activated is used as the primary partition. 3.根据权利要求1所述的面向双分区部署的分区监控方法,其特征在于,所述加载并启动与所述主分区对应的主分区内核以及主启动文件系统;判断所述主分区内核以及所述主启动文件系统在预设时长内是否启动成功,包括:3. The partition monitoring method for dual-partition deployment according to claim 1, wherein the loading and starting of the primary partition kernel and the primary boot file system corresponding to the primary partition; and determining whether the primary partition kernel and the primary boot file system are successfully started within a preset time period comprise: 加载并启动与所述主分区对应的主分区内核以及主启动文件系统,并将定时器的状态设置为开启;Loading and starting the primary partition kernel and the primary boot file system corresponding to the primary partition, and setting the state of the timer to be on; 利用系统初始化进程工具或自定义脚本调用系统升级控制器的接口,并基于所述主分区内核以及所述主启动文件系统,将程序标签写入所述主分区;Using a system initialization process tool or a custom script to call an interface of a system upgrade controller, and writing a program tag into the primary partition based on the primary partition kernel and the primary boot file system; 当程序标签写入完成后,判断主分区内核以及主启动文件系统在预设时长内是否启动成功。After the program tag is written, it is determined whether the primary partition kernel and the primary boot file system are successfully started within a preset time. 4.根据权利要求3所述的面向双分区部署的分区监控方法,其特征在于,所述当程序标签写入完成后,判断主分区内核以及主启动文件系统在预设时长内是否启动成功,包括:4. The partition monitoring method for dual-partition deployment according to claim 3, wherein after the program tag is written, determining whether the primary partition kernel and the primary boot file system are successfully started within a preset time period comprises: 当程序标签写入完成后,确定所述定时器的当前状态;After the program tag is written, determining the current state of the timer; 若所述定时器的当前状态为开启,则判定主分区内核以及主启动文件系统在预设时长内未启动成功;If the current state of the timer is on, it is determined that the primary partition kernel and the primary boot file system have not been successfully started within the preset time period; 若所述定时器的当前状态为关闭,则获取从所述定时器中获取程序标签的写入时间;If the current state of the timer is off, obtaining the writing time of the program tag from the timer; 判断所述写入时间是否大于预设时长;Determine whether the writing time is greater than a preset time length; 若所述写入时间不大于预设时长,则判定主分区内核以及主启动文件系统在预设时长内启动成功;If the writing time is not greater than the preset time length, it is determined that the primary partition kernel and the primary boot file system are successfully started within the preset time length; 若所述写入时间大于预设时长,则判定主分区内核以及主启动文件系统在预设时长内未启动成功。If the writing time is greater than the preset time length, it is determined that the primary partition kernel and the primary boot file system have not been successfully started within the preset time length. 5.根据权利要求1所述的面向双分区部署的分区监控方法,其特征在于,所述若所述主分区内核以及所述主启动文件系统在预设时长内未启动成功,则切换至备用分区,加载并启动与所述备用分区对应的备用分区内核以及备用启动文件系统,包括:5. The partition monitoring method for dual-partition deployment according to claim 1, wherein if the primary partition kernel and the primary boot file system fail to boot successfully within a preset time, switching to the backup partition, loading and booting the backup partition kernel and the backup boot file system corresponding to the backup partition, comprises: 若所述主分区内核以及所述主启动文件系统在预设时长内未启动成功,则将所述主分区标记为启动失败;If the primary partition kernel and the primary boot file system fail to boot successfully within a preset time, the primary partition is marked as boot failure; 切换至备用分区,并进行系统重启操作,在系统重启后,加载并启动与所述备用分区对应的备用分区内核以及备用启动文件系统;Switching to the standby partition and performing a system restart operation, after the system restarts, loading and starting the standby partition kernel and the standby boot file system corresponding to the standby partition; 判断所述备用分区内核以及所述备用启动文件系统在预设时长内是否启动成功。Determine whether the standby partition kernel and the standby startup file system are successfully started within a preset time period. 6.根据权利要求1所述的面向双分区部署的分区监控方法,其特征在于,所述若所述备用分区内核以及备用启动文件系统在预设时长内未启动成功,则获取系统更新压缩包,并对所述系统更新压缩包进行签名校验,包括:6. The partition monitoring method for dual-partition deployment according to claim 1, wherein if the backup partition kernel and the backup boot file system fail to start successfully within a preset time, obtaining a system update compressed package and performing signature verification on the system update compressed package comprises: 若所述备用分区内核以及备用启动文件系统在预设时长内未启动成功,则利用系统升级控制器并按照预设的远程接入方式从远端获取系统更新压缩包,并对所述系统更新压缩包中的签名进行校验;所述远程接入方式包括网络安全协议、简单文件传输协议。If the backup partition kernel and the backup boot file system fail to start successfully within the preset time, the system upgrade controller is used to obtain the system update compressed package from the remote end according to the preset remote access method, and the signature in the system update compressed package is verified; the remote access method includes the network security protocol and the simple file transfer protocol. 7.根据权利要求1至6任一项所述的面向双分区部署的分区监控方法,其特征在于,所述将所述系统更新压缩包安装至所述主分区和所述备用分区之后,还包括:7. The partition monitoring method for dual-partition deployment according to any one of claims 1 to 6, characterized in that after installing the system update compressed package to the primary partition and the backup partition, the method further comprises: 利用系统升级控制器将安装系统更新压缩包后的所述主分区设置为激活状态,并进行系统重启操作;Using the system upgrade controller to set the primary partition after the system update compressed package is installed to an activated state, and perform a system restart operation; 在系统重启后,加载并启动与安装系统更新压缩包后的所述主分区对应的更新后的主分区内核以及主启动文件系统;After the system is restarted, the updated primary partition kernel and the primary boot file system corresponding to the primary partition after the system update compressed package is installed are loaded and started; 判断更新后的主分区内核以及主启动文件系统在预设时长内是否启动成功;Determine whether the updated primary partition kernel and primary boot file system are successfully booted within a preset time period; 若更新后的主分区内核以及主启动文件系统在预设时长内启动成功,则判定远程恢复成功;If the updated primary partition kernel and primary boot file system are successfully booted within the preset time, the remote recovery is considered successful. 若更新后的主分区内核以及主启动文件系统在预设时长内未启动成功,则判定远程恢复失败。If the updated primary partition kernel and primary boot file system fail to start successfully within the preset time, the remote recovery is determined to have failed. 8.一种面向双分区部署的分区监控装置,其特征在于,包括:8. A partition monitoring device for dual-partition deployment, comprising: 加载启动模块,用于确定系统上电后的管理控制器的程序处于激活状态下的主分区,加载并启动与所述主分区对应的主分区内核以及主启动文件系统;A loading and starting module is used to determine the primary partition in which the program of the management controller is in an activated state after the system is powered on, and to load and start the primary partition kernel and the primary startup file system corresponding to the primary partition; 判断模块,用于判断所述主分区内核以及所述主启动文件系统在预设时长内是否启动成功;A judgment module, used to judge whether the primary partition kernel and the primary boot file system are successfully started within a preset time; 双分区切换模块,用于若所述主分区内核以及所述主启动文件系统在预设时长内未启动成功,则切换至备用分区,加载并启动与所述备用分区对应的备用分区内核以及备用启动文件系统,以完成双分区切换;A dual-partition switching module is configured to switch to a backup partition if the primary partition kernel and the primary boot file system fail to boot successfully within a preset time, load and boot the backup partition kernel and the backup boot file system corresponding to the backup partition, and complete the dual-partition switching; 远程恢复模块,用于若所述备用分区内核以及备用启动文件系统在预设时长内未启动成功,则获取系统更新压缩包,并对所述系统更新压缩包进行签名校验,若签名校验通过,则将所述系统更新压缩包安装至所述主分区和所述备用分区,以完成远程恢复。The remote recovery module is used to obtain a system update compressed package and perform signature verification on the system update compressed package if the backup partition kernel and the backup boot file system fail to start successfully within a preset time period. If the signature verification passes, the system update compressed package is installed to the primary partition and the backup partition to complete remote recovery. 9.一种电子设备,其特征在于,包括:9. An electronic device, comprising: 存储器,用于存储计算机程序;Memory for storing computer programs; 处理器,用于执行所述计算机程序时实现如权利要求1至7任一项所述面向双分区部署的分区监控方法的步骤。A processor is configured to implement the steps of the partition monitoring method for dual-partition deployment as described in any one of claims 1 to 7 when executing the computer program. 10.一种计算机可读存储介质,其特征在于,所述计算机可读存储介质中存储有计算机程序,其中,所述计算机程序被处理器执行时实现如权利要求1至7任一项所述面向双分区部署的分区监控方法的步骤。10. A computer-readable storage medium, characterized in that a computer program is stored in the computer-readable storage medium, wherein when the computer program is executed by a processor, the steps of the partition monitoring method for dual-partition deployment according to any one of claims 1 to 7 are implemented.
CN202511064433.8A 2025-07-31 2025-07-31 Partition monitoring method, device, equipment and medium for double partition deployment Pending CN120560954A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202511064433.8A CN120560954A (en) 2025-07-31 2025-07-31 Partition monitoring method, device, equipment and medium for double partition deployment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202511064433.8A CN120560954A (en) 2025-07-31 2025-07-31 Partition monitoring method, device, equipment and medium for double partition deployment

Publications (1)

Publication Number Publication Date
CN120560954A true CN120560954A (en) 2025-08-29

Family

ID=96824844

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202511064433.8A Pending CN120560954A (en) 2025-07-31 2025-07-31 Partition monitoring method, device, equipment and medium for double partition deployment

Country Status (1)

Country Link
CN (1) CN120560954A (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109976940A (en) * 2017-12-28 2019-07-05 天津创奇业网络技术有限公司 A kind of soft start network equipment
CN116627515A (en) * 2023-05-30 2023-08-22 杭州迪普科技股份有限公司 Partition switching starting method and device of embedded system

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109976940A (en) * 2017-12-28 2019-07-05 天津创奇业网络技术有限公司 A kind of soft start network equipment
CN116627515A (en) * 2023-05-30 2023-08-22 杭州迪普科技股份有限公司 Partition switching starting method and device of embedded system

Similar Documents

Publication Publication Date Title
CN100582799C (en) Electronic device diagnostic methods and systems
CN112948157A (en) Server fault positioning method, device and system and computer readable storage medium
US8788636B2 (en) Boot controlling method of managed computer
US8806265B2 (en) LPAR creation and repair for automated error recovery
US20090217079A1 (en) Method and apparatus for repairing multi-controller system
US20120084508A1 (en) Disk array apparatus and firmware update method therefor
US10880153B2 (en) Method and system for providing service redundancy between a master server and a slave server
CN101329631A (en) Method and apparatus for automatically detecting and recovering start-up of embedded system
US20130117518A1 (en) System controller, information processing system and method of saving and restoring data in the information processing system
CN112433769A (en) BMC starting method and device, computer equipment and storage medium
CN111182033A (en) A method and device for restoring a switch
CN120255970B (en) Baseboard management controller starting method, computer equipment, medium and product
CN117034296A (en) System starting method, system, equipment and medium
CN111090546B (en) Method, device and equipment for restarting operating system and readable storage medium
TWI777664B (en) Booting method of embedded system
CN109783150A (en) A kind of anti-brick method and device of embedded system starting
JP2005284902A (en) Terminal device, control method and control program thereof, host device, control method and control program thereof, and method, system, and program for remote updating
CN111427721B (en) Abnormality recovery method and device
CN113507384A (en) A system and method for switching equipment working mode
WO2011158367A1 (en) Technology for updating active program
JP7389877B2 (en) Network optimal boot path method and system
CN120560954A (en) Partition monitoring method, device, equipment and medium for double partition deployment
CN116627712A (en) Server memory fault detection method, device, electronic equipment and storage medium
JP2002229798A (en) Computer system, its bios management method, and bios management program
CN120179321B (en) Firmware loading method, device, electronic device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination