CN119645733B

CN119645733B - KVM data storage processing method and system and continuous data point recovery method thereof

Info

Publication number: CN119645733B
Application number: CN202411816798.7A
Authority: CN
Inventors: 胡晓勤; 程治远
Original assignee: Sichuan University
Current assignee: Sichuan University
Priority date: 2024-12-11
Filing date: 2024-12-11
Publication date: 2025-06-27
Anticipated expiration: 2044-12-11
Also published as: CN119645733A

Abstract

The present invention provides a KVM data storage processing method, system and continuous data point recovery method thereof, and relates to the field of disaster recovery backup technology. In the storage processing method, the server program first initializes and creates a configuration information table, an initialization volume, a log volume group and a metadata structure, and after receiving the source data from the production end, updates the corresponding volume, table and metadata according to the data type. In the continuous data point recovery method, after the recovery volume is created, the latest or arbitrary time point recovery is performed according to the relationship between the recovery time point M0 and the latest time point H0; if the data logic is inconsistent, a gradual rollback is performed. The present invention realizes efficient tracking and updating of data by combining the initialization volume and the log volume group, as well as metadata management, and ensures data consistency and integrity.

Description

KVM data storage processing method and system and continuous data point recovery method thereof

Technical Field

The invention belongs to the technical field of disaster recovery backup, and relates to a KVM data storage processing method, a system and a continuous data point recovery method thereof.

Background

In the field of modern information technology, KVM (kernel_ based Virtual Machine) has become an important choice for building a virtualized environment as an open source virtualization technology. The method allows a plurality of operating systems to run on the same physical server through simulating hardware, thereby improving the utilization rate and flexibility of resources. In this context, optimization of data storage structures is an important direction to improve virtual machine performance and reliability.

Currently, like a normal host, a KVM virtual machine in cloud computing also faces the risk of data loss and damage, and for this case, most users use a backup method to fully backup and create a snapshot of data of the KVM virtual machine at a certain time. However, the backup method has large backup time interval and large backup data storage amount, the data storage amount is increased continuously along with the long-term operation of the disaster recovery system, and especially in a continuous data protection environment, the data growth speed is very high, the data cannot be tracked efficiently, and the data consistency and the integrity are not guaranteed.

Therefore, how to design a data storage structure, which can track and manage data more efficiently, and ensure consistency and integrity of the data is a technical problem which needs to be solved rapidly at present.

Disclosure of Invention

The invention aims to solve the technical problems in the background art and provides a KVM data storage processing method, a system and a continuous data point recovery method thereof.

The technical scheme for solving the technical problems is as follows:

In a first aspect, a KVM data storage processing method is provided, where the method is applied to a server program, and includes the steps of:

a configuration information table creation step of initializing and creating a configuration information table;

A data volume creation step of initializing and creating an initialization volume and a log volume group;

A metadata structure creation step, namely initializing and creating a metadata structure, wherein the metadata structure comprises log volume information, a mirror image table, a log metadata file, a write operation log and a data block change log, wherein the mirror image table stores the latest state of a virtual machine and positions log metadata file records of each data block;

And a data storage step, namely receiving and judging the source data type of the production end, if the source data is basic configuration and synchronous data, updating the configuration information table firstly, and then updating the initialization volume and the metadata structure, and if the source data is a data change record, directly updating the log volume group and the metadata structure.

In a second aspect, there is provided a KVM data storage processing system, the system being applied to a server program, comprising:

the configuration information table creation module is used for initializing and creating a configuration information table;

The data volume creation module is used for initializing and creating an initialization volume and a log volume group;

The system comprises a metadata structure creation module, a metadata structure creation module and a data block change module, wherein the metadata structure is used for initializing and creating a metadata structure, the metadata structure comprises log volume information, a mirror image table, a log metadata file, a write operation log and a data block change log, wherein the mirror image table stores the latest state of a virtual machine and positions log metadata file records of each data block, and the log metadata file records each time point version of all the data blocks of the virtual machine;

The judging and updating module is used for receiving and judging the source data type of the production end, updating the configuration information table firstly if the source data is basic configuration and synchronous data, updating the initialization volume and the metadata structure, and directly updating the log volume group and the metadata structure if the source data is data change record.

In a third aspect, a continuous data point recovery method is provided, and the KVM data storage processing method further includes the steps of:

a recovery volume creating step, namely receiving a recovery instruction and creating a recovery volume;

judging the execution step, judging whether the recovery time point M0 is larger than or equal to the latest time point H0, if so, executing the latest time point recovery step, and if not, executing any time point recovery step;

a step of recovering the latest time point, which is to traverse the mirror table to determine whether the latest content of the data block is in the log volume group or the initialization volume, and then read data from the log volume group or the initialization volume and write the data into the recovery volume, wherein the log volume group is read through a log metadata file;

A recovery step at any time point, namely acquiring a bitmap at the moment M0 in a data block change log, traversing the bitmap to determine whether the latest content of the data block is in a log volume group or an initialization volume, and reading data from the log volume group or the initialization volume to write into the recovery volume, wherein the log volume group is read by combining a mirror table and log metadata file positioning, and the initialization volume is read by the mirror table positioning;

And step-by-step rollback, namely when the internal data of the virtual machine recovered according to the latest time point or any time point is not in a logic consistency state, locating the position of the data block before the recovery time point M0 through a write operation log and a mirror image table, then reversely tracking a log metadata file to determine the last effective change content of each data block, reading corresponding data from an initialization volume or a log volume according to the last effective change content, and overwriting the corresponding data in the recovery volume.

The beneficial effects of the invention are as follows:

the invention combines the modes of initializing the volume and the log volume group, reasonably organizes basic configuration, synchronous data and change data, adopts a metadata management mode of a metadata structure, can efficiently track and update the data, and simultaneously configures the dynamic update of the information table, the metadata, the initializing volume and the log volume group, ensures the consistency and the integrity of the data, and lays a foundation for rapidly positioning and recovering the required data in the data recovery process.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flowchart of a KVM data storage processing method according to embodiment 1 of the present invention.

Fig. 2 is a schematic diagram of a data storage structure in embodiment 1 of the present invention.

Fig. 3 is a schematic diagram of a log metadata file structure in embodiment 1 of the present invention.

FIG. 4 is a flowchart of a KVM data storage processing method in embodiment 2 of the present invention.

FIG. 5 is a schematic diagram of a KVM data storage processing system according to embodiment 3 of the present invention.

Fig. 6 is a schematic diagram of a judgment update module in embodiment 3 of the present invention.

Fig. 7 is a flowchart of a method for recovering continuous data points in embodiment 4 of the present invention.

In the drawings, the list of components represented by the various numbers is as follows:

3001. Configuration information table creation module, 3002, data volume creation module, 3003, metadata structure creation module, 3004, judgment update module, 3005, write-in initial volume module, 3006, update bitmap data module, 3007, delete log volume and log volume information module, 3008, modify metadata module, 30041, receiving unit, 30042, judgment unit, 30043, basic configuration and synchronous data storage unit, 30044, change data storage unit.

Detailed Description

The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.

Example 1

Currently, like a normal host, a KVM virtual machine in cloud computing also faces the risk of data loss and damage, and for this case, most users use a backup method to fully backup and create a snapshot of data of the KVM virtual machine at a certain time. However, the backup method has large backup time interval and large backup data storage amount, the data storage amount is increased continuously along with the long-term operation of the disaster recovery system, and especially in a continuous data protection environment, the data growth speed is very high, the data cannot be tracked efficiently, and the data consistency and the integrity are not guaranteed. Therefore, how to design a data storage structure, which can track and manage data more efficiently, and ensure consistency and integrity of the data is a technical problem which needs to be solved rapidly at present.

In view of the foregoing, an embodiment of the present invention provides a KVM data storage processing method. FIG. 1 is a schematic flow chart of a method for storing and processing KVM data according to an embodiment of the present invention, and the method includes, with reference to FIGS. 1 and 2:

step S101, initializing and creating a configuration information table.

It will be appreciated that the configuration information table stores basic information of all virtual machines, through which the unique identification code of the virtual machine, the total disk size, and the bitmap data path are contained. The configuration information table is arranged according to the unique identification code, can search records according to the unique identification code, can generate an initialization volume and a log volume group according to the total size of the disk, and the bitmap data path stores the absolute path of the transmitted bitmap data at the local end and can position the bitmap data.

It is further understood that the configuration information table locates the information location of the target virtual machine, whether the information specifying the target virtual machine is in the initialization volume, log volume group, or metadata structure.

Step S102, initializing and creating an initialization volume and a log volume group.

It will be appreciated that since the management mechanism for directly operating the block device is complex, the present embodiment uses files to store data, improving the flexibility of data storage, and enabling the block device to be operated in a simpler manner. The initialization volume stores virtual machine basic configuration and synchronization data, and the volume size can be consistent with the total size of the virtual machine disk. The log volume group sequentially stores the change data generated by the production end, and the size of each log volume can be half of the total size of the virtual machine disk.

Step S103, initializing and creating a metadata structure, wherein the metadata structure comprises log volume information, a mirror table, a log metadata file, a write operation log and a data block change log, the mirror table stores the latest state of the virtual machine and locates log metadata file records of each data block, the log metadata file records each time point version of all the data blocks of the virtual machine, and the data block change log stores the storage state of all the data blocks of the virtual machine at the moment T by using bitmap.

It will be appreciated that the write operation log stores each write operation information of the virtual machine, and the write operation log is stored in the write operation order as the virtual machine is continuously written to and dynamically increased. Each record contains a time stamp, a starting logical block number written, and a number of data blocks written.

It can be further understood that the action mechanism of the data block change log is similar to the snapshot principle, and the data volume type where the data block is stored can be directly queried according to the bitmap after the rollback time at any time point, so that the traversal of the log metafile is reduced as much as possible. The data block change log record includes a bitmap created timestamp and an absolute path of the bitmap.

The data block change log dynamically increases the record according to the time interval set by the user, and the record and the search record can be arranged according to the time stamp sequence by writing the data block mark flag of the mirror image table at the moment into the created bitmap. Each value of bitmap should be consistent with the data block flag value.

Optionally, in step S103, as shown in fig. 3, the log metadata file is in a double-chain table structure, and each log metadata file record includes a head pointer head and a tail pointer last, where the head pointer head points to the first modified data block version and finds the tail pointer last, and the tail pointer last points to the last modified data block version;

The mirror table comprises a data block flag and a log metadata offset log_meta, wherein the log metadata offset log_meta is the offset of log metadata of a data block recorded in a log metadata file, and the log metadata offset log_meta stores the address of a head pointer head.

It will be appreciated that the double-stranded table structure has a head node, a last node and a middle node, and the double-stranded table supports two-way traversal, from the head to the tail, or from the tail to the head.

It will also be appreciated that the log metadata file dynamically adds records as the virtual machine is continually written, each record storing a respective point-in-time version of a block of data. Unlike conventional log metadata files, each log metadata file record organizes each point-in-time version of a stored data block using a doubly linked list structure. Of course, by traversing the log metadata file, any continuous data point can be quickly recovered, and further guarantee is provided for the subsequent acceleration of continuous recovery.

It is also worth noting that, as also shown in FIG. 3, the log metadata file contains a data field, a pre pointer, and a next pointer per node. The pre pointer and the next pointer point to the front and back nodes of the node, respectively, so the log metadata record can be traversed by the pre pointer and the next pointer in positive and reverse order, respectively. The Data field is a structure data_info containing the Data block microsecond write timestamp, the log volume filename log_volume to which the Data block is written, and the intra-log volume offset to which the Data block is written, so the location of the Data block in the log volume can be located by the Data field of the node.

It should be noted that, the order of initializing and creating the configuration information table, the initialization volume, the log volume group and the metadata structure may be increased by parallel processing, but it is required to ensure that the dependency relationship is satisfied, that is, the configuration information table is dependent on the initialization volume and the log volume group, and the initialization volume and the log volume group are dependent on the metadata structure, or may be sequentially created, for example, the configuration information table is initialized and created, the initialization volume and the log volume group are created, and finally the metadata structure is created. The present embodiment is not particularly limited thereto.

Step S104, receiving and judging the source data type of the production end, if the source data is basic configuration and synchronous data, updating the configuration information table first, then updating the initialization volume and the metadata structure, and if the source data is a data change record, directly updating the log volume group and the metadata structure.

Optionally, step S102 includes:

step S1021, receiving source data of a production end;

Step S1022, judging the type of the source data, and if the source data is configuration information, bitmap data and non-hole synchronous data, executing the basic configuration and synchronous data storage steps, if the source data is a data change record, executing the change data storage step, wherein the data change record comprises a unique identification code, a time stamp, an offset, a data length and a data content;

Step S1023, for the configuration information, firstly generating a configuration information table according to the received configuration information, recording and adding the configuration information table, and then generating a data volume and metadata of the target virtual machine according to the generated record; for the synchronous data without holes, firstly converting the synchronous data without holes and the bitmap data to obtain initialized synchronous data, and then writing the initialized synchronous data into a data volume;

Step S1024, the metadata and the data volume of the virtual machine are searched according to the unique identification code in the data change record, the data content in the record is written into the log volume group, and then the write operation log, the update mirror image table, the update log metadata file and the update data block change log are updated.

It is understood that the configuration information stores virtual machine information including unique identification codes of the virtual machines, and the total size of the virtual disk files. The void-free synchronous data is a void-free version of the initialization synchronous data of the virtual machine, and the initialization synchronous data can be obtained through conversion by combining bitmap data. The bitmap data is a data block storage state of initialization synchronous data of the virtual machine, and the initialization synchronous data can be obtained through conversion by combining the synchronous data without holes.

The embodiment combines the modes of the initialization volume and the log volume group, reasonably organizes basic configuration, synchronous data and change data, adopts a metadata management mode of a metadata structure, can efficiently track and update data, and simultaneously configures dynamic update of an information table, metadata, the initialization volume and the log volume group, ensures data consistency and integrity, and lays a foundation for rapidly positioning and recovering required data in a data recovery process.

Example 2

As shown in fig. 4, a KVM data storage processing method is provided, and the method is applied to a server program, and includes the steps of:

Step S201, initializing and creating a configuration information table;

Step S202, initializing and creating an initialization volume and a log volume group, wherein the log volume group comprises a plurality of log volumes, and each log volume correspondingly comprises log volume information in a metadata structure;

Step S203, creating a metadata structure comprising log volume information, a mirror table, a log metadata file, a write operation log and a data block change log, wherein the mirror table stores the latest state of the virtual machine and locates the log metadata file record of each data block;

The log metadata file records all time point versions of all data blocks of the virtual machine and stores storage positions of all data in the log volume group;

The data block change log stores the storage states of all data blocks of the virtual machine at the moment T by using a bitmap;

The log volume information comprises a data block offset and a data block time stamp time, wherein the data block offset is the offset of each data block stored in the log volume in an initialization volume, and the data block time stamp time is the time stamp written by the last data block stored in the log volume;

Step S204, receiving and judging the source data type of the production end, if the source data is basic configuration and synchronous data, updating the configuration information table firstly, and then updating the initialization volume and the metadata structure;

Step S205, find old journal volume LV0 and its journal volume information in journal volume group, traverse the data block offset of journal volume information in journal volume LV0, and then overwrite all data block contents in journal volume LV0 into the initialization volume according to the data block offset;

Step S206, searching a configuration information table according to the unique identification code of the target virtual machine to obtain bitmap data of the target virtual machine, and then synchronously traversing the initialization volume and the bitmap data to update the bitmap data;

Step S207, the log volume LV0 data are all written into the initialization volume and then the log volume LV0 and the corresponding log volume information thereof are deleted;

Step S208, the mirror table, the log metadata file, the write operation log and the data block change log are correspondingly modified.

Unlike the above embodiments, the present embodiment provides a storage space reclamation processing policy that merges the virtual machine log volume data contents into the initialization volume when the data storage amount reaches the limit, and modifies the deletion-related metadata. The system not only ensures long-time stable operation of the system, but also improves the utilization efficiency of storage, adapts to the continuously-increasing requirement of data volume, and enhances the expandability and the persistence of the system.

Example 3

As shown in fig. 5, there is provided a KVM data storage processing system, which is applied to a server program, including:

A configuration information table creation module 3001 for initializing and creating a configuration information table;

The data volume creation module 3002 is configured to initially create an initialization volume and a log volume group.

The metadata structure creation module 3003 is configured to initialize and create a metadata structure, where the metadata structure includes log volume information, a mirror table, a log metadata file, a write operation log, and a data block change log, the mirror table stores a latest state of the virtual machine and locates a log metadata file record of each data block, the log metadata file records each time point version of all data blocks of the virtual machine, and the data block change log uses bitmap to store storage states of all data blocks of the virtual machine at time T.

The judging and updating module 3004 is configured to receive and judge the source data type of the production end, update the configuration information table first if the source data is basic configuration and synchronization data, then update the initialization volume and the metadata structure, and directly update the log volume group and the metadata structure if the source data is a data change record.

Optionally, in the metadata structure creation module 3003, the log metadata file is in a double-chain table structure, and each log metadata file record includes a head pointer head and a tail pointer last, where the head pointer head points to the first modified data block version and finds the tail pointer last, and the tail pointer last points to the last modified data block version;

Optionally, in the data volume creation module 3002, the log volume group includes a plurality of log volumes, each log volume correspondingly includes log volume information in a metadata structure;

The metadata structure creation module 3003 stores storage locations of all data in a log volume group in a log metadata file, wherein the log volume information includes a data block offset and a data block timestamp time, the data block offset is an offset of each data block stored in the log volume in an initialization volume, and the data block timestamp time is a timestamp written by a last data block stored in the log volume;

as also shown in fig. 5, the system further comprises:

The write-in initial volume module 3005 is configured to find an old log volume LV0 and log volume information thereof in the log volume group, traverse a data block offset of the log volume information in the log volume LV0, and then overwrite all data block contents in the log volume LV0 into the initialization volume according to the data block offset;

The bitmap data updating module 3006 is configured to search the configuration information table according to the unique identifier of the target virtual machine, obtain bitmap data of the target virtual machine, and then perform synchronous traversal on the initialization volume and the bitmap data to update the bitmap data;

the log volume deleting and log volume information module 3007 is configured to delete the log volume LV0 and the log volume information corresponding thereto after writing all the log volume LV0 data into the initialization volume;

A modification metadata module 3008 for corresponding to the modification mirror table, the log metadata file, the write operation log, and the data block change log

As shown in fig. 6, optionally, the determining update module 3004 includes:

a receiving unit 30041, configured to receive source data of a production end;

The judging unit 30042 is configured to judge the type of the source data, and if the source data is configuration information, bitmap data, and non-hole synchronous data, execute the basic configuration and synchronous data storage step, and if the source data is a data change record, execute the change data storage step, where the data change record includes a unique identifier, a timestamp, an offset, a data length, and a data content;

The basic configuration and synchronization data storage unit 30043 is configured to generate a configuration information table according to the received configuration information, record the configuration information table, and generate a data volume and metadata of the target virtual machine according to the generated record; for the synchronous data without holes, firstly converting the synchronous data without holes and the bitmap data to obtain initialized synchronous data, and then writing the initialized synchronous data into a data volume;

And the change data storage unit 30044 is used for searching the metadata and the data volume of the virtual machine according to the unique identification code in the data change record, writing the data content in the record into the log volume group, and then updating the write operation log, the update mirror table, the update log metadata file and the update data block change log.

According to the embodiment, the initialization volume and the log volume group are fused, data management is realized by means of the metadata structure, an efficient data tracking and updating framework is constructed, basic configuration and data synchronization flow are optimized, and meanwhile, data consistency and integrity are guaranteed through a dynamic updating mechanism.

Example 4

As shown in fig. 7, a continuous data point recovery method is provided, and the KVM data storage processing method described in embodiment 1 or 2 is used, which includes the steps of:

Step S401, initializing and creating a configuration information table;

Step S402, initializing and creating an initialization volume and a log volume group;

step S403, initializing and creating a metadata structure, wherein the metadata structure comprises log volume information, a mirror image table, a log metadata file, a write operation log and a data block change log, wherein the mirror image table stores the latest state of the virtual machine and positions the log metadata file record of each data block, and the log metadata file records each time point version of all the data blocks of the virtual machine;

Step S404, receiving and judging the source data type of the production end, if the source data is basic configuration and synchronous data, updating the configuration information table firstly, and then updating the initialization volume and the metadata structure;

Step S405, receiving a recovery instruction and creating a recovery volume;

Step S406, judging whether the recovery time point M0 is greater than or equal to the latest time point H0, if so, executing step S407, otherwise, executing step S108;

step S407, traversing the mirror image table to determine whether the latest content of the data block is in a log volume group or an initialization volume, and then reading data from the log volume group or the initialization volume and writing the data into a recovery volume, wherein the read log volume group is read through a log metadata file;

Step S408, obtaining a bitmap at the moment M0 in the data block change log, traversing the bitmap to determine whether the latest content of the data block is in a log volume group or an initialization volume, and reading data from the log volume group or the initialization volume to write into a recovery volume, wherein the read log volume group is read by combining a mirror table and log metadata file positioning, and the read initialization volume is read by the mirror table positioning;

In step S409, when the internal data of the virtual machine recovered according to the latest time point or any time point is not in a logical consistency state, the data block position before the recovery time point M0 is located through the write operation log and the mirror table, then the last valid modification content of each data block is determined by reversely tracking the log metadata file, and then the corresponding data is read from the initialization volume or the log volume according to the last valid modification content and is overwritten into the recovery volume.

It will be appreciated that before recovery, the server needs to create a recovery volume, typically with the recovery volume data set to zero, and the volume size at recovery is the same as the initial volume.

The latest time point H0 refers to the time point when the system successfully records the data change last time, and is the snapshot time point of the latest data state in the system, which represents the latest version of the current data.

The recovery time point M0 refers to a target time point to which it is desired to recover data, and may be any past moment, and it may be desired to recover data to this particular historical state due to data loss, corruption, or error.

In the present embodiment, if the recovery time point M0 is equal to or greater than the latest time point H0, this means that the user wishes to recover to a state containing the latest data, and thus the latest time point recovery step is performed. If the recovery time point M0 is smaller than the latest time point H0, the user wishes to recover to an earlier state, and thus an arbitrary time point recovery step is performed.

It will be appreciated that the data block flag of the mirror table has three values, 0 indicating that the latest content of the data block is in the log volume group, 1 indicating that the latest content of the data block is in the initialization volume, and 2 indicating that the data block is invalid data, i.e., that the block is an unassigned block or that the data is zero. The log metadata offset log_meta is the offset of the log metadata record of the data block in the log metadata file, and stores the address of the head pointer head

In this embodiment, for the recovery at the latest point in time, the present embodiment traverses the mirror table, and checks the data block flag field of each record to determine whether the latest content of the data block is in the log volume group or the initialization volume. If the extracted data block flag is 2, the method can be skipped directly. If the mirror table data block record with the data block flag of 1 is fetched, the data can be read from the initialization volume directly according to the offset of the record in the mirror table, because the offset of the record in the mirror table is actually the offset of the data block in the initialization volume. If the flag of the extracted Data block is 0, the log metadata file needs to be positioned according to the log metadata offset log_meta, the log metadata file records all time point versions of all Data blocks of the virtual machine, then the log volume is read through the log metadata file, more specifically, the head node in the log metadata file is positioned according to the log metadata offset log_meta, the last node is found through the pre pointer of the head node, the log volume file name (log_volume) and the in-volume offset (offset) in the data_info structure are obtained, and the Data is read from the log volume according to the log volume file name and the in-volume offset.

In this embodiment, for recovery at any time point, the data block change log uses the bitmap to store the storage states of all the data blocks of the virtual machine at a certain moment, and after the time of rollback at any time point, the data volume type where the data block is stored can be directly queried according to the bitmap, and the record closest to the user-specified time point M0 can be quickly located by a binary search method. After the record closest to M0 is determined, the state of each data block can be rapidly checked through the bitmap, if the bit value is 2, the data block is invalid and skipped, if the bit value is 1, the latest data of the data block is initialized, and if the bit value is 0, the data is possibly in a log volume group. If the data is in the log volume group, the mirror table record is positioned according to the offset in the bitmap, then the head node in the log metadata file is positioned according to the log metadata offset log_meta, the log metadata record is traversed in reverse order, the node with the time stamp less than or equal to M0 is found, the file name and the offset of the log volume are obtained, the data is read from the log volume, and if the node meeting the condition is not found, the data is indicated in the initialization volume.

In addition, gradual rollback recovery is a method of ensuring that data is restored to a specified historical state, ensuring consistency and integrity of the data by gradually backing back to the most recent consistent state.

In this embodiment, the write operation log records all write operations, the mirror table provides the current state and position information of the data blocks, and the last modified content of the data blocks can be determined by combining the two parts, the log metadata file records each time point version of all the data blocks of the virtual machine, the data state of the designated time point can be found through reverse sequence traversal, and the restored data can be ensured to be consistent with the state of the designated time point through gradual rollback and overwrite, and the restored data can be processed through rollback even if the data is inconsistent.

More specifically, a binary search method is used to locate that the time stamp of the recovery time point M0 is less than or equal to M0 and the latest write operation record searches the mirror table record according to the start block number and the number of blocks in the write operation log, and determines the head node in the log metadata file of each data block. Traversing the log metadata record in the reverse order, finding out nodes with time stamps smaller than or equal to M0, and determining the last modified content of the data block.

If the previous node of the target node is null, the data is stored in the initialization volume, and if the previous node is not null, the data is stored in the log volume. If the Data is stored in the initialization volume, the Data is read from the initialization volume directly according to the offset in the mirror table, and if the Data is stored in the log volume, the Data is read from the log volume according to the log volume file name and offset in the data_info structure of the previous node. And then, the data read from the initialization volume or the log volume is overwritten and written into the recovery volume according to the offset in the mirror table, and if the internal data of the recovered virtual machine is inconsistent, the operation is continuously rolled back and repeated.

The embodiment supports a plurality of recovery methods, including the recovery at the latest time point, the recovery at any time point and the gradual rollback recovery, can meet different service requirements, when data loss or error occurs, a user can select the most suitable recovery scheme according to the actual situation, and by traversing the mirror table and the log metadata file, required data can be accurately recovered from the initialization volume and the log volume group, thereby ensuring quick recovery under different fault conditions.

Any combination of one or more computer readable media may be employed for storing embodiments of the present invention. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the computer-readable storage medium include an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD_ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

The computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, either in baseband or as part of a carrier wave. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.

Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

The computer program code for carrying out operations of the present invention may be written in one or more programming languages, including an object oriented programming language such as Java, smalltalk, C ++, ruby, go and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).

The foregoing examples illustrate only a few embodiments of the invention and are described in detail herein without thereby limiting the scope of the invention. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the invention, which are all within the scope of the invention. Accordingly, the scope of protection of the present invention is to be determined by the appended claims.

Claims

1. A KVM data storage and processing method, characterized in that the method is applied to a server program, comprising the steps of:

Configuration information table creation step, initialization creation of configuration information table;

Data volume creation steps: Initialize and create initialization volume and log volume group;

The metadata structure creation step is to initialize and create the metadata structure. The metadata structure includes log volume information, mirror table, log metadata file, write operation log and data block change log. The mirror table stores the latest state of the virtual machine and locates the log metadata file record of each data block; the log metadata file records the versions of all data blocks of the virtual machine at each time point; the data block change log uses bitmap to store the storage state of all data blocks of the virtual machine at time T;

Data storage step: receiving and determining the source data type of the production end. If the source data is basic configuration and synchronization data, the configuration information table is updated first, and then the initialization volume and metadata structure are updated; if the source data is a data change record, the log volume group and metadata structure are directly updated;

The data storage step includes receiving and determining the source data type of the production end:

Receiving step, receiving source data from the production end;

A judgment step, judging the type of source data, if the source data is configuration information, bitmap data and synchronization data without holes, then executing the basic configuration and synchronization data storage steps; if the source data is a data change record, then executing the change data storage step, wherein the data change record includes a unique identification code, a timestamp, an offset, a data length and a data content;

Basic configuration and synchronization data storage steps: for configuration information, first generate a configuration information table record according to the received configuration information and add it to the configuration information table, and then generate the data volume and metadata of the target virtual machine according to the generated record; for bitmap data, first initialize the metadata according to the bitmap data, and then write the absolute path of the bitmap data into the target virtual machine record in the configuration information table; for synchronization data without holes, first convert the synchronization data without holes and the bitmap data to obtain the initialization synchronization data, and then write the initialization synchronization data into the data volume;

The step of storing changed data is to search for the metadata and data volume of the virtual machine according to the unique identification code in the data change record, write the data content in the record into the log volume group, and then update the write operation log, update the mirror table, update the log metadata file and update the data block change log.

2. The KVM data storage and processing method according to claim 1 is characterized in that, in the metadata structure creation step, the log metadata file is a double linked list structure, each log metadata file record includes a head pointer head and a tail pointer last, the head pointer head points to the data block version that was changed for the first time and finds the tail pointer last, the tail pointer last points to the data block version that was changed most recently;

The mirror table includes a data block flag flag and a log metadata offset log_meta, wherein the log metadata offset log_meta is the offset of the log metadata of the data block recorded in the log metadata file, and the log metadata offset log_meta stores the address of the head pointer head.

3. The KVM data storage and processing method according to claim 1, characterized in that, in the data volume creation step, the log volume group includes a plurality of log volumes, and each log volume corresponds to log volume information in a metadata structure;

In the metadata structure creation step, the log metadata file stores the storage location of all data in the log volume group; the log volume information includes a data block offset and a data block timestamp, wherein the data block offset is the offset of each data block stored in the log volume in the initialization volume, and the data block timestamp is the timestamp when the last data block stored in the log volume is written.

4. The KVM data storage processing method according to claim 3, characterized in that the method further comprises the steps of:

In the step of writing the initial volume, find the old log volume LV0 and its log volume information in the log volume group, traverse the data block offset offset of the log volume information in the log volume LV0, and then overwrite all the data block contents in the log volume LV0 with the initialization volume according to the data block offset offset;

The step of updating the bitmap data is to search the configuration information table according to the unique identification code of the target virtual machine, obtain the bitmap data of the target virtual machine, and then synchronously traverse the initialization volume and the bitmap data to update the bitmap data;

The step of deleting the log volume and its log volume information is to write all the data of the log volume LV0 into the initialization volume and then delete the log volume LV0 and its corresponding log volume information;

The metadata modification step corresponds to modifying the mirror table, log metadata file, writing operation log and data block change log.

5. A KVM data storage and processing system, characterized in that the system is applied to a server program, comprising:

A configuration information table creation module is used to initialize and create a configuration information table;

The data volume creation module is used to initialize and create the initialization volume and log volume group;

The metadata structure creation module is used to initialize and create the metadata structure. The metadata structure includes log volume information, mirror table, log metadata file, write operation log and data block change log. The mirror table stores the latest state of the virtual machine and locates the log metadata file record of each data block. The log metadata file records the versions of all data blocks of the virtual machine at each time point. The data block change log uses bitmap to store the storage state of all data blocks of the virtual machine at time T.

The judgment update module is used to receive and judge the source data type of the production end. If the source data is basic configuration and synchronization data, the configuration information table is updated first, and then the initialization volume and metadata structure are updated; if the source data is a data change record, the log volume group and metadata structure are directly updated;

Determine the update module, including:

A receiving unit, used for receiving source data from the production end;

A judgment unit, used to judge the type of source data, if the source data is configuration information, bitmap data and synchronization data without holes, then execute the basic configuration and synchronization data storage steps; if the source data is a data change record, then execute the change data storage step, wherein the data change record includes a unique identification code, a timestamp, an offset, a data length and a data content;

The basic configuration and synchronization data storage unit is used to generate a configuration information table record according to the received configuration information and add it to the configuration information table, and then generate the data volume and metadata of the target virtual machine according to the generated record; for bitmap data, first initialize the metadata according to the bitmap data, and then write the absolute path of the bitmap data into the target virtual machine record in the configuration information table; for synchronization data without holes, first convert the synchronization data without holes and the bitmap data to obtain the initialization synchronization data, and then write the initialization synchronization data into the data volume;

The change data storage unit is used to find the metadata and data volume of the virtual machine according to the unique identification code in the data change record, write the data content in the record into the log volume group, and then update the write operation log, update the mirror table, update the log metadata file and update the data block change log.

6. The KVM data storage and processing system according to claim 5, characterized in that in the metadata structure creation module, the log metadata file is a double linked list structure, each log metadata file record includes a head pointer head and a tail pointer last, the head pointer head points to the data block version that was changed for the first time and finds the tail pointer last, the tail pointer last points to the data block version that was changed most recently;

7. The KVM data storage and processing system according to claim 5, characterized in that, in the data volume creation module, the log volume group includes a plurality of log volumes, and each log volume corresponds to log volume information in a metadata structure;

In the metadata structure creation module, the log metadata file stores the storage location of all data in the log volume group; the log volume information includes a data block offset and a data block timestamp, wherein the data block offset is the offset of each data block stored in the log volume in the initialization volume, and the data block timestamp is the timestamp of the last data block stored in the log volume.

The system further comprises:

The write initial volume module is used to find the old log volume LV0 and its log volume information in the log volume group, traverse the data block offset offset of the log volume information in the log volume LV0, and then overwrite all data block contents in the log volume LV0 into the initialization volume according to the data block offset offset;

The bitmap data updating module is used to search the configuration information table according to the unique identification code of the target virtual machine, obtain the bitmap data of the target virtual machine, and then synchronously traverse the initialization volume and the bitmap data to update the bitmap data;

The module for deleting the log volume and its log volume information is used to delete the log volume LV0 and its corresponding log volume information after writing all the data of the log volume LV0 into the initialization volume;

The metadata modification module is used to modify the image table, log metadata file, write operation log and data block change log.

8. A method for recovering continuous data points, characterized in that the method for storing and processing KVM data according to any one of claims 1 to 4 is used, and further comprising the steps of:

A recovery volume creation step is to receive a recovery instruction and create a recovery volume;

Determine the execution step, determine whether the recovery time point M0 is greater than or equal to the latest time point H0, if so, execute the latest time point recovery step, if not, execute the arbitrary time point recovery step;

The latest point-in-time recovery step traverses the mirror table to determine whether the latest content of the data block is in the log volume group or the initialization volume, and then reads the data from the log volume group or the initial volume and writes it to the recovery volume, wherein the log volume group is read through the log metadata file;

The recovery step at any time point is to obtain the bitmap at time M0 in the data block change log, and then traverse the bitmap to determine whether the latest content of the data block is in the log volume group or the initialization volume, and then read the data from the log volume group or the initial volume and write it to the recovery volume. The reading of the log volume group is done by combining the mirror table and the log metadata file positioning, and the reading of the initialization volume is done by positioning the mirror table;

In the step-by-step rollback procedure, when the internal data of the virtual machine after restoration at the latest time point or any time point is not in a logically consistent state, the data block position before the recovery time point M0 is located by writing the operation log and the mirror table, and then the log metadata file is traced back to determine the last valid change content of each data block, and then the corresponding data is read from the initialization volume or log volume according to the last valid change content, and overwritten and written into the recovery volume.