Detailed Description
The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
Example 1
Currently, like a normal host, a KVM virtual machine in cloud computing also faces the risk of data loss and damage, and for this case, most users use a backup method to fully backup and create a snapshot of data of the KVM virtual machine at a certain time. However, the backup method has large backup time interval and large backup data storage amount, the data storage amount is increased continuously along with the long-term operation of the disaster recovery system, and especially in a continuous data protection environment, the data growth speed is very high, the data cannot be tracked efficiently, and the data consistency and the integrity are not guaranteed. Therefore, how to design a data storage structure, which can track and manage data more efficiently, and ensure consistency and integrity of the data is a technical problem which needs to be solved rapidly at present.
In view of the foregoing, an embodiment of the present invention provides a KVM data storage processing method. FIG. 1 is a schematic flow chart of a method for storing and processing KVM data according to an embodiment of the present invention, and the method includes, with reference to FIGS. 1 and 2:
step S101, initializing and creating a configuration information table.
It will be appreciated that the configuration information table stores basic information of all virtual machines, through which the unique identification code of the virtual machine, the total disk size, and the bitmap data path are contained. The configuration information table is arranged according to the unique identification code, can search records according to the unique identification code, can generate an initialization volume and a log volume group according to the total size of the disk, and the bitmap data path stores the absolute path of the transmitted bitmap data at the local end and can position the bitmap data.
It is further understood that the configuration information table locates the information location of the target virtual machine, whether the information specifying the target virtual machine is in the initialization volume, log volume group, or metadata structure.
Step S102, initializing and creating an initialization volume and a log volume group.
It will be appreciated that since the management mechanism for directly operating the block device is complex, the present embodiment uses files to store data, improving the flexibility of data storage, and enabling the block device to be operated in a simpler manner. The initialization volume stores virtual machine basic configuration and synchronization data, and the volume size can be consistent with the total size of the virtual machine disk. The log volume group sequentially stores the change data generated by the production end, and the size of each log volume can be half of the total size of the virtual machine disk.
Step S103, initializing and creating a metadata structure, wherein the metadata structure comprises log volume information, a mirror table, a log metadata file, a write operation log and a data block change log, the mirror table stores the latest state of the virtual machine and locates log metadata file records of each data block, the log metadata file records each time point version of all the data blocks of the virtual machine, and the data block change log stores the storage state of all the data blocks of the virtual machine at the moment T by using bitmap.
It will be appreciated that the write operation log stores each write operation information of the virtual machine, and the write operation log is stored in the write operation order as the virtual machine is continuously written to and dynamically increased. Each record contains a time stamp, a starting logical block number written, and a number of data blocks written.
It can be further understood that the action mechanism of the data block change log is similar to the snapshot principle, and the data volume type where the data block is stored can be directly queried according to the bitmap after the rollback time at any time point, so that the traversal of the log metafile is reduced as much as possible. The data block change log record includes a bitmap created timestamp and an absolute path of the bitmap.
The data block change log dynamically increases the record according to the time interval set by the user, and the record and the search record can be arranged according to the time stamp sequence by writing the data block mark flag of the mirror image table at the moment into the created bitmap. Each value of bitmap should be consistent with the data block flag value.
Optionally, in step S103, as shown in fig. 3, the log metadata file is in a double-chain table structure, and each log metadata file record includes a head pointer head and a tail pointer last, where the head pointer head points to the first modified data block version and finds the tail pointer last, and the tail pointer last points to the last modified data block version;
The mirror table comprises a data block flag and a log metadata offset log_meta, wherein the log metadata offset log_meta is the offset of log metadata of a data block recorded in a log metadata file, and the log metadata offset log_meta stores the address of a head pointer head.
It will be appreciated that the double-stranded table structure has a head node, a last node and a middle node, and the double-stranded table supports two-way traversal, from the head to the tail, or from the tail to the head.
It will also be appreciated that the log metadata file dynamically adds records as the virtual machine is continually written, each record storing a respective point-in-time version of a block of data. Unlike conventional log metadata files, each log metadata file record organizes each point-in-time version of a stored data block using a doubly linked list structure. Of course, by traversing the log metadata file, any continuous data point can be quickly recovered, and further guarantee is provided for the subsequent acceleration of continuous recovery.
It is also worth noting that, as also shown in FIG. 3, the log metadata file contains a data field, a pre pointer, and a next pointer per node. The pre pointer and the next pointer point to the front and back nodes of the node, respectively, so the log metadata record can be traversed by the pre pointer and the next pointer in positive and reverse order, respectively. The Data field is a structure data_info containing the Data block microsecond write timestamp, the log volume filename log_volume to which the Data block is written, and the intra-log volume offset to which the Data block is written, so the location of the Data block in the log volume can be located by the Data field of the node.
It should be noted that, the order of initializing and creating the configuration information table, the initialization volume, the log volume group and the metadata structure may be increased by parallel processing, but it is required to ensure that the dependency relationship is satisfied, that is, the configuration information table is dependent on the initialization volume and the log volume group, and the initialization volume and the log volume group are dependent on the metadata structure, or may be sequentially created, for example, the configuration information table is initialized and created, the initialization volume and the log volume group are created, and finally the metadata structure is created. The present embodiment is not particularly limited thereto.
Step S104, receiving and judging the source data type of the production end, if the source data is basic configuration and synchronous data, updating the configuration information table first, then updating the initialization volume and the metadata structure, and if the source data is a data change record, directly updating the log volume group and the metadata structure.
Optionally, step S102 includes:
step S1021, receiving source data of a production end;
Step S1022, judging the type of the source data, and if the source data is configuration information, bitmap data and non-hole synchronous data, executing the basic configuration and synchronous data storage steps, if the source data is a data change record, executing the change data storage step, wherein the data change record comprises a unique identification code, a time stamp, an offset, a data length and a data content;
Step S1023, for the configuration information, firstly generating a configuration information table according to the received configuration information, recording and adding the configuration information table, and then generating a data volume and metadata of the target virtual machine according to the generated record; for the synchronous data without holes, firstly converting the synchronous data without holes and the bitmap data to obtain initialized synchronous data, and then writing the initialized synchronous data into a data volume;
Step S1024, the metadata and the data volume of the virtual machine are searched according to the unique identification code in the data change record, the data content in the record is written into the log volume group, and then the write operation log, the update mirror image table, the update log metadata file and the update data block change log are updated.
It is understood that the configuration information stores virtual machine information including unique identification codes of the virtual machines, and the total size of the virtual disk files. The void-free synchronous data is a void-free version of the initialization synchronous data of the virtual machine, and the initialization synchronous data can be obtained through conversion by combining bitmap data. The bitmap data is a data block storage state of initialization synchronous data of the virtual machine, and the initialization synchronous data can be obtained through conversion by combining the synchronous data without holes.
The embodiment combines the modes of the initialization volume and the log volume group, reasonably organizes basic configuration, synchronous data and change data, adopts a metadata management mode of a metadata structure, can efficiently track and update data, and simultaneously configures dynamic update of an information table, metadata, the initialization volume and the log volume group, ensures data consistency and integrity, and lays a foundation for rapidly positioning and recovering required data in a data recovery process.
Example 2
As shown in fig. 4, a KVM data storage processing method is provided, and the method is applied to a server program, and includes the steps of:
Step S201, initializing and creating a configuration information table;
Step S202, initializing and creating an initialization volume and a log volume group, wherein the log volume group comprises a plurality of log volumes, and each log volume correspondingly comprises log volume information in a metadata structure;
Step S203, creating a metadata structure comprising log volume information, a mirror table, a log metadata file, a write operation log and a data block change log, wherein the mirror table stores the latest state of the virtual machine and locates the log metadata file record of each data block;
The log metadata file records all time point versions of all data blocks of the virtual machine and stores storage positions of all data in the log volume group;
The data block change log stores the storage states of all data blocks of the virtual machine at the moment T by using a bitmap;
The log volume information comprises a data block offset and a data block time stamp time, wherein the data block offset is the offset of each data block stored in the log volume in an initialization volume, and the data block time stamp time is the time stamp written by the last data block stored in the log volume;
Step S204, receiving and judging the source data type of the production end, if the source data is basic configuration and synchronous data, updating the configuration information table firstly, and then updating the initialization volume and the metadata structure;
Step S205, find old journal volume LV0 and its journal volume information in journal volume group, traverse the data block offset of journal volume information in journal volume LV0, and then overwrite all data block contents in journal volume LV0 into the initialization volume according to the data block offset;
Step S206, searching a configuration information table according to the unique identification code of the target virtual machine to obtain bitmap data of the target virtual machine, and then synchronously traversing the initialization volume and the bitmap data to update the bitmap data;
Step S207, the log volume LV0 data are all written into the initialization volume and then the log volume LV0 and the corresponding log volume information thereof are deleted;
Step S208, the mirror table, the log metadata file, the write operation log and the data block change log are correspondingly modified.
Unlike the above embodiments, the present embodiment provides a storage space reclamation processing policy that merges the virtual machine log volume data contents into the initialization volume when the data storage amount reaches the limit, and modifies the deletion-related metadata. The system not only ensures long-time stable operation of the system, but also improves the utilization efficiency of storage, adapts to the continuously-increasing requirement of data volume, and enhances the expandability and the persistence of the system.
Example 3
As shown in fig. 5, there is provided a KVM data storage processing system, which is applied to a server program, including:
A configuration information table creation module 3001 for initializing and creating a configuration information table;
The data volume creation module 3002 is configured to initially create an initialization volume and a log volume group.
The metadata structure creation module 3003 is configured to initialize and create a metadata structure, where the metadata structure includes log volume information, a mirror table, a log metadata file, a write operation log, and a data block change log, the mirror table stores a latest state of the virtual machine and locates a log metadata file record of each data block, the log metadata file records each time point version of all data blocks of the virtual machine, and the data block change log uses bitmap to store storage states of all data blocks of the virtual machine at time T.
The judging and updating module 3004 is configured to receive and judge the source data type of the production end, update the configuration information table first if the source data is basic configuration and synchronization data, then update the initialization volume and the metadata structure, and directly update the log volume group and the metadata structure if the source data is a data change record.
Optionally, in the metadata structure creation module 3003, the log metadata file is in a double-chain table structure, and each log metadata file record includes a head pointer head and a tail pointer last, where the head pointer head points to the first modified data block version and finds the tail pointer last, and the tail pointer last points to the last modified data block version;
The mirror table comprises a data block flag and a log metadata offset log_meta, wherein the log metadata offset log_meta is the offset of log metadata of a data block recorded in a log metadata file, and the log metadata offset log_meta stores the address of a head pointer head.
Optionally, in the data volume creation module 3002, the log volume group includes a plurality of log volumes, each log volume correspondingly includes log volume information in a metadata structure;
The metadata structure creation module 3003 stores storage locations of all data in a log volume group in a log metadata file, wherein the log volume information includes a data block offset and a data block timestamp time, the data block offset is an offset of each data block stored in the log volume in an initialization volume, and the data block timestamp time is a timestamp written by a last data block stored in the log volume;
as also shown in fig. 5, the system further comprises:
The write-in initial volume module 3005 is configured to find an old log volume LV0 and log volume information thereof in the log volume group, traverse a data block offset of the log volume information in the log volume LV0, and then overwrite all data block contents in the log volume LV0 into the initialization volume according to the data block offset;
The bitmap data updating module 3006 is configured to search the configuration information table according to the unique identifier of the target virtual machine, obtain bitmap data of the target virtual machine, and then perform synchronous traversal on the initialization volume and the bitmap data to update the bitmap data;
the log volume deleting and log volume information module 3007 is configured to delete the log volume LV0 and the log volume information corresponding thereto after writing all the log volume LV0 data into the initialization volume;
A modification metadata module 3008 for corresponding to the modification mirror table, the log metadata file, the write operation log, and the data block change log
As shown in fig. 6, optionally, the determining update module 3004 includes:
a receiving unit 30041, configured to receive source data of a production end;
The judging unit 30042 is configured to judge the type of the source data, and if the source data is configuration information, bitmap data, and non-hole synchronous data, execute the basic configuration and synchronous data storage step, and if the source data is a data change record, execute the change data storage step, where the data change record includes a unique identifier, a timestamp, an offset, a data length, and a data content;
The basic configuration and synchronization data storage unit 30043 is configured to generate a configuration information table according to the received configuration information, record the configuration information table, and generate a data volume and metadata of the target virtual machine according to the generated record; for the synchronous data without holes, firstly converting the synchronous data without holes and the bitmap data to obtain initialized synchronous data, and then writing the initialized synchronous data into a data volume;
And the change data storage unit 30044 is used for searching the metadata and the data volume of the virtual machine according to the unique identification code in the data change record, writing the data content in the record into the log volume group, and then updating the write operation log, the update mirror table, the update log metadata file and the update data block change log.
According to the embodiment, the initialization volume and the log volume group are fused, data management is realized by means of the metadata structure, an efficient data tracking and updating framework is constructed, basic configuration and data synchronization flow are optimized, and meanwhile, data consistency and integrity are guaranteed through a dynamic updating mechanism.
Example 4
As shown in fig. 7, a continuous data point recovery method is provided, and the KVM data storage processing method described in embodiment 1 or 2 is used, which includes the steps of:
Step S401, initializing and creating a configuration information table;
Step S402, initializing and creating an initialization volume and a log volume group;
step S403, initializing and creating a metadata structure, wherein the metadata structure comprises log volume information, a mirror image table, a log metadata file, a write operation log and a data block change log, wherein the mirror image table stores the latest state of the virtual machine and positions the log metadata file record of each data block, and the log metadata file records each time point version of all the data blocks of the virtual machine;
Step S404, receiving and judging the source data type of the production end, if the source data is basic configuration and synchronous data, updating the configuration information table firstly, and then updating the initialization volume and the metadata structure;
Step S405, receiving a recovery instruction and creating a recovery volume;
Step S406, judging whether the recovery time point M0 is greater than or equal to the latest time point H0, if so, executing step S407, otherwise, executing step S108;
step S407, traversing the mirror image table to determine whether the latest content of the data block is in a log volume group or an initialization volume, and then reading data from the log volume group or the initialization volume and writing the data into a recovery volume, wherein the read log volume group is read through a log metadata file;
Step S408, obtaining a bitmap at the moment M0 in the data block change log, traversing the bitmap to determine whether the latest content of the data block is in a log volume group or an initialization volume, and reading data from the log volume group or the initialization volume to write into a recovery volume, wherein the read log volume group is read by combining a mirror table and log metadata file positioning, and the read initialization volume is read by the mirror table positioning;
In step S409, when the internal data of the virtual machine recovered according to the latest time point or any time point is not in a logical consistency state, the data block position before the recovery time point M0 is located through the write operation log and the mirror table, then the last valid modification content of each data block is determined by reversely tracking the log metadata file, and then the corresponding data is read from the initialization volume or the log volume according to the last valid modification content and is overwritten into the recovery volume.
It will be appreciated that before recovery, the server needs to create a recovery volume, typically with the recovery volume data set to zero, and the volume size at recovery is the same as the initial volume.
The latest time point H0 refers to the time point when the system successfully records the data change last time, and is the snapshot time point of the latest data state in the system, which represents the latest version of the current data.
The recovery time point M0 refers to a target time point to which it is desired to recover data, and may be any past moment, and it may be desired to recover data to this particular historical state due to data loss, corruption, or error.
In the present embodiment, if the recovery time point M0 is equal to or greater than the latest time point H0, this means that the user wishes to recover to a state containing the latest data, and thus the latest time point recovery step is performed. If the recovery time point M0 is smaller than the latest time point H0, the user wishes to recover to an earlier state, and thus an arbitrary time point recovery step is performed.
It will be appreciated that the data block flag of the mirror table has three values, 0 indicating that the latest content of the data block is in the log volume group, 1 indicating that the latest content of the data block is in the initialization volume, and 2 indicating that the data block is invalid data, i.e., that the block is an unassigned block or that the data is zero. The log metadata offset log_meta is the offset of the log metadata record of the data block in the log metadata file, and stores the address of the head pointer head
In this embodiment, for the recovery at the latest point in time, the present embodiment traverses the mirror table, and checks the data block flag field of each record to determine whether the latest content of the data block is in the log volume group or the initialization volume. If the extracted data block flag is 2, the method can be skipped directly. If the mirror table data block record with the data block flag of 1 is fetched, the data can be read from the initialization volume directly according to the offset of the record in the mirror table, because the offset of the record in the mirror table is actually the offset of the data block in the initialization volume. If the flag of the extracted Data block is 0, the log metadata file needs to be positioned according to the log metadata offset log_meta, the log metadata file records all time point versions of all Data blocks of the virtual machine, then the log volume is read through the log metadata file, more specifically, the head node in the log metadata file is positioned according to the log metadata offset log_meta, the last node is found through the pre pointer of the head node, the log volume file name (log_volume) and the in-volume offset (offset) in the data_info structure are obtained, and the Data is read from the log volume according to the log volume file name and the in-volume offset.
In this embodiment, for recovery at any time point, the data block change log uses the bitmap to store the storage states of all the data blocks of the virtual machine at a certain moment, and after the time of rollback at any time point, the data volume type where the data block is stored can be directly queried according to the bitmap, and the record closest to the user-specified time point M0 can be quickly located by a binary search method. After the record closest to M0 is determined, the state of each data block can be rapidly checked through the bitmap, if the bit value is 2, the data block is invalid and skipped, if the bit value is 1, the latest data of the data block is initialized, and if the bit value is 0, the data is possibly in a log volume group. If the data is in the log volume group, the mirror table record is positioned according to the offset in the bitmap, then the head node in the log metadata file is positioned according to the log metadata offset log_meta, the log metadata record is traversed in reverse order, the node with the time stamp less than or equal to M0 is found, the file name and the offset of the log volume are obtained, the data is read from the log volume, and if the node meeting the condition is not found, the data is indicated in the initialization volume.
In addition, gradual rollback recovery is a method of ensuring that data is restored to a specified historical state, ensuring consistency and integrity of the data by gradually backing back to the most recent consistent state.
In this embodiment, the write operation log records all write operations, the mirror table provides the current state and position information of the data blocks, and the last modified content of the data blocks can be determined by combining the two parts, the log metadata file records each time point version of all the data blocks of the virtual machine, the data state of the designated time point can be found through reverse sequence traversal, and the restored data can be ensured to be consistent with the state of the designated time point through gradual rollback and overwrite, and the restored data can be processed through rollback even if the data is inconsistent.
More specifically, a binary search method is used to locate that the time stamp of the recovery time point M0 is less than or equal to M0 and the latest write operation record searches the mirror table record according to the start block number and the number of blocks in the write operation log, and determines the head node in the log metadata file of each data block. Traversing the log metadata record in the reverse order, finding out nodes with time stamps smaller than or equal to M0, and determining the last modified content of the data block.
If the previous node of the target node is null, the data is stored in the initialization volume, and if the previous node is not null, the data is stored in the log volume. If the Data is stored in the initialization volume, the Data is read from the initialization volume directly according to the offset in the mirror table, and if the Data is stored in the log volume, the Data is read from the log volume according to the log volume file name and offset in the data_info structure of the previous node. And then, the data read from the initialization volume or the log volume is overwritten and written into the recovery volume according to the offset in the mirror table, and if the internal data of the recovered virtual machine is inconsistent, the operation is continuously rolled back and repeated.
The embodiment supports a plurality of recovery methods, including the recovery at the latest time point, the recovery at any time point and the gradual rollback recovery, can meet different service requirements, when data loss or error occurs, a user can select the most suitable recovery scheme according to the actual situation, and by traversing the mirror table and the log metadata file, required data can be accurately recovered from the initialization volume and the log volume group, thereby ensuring quick recovery under different fault conditions.
Any combination of one or more computer readable media may be employed for storing embodiments of the present invention. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the computer-readable storage medium include an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD_ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
The computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, either in baseband or as part of a carrier wave. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
The computer program code for carrying out operations of the present invention may be written in one or more programming languages, including an object oriented programming language such as Java, smalltalk, C ++, ruby, go and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).
The foregoing examples illustrate only a few embodiments of the invention and are described in detail herein without thereby limiting the scope of the invention. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the invention, which are all within the scope of the invention. Accordingly, the scope of protection of the present invention is to be determined by the appended claims.