Disclosure of Invention
The invention aims to provide an input/output processing method, an input/output processing device, input/output processing equipment and a medium, so as to solve the technical problems of disordered input/output flow and inconsistent data.
In order to solve the above technical problems, the present invention provides an input/output processing method, including:
Acquiring a first number of sites mapped to a host in a stage of creating a data protection center, wherein the data protection center at least comprises a local site and a remote site;
when the data protection center protection data is started, if the first quantity is detected to be larger than 1, determining that the data protection center protection data is failed to start;
when the first number is equal to 1, taking the site mapped to the host as a target site, and receiving an input/output request issued by the host by utilizing the target site;
And returning to the step of taking the site mapped to the host as a target site and receiving an input/output request issued by the host by utilizing the target site when the second number is detected to be equal to 1.
In one aspect, the creating the data protection center stage, the obtaining the first number of sites mapped to the host includes:
determining, at a stage of creating a data protection center, the first number of sites mapped to a host according to a parameter characterizing the number of sites mapped to the host;
In the stage of creating the data protection center, when creating the data volumes and mapping each data volume to a site corresponding to the data protection center, when creating data synchronous replication between local sites and asynchronous replication between the local sites and the different sites, the parameters used for representing the number of sites mapped to the host are invalid values, and in the stage of creating the data protection center, when detecting that the sites are mapped to the host, the values of the parameters used for representing the number of sites mapped to the host are added with 1.
In another aspect, the step of creating the data protection center further includes:
Setting the input/output request jump directions of the local site and the remote site;
before starting the data protection center to protect data, the method further comprises:
acquiring an input/output request transmission direction;
judging whether the jump direction of the input/output request is the same as the transmission direction of the input/output request;
if yes, starting the data protection center to protect data;
if not, outputting prompt information for representing that the data protection center is prohibited to start to protect the data, and controlling the jump direction of the input/output request to be the same as the transmission direction of the input/output request.
On the other hand, the local site comprises a first site and a second site, wherein the first site and the second site are of a dual-activity network topology structure, and the remote site comprises a backup logic storage unit, wherein the backup logic storage unit in the remote site is consistent with the data logic storage unit in the second site in attribute;
The controlling the input/output request skip direction to be the same as the input/output request transmission direction includes:
Under the condition that the transmission direction of the input/output request is detected to be forward transmission, reserving the jump from the data logic storage unit in the second site to the backup logic storage unit in the other site, and after the data protection center is started to protect data, keeping the jump from the data logic storage unit in the second site to the input/output request of the backup logic storage unit in the other site, wherein the forward transmission is the transmission of the input/output request from the first site to the second site;
if the transmission direction of the input/output request is detected to be changed from the forward transmission to the reverse transmission, the data protection center is paused to protect data, wherein the reverse transmission is that the input/output request is transmitted from the remote site to the second site;
deleting the input-output request skip relation from the data logic storage unit in the second site to the backup logic storage unit in the remote site from the time of suspending the data protection center protection data to the time of restarting the data protection center protection data;
and establishing an input-output request jump relation from the backup logic storage unit in the remote site to the data logic storage unit in the second site.
In another aspect, after the data protection center starts protecting the data and then keeps the data logic storage unit in the second site to the skip of the input/output request of the backup logic storage unit in the remote site, the method further includes:
After detecting that the write input/output request reaches the data logic storage unit in the second site, jumping to the backup logic storage unit modification bitmap in the remote site;
And returning to writing data in the data logic storage unit in the second site after finishing modifying the bitmap of the backup logic storage unit, wherein the bitmap is used for representing the difference between the data in the data logic storage unit of the second site and the data in the backup logic storage unit of the remote site.
In another aspect, after the establishing the input/output request skip relationship from the backup logical storage unit in the offsite site to the data logical storage unit in the second site, the method further includes:
Copying the received input and output requests of the host by using the backup logic storage units in the remote sites after the data protection center is started to protect data;
the data logic storage unit in the second site is jumped to perform double writing with the first site;
And after the double writing is detected to be completed, returning the backup logic storage unit modification bitmap of the remote site so that the difference between the data in the data logic storage unit of the second site and the data in the backup logic storage unit of the remote site is 0.
In another aspect, the method further comprises:
when the reverse transmission is detected, if one of the first station and the second station fails, acquiring a normal station of the first station and the second station;
and limiting the switching of the main end in the double-activity relationship to the normal station and keeping the remote station to receive the input and output requests issued by the host, wherein the main end in the double-activity relationship is the station for receiving the input and output requests of the host.
In order to solve the above technical problem, the present invention further provides an input/output processing device, including:
The system comprises a creation and acquisition module, a data protection center, a data management module and a data management module, wherein the creation and acquisition module is used for acquiring a first number of sites mapped to a host computer in a stage of creating the data protection center, and the data protection center at least comprises a local site and a remote site;
the data protection center protection data starting module is used for starting the data protection center protection data, and determining that the data protection center protection data is started successfully when the first number is detected to be larger than 1;
The judging module is used for judging whether the first quantity is equal to 0 after the starting is successful, if not, triggering the receiving module, and if so, triggering the adding and detecting module;
The receiving module is used for taking the site mapped to the host as a target site and receiving an input and output request issued by the host by utilizing the target site;
The adding and detecting module is used for adding the sites mapped to the host and obtaining the second number of the sites mapped to the host, and the receiving module is triggered back when the second number is detected to be equal to 1.
In order to solve the above technical problem, the present invention further provides an input/output processing apparatus, including:
a memory for storing a computer program;
And the processor is used for realizing the steps of the input/output processing method when executing the computer program.
In order to solve the above technical problem, the present invention further provides a computer readable storage medium, where a computer program is stored, where the computer program implements the steps of the input/output processing method described above when executed by a processor.
The input/output processing method comprises the steps of obtaining a first number of sites mapped to a host computer in a stage of creating a data protection center, adding the sites mapped to the host computer and obtaining a second number of sites mapped to the host computer when the first number is equal to 1 or when the first number is equal to 0 when the data protection center is started to protect data, taking the sites mapped to the host computer as target sites when the second number is detected to be equal to 1, and receiving input/output requests issued by the host computer by utilizing the target sites.
The method has the advantages that firstly, in the method, only when the number of the sites mapped to the host is equal to 1, the sites mapped to the host are used for receiving the input and output requests issued by the host, namely, only one site is guaranteed to bear the business of the host at the same time, the problem that the flow of the input and output requests is disordered is avoided, the data consistency among the sites is guaranteed as far as possible, secondly, in the method, in the stage of creating the data protection center, when the data protection center is started for protecting the data, and after the data protection center is successfully started, the number of the sites mapped to the host is judged, so that in the whole process of using the data protection center for protecting the data, only one site mapped to the host is guaranteed to receive the input and output requests issued by the host, the occurrence of the problem that the flow of the input and output requests is disordered is further avoided, and in the stage of creating the data protection center, the first number of the sites mapped to the host is 0, and in the stage of the data protection center is obtained, after the data protection center is started, the sites mapped to the host is added, the successful in the process of receiving the input and output requests issued by the host is guaranteed, and the host is processed as far as possible.
In addition, the input/output request skip direction is set in the stage of creating the data protection center, meanwhile, before the data protection center is started to protect data, the input/output request transmission direction is acquired, and when the input/output request skip direction is judged to be the same as the input/output request transmission direction, the data protection center is started to protect the data. I.e. the jump direction of the input/output request and the transmission direction of the input/output request are mutually bound, thereby avoiding the occurrence of the problem of disordered flow of the input/output request.
When the input-output request skip direction is judged to be different from the input-output request transmission direction, the input-output request skip direction and the input-output request transmission direction are controlled to be the same, so that the input-output request flow direction is ensured not to be disordered as much as possible.
After the writing input/output request is detected to reach the data logic storage unit in the second site, jumping to the backup logic storage unit modification bitmap in the remote site, returning to write data in the data logic storage unit in the second site after finishing modifying the bitmap of the backup logic storage unit, and returning to the backup logic storage unit modification bitmap of the remote site after detecting that double writing is finished, so that the data in the data logic storage unit of the second site is identical with the data in the backup logic storage unit of the remote site, and the data consistency is ensured.
For the dual-active site, when a fault site exists, the main end in the dual-active relationship is limited to be switched to a normal site, so that the main end and the remote site in the dual-active relationship at the same moment are prevented from receiving the input and output requests issued by the host, and only one site of the remote site is used for receiving the input and output requests issued by the host, thereby preventing the occurrence of the situation that the flow of the input and output requests is disordered due to the fact that the fault or multiple sites simultaneously bear the business of the host, and eliminating the risk of inconsistent data.
In addition, the invention also provides an input/output processing device, an input/output processing equipment and a computer readable storage medium, which have the same or corresponding technical characteristics as the input/output processing method, and the effects are the same as the above.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by a person of ordinary skill in the art without making any inventive effort are within the scope of the present invention.
The invention provides an input/output processing method, an input/output processing device, input/output processing equipment and a medium, which aim to solve the technical problems of disordered input/output flow and inconsistent data.
In order to better understand the aspects of the present invention, the present invention will be described in further detail with reference to the accompanying drawings and detailed description. Fig. 1 is a flowchart of an input/output processing method according to an embodiment of the present invention, as shown in fig. 1, where the method includes:
s10, in the stage of creating a data protection center, acquiring a first number of sites mapped to a host;
the data protection center at least comprises a local site and a remote site;
S11, when the data protection center protection data is started, if the first number is detected to be larger than 1, determining that the data protection center protection data is started to fail, and if the first number is detected to be smaller than or equal to 1, determining that the data protection center protection data is started to be successful;
S12, judging whether the first quantity is equal to 0 after the starting is successful, if not, entering a step S13, and if so, entering a step S14;
S13, taking the site mapped to the host as a target site, and receiving an input and output request issued by the host by utilizing the target site;
And S14, adding the sites mapped to the host and acquiring a second number of sites mapped to the host, taking the sites mapped to the host as target sites when the second number is detected to be equal to 1, and receiving input and output requests issued by the host by utilizing the target sites.
The current disaster recovery technology generally adopted mainly comprises local disaster recovery and remote disaster recovery, wherein a local disaster recovery site is generally provided with service processing capacity equivalent to that of a production site, and the local disaster recovery site and the production site can be smoothly switched under the condition of not losing data, so that continuous operation of the service is maintained, and the requirement on a network link is high, so that the distance between the local disaster recovery site and the production site is limited. The disaster recovery in different places can lead the production site to obtain data protection in a longer distance, but the data is inevitably lost when the disaster occurs. The 3DC combines the advantages of the two, and establishes a local disaster recovery center as a synchronous data mirror site and has service taking-over capability under the condition that the remote disaster recovery center has complete disaster taking-over capability. The 3DC can respond quickly to disaster situations of different scales, restore service, reduce data loss and realize better recovery point targets (Recovery Point Object, RPO) and recovery time targets (Recovery Time Object, RTO).
The current 3DC strategy has the advantages of local disaster recovery and remote disaster recovery, and meanwhile, the requirements on IO issuing and flow direction transmission in the workflow are more strict, and because the production center and the disaster recovery center under the 3DC strategy have the capacity of receiving services, when the services are issued simultaneously due to misoperation of users, or the failure production site carries a host mapping of the original production service to start a 3DC failure switching flow, the IO flow direction is disordered, and the risks of inconsistent data exist.
Therefore, in the input/output processing method provided by the embodiment of the invention, the IO issuing is limited, the disorder of IO flow caused by faults or misoperation is prevented, and the possibility of inconsistent data is eliminated.
First, in a stage of creating a data protection center, a first number of sites mapped to a host is obtained. The manner of acquiring the number of sites mapped to the host (including the first number and the second number mentioned later) is not limited. In one embodiment, creating a data protection center stage, obtaining a first number of sites mapped to a host includes:
Determining, at a stage of creating the data protection center, a first number of sites mapped to the host based on parameters characterizing the number of sites mapped to the host;
In the stage of creating the data protection center, when data volumes are created and mapped to sites corresponding to the data protection center, when synchronous replication of data between local sites is created and asynchronous replication between the local sites and the different sites is created, parameters used for representing the number of sites mapped to a host are invalid values, and in the stage of creating the data protection center, when the existence of the sites mapped to the host is detected, the values of the parameters used for representing the number of sites mapped to the host are added by 1.
In order to avoid the confusion of IO flows, the embodiment of the invention ensures that only one site receives the IO issued by the host at the same time. When the data protection center is started to protect data, if the first number is detected to be larger than 1, the data protection center is determined to be failed to start, if the first number is detected to be smaller than or equal to 1, the data protection center is determined to be successfully started, after the data protection center is successfully started, if the first number is 1, a site mapped to a host is used as a target site, the target site is utilized to receive an input/output request issued by the host, if the first number is 0, a site mapped to the host is added, a second number of sites mapped to the host is obtained, and if the second number is detected to be 1, the site mapped to the host is used as the target site, and the target site is utilized to receive the input/output request issued by the host.
The above procedure will be described below taking a 3DC scenario as an example.
The relation between the 3DC production center and the local disaster recovery center is two kinds of double-activity and synchronous replication, the relation between the 3DC production center and the remote disaster recovery center is a periodic asynchronous replication relation, and the 3DC production center and the local disaster recovery center are divided into serial and parallel architectures according to the topological relation of two remote replication main ends during forward transmission of the 3DC, namely the current 3DC has the following four networking architectures in total
The serial networking architecture comprises synchronous and cycle asynchronization (A- > B- > C), double-activity and cycle asynchronization (A < - > B- > C);
The parallel networking architecture comprises synchronous+periodic asynchronization (C < -A- > B), dual-activity+periodic asynchronization (C < -A- > B);
(1) Synchronous+periodic asynchronous tandem architecture:
The method comprises the steps of deploying a disk array A in a service production center, deploying a disk array B in a local disaster recovery center, realizing interconnection between the disk array A and the disk array B through a Fiber Channel (FC) link, establishing synchronous remote copy, realizing real-time synchronization of data received by the disk array A to the disk array B, deploying a disk array C in a different disaster recovery center, establishing periodic asynchronous remote copy with the disk array B, and synchronizing the data of the disk array B to the disk array C according to the period.
(2) Dual active + cycle asynchronous tandem architecture:
The architecture consists of two production centers and a remote disaster recovery center, wherein a disk array A and a disk array B are respectively deployed in the two local service production centers and are mutually connected through an FC link to build a dual-active topology, a disk array C is deployed in the remote disaster recovery center, a period asynchronous remote copy is built with the disk array B, and the data of the disk array B are synchronized to the disk array C according to the period.
(3) Synchronous+periodic asynchronous parallel architecture:
The method comprises the steps of deploying a disk array A in a service production center, deploying a disk array B in a local disaster recovery center, realizing interconnection and establishing synchronous remote copy through an FC link, deploying a disk array C in a remote disaster recovery center, and establishing periodic asynchronous remote copy with the disk array A.
(4) Dual active + cycle asynchronous parallel architecture:
The architecture consists of two production centers and a remote disaster recovery center, wherein a disk array A and a disk array B are respectively deployed in the two local service production centers, interconnection is realized between the two service production centers through an FC link, a dual-active topology is built, a disk array C is deployed in the remote disaster recovery center, and periodic asynchronous remote replication is built with the A.
In the 3DC strategy, the remote copy from the logical storage unit of A to the logical storage unit of B and the remote copy from the incremental data storage unit of B to the logical storage unit of C are included. In remote copy, a parameter (site_3dc_count) is added to characterize the number of sites mapped to hosts, representing the number of sites mapping hosts in the 3DC policy, and the sites of the non-3 DC policy are 0Xffff (i.e., an invalid value). The number of sites mapped to the host is determined according to the change in the value of site_3dc_count.
Fig. 2 is a flowchart of a method for creating a 3DC policy after adding parameters according to an embodiment of the present invention, as shown in fig. 2, where the method includes:
S15, creating a volume;
s16, creating double-activity and creating synchronous replication;
S17, creating periodic asynchronous replication;
s18, setting input-output jump directions of synchronous replication and periodic asynchronous replication;
s19, synchronously copying the main volume mapping host;
and S20, starting a data protection center to protect the data strategy.
It is noted that the value of site_3dc_count is 0Xffff when creating dual-activity and creating synchronous replication, creating periodic asynchronous replication, setting the input-output skip direction of synchronous replication and periodic asynchronous replication. After synchronously replicating the primary volume mapping host, site_3dc_count=1.
1) When creating the 3DC strategy, the parameter is initialized to 0, and the situation that a virtual disk (vdisk) of synchronous/dual-active and periodic asynchronous remote copy in the 3DC strategy maps a host is traversed, and 1 is added to the host mapping.
2) When the 3DC strategy is started, inquiring parameters site_3dc_count for each remote copy in the 3DC strategy, if site_3dc_count > 1, the starting fails and errors are reported.
3) After the 3DC strategy is successfully started, if the host mapping is newly added to the vdisk in the 3DC strategy, inquiring a parameter site_3dc_count of remote copy corresponding to the vdisk at the moment, and if site_3dc_count=1, failing to add the host mapping.
By checking the host mapping condition at key flow points such as creating the host mapping and creating the 3DC strategy, starting the 3DC strategy and the like, only one site is ensured to receive the host mapping before and after the production service center is changed.
In the method provided by the embodiment, firstly, only when the number of the stations mapped to the host is equal to 1, the stations mapped to the host are utilized to receive the input and output requests issued by the host, namely, only one station is ensured to bear the business of the host at the same time, the problem of disordered flow of the input and output requests is avoided, and the data consistency among the stations is ensured as much as possible; the method comprises the steps of establishing a data protection center, judging the number of stations mapped to a host after the data protection center is successfully started to protect data, ensuring that only one station is mapped to the host to receive an input/output request issued by the host in the whole process of using the data protection center to protect the data, further avoiding the problem of disordered flow of the input/output request and ensuring the data consistency, and further, when the data protection center is established, acquiring the first number of stations mapped to the host as 0, after the data protection center is successfully started, receiving the input/output request issued by the host by the station mapped to the host, and ensuring that the input/output request issued by the host can be processed as much as possible.
Besides the mode of inquiring through the site host service mapping, only one site is guaranteed to receive the host mapping, so that the occurrence of the condition that IO flows are disordered is avoided, and the risk of inconsistent data is eliminated. The embodiment also provides for avoiding occurrence of IO flow disruption through IO flow direction change restriction. In an implementation, the step of creating the data protection center further comprises:
Setting the input/output request jump directions of the local site and the remote site. As in step S18 in fig. 2.
Before starting the data protection center to protect the data, the method further comprises:
acquiring an input/output request transmission direction;
judging whether the input/output request skip direction is the same as the input/output request transmission direction;
If yes, starting a data protection center to protect data;
If not, outputting prompt information for representing that the data protection center is prohibited to start to protect the data, and controlling the jump direction of the input/output request to be the same as the transmission direction of the input/output request.
The local site comprises a first site and a second site, wherein the first site and the second site are of a dual-activity network topological structure, and the remote site comprises a backup logic storage unit, wherein the backup logic storage unit in the remote site is consistent with the data logic storage unit in the second site in attribute;
the controlling the input/output request skip direction to be the same as the input/output request transmission direction includes:
Under the condition that the transmission direction of the input/output request is forward transmission, reserving the jump from the data logic storage unit in the second site to the backup logic storage unit in the different site, and after the data protection center is started to protect the data, reserving the jump from the data logic storage unit in the second site to the input/output request of the backup logic storage unit in the different site, wherein the forward transmission is the transmission of the input/output request from the first site to the second site;
after the data protection center protects the data and keeps the data logic storage unit in the second site to the skip of the input/output request of the backup logic storage unit in the different site after the data is started, the method further comprises the following steps:
after detecting that the write input/output request reaches the data logic storage unit in the second site, jumping to the backup logic storage unit modification bitmap in the remote site;
and returning to write data in the data logic storage unit in the second site after finishing modifying the bitmap of the backup logic storage unit, wherein the bitmap is used for representing the difference between the data in the data logic storage unit of the second site and the data in the backup logic storage unit of the remote site.
If the transmission direction of the input/output request is detected to be changed from forward transmission to reverse transmission, the data protection center is paused to protect the data, wherein the reverse transmission is that the input/output request is transmitted from a different site to a second site;
Deleting the input-output request skip relation from the data logic storage unit in the second site to the backup logic storage unit in the remote site from the time of suspending the data protection center to protect the data until the data protection center is restarted;
and establishing an input-output request jump relation from the backup logic storage unit in the remote site to the data logic storage unit in the second site.
After establishing the input-output request jump relationship from the backup logical storage unit in the offsite site to the data logical storage unit in the second site, further comprising:
After the data protection center is started to protect data, copying the received input and output requests of the host by using backup logic storage units in the remote sites;
the data logic storage unit which jumps to the second site performs double writing with the first site;
And after the double writing is detected to be completed, returning the backup logical storage unit modification bitmap of the remote site so that the difference between the data in the data logical storage unit of the second site and the data in the backup logical storage unit of the remote site is 0.
Taking the dual active + asynchronous tandem 3DC strategy as an example:
Fig. 3 is a schematic diagram of an IO processing flow of dual-activity asynchronous serial connection provided in an embodiment of the present invention, where, as shown in fig. 3, a production host issues an IO cross-site dual-activity cluster, and the cross-site dual-activity cluster interacts with a disaster recovery center corresponding to a disaster recovery host. The cross-site dual-activity cluster comprises a data center A array and a data center B array, wherein the data center A array and the data center B array both comprise data logic storage units, the data center B array also comprises incremental data, and the disaster recovery center C array corresponds to the disaster recovery logic storage units.
In the IO forward transmission process, a dual-active auxiliary end and a remote copy main end are arranged in a production center B, the dual-active auxiliary end is designated as a data logic storage Unit (LUN), the periodic asynchronous copy main end is a backup LUN consistent with the attribute of a data LUN, IO skip of the data LUN-backup LUN is reserved, the IO skip can not be changed in a 3DC starting state, after a write IO reaches the data LUN, the write IO is firstly skipped to a backup LUN modification bitmap and then returned to write data by the data LUN, the prior art synchronizes the write IO received by the data LUN to the snapshot LUN by starting incremental snapshot at fixed time, the skip mode and the IO transmission direction are mutually bound, and the 3DC strategy can not be started mutually exclusive.
When the 3DC topology is modified into reverse transmission, deleting the jumping relation of the data LUN- > backup LUN between the 3DC strategy suspension and the next startup, and establishing the jumping relation of the backup LUN to the data LUN. After reverse synchronization is started, the backup LUN performs double writing on the copy IO of the remote disaster recovery site, the data LUN which is firstly jumped to the production site B and the production site A, and the double writing is ended to return the backup LUN to modify the bitmap, so that the bitmaps of asynchronous remote copying of B and C are consistent.
In the method provided by the embodiment, the IO skip direction is ensured to be consistent with the IO transmission direction through the modification mechanism of the IO skip relationship between the two remote copy relationships, so that the problem of disordered input and output flow is avoided, and the risk of inconsistent data is eliminated.
The problem of disordered input and output flow directions is solved by changing the limitation on the IO flow directions. In practice, the failure production site carries the host mapping of the original production service to start the 3DC failure switching process, which causes the IO flow disorder and the risk of inconsistent data, so the input/output processing method further includes:
when reverse transmission is detected, if one of the first station and the second station fails, acquiring a normal station of the first station and the second station;
and limiting the switching of the main end in the double-activity relationship to a normal site, and keeping the site at a different place to receive the input and output requests issued by the host, wherein the main end in the double-activity relationship is the site for receiving the input and output requests of the host.
According to the 3DC policy requirement, when the 3DC policy is started and receives service data, only one station is allowed to bear the service data, but under special conditions, for example, after all local stations are failed, reverse transmission is carried out, if the recovered production station A or B still has host service mapping before failure, namely the possibility of receiving host IO exists, and the risk of inconsistent data exists.
The invention increases the limit of automatic switching of the double-active relation caused by faults under the 3DC strategy. For example, in the reverse transmission process after the recovery of the production centers a and B, if the production center B fails again, the present invention limits the switching of the dual-activity relationship master end to the production center a, prevents the IO flow direction conflict after the recovery of the production center B, at this time, the dual-activity and periodic asynchronous replication relationship will be disconnected, the asynchronous replication center C continues to receive the service data, after the recovery of the production center B, the IO flow direction is unchanged, the reverse transmission is continued, and the fault recovery process of the 3DC strategy is executed. The fixation of IO flow direction in the whole fault recovery process is ensured, and the risk of inconsistent data is eliminated.
Specifically, the fault recovery procedure for executing the 3DC strategy specifically includes the following:
When the local production center malfunctions, rpo=0 can be made and RTO can be greatly reduced due to the dual active automatic switching feature. When two production sites of the local double-activity simultaneously fail, the production sites can be manually switched at the moment, and the production service is born by using the site of the disaster recovery in different places, so that a small amount of data cannot be avoided being lost at the moment. After the local site is restored, the reverse transmission is used to reversely transmit the incremental service data newly received by the remote disaster recovery site to the local production site and the local disaster recovery site, and after synchronization is completed, the service is switched to the production center to restore the forward transmission.
The following describes a site failure handling method for double active + cycle asynchronous tandem 3 DC.
Production center A failure:
When the production center A fails, the production center B automatically takes over the service, the data difference between the production center B and the production center A is recorded, and the cycle asynchronous copying between the production center B and the remote disaster recovery center is not affected.
When the production center A is repaired, the production center B copies the delta data during the failure to the production center A until the two dual-activity data LUNs are completely synchronized.
If production center A is not repairable and rebuilds, then double activity requires a full synchronization of B to A at a time.
B production center B failure
When the production center B fails, the connection pipe of the service of the production center A is not affected, but the production data cannot be synchronized to B and cannot be copied to the remote disaster recovery center through cycle asynchronization.
When the production center B is repaired, the production center A automatically resynchronizes the delta data to the production center B, and when the B and the A are completely synchronous, the periodic asynchronous replication is triggered again.
When the production center B cannot be repaired, the dual activities need to reinitialize synchronous data and reinitialize and copy the synchronous data to the remote disaster recovery center.
C, simultaneously failing the production center A and the production center B:
because the distance between the production center A and the production center B is relatively short, a large-scale regional disaster can be encountered to cause simultaneous failure of double sites, and the service can be pulled up at the disaster recovery center in different places. When the service is pulled up at the remote disaster recovery center, the data is rolled back to the nearest consistency point, and at most, the data of two replication cycles can be lost.
D, failure of a remote disaster recovery center C:
When the disaster recovery center C fails, the period asynchronous copying from the production center B to the disaster recovery center C is interrupted, and after the disaster recovery center C is recovered, the asynchronous copying continues to operate in an increment mode and the increment data is synchronized.
If the disaster recovery center C data cannot be recovered, full initial synchronization is needed once.
Through the fault restoration strategy, various fault scenes in the station of the double-activity and cycle asynchronous serial 3DC are processed, and the protection of data is completed.
In the method provided by the embodiment, aiming at the scene that the 3DC service has the possibility of data inconsistency caused by the fault to cause the switching of the service production center, the misdirection of IO caused by the manual misoperation and the like, the method for protecting the IO flow direction of the 3DC strategy by scanning and inquiring all stations of the 3DC strategy, modifying the IO jump relation between the data LUN and the standby LUN and limiting the dual-active switching mechanism ensures that only one station maps the host before and after the switching of the service production stations, prevents the misdirection of the IO flow of the 3DC strategy caused by the fault or the multi-station simultaneous bearing service and eliminates the risk of data inconsistency.
It is noted that the input/output processing method provided by the embodiment of the invention can be applied to any networking architecture under 3DC networking to prevent the disorder of IO flow caused by faults or misoperation and eliminate the possibility of inconsistent data.
In the above embodiments, the input/output processing method is described in detail, and the present invention further provides embodiments corresponding to the input/output processing apparatus and the input/output processing device. It should be noted that the present invention describes an embodiment of the device portion from two angles, one based on the angle of the functional module and the other based on the angle of the hardware.
An input/output processing device according to an embodiment of the present invention includes:
The system comprises a creation and acquisition module, a data protection center, a data management module and a data management module, wherein the creation and acquisition module is used for acquiring a first number of sites mapped to a host computer in a stage of creating the data protection center, and the data protection center at least comprises a local site and a remote site;
The system comprises a determining module, a data protection center protection data starting module and a data protection center protection data starting module, wherein the determining module is used for determining that the data protection center protection data is failed to start if the first number is detected to be larger than 1, and determining that the data protection center protection data is successfully started if the first number is detected to be smaller than or equal to 1;
The judging module is used for judging whether the first quantity is equal to 0 after the starting is successful, if not, triggering the receiving module, and if so, triggering the adding and detecting module;
the receiving module is used for taking the site mapped to the host as a target site and receiving an input and output request issued by the host by utilizing the target site;
And the adding and detecting module is used for adding the sites mapped to the host and acquiring a second number of sites mapped to the host, and returning to the trigger receiving module when the second number is detected to be equal to 1.
In some embodiments, the creating and obtaining module comprises:
A creation and acquisition sub-module for determining, at a stage of creating the data protection center, a first number of sites mapped to the host according to parameters characterizing the number of sites mapped to the host;
In the stage of creating the data protection center, when data volumes are created and mapped to sites corresponding to the data protection center, when synchronous replication of data between local sites is created and asynchronous replication between the local sites and the different sites is created, parameters used for representing the number of sites mapped to a host are invalid values, and in the stage of creating the data protection center, when the existence of the sites mapped to the host is detected, the values of the parameters used for representing the number of sites mapped to the host are added by 1.
In some embodiments, the input-output processing apparatus further comprises:
The setting module is used for setting the input/output request jump directions of the local site and the remote site;
the input/output processing device further includes:
the first acquisition module is used for acquiring the transmission direction of the input/output request;
the first judging module is used for judging whether the jump direction of the input/output request is the same as the transmission direction of the input/output request, if so, triggering the starting module, otherwise, triggering the output and control module;
The starting module is used for starting the data protection center to protect data;
The output and control module is used for outputting prompt information for representing the protection data of the data protection center which is prohibited from being started, and controlling the jump direction of the input and output request to be the same as the transmission direction of the input and output request.
In some embodiments, the local site comprises a first site and a second site, wherein the first site and the second site are in a dual-activity network topology structure, and the remote site comprises a backup logic storage unit, wherein the backup logic storage unit in the remote site is consistent with the data logic storage unit in the second site in attribute;
The control module is used for controlling the jump direction of the input/output request to be the same as the transmission direction of the input/output request;
the control module specifically comprises:
The data protection center is used for protecting the data from the data logic storage unit in the second site to the backup logic storage unit in the remote site, and the data protection center is used for protecting the data from the data logic storage unit in the second site to the backup logic storage unit in the remote site;
The data protection center is used for protecting data if the transmission direction of the input/output request is detected to be changed from forward transmission to reverse transmission, wherein the reverse transmission is that the input/output request is transmitted from a remote site to a second site;
The deleting module is used for deleting the input-output request skip relation from the data logic storage unit in the second site to the backup logic storage unit in the different site from the time of suspending the data protection center protection data to the time of restarting the data protection center protection data;
And the establishing module is used for establishing the input-output request jump relation from the backup logic storage unit in the remote site to the data logic storage unit in the second site.
In some embodiments, the input-output processing apparatus further comprises:
the skip module is used for skipping to the backup logical storage unit modification bitmap in the remote site after detecting that the write input/output request reaches the data logical storage unit in the second site;
And the first return module is used for returning to write data in the data logic storage unit in the second site after finishing modifying the bitmap of the backup logic storage unit, wherein the bitmap is used for representing the difference between the data in the data logic storage unit of the second site and the data in the backup logic storage unit of the remote site.
In some embodiments, the input-output processing apparatus further comprises:
the copy module is used for copying the received input and output requests of the host by using the backup logic storage units in the remote sites after the data protection center is started to protect the data;
The double-writing module is used for jumping to a data logic storage unit in the second site to perform double writing with the first site;
And the second return module is used for returning the backup logic storage unit of the remote site to modify the bitmap after the double-write completion is detected so that the difference between the data in the data logic storage unit of the second site and the data in the backup logic storage unit of the remote site is 0.
In some embodiments, the input-output processing apparatus further comprises:
The second acquisition module is used for acquiring normal stations in the first station and the second station if one station in the first station and the second station fails when reverse transmission is detected;
and the limiting module is used for limiting the switching of the main end in the double-activity relationship to a normal site and keeping the site at a different place to receive the input and output requests issued by the host, wherein the main end in the double-activity relationship is the site for receiving the input and output requests of the host.
Since the embodiments of the apparatus portion and the embodiments of the method portion correspond to each other, the embodiments of the apparatus portion are referred to the description of the embodiments of the method portion, and are not repeated herein.
Fig. 4 is a block diagram of an input/output processing device according to another embodiment of the present invention. The present embodiment is based on a hardware angle, and as shown in fig. 4, the input-output processing apparatus includes:
A memory 20 for storing a computer program;
A processor 21 for implementing the steps of the method of input output processing as mentioned in the above embodiments when executing a computer program.
Processor 21 may include one or more processing cores, such as a 4-core processor, an 8-core processor, etc. The Processor 21 may be implemented in at least one hardware form of a digital signal Processor (DIGITAL SIGNAL Processor, DSP), field-Programmable gate array (Field-Programmable GATE ARRAY, FPGA), programmable logic array (Programmable Logic Array, PLA). The processor 21 may also include a main processor for processing data in the awake state, also referred to as a central processor (Central Processing Unit, CPU), and a coprocessor for processing data in the standby state, which is a low-power processor. In some embodiments, the processor 21 may be integrated with a graphics processor (Graphics Processing Unit, GPU) for rendering and drawing of content required to be displayed by the display screen. In some embodiments, the processor 21 may also include an artificial intelligence (ARTIFICIAL INTELLIGENCE, AI) processor for processing computing operations related to machine learning.
Memory 20 may include one or more computer-readable storage media, which may be non-transitory. Memory 20 may also include high-speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In this embodiment, the memory 20 is at least used for storing a computer program 201, where the computer program, when loaded and executed by the processor 21, can implement the relevant steps of the input/output processing method disclosed in any of the foregoing embodiments. In addition, the resources stored in the memory 20 may further include an operating system 202, data 203, and the like, where the storage manner may be transient storage or permanent storage. Operating system 202 may include Windows, unix, linux, among other things. The data 203 may include, but is not limited to, the data related to the above-mentioned input-output processing method, and the like.
In some embodiments, the input/output processing device may further include a display 22, an input/output interface 23, a communication interface 24, a power supply 25, and a communication bus 26.
Those skilled in the art will appreciate that the configuration shown in fig. 4 is not limiting of the input-output processing device and may include more or fewer components than shown.
The input/output processing device provided by the embodiment of the invention comprises a memory and a processor, wherein the processor can realize the method of input/output processing when executing the program stored in the memory.
The embodiment of the invention also provides a computer program product, which comprises a computer program/instruction, wherein the computer program/instruction realizes the steps of the input/output processing method when being executed by a processor.
Finally, the invention also provides a corresponding embodiment of the computer readable storage medium. The computer-readable storage medium has stored thereon a computer program which, when executed by a processor, performs the steps as described in the method embodiments above.
It will be appreciated that the methods of the above embodiments, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored on a computer readable storage medium. Based on this understanding, the technical solution of the present invention may be embodied essentially or in part or all of the technical solution or in part in the form of a software product stored in a storage medium for performing all or part of the steps of the method according to the embodiments of the present invention. The storage medium includes a U disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk, an optical disk, or other various media capable of storing program codes.
The computer readable storage medium provided by the invention comprises the input/output processing method, and the effects are the same as those of the input/output processing method.
The input and output processing method, the device, the equipment and the medium provided by the invention are described in detail. In the description, each embodiment is described in a progressive manner, and each embodiment is mainly described by the differences from other embodiments, so that the same similar parts among the embodiments are mutually referred. For the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant points refer to the description of the method section. It should be noted that it will be apparent to those skilled in the art that the present invention may be modified and practiced without departing from the spirit of the present invention.
It should also be noted that in this specification, relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises an element.