[go: up one dir, main page]

CN107239238A - A kind of I/O operation method and device of the storage based on distributed lock - Google Patents

A kind of I/O operation method and device of the storage based on distributed lock Download PDF

Info

Publication number
CN107239238A
CN107239238A CN201710623874.6A CN201710623874A CN107239238A CN 107239238 A CN107239238 A CN 107239238A CN 201710623874 A CN201710623874 A CN 201710623874A CN 107239238 A CN107239238 A CN 107239238A
Authority
CN
China
Prior art keywords
host
read
timestamp
time
target data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710623874.6A
Other languages
Chinese (zh)
Other versions
CN107239238B (en
Inventor
马怀旭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhengzhou Yunhai Information Technology Co Ltd
Original Assignee
Zhengzhou Yunhai Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhengzhou Yunhai Information Technology Co Ltd filed Critical Zhengzhou Yunhai Information Technology Co Ltd
Priority to CN201710623874.6A priority Critical patent/CN107239238B/en
Publication of CN107239238A publication Critical patent/CN107239238A/en
Application granted granted Critical
Publication of CN107239238B publication Critical patent/CN107239238B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0655Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices
    • G06F3/0659Command handling arrangements, e.g. command buffers, queues, command scheduling
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本发明公开了一种基于分布式锁的存储的IO操作方法,应用于第一主机,第一主机与多个其他主机均连接于同一共享存储,按照预设的读取周期读取每个其他主机按照预设的写入周期写入至共享存储的时间戳,在对共享存储中的目标数据下发第一IO操作指令时,检测目标数据是否被第二主机占用,如果是,则读取第二主机的时间戳,根据此次读取的第二主机的时间戳和上一次读取的第二主机的时间戳,确定第二主机是否离线,如果是,则解除第二主机对目标数据的占用并进行IO操作。应用本发明所提供的技术方案,避免了第二主机并未离线,但由于第二主机的IO超时被意外隔离的情况。本发明还公开了一种基于分布式锁的存储的IO操作装置,具有相应技术效果。

The invention discloses a storage IO operation method based on a distributed lock, which is applied to a first host. The first host and multiple other hosts are connected to the same shared storage, and each other is read according to a preset reading cycle. The host writes the time stamp to the shared storage according to the preset write cycle. When issuing the first IO operation command to the target data in the shared storage, it detects whether the target data is occupied by the second host, and if so, reads The timestamp of the second host, according to the timestamp of the second host read this time and the timestamp of the second host read last time, determine whether the second host is offline, if yes, release the second host from the target data Occupied and perform IO operations. Applying the technical solution provided by the present invention avoids the situation that the second host is not offline, but is accidentally isolated due to IO timeout of the second host. The invention also discloses an IO operation device based on distributed lock storage, which has corresponding technical effects.

Description

一种基于分布式锁的存储的IO操作方法及装置A storage IO operation method and device based on distributed locks

技术领域technical field

本发明涉及计算机应用技术领域,特别是涉及一种基于分布式锁的存储的IO操作方法及装置。The present invention relates to the field of computer application technology, in particular to a storage IO operation method and device based on distributed locks.

背景技术Background technique

随着计算机技术以及网络技术的发展,在存储的实际生产环境中,常常出现虚拟化的大型集群。在虚拟化系统中,可能有不同系统或者同一系统的多个主机连接至同一个或者一组资源,这时就需要分布式锁来避免获取资源的各个主机间的相互干扰。With the development of computer technology and network technology, virtualized large-scale clusters often appear in the actual production environment of storage. In a virtualization system, there may be multiple hosts of different systems or the same system connected to the same resource or a group of resources. At this time, distributed locks are needed to avoid mutual interference among hosts acquiring resources.

现有技术中,通常采用共享存储的方案来实现分布式锁。对于连接至同一共享存储的每个主机,当其中一个主机对共享存储进行占用时,例如其中一个主机对共享存储进行写操作时,将其称为一号主机,其他主机要等待一号主机的写操作进程结束才能对共享存储进行IO操作。通常,现有技术的分布式锁方案中会采用一个超时机制,即当一号主机断开与共享存储的连接超过一定时间时,分布式锁会解除一号主机对共享存储的占用,使得其他主机可以占用该共享存储。然而,如果共享存储的IO压力大,即使一号主机并未断开与共享存储的连接,分布式锁也会误将一号主机对共享存储的占用解除,当共享存储重新接收到一号主机的IO操作指令时,由于其他主机此时占用了该共享存储,使得一号主机被隔离,无法完成此前未完成的IO操作。In the prior art, a shared storage solution is usually used to implement distributed locks. For each host connected to the same shared storage, when one of the hosts occupies the shared storage, for example, when one of the hosts writes to the shared storage, it is called host No. The IO operation on the shared storage can only be performed after the writing operation process is completed. Usually, a timeout mechanism is adopted in the distributed lock scheme of the prior art, that is, when host No. 1 is disconnected from the shared storage for a certain period of time, the distributed lock will release host No. Hosts can occupy this shared storage. However, if the IO pressure of the shared storage is high, even if host No. 1 has not disconnected from the shared storage, the distributed lock will mistakenly release host No. 1 from occupying the shared storage. When the IO operation command is issued, because other hosts occupy the shared storage at this time, host No. 1 is isolated and cannot complete the previously unfinished IO operations.

因此,如何找到一个稳定的分布式锁方案,是目前本领域技术人员急需解决的技术问题。Therefore, how to find a stable distributed lock solution is a technical problem urgently needed by those skilled in the art.

发明内容Contents of the invention

本发明的目的是提供一种基于分布式锁的存储的IO操作方法及装置,避免了第二主机并未离线,但由于第二主机的IO超时造成对第二主机的意外的隔离,从而无法完成对共享存储的IO操作的情况。The purpose of the present invention is to provide an IO operation method and device based on distributed lock storage, which avoids that the second host is not offline, but due to the unexpected isolation of the second host due to the IO timeout of the second host, it cannot The case where the IO operation to the shared storage is completed.

为解决上述技术问题,本发明提供如下技术方案:In order to solve the above technical problems, the present invention provides the following technical solutions:

一种基于分布式锁的存储的IO操作方法,应用于第一主机,所述第一主机与多个其他主机均连接于同一共享存储,该方法包括:An IO operation method based on distributed lock storage, applied to a first host, the first host and multiple other hosts are connected to the same shared storage, the method comprising:

按照预设的读取周期读取每个其他主机按照预设的写入周期写入至所述共享存储的时间戳;reading the time stamp written by each other host to the shared storage according to the preset write cycle according to the preset read cycle;

在对所述共享存储中的目标数据下发第一IO操作指令时,检测所述目标数据是否被第二主机占用,所述第二主机为多个其他主机中的任意一个;When issuing a first IO operation instruction to the target data in the shared storage, detecting whether the target data is occupied by a second host, where the second host is any one of a plurality of other hosts;

如果是,则读取所述第二主机的时间戳,根据此次读取的所述第二主机的时间戳和上一次读取的所述第二主机的时间戳,确定所述第二主机是否离线;If so, read the timestamp of the second host, and determine the second host according to the timestamp of the second host read this time and the timestamp of the second host read last time whether offline;

如果是,则解除所述第二主机对所述目标数据的占用并进行IO操作。If so, release the occupation of the target data by the second host and perform an IO operation.

优选的,所述根据此次读取的所述第二主机的时间戳和上一次读取的所述第二主机的时间戳,确定所述第二主机是否离线,包括:Preferably, the determining whether the second host is offline according to the timestamp of the second host read this time and the timestamp of the second host read last time includes:

获得所述第二主机的IO等待时间;Obtain the IO waiting time of the second host;

确定上一次读取的所述第二主机的时间戳加上所述IO等待时间是否大于此次读取的所述第二主机的时间戳;Determine whether the timestamp of the second host read last time plus the IO waiting time is greater than the timestamp of the second host read this time;

如果否,则确定所述第二主机离线。If not, it is determined that the second host is offline.

优选的,所述获得所述第二主机的IO等待时间,包括:Preferably, the obtaining the IO waiting time of the second host includes:

根据所述第二主机的第二IO指令队列的等待时间,确定所述第二主机的IO等待时间;determining the IO waiting time of the second host according to the waiting time of the second IO command queue of the second host;

其中,所述第二IO指令队列具有非超时属性,包含所述第二主机对所述目标数据下发的第二IO操作指令。Wherein, the second IO instruction queue has a non-timeout attribute, and includes a second IO operation instruction issued by the second host to the target data.

优选的,所述第一IO操作指令为具有非超时属性的第一IO指令队列中的指令。Preferably, the first IO operation instruction is an instruction in the first IO instruction queue with a non-timeout property.

优选的,所述读取周期大于或等于所述写入周期。Preferably, the read period is greater than or equal to the write period.

一种基于分布式锁的存储的IO操作装置,应用于第一主机,所述第一主机与多个其他主机均连接于同一共享存储,该装置包括:An IO operation device based on distributed lock storage, applied to a first host, the first host and multiple other hosts are connected to the same shared storage, the device includes:

时间戳读取模块,用于按照预设的读取周期读取每个其他主机按照预设的写入周期写入至所述共享存储的时间戳;A time stamp reading module, configured to read the time stamp written by each other host to the shared storage according to the preset write cycle according to the preset read cycle;

目标数据检测模块,用于在对所述共享存储中的目标数据下发第一IO操作指令时,检测所述目标数据是否被第二主机占用,如果是,则触发第二主机离线确定模块,所述第二主机为多个其他主机中的任意一个;The target data detection module is used to detect whether the target data is occupied by the second host when issuing the first IO operation instruction to the target data in the shared storage, and if so, trigger the second host offline determination module, The second host is any one of multiple other hosts;

所述第二主机离线确定模块,用于读取所述第二主机的时间戳,根据此次读取的所述第二主机的时间戳和上一次读取的所述第二主机的时间戳,确定所述第二主机是否离线,如果是,则触发IO操作模块;The second host offline determination module is configured to read the timestamp of the second host, according to the timestamp of the second host read this time and the timestamp of the second host read last time , determining whether the second host is offline, and if so, triggering the IO operation module;

所述IO操作模块,用于解除所述第二主机对所述目标数据的占用并进行IO操作。The IO operation module is configured to release the occupation of the target data by the second host and perform an IO operation.

优选的,所述第二主机离线确定模块,包括:Preferably, the module for determining that the second host is offline includes:

IO等待时间获得子模块,用于获得所述第二主机的IO等待时间;The IO waiting time obtaining submodule is used to obtain the IO waiting time of the second host;

时间戳确定子模块,用于确定上一次读取的所述第二主机的时间戳加上所述IO等待时间是否大于此次读取的所述第二主机的时间戳,如果否,则进入第二主机离线确定子模块;The timestamp determination submodule is used to determine whether the timestamp of the second host read last time plus the IO waiting time is greater than the timestamp of the second host read this time, if not, enter The second host determines the submodule offline;

所述第二主机离线确定子模块,用于确定所述第二主机离线。The second host offline determining submodule is configured to determine that the second host is offline.

优选的,所述IO等待时间获得子模块,具体用于:Preferably, the IO waiting time acquisition submodule is specifically used for:

根据所述第二主机的第二IO指令队列的等待时间,确定所述第二主机的IO等待时间;determining the IO waiting time of the second host according to the waiting time of the second IO command queue of the second host;

其中,所述第二IO指令队列具有非超时属性,包含所述第二主机对所述目标数据下发的第二IO操作指令。Wherein, the second IO instruction queue has a non-timeout attribute, and includes a second IO operation instruction issued by the second host to the target data.

优选的,所述第一IO操作指令为具有非超时属性的第一IO指令队列中的指令。Preferably, the first IO operation instruction is an instruction in the first IO instruction queue with a non-timeout property.

优选的,所述读取周期大于或等于所述写入周期。Preferably, the read period is greater than or equal to the write period.

应用本发明实施例所提供的技术方案,按照预设的读取周期读取每个其他主机按照预设的写入周期写入至共享存储的时间戳,在对共享存储中的目标数据下发第一IO操作指令时,检测目标数据是否被第二主机占用,如果是,则读取第二主机的时间戳,根据此次读取的第二主机的时间戳和上一次读取的第二主机的时间戳,确定第二主机是否离线,如果是,则解除第二主机对目标数据的占用并进行IO操作。By applying the technical solution provided by the embodiment of the present invention, the time stamp written by each other host to the shared storage according to the preset write cycle is read according to the preset read cycle, and the target data in the shared storage is issued During the first IO operation instruction, detect whether the target data is occupied by the second host, if yes, read the timestamp of the second host, according to the timestamp of the second host read this time and the second read last time The time stamp of the host, to determine whether the second host is offline, and if so, release the occupation of the target data by the second host and perform IO operations.

当共享存储的读写压力比较大时,可能无法及时对第二主机的IO操作进行响应,因此第二主机下发的IO操作指令出现IO超时,可能是第二主机断开了与共享存储的连接导致第二主机的IO超时,也可能是共享存储的IO压力大造成第二主机的IO超时。本发明的方案中,当第一主机检测到目标数据被第二主机占用时,读取第二主机的时间戳,将此次读取的第二主机的时间戳与上一次读取的第二主机的时间戳进行比较,确定第二主机是否离线。当共享存储的IO压力大时,第二主机的IO出现超时,但第二主机可以通过下发时间戳,使得第一主机确定第二主机是否离线,避免了第二主机并未离线,但由于第二主机的IO超时造成对第二主机的意外的隔离,从而无法完成对共享存储的IO操作的情况。When the reading and writing pressure of the shared storage is relatively high, it may not be able to respond to the IO operation of the second host in time. Therefore, if the IO operation command issued by the second host has an IO timeout, it may be that the second host has disconnected from the shared storage. The connection causes the IO timeout of the second host, or the IO pressure of the shared storage may cause the IO timeout of the second host. In the solution of the present invention, when the first host detects that the target data is occupied by the second host, it reads the time stamp of the second host, and compares the time stamp of the second host read this time with the time stamp of the second host read last time. The host's timestamp is compared to determine whether the second host is offline. When the IO pressure of the shared storage is high, the IO of the second host will time out, but the second host can send a time stamp to make the first host determine whether the second host is offline, avoiding that the second host is not offline, but due to The IO timeout of the second host causes unexpected isolation of the second host, so that the IO operation on the shared storage cannot be completed.

附图说明Description of drawings

为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the accompanying drawings in the following description are only These are some embodiments of the present invention. Those skilled in the art can also obtain other drawings based on these drawings without creative work.

图1为本发明中一种基于分布式锁的存储的IO操作方法实施场景示意图;Fig. 1 is a schematic diagram of an implementation scenario of an IO operation method based on distributed lock storage in the present invention;

图2为本发明中一种基于分布式锁的存储的IO操作方法的实施流程图;Fig. 2 is the implementation flowchart of a kind of IO operation method based on the storage of distributed lock in the present invention;

图3为本发明中一种基于分布式锁的存储的IO操作装置的结构示意图。FIG. 3 is a schematic structural diagram of an IO operation device based on distributed lock storage in the present invention.

具体实施方式detailed description

本发明的核心是提供一种基于分布式锁的存储的IO操作方法,第一主机通过读取第二主机的时间戳确定第二主机是否离线,避免了第二主机并未离线,但由于第二主机的IO超时造成对第二主机的意外的隔离,从而无法完成对共享存储的IO操作的情况。The core of the present invention is to provide an IO operation method based on distributed lock storage. The first host determines whether the second host is offline by reading the timestamp of the second host, which avoids that the second host is not offline, but due to the second The IO timeout of the second host causes unexpected isolation of the second host, so that the IO operation on the shared storage cannot be completed.

为了使本技术领域的人员更好地理解本发明方案,下面结合附图和具体实施方式对本发明作进一步的详细说明。显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。In order to enable those skilled in the art to better understand the solution of the present invention, the present invention will be further described in detail below in conjunction with the accompanying drawings and specific embodiments. Apparently, the described embodiments are only some of the embodiments of the present invention, but not all of them. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

请参考图1,为本发明中一种基于分布式锁的存储的IO操作方法实施场景示意图。主机101、主机102、主机103以及主机104都连接至同一共享存储105中。每个主机可以在共享存储的对应的位置写下各自的时间戳,在每次写下时间戳的同时,向共享存储下发IO操作的等待时间。Please refer to FIG. 1 , which is a schematic diagram of an implementation scenario of an IO operation method based on distributed lock storage in the present invention. The host 101 , the host 102 , the host 103 and the host 104 are all connected to the same shared storage 105 . Each host can write its own time stamp in the corresponding position of the shared storage, and send the waiting time of the IO operation to the shared storage while writing the time stamp each time.

请参考图2,为本发明中一种基于分布式锁的存储的IO操作方法的实施流程图,该方法可以应用于第一主机,第一主机和多个其他主机均连接于同一共享存储,该方法可以包括以下步骤:Please refer to FIG. 2 , which is an implementation flowchart of an IO operation method based on distributed lock storage in the present invention. This method can be applied to the first host, and the first host and multiple other hosts are connected to the same shared storage. The method may include the steps of:

S201:按照预设的读取周期读取每个其他主机按照预设的写入周期写入至共享存储的时间戳。S201: Read, according to a preset read cycle, the time stamp written to the shared storage by each other host according to a preset write cycle.

共享存储可以是一个或者一组资源,可以有多个主机与同一共享存储连接。连接至同一共享存储的每一个主机均可以将各自的时间戳写入该共享存储中。在本申请文件中,主机的时间戳指的是当前主机的开机时长,例如可以为当前主机的开机时间的秒数。每个主机可以启动一个进程,按照预设的写入周期,在共享存储中写入各自的时间戳。每个主机在共享存储中写入的时间戳,其他主机只能读取该时间戳,不能对该时间戳进行修改,例如,可以在共享存储上为每个加入的主机分配对应该主机的唯一空间,每个主机对自己的空间可以进行读写操作,对其他主机对应的空间只能进行读取操作。Shared storage can be one or a group of resources, and there can be multiple hosts connected to the same shared storage. Each host connected to the same shared storage can write its own timestamp to the shared storage. In this application document, the time stamp of the host refers to the current boot time of the host, for example, it may be the number of seconds of the current boot time of the host. Each host can start a process to write its own timestamp in the shared storage according to the preset write cycle. The timestamp written by each host in the shared storage can only be read by other hosts and cannot be modified. For example, each joining host can be assigned a unique ID corresponding to the host on the shared storage. space, each host can read and write to its own space, and can only read and write to the space corresponding to other hosts.

第一主机按照预设的读取周期读取每个其他主机按照预设的写入周期写入至共享存储的时间戳之后,可以进行步骤S202的操作。After the first host reads the time stamp written into the shared storage by each other host according to the preset read cycle according to the preset read cycle, the operation of step S202 may be performed.

需要说明的是,读取周期和写入周期可以根据实际情况进行设定和调整,并不影响本发明的实施。优选的,读取周期大于或等于写入周期。It should be noted that the reading cycle and the writing cycle can be set and adjusted according to actual conditions, which will not affect the implementation of the present invention. Preferably, the read period is greater than or equal to the write period.

S202:在对共享存储中的目标数据下发第一IO操作指令时,检测目标数据是否被第二主机占用。S202: Detect whether the target data is occupied by the second host when issuing a first IO operation instruction to the target data in the shared storage.

如果是,则进入步骤S203。其中,第二主机为多个其他主机中的任意一个。If yes, go to step S203. Wherein, the second host is any one of multiple other hosts.

目标数据为存放在共享存储中的数据。当第一主机对目标数据下发第一IO操作指令时,检测目标数据是否被第二主机占用。第二主机和第一主机连接至同一共享存储,目标数据在被第二主机占用时,第一主机无法对目标数据下发IO操作指令。例如,当第二主机对目标数据进行写操作时,通过分布式锁对目标数据设置一个标识,标识中可以携带第二主机的名称。当目标数据存在第二主机的标识时,第一主机则确定该目标数据被第二数据占用。The target data is the data stored in the shared storage. When the first host issues a first IO operation command to the target data, it is detected whether the target data is occupied by the second host. The second host and the first host are connected to the same shared storage, and when the target data is occupied by the second host, the first host cannot issue an IO operation command to the target data. For example, when the second host performs a write operation on the target data, an identifier is set on the target data through the distributed lock, and the identifier may carry the name of the second host. When the target data has the identifier of the second host, the first host determines that the target data is occupied by the second data.

当然,如果确定目标数据并未被其他任意主机占用,则第一主机可以对目标数据下发IO操作指令,以完成相应的读写操作。Of course, if it is determined that the target data is not occupied by any other host, the first host may issue an IO operation command to the target data to complete corresponding read and write operations.

S203:读取第二主机的时间戳,根据此次读取的第二主机的时间戳和上一次读取的第二主机的时间戳,确定第二主机是否离线。S203: Read the time stamp of the second host, and determine whether the second host is offline according to the time stamp of the second host read this time and the time stamp of the second host read last time.

在检测到目标数据被第二主机占用之后,第一主机读取第二主机的时间戳,并根据此次读取的第二主机的时间戳和上一次读取的第二主机的时间戳,确定第二主机是否离线。After detecting that the target data is occupied by the second host, the first host reads the timestamp of the second host, and according to the timestamp of the second host read this time and the timestamp of the second host read last time, Determine if the secondary host is offline.

第一主机可以将此次读取的第二主机的时间戳和上一次读取的第二主机的时间戳进行比较,确定两次读取的时间戳是否有变化,如果两次读取的时间戳没有变化,可以确定第二主机离线。The first host can compare the timestamp of the second host read this time with the timestamp of the second host read last time to determine whether the timestamps of the two reads have changed. If the time of the two reads There is no change in the stamp, and it can be determined that the second host is offline.

当然,在本发明的一种具体实施方式中,考虑到对目标数据的IO操作可能比较频繁,或者预设的读取周期较短等因素,当第一主机确定此次读取的第二主机的时间戳相较于上次读取的第二主机的时间戳没有变化时,可以进一步确定两次读取时间戳的时间间隔是否超过第二主机的写入周期,如果不超过,可以在一个等待时长之后,读取第二主机的时间戳。如果在等待时长之后读取的时间戳仍不发生变化,则可以确定第二主机离线。Of course, in a specific implementation of the present invention, considering that the IO operations on the target data may be relatively frequent, or the preset reading cycle is relatively short, when the first host determines that the second host to read this time When the timestamp of the second host has not changed compared with the last read timestamp of the second host, it can be further determined whether the time interval between two read timestamps exceeds the write cycle of the second host. If not, it can be done in a After waiting for a certain amount of time, read the timestamp of the second host. If the read time stamp still does not change after the waiting period, it may be determined that the second host is offline.

例如,第二主机的写入周期预设为10秒,第一主机的读取周期预设为30秒。通常,预设的读取周期大于或者等于预设的写入周期。第一主机检测到目标数据被第二主机占用时,读取的第二主机的时间戳为一小时,上一次读取的第二主机的时间戳也为一小时,两次读取第二主机的时间戳的时间间隔为7秒。由于7秒低于第二主机的写入周期10秒,可以在一个等待时长后读取第二主机的时间戳。等待时长可以根据实际情况进行设定和调整,例如设定为与第二主机的写入周期相同,也可以设定为第二主机的写入周期的三倍时长等,本发明对此不作限制。例如当等待时长设为20秒,在20秒后读取第二主机的时间戳,如果读取的第二主机的时间戳仍为一小时,则确定第二主机离线。For example, the preset writing period of the second host is 10 seconds, and the preset reading period of the first host is 30 seconds. Usually, the preset read cycle is greater than or equal to the preset write cycle. When the first host detects that the target data is occupied by the second host, the read timestamp of the second host is one hour, and the last read timestamp of the second host is also one hour, and the second host is read twice The timestamp interval is 7 seconds. Since 7 seconds is lower than the write period of the second host of 10 seconds, the timestamp of the second host can be read after a waiting period. The waiting time can be set and adjusted according to the actual situation, for example, it can be set to be the same as the writing cycle of the second host, or it can be set to be three times the writing cycle of the second host, etc., which is not limited by the present invention . For example, when the waiting period is set to 20 seconds, the timestamp of the second host is read after 20 seconds, and if the read timestamp of the second host is still one hour, it is determined that the second host is offline.

第一主机在确定第二主机离线之后,可以进行S204的操作。After the first host determines that the second host is offline, the operation of S204 may be performed.

S204:解除第二主机对目标数据的占用并进行IO操作。S204: Release the occupation of the target data by the second host and perform an IO operation.

第一主机确定第二主机离线后,解除第二主机对目标数据的占用。例如,可以将第二主机通过分布式锁对目标数据设置的标识进行删除。第一主机在解除第二主机对目标数据的占用之后,可以对目标数据进行IO操作。After determining that the second host is offline, the first host releases the occupation of the target data by the second host. For example, the identifier set by the second host on the target data through the distributed lock may be deleted. After the first host releases the occupation of the target data by the second host, it can perform an IO operation on the target data.

应用本发明实施例所提供的技术方案,按照预设的读取周期读取每个其他主机按照预设的写入周期写入至共享存储的时间戳,在对共享存储中的目标数据下发第一IO操作指令时,检测目标数据是否被第二主机占用,如果是,则读取第二主机的时间戳,根据此次读取的第二主机的时间戳和上一次读取的第二主机的时间戳,确定第二主机是否离线,如果是,则解除第二主机对目标数据的占用并进行IO操作。By applying the technical solution provided by the embodiment of the present invention, the time stamp written by each other host to the shared storage according to the preset write cycle is read according to the preset read cycle, and the target data in the shared storage is issued During the first IO operation instruction, detect whether the target data is occupied by the second host, if yes, read the timestamp of the second host, according to the timestamp of the second host read this time and the second read last time The time stamp of the host, to determine whether the second host is offline, and if so, release the occupation of the target data by the second host and perform IO operations.

当共享存储的读写压力比较大时,可能无法及时对第二主机的IO操作进行响应,因此第二主机下发的IO操作指令出现IO超时,可能是第二主机断开了与共享存储的连接导致第二主机的IO超时,也可能是共享存储的IO压力大造成第二主机的IO超时。本发明的方案中,当第一主机检测到目标数据被第二主机占用时,读取第二主机的时间戳,将此次读取的第二主机的时间戳与上一次读取的第二主机的时间戳进行比较,确定第二主机是否离线。当共享存储的IO压力大时,第二主机的IO出现超时,但第二主机可以通过下发时间戳,使得第一主机确定第二主机是否离线,避免了第二主机并未离线,但由于第二主机的IO超时造成对第二主机的意外的隔离,从而无法完成对共享存储的IO操作的情况。When the reading and writing pressure of the shared storage is relatively high, it may not be able to respond to the IO operation of the second host in time. Therefore, if the IO operation command issued by the second host has an IO timeout, it may be that the second host has disconnected from the shared storage. The connection causes the IO timeout of the second host, or the IO pressure of the shared storage may cause the IO timeout of the second host. In the solution of the present invention, when the first host detects that the target data is occupied by the second host, it reads the time stamp of the second host, and compares the time stamp of the second host read this time with the time stamp of the second host read last time. The host's timestamp is compared to determine whether the second host is offline. When the IO pressure of the shared storage is high, the IO of the second host will time out, but the second host can send a time stamp to make the first host determine whether the second host is offline, avoiding that the second host is not offline, but due to The IO timeout of the second host causes unexpected isolation of the second host, so that the IO operation on the shared storage cannot be completed.

在本发明的一种具体实施方式中,步骤S203包括:In a specific implementation manner of the present invention, step S203 includes:

步骤一:获得第二主机的IO等待时间;Step 1: Obtain the IO waiting time of the second host;

步骤二:确定上一次读取的第二主机的时间戳加上IO等待时间是否大于此次读取的第二主机的时间戳,如果否,则进入步骤三;Step 2: Determine whether the timestamp of the second host read last time plus the IO waiting time is greater than the timestamp of the second host read this time, if not, then enter step 3;

步骤三:确定第二主机离线。Step 3: Make sure that the second host is offline.

第一主机从共享存储中读取第二主机的IO等待时间,第二主机的IO等待时间由第二主机下发至共享存储的相应区域。The first host reads the IO waiting time of the second host from the shared storage, and the IO waiting time of the second host is sent to a corresponding area of the shared storage by the second host.

需要说明的是,可以根据第二主机的第二IO指令队列的等待时间,确定第二主机的IO等待时间。其中,第二IO指令队列具有非超时属性,包含第二主机对目标数据下发的第二IO操作指令。具体地:第二主机可以增加一个具有非超时属性的第二IO指令队列,该队列由第二IO操作指令构成。非超时属性指的第二主机下发该队列中的IO第二IO操作指令时,下发不成功则继续下发,并不将该队列进行删除,直至该队列中的第二IO操作指令下发成功。当然,连接至同一共享存储中的每一个主机,都可以采用具有非超时属性的IO指令队列,进行IO操作指令的下发。It should be noted that the IO waiting time of the second host may be determined according to the waiting time of the second IO command queue of the second host. Wherein, the second IO instruction queue has a non-timeout attribute, and includes a second IO operation instruction issued by the second host to the target data. Specifically: the second host may add a second IO command queue with a non-timeout attribute, the queue is composed of second IO operation commands. The non-timeout attribute refers to that when the second host issues the IO second IO operation command in the queue, if the delivery fails, it will continue to issue, and the queue will not be deleted until the second IO operation command in the queue is issued. sent successfully. Of course, each host connected to the same shared storage can use the IO command queue with non-timeout property to issue IO operation commands.

第二主机可以在下发时间戳的同时,下发第二主机的IO等待时间。当然,IO等待时间的下发也可以采用其他的下发方案,并不影响本发明的实施。第一主机将上一次读取的第二主机的时间戳与第二主机的IO等待时间相加,确定相加结果是否大于此次读取的第二主机的时间戳,如果否,则确定第二主机离线。The second host may deliver the IO waiting time of the second host while delivering the time stamp. Of course, other delivery schemes may also be adopted for the delivery of the IO waiting time, which does not affect the implementation of the present invention. The first host adds the timestamp of the second host read last time to the IO waiting time of the second host, and determines whether the addition result is greater than the timestamp of the second host read this time, and if not, then determines the first The second host is offline.

如果仅通过此次读取的第二主机的时间戳与上一次读取的第二主机的时间戳,确定第二主机是否离线,那么第一主机的读取周期较短,对共享存储的目标数据的IO操作比较频繁,以及第二主机进行时间戳的下发可能存在延迟等多种因素都会影响第一主机判断第二主机是否离线的准确性。当第二主机并未断开与共享存储的连接时,由于共享存储的IO压力大造成第二主机的IO操作指令出现超时,则将上次读取的第二主机的时间戳,加上第二主机的IO等待时间,得到的和会大于或者等于此次读取的第二主机的时间戳。当第二主机断开与共享存储的连接时,例如第二主机宕机或者第二主机的网络连接中断,由于第二主机无法下发IO等待时间以更新共享存储中保存的第二主机的IO等待时间,第一主机将从共享存储中读取的第二主机的IO等待时间,与上一次读取的第二主机的时间戳进行求和,结果会小于此次读取的第二主机的时间戳。If only the timestamp of the second host read this time and the timestamp of the second host read last time are used to determine whether the second host is offline, then the read cycle of the first host is shorter, and the shared storage target Multiple factors such as frequent data IO operations and possible delays in sending timestamps by the second host will affect the accuracy of the first host in judging whether the second host is offline. When the second host has not disconnected from the shared storage, and the IO operation command of the second host times out due to the high IO pressure of the shared storage, the time stamp of the second host read last time plus the first The sum of the IO waiting time of the two hosts will be greater than or equal to the timestamp of the second host read this time. When the second host disconnects from the shared storage, for example, the second host crashes or the network connection of the second host is interrupted, because the second host cannot issue the IO waiting time to update the IO of the second host saved in the shared storage Waiting time, the first host will sum the IO waiting time of the second host read from the shared storage with the timestamp of the second host read last time, and the result will be less than the time stamp of the second host read this time timestamp.

例如,第一主机检测到第二主机占用共享存储中的目标数据时,此次读取的第二主机的时间戳为60秒,第二主机的IO等待时间为32秒,上一次读取的第二主机的时间戳为30秒,则将上一次读取的第二主机的时间戳加上第二主机的IO等待时间后,得到的时间大于此次读取的第二主机的时间戳,则可以确定第二主机并未断开与共享存储的连接。For example, when the first host detects that the second host occupies the target data in the shared storage, the time stamp of the second host read this time is 60 seconds, the IO waiting time of the second host is 32 seconds, and the time stamp of the last read The timestamp of the second host is 30 seconds, then the time stamp of the second host read last time is added to the IO waiting time of the second host, and the time obtained is greater than the timestamp of the second host read this time, Then it can be determined that the second host has not disconnected from the shared storage.

相应于上面的方法实施例,本发明实施例还提供了一种基于分布式锁的存储的IO操作装置,下文描述的一种基于分布式锁的存储的IO操作装置与上文描述的一种基于分布式锁的存储的IO操作方法可相互对应参照。Corresponding to the above method embodiment, the embodiment of the present invention also provides an IO operation device based on distributed lock storage, the IO operation device based on distributed lock storage described below is the same as the one described above The IO operation methods of storage based on distributed locks can be referred to each other.

参见图3所示,为本发明实施例中一种基于分布式锁的存储的IO操作装置的结构示意图,应用于第一主机,第一主机与多个其他主机均连接于同一共享存储,该装置包括以下模块:Referring to Figure 3, it is a schematic structural diagram of an IO operation device based on distributed lock storage in an embodiment of the present invention, which is applied to the first host, and the first host and multiple other hosts are connected to the same shared storage. The device includes the following modules:

时间戳读取模块301,用于按照预设的读取周期读取每个其他主机按照预设的写入周期写入至共享存储的时间戳;The timestamp reading module 301 is configured to read the timestamp written by each other host to the shared storage according to the preset write cycle according to the preset read cycle;

目标数据检测模块302,用于在对共享存储中的目标数据下发第一IO操作指令时,检测目标数据是否被第二主机占用,如果是,则触发第二主机离线确定模块303,第二主机为多个其他主机中的任意一个;The target data detection module 302 is configured to detect whether the target data is occupied by the second host when issuing the first IO operation instruction to the target data in the shared storage, and if so, trigger the second host offline determination module 303, the second The host is any one of multiple other hosts;

第二主机离线确定模块303,用于读取第二主机的时间戳,根据此次读取的第二主机的时间戳和上一次读取的第二主机的时间戳,确定第二主机是否离线,如果是,则触发IO操作模块304;The second host offline determination module 303 is configured to read the timestamp of the second host, and determine whether the second host is offline according to the timestamp of the second host read this time and the timestamp of the second host read last time , if yes, trigger the IO operation module 304;

IO操作模块304,用于解除第二主机对目标数据的占用并进行IO操作。The IO operation module 304 is configured to release the occupation of the target data by the second host and perform an IO operation.

应用本发明实施例所提供的装置,按照预设的读取周期读取每个其他主机按照预设的写入周期写入至共享存储的时间戳,在对共享存储中的目标数据下发第一IO操作指令时,检测目标数据是否被第二主机占用,如果是,则读取第二主机的时间戳,根据此次读取的第二主机的时间戳和上一次读取的第二主机的时间戳,确定第二主机是否离线,如果是,则解除第二主机对目标数据的占用并进行IO操作。Apply the device provided by the embodiment of the present invention to read the time stamps written by each other host to the shared storage according to the preset read cycle according to the preset read cycle, and send the second time stamp to the target data in the shared storage During an IO operation command, detect whether the target data is occupied by the second host, if yes, read the timestamp of the second host, according to the timestamp of the second host read this time and the second host read last time to determine whether the second host is offline, and if so, release the second host from occupying the target data and perform IO operations.

当共享存储的读写压力比较大时,可能无法及时对第二主机的IO操作进行响应,因此第二主机下发的IO操作指令出现IO超时,可能是第二主机断开了与共享存储的连接导致第二主机的IO超时,也可能是共享存储的IO压力大造成第二主机的IO超时。本发明的方案中,当第一主机检测到目标数据被第二主机占用时,读取第二主机的时间戳,将此次读取的第二主机的时间戳与上一次读取的第二主机的时间戳进行比较,确定第二主机是否离线。当共享存储的IO压力大时,第二主机的IO出现超时,但第二主机可以通过下发时间戳,使得第一主机确定第二主机是否离线,避免了第二主机并未离线,但由于第二主机的IO超时而无法完成对共享存储的IO操作的情况。When the reading and writing pressure of the shared storage is relatively high, it may not be able to respond to the IO operation of the second host in time. Therefore, if the IO operation command issued by the second host has an IO timeout, it may be that the second host has disconnected from the shared storage. The connection causes the IO timeout of the second host, or the IO pressure of the shared storage may cause the IO timeout of the second host. In the solution of the present invention, when the first host detects that the target data is occupied by the second host, it reads the time stamp of the second host, and compares the time stamp of the second host read this time with the time stamp of the second host read last time. The host's timestamp is compared to determine whether the second host is offline. When the IO pressure of the shared storage is high, the IO of the second host will time out, but the second host can send a time stamp to make the first host determine whether the second host is offline, avoiding that the second host is not offline, but due to The IO operation of the second host fails to complete the IO operation on the shared storage due to timeout.

在本发明的一种具体实施方式中,第二主机离线确定模块303,包括:In a specific implementation manner of the present invention, the module 303 for determining that the second host is offline includes:

IO等待时间获得子模块,用于获得第二主机的IO等待时间;The IO waiting time obtaining submodule is used to obtain the IO waiting time of the second host;

时间戳确定子模块,用于确定上一次读取的第二主机的时间戳加上IO等待时间是否大于此次读取的第二主机的时间戳,如果否,则进入第二主机离线确定子模块;Timestamp determines submodule, is used to determine whether the timestamp of the second host read last time plus the IO waiting time is greater than the timestamp of the second host read this time, if not, then enters the offline determination submodule of the second host module;

第二主机离线确定子模块,用于确定第二主机离线。The second host offline determining submodule is configured to determine that the second host is offline.

在本发明的一种具体实施方式中,IO等待时间获得子模块,具体用于:In a specific embodiment of the present invention, the IO waiting time acquisition submodule is specifically used for:

根据第二主机的第二IO指令队列的等待时间,确定第二主机的IO等待时间;Determine the IO waiting time of the second host according to the waiting time of the second IO command queue of the second host;

第二IO指令队列具有非超时属性,包含第二主机对目标数据下发的第二IO操作指令。The second IO instruction queue has a non-timeout attribute, and includes a second IO operation instruction issued by the second host to the target data.

在本发明的一种具体实施方式中,第一IO操作指令为具有非超时属性的第一IO指令队列中的指令。In a specific implementation manner of the present invention, the first IO operation instruction is an instruction in the first IO instruction queue with a non-timeout attribute.

在本发明的一种具体实施方式中,读取周期大于或等于写入周期。In a specific embodiment of the present invention, the read period is greater than or equal to the write period.

本说明书中各个实施例采用递进的方式描述,每个实施例重点说明的都是与其它实施例的不同之处,各个实施例之间相同或相似部分互相参见即可。对于实施例公开的装置而言,由于其与实施例公开的方法相对应,所以描述的比较简单,相关之处参见方法部分说明即可。Each embodiment in this specification is described in a progressive manner, each embodiment focuses on the difference from other embodiments, and the same or similar parts of each embodiment can be referred to each other. As for the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and for the related information, please refer to the description of the method part.

专业人员还可以进一步意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、计算机软件或者二者的结合来实现,为了清楚地说明硬件和软件的可互换性,在上述说明中已经按照功能一般性地描述了各示例的组成及步骤。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本发明的范围。Professionals can further realize that the units and algorithm steps of the examples described in conjunction with the embodiments disclosed herein can be implemented by electronic hardware, computer software or a combination of the two. In order to clearly illustrate the possible For interchangeability, in the above description, the composition and steps of each example have been generally described according to their functions. Whether these functions are executed by hardware or software depends on the specific application and design constraints of the technical solution. Those skilled in the art may use different methods to implement the described functions for each specific application, but such implementation should not be regarded as exceeding the scope of the present invention.

结合本文中所公开的实施例描述的方法或算法的步骤可以直接用硬件、处理器执行的软件模块,或者二者的结合来实施。软件模块可以置于随机存储器(RAM)、内存、只读存储器(ROM)、电可编程ROM、电可擦除可编程ROM、寄存器、硬盘、可移动磁盘、CD-ROM、或技术领域内所公知的任意其它形式的存储介质中。The steps of the methods or algorithms described in connection with the embodiments disclosed herein may be directly implemented by hardware, software modules executed by a processor, or a combination of both. Software modules can be placed in random access memory (RAM), internal memory, read-only memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, removable disk, CD-ROM, or any other Any other known storage medium.

本文中应用了具体个例对本发明的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本发明的技术方案及其核心思想。应当指出,对于本技术领域的普通技术人员来说,在不脱离本发明原理的前提下,还可以对本发明进行若干改进和修饰,这些改进和修饰也落入本发明权利要求的保护范围内。In this paper, specific examples are used to illustrate the principles and implementation methods of the present invention, and the descriptions of the above embodiments are only used to help understand the technical solutions and core ideas of the present invention. It should be pointed out that for those skilled in the art, without departing from the principles of the present invention, some improvements and modifications can be made to the present invention, and these improvements and modifications also fall within the protection scope of the claims of the present invention.

Claims (10)

1.一种基于分布式锁的存储的IO操作方法,其特征在于,应用于第一主机,所述第一主机与多个其他主机均连接于同一共享存储,包括:1. A stored IO operation method based on a distributed lock, characterized in that it is applied to a first host, and the first host is connected to the same shared storage with multiple other hosts, including: 按照预设的读取周期读取每个其他主机按照预设的写入周期写入至所述共享存储的时间戳;reading the time stamp written by each other host to the shared storage according to the preset write cycle according to the preset read cycle; 在对所述共享存储中的目标数据下发第一IO操作指令时,检测所述目标数据是否被第二主机占用,所述第二主机为多个其他主机中的任意一个;When issuing a first IO operation instruction to the target data in the shared storage, detecting whether the target data is occupied by a second host, where the second host is any one of a plurality of other hosts; 如果是,则读取所述第二主机的时间戳,根据此次读取的所述第二主机的时间戳和上一次读取的所述第二主机的时间戳,确定所述第二主机是否离线;If yes, read the timestamp of the second host, and determine the second host according to the timestamp of the second host read this time and the timestamp of the second host read last time whether offline; 如果是,则解除所述第二主机对所述目标数据的占用并进行IO操作。If so, release the occupation of the target data by the second host and perform an IO operation. 2.根据权利要求1所述的方法,其特征在于,所述根据此次读取的所述第二主机的时间戳和上一次读取的所述第二主机的时间戳,确定所述第二主机是否离线,包括:2. The method according to claim 1, wherein the second host is determined according to the timestamp of the second host read this time and the timestamp of the second host read last time. Whether the second host is offline, including: 获得所述第二主机的IO等待时间;Obtain the IO waiting time of the second host; 确定上一次读取的所述第二主机的时间戳加上所述IO等待时间是否大于此次读取的所述第二主机的时间戳;Determine whether the timestamp of the second host read last time plus the IO waiting time is greater than the timestamp of the second host read this time; 如果否,则确定所述第二主机离线。If not, it is determined that the second host is offline. 3.根据权利要求2所述的方法,其特征在于,所述获得所述第二主机的IO等待时间,包括:3. The method according to claim 2, wherein the obtaining the IO waiting time of the second host comprises: 根据所述第二主机的第二IO指令队列的等待时间,确定所述第二主机的IO等待时间;determining the IO waiting time of the second host according to the waiting time of the second IO command queue of the second host; 其中,所述第二IO指令队列具有非超时属性,包含所述第二主机对所述目标数据下发的第二IO操作指令。Wherein, the second IO instruction queue has a non-timeout attribute, and includes a second IO operation instruction issued by the second host to the target data. 4.根据权利要求1所述的方法,其特征在于,所述第一IO操作指令为具有非超时属性的第一IO指令队列中的指令。4. The method according to claim 1, wherein the first IO operation instruction is an instruction in the first IO instruction queue with a non-timeout attribute. 5.根据权利要求1至4任一项所述的方法,其特征在于,所述读取周期大于或等于所述写入周期。5. The method according to any one of claims 1 to 4, wherein the read period is greater than or equal to the write period. 6.一种基于分布式锁的存储的IO操作装置,其特征在于,应用于第一主机,所述第一主机与多个其他主机均连接于同一共享存储,包括:6. An IO operation device based on distributed lock storage, characterized in that it is applied to a first host, and the first host is connected to the same shared storage with multiple other hosts, including: 时间戳读取模块,用于按照预设的读取周期读取每个其他主机按照预设的写入周期写入至所述共享存储的时间戳;A time stamp reading module, configured to read the time stamp written by each other host to the shared storage according to the preset write cycle according to the preset read cycle; 目标数据检测模块,用于在对所述共享存储中的目标数据下发第一IO操作指令时,检测所述目标数据是否被第二主机占用,如果是,则触发第二主机离线确定模块,所述第二主机为多个其他主机中的任意一个;The target data detection module is used to detect whether the target data is occupied by the second host when issuing the first IO operation instruction to the target data in the shared storage, and if so, trigger the second host offline determination module, The second host is any one of multiple other hosts; 所述第二主机离线确定模块,用于读取所述第二主机的时间戳,根据此次读取的所述第二主机的时间戳和上一次读取的所述第二主机的时间戳,确定所述第二主机是否离线,如果是,则触发IO操作模块;The second host offline determination module is configured to read the timestamp of the second host, according to the timestamp of the second host read this time and the timestamp of the second host read last time , determining whether the second host is offline, and if so, triggering the IO operation module; 所述IO操作模块,用于解除所述第二主机对所述目标数据的占用并进行IO操作。The IO operation module is configured to release the occupation of the target data by the second host and perform an IO operation. 7.根据权利要求6所述的装置,其特征在于,所述第二主机离线确定模块,包括:7. The device according to claim 6, wherein the module for determining that the second host is offline comprises: IO等待时间获得子模块,用于获得所述第二主机的IO等待时间;The IO waiting time obtaining submodule is used to obtain the IO waiting time of the second host; 时间戳确定子模块,用于确定上一次读取的所述第二主机的时间戳加上所述IO等待时间是否大于此次读取的所述第二主机的时间戳,如果否,则进入第二主机离线确定子模块;The timestamp determination submodule is used to determine whether the timestamp of the second host read last time plus the IO waiting time is greater than the timestamp of the second host read this time, if not, enter The second host determines the submodule offline; 所述第二主机离线确定子模块,用于确定所述第二主机离线。The second host offline determination submodule is configured to determine that the second host is offline. 8.根据权利要求7所述的装置,其特征在于,所述IO等待时间获得子模块,具体用于:8. The device according to claim 7, wherein the IO waiting time obtaining submodule is specifically used for: 根据所述第二主机的第二IO指令队列的等待时间,确定所述第二主机的IO等待时间;determining the IO waiting time of the second host according to the waiting time of the second IO command queue of the second host; 其中,所述第二IO指令队列具有非超时属性,包含所述第二主机对所述目标数据下发的第二IO操作指令。Wherein, the second IO instruction queue has a non-timeout attribute, and includes a second IO operation instruction issued by the second host to the target data. 9.根据权利要求6所述的装置,其特征在于,所述第一IO操作指令为具有非超时属性的第一IO指令队列中的指令。9. The device according to claim 6, wherein the first IO operation instruction is an instruction in the first IO instruction queue with a non-timeout attribute. 10.根据权利要求6至9任一项所述的装置,其特征在于,所述读取周期大于或等于所述写入周期。10. The device according to any one of claims 6 to 9, wherein the read cycle is greater than or equal to the write cycle.
CN201710623874.6A 2017-07-27 2017-07-27 A kind of IO operation method and device based on distributed lock storage Active CN107239238B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710623874.6A CN107239238B (en) 2017-07-27 2017-07-27 A kind of IO operation method and device based on distributed lock storage

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710623874.6A CN107239238B (en) 2017-07-27 2017-07-27 A kind of IO operation method and device based on distributed lock storage

Publications (2)

Publication Number Publication Date
CN107239238A true CN107239238A (en) 2017-10-10
CN107239238B CN107239238B (en) 2020-09-04

Family

ID=59989774

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710623874.6A Active CN107239238B (en) 2017-07-27 2017-07-27 A kind of IO operation method and device based on distributed lock storage

Country Status (1)

Country Link
CN (1) CN107239238B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109002360A (en) * 2018-08-29 2018-12-14 郑州云海信息技术有限公司 A kind of distributed lock failure separation method, device, computer storage and equipment
CN109375988A (en) * 2018-11-01 2019-02-22 郑州云海信息技术有限公司 Method and device for realizing distributed lock
CN110568991A (en) * 2018-06-06 2019-12-13 北京忆恒创源科技有限公司 method for reducing IO command conflict caused by lock and storage device
CN111459963A (en) * 2020-04-07 2020-07-28 中国建设银行股份有限公司 Core accounting transaction concurrent processing method and device
CN111857579A (en) * 2020-06-30 2020-10-30 广东浪潮大数据研究有限公司 SSD (solid State disk) controller resetting method, system and device and readable storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7103530B1 (en) * 2002-03-29 2006-09-05 Cypress Semiconductor Corporation System for integrating event-related information and trace information
CN104317658A (en) * 2014-10-17 2015-01-28 华中科技大学 MapReduce based load self-adaptive task scheduling method
CN104954411A (en) * 2014-03-31 2015-09-30 腾讯科技(深圳)有限公司 Method for sharing network resource by distributed system, terminal thereof and system thereof
US9411534B2 (en) * 2014-07-02 2016-08-09 Hedvig, Inc. Time stamp generation for virtual disks
CN106293934A (en) * 2016-07-19 2017-01-04 浪潮(北京)电子信息产业有限公司 A kind of cluster system management optimization method and platform
CN106873918A (en) * 2017-02-27 2017-06-20 郑州云海信息技术有限公司 Storage method to set up and device in a kind of virtualization system

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7103530B1 (en) * 2002-03-29 2006-09-05 Cypress Semiconductor Corporation System for integrating event-related information and trace information
CN104954411A (en) * 2014-03-31 2015-09-30 腾讯科技(深圳)有限公司 Method for sharing network resource by distributed system, terminal thereof and system thereof
US9411534B2 (en) * 2014-07-02 2016-08-09 Hedvig, Inc. Time stamp generation for virtual disks
CN104317658A (en) * 2014-10-17 2015-01-28 华中科技大学 MapReduce based load self-adaptive task scheduling method
CN106293934A (en) * 2016-07-19 2017-01-04 浪潮(北京)电子信息产业有限公司 A kind of cluster system management optimization method and platform
CN106873918A (en) * 2017-02-27 2017-06-20 郑州云海信息技术有限公司 Storage method to set up and device in a kind of virtualization system

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110568991A (en) * 2018-06-06 2019-12-13 北京忆恒创源科技有限公司 method for reducing IO command conflict caused by lock and storage device
CN110568991B (en) * 2018-06-06 2023-07-25 北京忆恒创源科技股份有限公司 Method and storage device for reducing IO command conflict caused by lock
CN109002360A (en) * 2018-08-29 2018-12-14 郑州云海信息技术有限公司 A kind of distributed lock failure separation method, device, computer storage and equipment
CN109002360B (en) * 2018-08-29 2021-10-26 郑州云海信息技术有限公司 Distributed lock fault isolation method and device, computer memory and equipment
CN109375988A (en) * 2018-11-01 2019-02-22 郑州云海信息技术有限公司 Method and device for realizing distributed lock
CN111459963A (en) * 2020-04-07 2020-07-28 中国建设银行股份有限公司 Core accounting transaction concurrent processing method and device
CN111459963B (en) * 2020-04-07 2024-03-15 中国建设银行股份有限公司 Concurrent processing method and device for core accounting transaction
CN111857579A (en) * 2020-06-30 2020-10-30 广东浪潮大数据研究有限公司 SSD (solid State disk) controller resetting method, system and device and readable storage medium
CN111857579B (en) * 2020-06-30 2024-02-09 广东浪潮大数据研究有限公司 SSD disk controller resetting method, SSD disk controller resetting system, SSD disk controller resetting device and readable storage medium

Also Published As

Publication number Publication date
CN107239238B (en) 2020-09-04

Similar Documents

Publication Publication Date Title
CN106843749B (en) Write request processing method, device and equipment
CN107239238A (en) A kind of I/O operation method and device of the storage based on distributed lock
US20230259528A1 (en) Synchronization cache seeding
CN111078368B (en) Memory snapshot method and device of cloud computing platform virtual machine and readable storage medium
EP3132449B1 (en) Method, apparatus and system for handling data error events with memory controller
CN114637475A (en) A distributed storage system control method, device and readable storage medium
CN112199240B (en) A method and related equipment for node switching when a node fails
CN105306605B (en) A kind of double host server systems
CN107861691B (en) A load balancing method and device for a multi-controller storage system
US11748215B2 (en) Log management method, server, and database system
JP2021168123A (en) Systems and method for distributed read/write locking with network key values for storage devices
CN104170307B (en) Failover methods, devices and systems
CN106406750A (en) Data operation method and system
CN106331081B (en) A kind of information synchronization method and device
CN108595287B (en) Data truncation method and device based on erasure codes
US9729305B2 (en) Airplane system and control method thereof
CN112905668A (en) Database derivative method, apparatus, and medium based on distributed data stream processing engine
CN119148937B (en) Data writing method, system, electronic equipment and storage medium
CN106776055A (en) A kind of distributed lock method and system
US9686206B2 (en) Temporal based collaborative mutual exclusion control of a shared resource
CN108055159A (en) A kind of clustered node operation synchronous method and device
CN110309224A (en) A data replication method and device
US7568121B2 (en) Recovery from failure in data storage systems
US20220138177A1 (en) Fault tolerance for transaction mirroring
CN117950921B (en) Memory failure processing method, memory expansion control device, electronic device and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant