CN103635887B - Method and storage system for caching data - Google Patents
Method and storage system for caching data Download PDFInfo
- Publication number
- CN103635887B CN103635887B CN201380001620.0A CN201380001620A CN103635887B CN 103635887 B CN103635887 B CN 103635887B CN 201380001620 A CN201380001620 A CN 201380001620A CN 103635887 B CN103635887 B CN 103635887B
- Authority
- CN
- China
- Prior art keywords
- controller
- address information
- read
- data
- target data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0862—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches with prefetch
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0866—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches for peripheral storage systems, e.g. disk cache
- G06F12/0868—Data transfer between cache memory and other subsystems, e.g. storage devices or host systems
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Memory System Of A Hierarchy Structure (AREA)
Abstract
Description
技术领域technical field
本发明涉及存储技术,尤其涉及一种缓存数据的方法和存储系统。The present invention relates to storage technology, in particular to a method for caching data and a storage system.
背景技术Background technique
高速缓冲存储器(简称缓存,又称cache),是存储系统中CPU与主存储器(例如,硬盘)之间的缓冲存储器,体积比硬盘小,但速度比硬盘快。通常情况下,CPU在处理读数据请求时,如果在缓存中找到可用数据(称为cache命中),则可立即返回该读数据请求的结果,不命中时才将读数据请求发送到硬盘中读写数据。由于缓存访问的速度远大于硬盘读写的速度,因此cache命中率越高,存储系统的性能越高。因此,现有的做法是将“即将可能被访问到”的数据提前读取到cache中,后续的读数据请求则可立即命中,这种做法就称之为预取。High-speed cache memory (cached for short, also known as cache) is a buffer memory between the CPU and the main memory (for example, hard disk) in the storage system. It is smaller in size but faster than the hard disk. Normally, when the CPU processes a read data request, if it finds available data in the cache (called a cache hit), it can immediately return the result of the read data request, and only sends the read data request to the hard disk for reading when there is a miss. write data. Since the cache access speed is much faster than the hard disk read and write speed, the higher the cache hit rate, the higher the performance of the storage system. Therefore, the existing method is to read data that is "soon to be accessed" into the cache in advance, and subsequent read data requests can be hit immediately. This method is called prefetching.
对于包含多个控制器的存储系统,如果多个控制器是主/备(A/P)模式,即只有一个控制器处于工作状态,那么对一个存储系统来说,其保存在cache里面的数据都会集中存储在该控制器的cache中,因此可以通过对读数据请求进行顺序流识别来进行cache数据的预取。然而,如果多个控制器是主/主(A/A)模式,每个控制器都处于工作状态,读数据请求可能会被分发到各个控制器上,因此每个控制器对读数据请求进行顺序流识别从而进行预取时,所依据的信息不够全面,因此预取的数据不够准确。For a storage system containing multiple controllers, if multiple controllers are in active/standby (A/P) mode, that is, only one controller is in working state, then for a storage system, the data stored in the cache All are stored centrally in the cache of the controller, so the cache data can be prefetched by identifying the sequence flow of the read data request. However, if multiple controllers are in master/master (A/A) mode, and each controller is active, the read data request may be distributed to each controller, so each controller performs the read data request When prefetching is performed by sequence flow identification, the information based on it is not comprehensive enough, so the prefetched data is not accurate enough.
发明内容Contents of the invention
本发明实施例提供一种缓存数据的方法以及存储系统,以实现在存储系统包括多个控制器的情况下,准确地预测待读取的目标数据。Embodiments of the present invention provide a method for caching data and a storage system, so as to accurately predict target data to be read when the storage system includes multiple controllers.
本发明实施例第一方面提供了一种缓存数据的方法,所述方法应用于存储系统中,所述存储系统包括多个控制器,其中,每个控制器包括缓存;所述方法包括:The first aspect of the embodiments of the present invention provides a method for caching data, the method is applied in a storage system, and the storage system includes a plurality of controllers, wherein each controller includes a cache; the method includes:
第一控制器接收主机发送的读数据请求,所述读数据请求携带地址信息;根据所述读数据请求携带的地址信息确定第二控制器;向所述第二控制器发送所述地址信息;The first controller receives a read data request sent by the host, where the read data request carries address information; determines a second controller according to the address information carried in the read data request; and sends the address information to the second controller;
所述第二控制器根据所述地址信息获得待读取的目标数据的地址信息,以根据所述目标数据的地址信息将所述目标数据读取到缓存中。The second controller obtains address information of the target data to be read according to the address information, so as to read the target data into the cache according to the address information of the target data.
在本发明实施例第一方面的第一种实施方式中,所述读数据请求携带的地址信息包括所述读数据请求携带的起始地址;In the first implementation manner of the first aspect of the embodiments of the present invention, the address information carried in the read data request includes the start address carried in the read data request;
所述根据所述读数据请求携带的地址信息确定第二控制器包括:The determining the second controller according to the address information carried in the read data request includes:
根据所述读数据请求携带的起始地址,按照设定的散列算法,确定第二控制器。The second controller is determined according to the start address carried in the data read request and according to a set hash algorithm.
结合本发明实施例第一方面的第一种实施方式,在本发明实施例第一方面的第二种实施方式中,所述设定的散列算法包括一致性哈希算法。With reference to the first implementation manner of the first aspect of the embodiments of the present invention, in the second implementation manner of the first aspect of the embodiments of the present invention, the set hash algorithm includes a consistent hash algorithm.
在本发明实施例第一方面的第三种实施方式中,所述读数据请求携带的地址信息包括所述读数据请求携带的起始地址;In the third implementation manner of the first aspect of the embodiments of the present invention, the address information carried in the read data request includes the start address carried in the read data request;
所述根据所述读数据请求携带的地址信息确定第二控制器包括:The determining the second controller according to the address information carried in the read data request includes:
根据所述起始地址查询预设的配置表,获得所述起始地址对应的第二控制器。Querying a preset configuration table according to the starting address to obtain the second controller corresponding to the starting address.
在本发明实施例第一方面的第四种实施方式中,所述根据所述目标数据的地址信息将所述目标数据读取到缓存中包括:In the fourth implementation manner of the first aspect of the embodiments of the present invention, the reading the target data into the cache according to the address information of the target data includes:
所述第二控制器根据所述目标数据的地址信息将所述目标数据读取到所述第二控制器的缓存中。The second controller reads the target data into the cache of the second controller according to the address information of the target data.
在本发明实施例第一方面的第五种实施方式中,所述根据所述目标数据的地址信息将所述目标数据读取到缓存中包括:In the fifth implementation manner of the first aspect of the embodiments of the present invention, the reading the target data into the cache according to the address information of the target data includes:
所述第二控制器根据所述目标数据的地址信息确定所述目标数据对应的第三控制器;向所述第三控制器发送预取命令,所述预取命令包括所述目标数据的地址信息;The second controller determines a third controller corresponding to the target data according to the address information of the target data; sends a prefetch command to the third controller, and the prefetch command includes the address of the target data information;
所述第三控制器根据所述目标数据的地址信息将所述目标数据读取到所述第三控制器的缓存中。The third controller reads the target data into the cache of the third controller according to the address information of the target data.
本发明实施例第二方面提供了一种存储系统,包括:The second aspect of the embodiment of the present invention provides a storage system, including:
第一控制器用于接收主机发送的读数据请求,所述读数据请求携带地址信息;根据所述读数据请求携带的地址信息确定第二控制器;向所述第二控制器发送所述地址信息;The first controller is configured to receive a read data request sent by a host, where the read data request carries address information; determine a second controller according to the address information carried in the read data request; and send the address information to the second controller ;
所述第二控制器用于根据所述地址信息获得待读取的目标数据的地址信息,以根据所述目标数据的地址信息将所述目标数据读取到缓存中。The second controller is configured to obtain address information of target data to be read according to the address information, so as to read the target data into the cache according to the address information of the target data.
在本发明实施例第二方面的第一种实施方式中,所述读数据请求携带的地址信息包括所述读数据请求携带的起始地址;In the first implementation manner of the second aspect of the embodiments of the present invention, the address information carried in the read data request includes the start address carried in the read data request;
所述第一控制器具体用于根据所述读数据请求携带的起始地址,按照设定的散列算法,确定第二控制器。The first controller is specifically configured to determine the second controller according to the start address carried in the read data request and according to a set hash algorithm.
结合本发明实施例第二方面的第一种实施方式,在本发明实施例第二方面的第二种实施方式中,所述设定的散列算法包括一致性哈希算法。With reference to the first implementation manner of the second aspect of the embodiments of the present invention, in the second implementation manner of the second aspect of the embodiments of the present invention, the set hash algorithm includes a consistent hash algorithm.
在本发明实施例第二方面的第三种实施方式中,所述读数据请求携带的地址信息包括所述读数据请求携带的起始地址;In the third implementation manner of the second aspect of the embodiments of the present invention, the address information carried in the read data request includes the start address carried in the read data request;
所述第一控制器具体用于根据所述起始地址查询预设的配置表,获得所述起始地址对应的第二控制器。The first controller is specifically configured to query a preset configuration table according to the start address to obtain a second controller corresponding to the start address.
在本发明实施例第二方面的第四种实施方式中,所述第二控制器还用于根据所述目标数据的地址信息将所述目标数据读取到所述第二控制器的缓存中。In the fourth implementation manner of the second aspect of the embodiments of the present invention, the second controller is further configured to read the target data into the cache of the second controller according to the address information of the target data .
在本发明实施例第二方面的第五种实施方式中,所述系统还包括第三控制器;In the fifth implementation manner of the second aspect of the embodiments of the present invention, the system further includes a third controller;
所述第二控制器还用于根据所述目标数据的地址信息确定所述目标数据对应的第三控制器;向所述第三控制器发送预取命令,所述预取命令包括所述目标数据的地址信息;The second controller is further configured to determine a third controller corresponding to the target data according to the address information of the target data; and send a prefetch command to the third controller, and the prefetch command includes the target address information of the data;
所述第三控制器用于根据所述目标数据的地址信息将所述目标数据读取到所述第三控制器的缓存中。The third controller is configured to read the target data into the cache of the third controller according to the address information of the target data.
在本发明实施例中,在第一控制器接收到主机发送的读数据请求后,根据所述读数据请求携带的地址信息确定第二控制器,并将所述地址信息发送给所述第二控制器,由第二控制器根据地址信息获得待读取的目标数据,以执行读取所述目标数据到缓存的操作。由于执行获得待读取的目标数据的操作的控制器是由读数据请求携带的地址信息确定的,因此在一段逻辑地址上面发生的读数据请求可以由一个控制器集中分析,对于这段逻辑地址来说,所获得的读数据请求的信息是全面的,因此可以准确地预测待读取的目标数据,并读取到缓存中。In the embodiment of the present invention, after the first controller receives the read data request sent by the host, it determines the second controller according to the address information carried in the read data request, and sends the address information to the second controller. The controller is configured to obtain the target data to be read by the second controller according to the address information, so as to execute the operation of reading the target data into the cache. Since the controller that executes the operation to obtain the target data to be read is determined by the address information carried in the read data request, the read data request that occurs on a logical address can be centrally analyzed by a controller. For this logical address In other words, the obtained read data request information is comprehensive, so the target data to be read can be accurately predicted and read into the cache.
附图说明Description of drawings
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the accompanying drawings in the following description are For some embodiments of the present invention, those skilled in the art can also obtain other drawings based on these drawings without any creative work.
图1为本发明实施例提供的一种缓存数据的方法的应用网络架构示意图;FIG. 1 is a schematic diagram of an application network architecture of a method for caching data provided by an embodiment of the present invention;
图2为本发明实施例提供的一种缓存数据的方法的流程图;FIG. 2 is a flow chart of a method for caching data provided by an embodiment of the present invention;
图3为本发明实施例提供的另一种缓存数据的方法的流程图;FIG. 3 is a flowchart of another method for caching data provided by an embodiment of the present invention;
图4为本发明实施例提供的数据预取单元的数据结构图;FIG. 4 is a data structure diagram of a data prefetching unit provided by an embodiment of the present invention;
图5为本发明实施例提供的数据预取单元中的数据块表的数据结构图;5 is a data structure diagram of a data block table in a data prefetching unit provided by an embodiment of the present invention;
图6为本发明实施例提供的另一种缓存数据的方法的具体流程示意图;FIG. 6 is a schematic flowchart of another method for caching data provided by an embodiment of the present invention;
图7为本发明实施例提供的读数据请求与数据块的对应示意图;FIG. 7 is a schematic diagram of the correspondence between a read data request and a data block provided by an embodiment of the present invention;
图8为本发明实施例提供的存储系统的结构示意图。FIG. 8 is a schematic structural diagram of a storage system provided by an embodiment of the present invention.
具体实施方式Detailed ways
为使本发明实施例的目的、技术方案和优点更加清楚,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。In order to make the purpose, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the drawings in the embodiments of the present invention. Obviously, the described embodiments It is a part of embodiments of the present invention, but not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of the present invention.
本发明实施例的系统架构System architecture of the embodiment of the present invention
本发明实施例提供的缓存数据的方法可以在存储系统上实现。图1为本发明实施例提供的缓存数据的方法的系统架构示意图,如图1所示,该存储系统包括多个控制器(图中以包含4个控制器为例)和存储设备。本实施例中,存储设备以硬盘为例说明。The method for caching data provided by the embodiments of the present invention can be implemented on a storage system. FIG. 1 is a schematic diagram of a system architecture of a method for caching data provided by an embodiment of the present invention. As shown in FIG. 1 , the storage system includes multiple controllers (four controllers are taken as an example in the figure) and storage devices. In this embodiment, the storage device is described by taking a hard disk as an example.
图1仅是示例性说明,并不限定具体的组网方式,如:级联树形组网、环状组网都可以。只要控制器和存储设备之间能够相互通信。FIG. 1 is only an illustration, and does not limit a specific networking mode, for example, cascading tree networking and ring networking are all possible. As long as the controller and the storage device can communicate with each other.
控制器可以包括当前技术已知的任何计算设备,如服务器、台式计算机等等。在本发明实施例的一种应用场景中,每个控制器都可以处理来自主机的读数据请求,也都可以访问存储设备中存储的数据,比如将存储设备中的数据读取出来,存储在cache中。或者,在另一种应用场景中,每个控制器都可以处理来自主机的读数据请求,但是每个控制器对应存储设备中的一段存储空间(例如部分磁盘或者一个磁盘的部分存储空间),也就是说存储设备中的每段存储空间都有其对应的特定的控制器,不能被其他控制器管理或者访问。需要说明的是,这里的存储设备是指磁盘、硬盘或者其他存储介质,不包括控制器。The controller may include any computing device known in the art, such as servers, desktop computers, and the like. In an application scenario of the embodiment of the present invention, each controller can process the read data request from the host, and can also access the data stored in the storage device, such as reading the data in the storage device and storing it in the in the cache. Or, in another application scenario, each controller can handle read data requests from the host, but each controller corresponds to a section of storage space in the storage device (such as a part of the disk or a part of the storage space of a disk), That is to say, each segment of storage space in the storage device has its corresponding specific controller and cannot be managed or accessed by other controllers. It should be noted that the storage device here refers to a magnetic disk, hard disk or other storage media, excluding a controller.
在控制器内部,包含有缓存(cache),cache是CPU和硬盘之间的缓冲存储器,体积比硬盘小,但速度比硬盘快。Cache中存储有部分数据,当CPU在处理读数据请求时,如果在cache中找到可用数据,即cache命中。Inside the controller, there is a cache (cache). The cache is a buffer memory between the CPU and the hard disk. It is smaller than the hard disk but faster than the hard disk. Some data is stored in the cache. When the CPU is processing a read data request, if the available data is found in the cache, it is a cache hit.
每个控制器之间可以相互通信,可以访问其他控制器的cache中存储的数据。例如,控制器0接收到来自主机(图中未示出)访问数据A的读数据请求,控制器0的cache中没有存储数据A,但控制器1的cache中存储有数据A,因此控制器0可以向控制器1发送数据读取命令,使得控制器1从cache中读取数据A,并发送给控制器0,因此控制器0就可以直接向主机返回数据A了。需要说明的是,由于cache之间的数据通信是采用高速数据传输通道,因此控制器之间cache数据共享访问速度非常快,与cache未命中需要读取存储设备中的数据相比,从其他控制器的cache中获得数据的过程所花费的时间很短。Each controller can communicate with each other and can access the data stored in the cache of other controllers. For example, when controller 0 receives a data read request from a host (not shown in the figure) to access data A, data A is not stored in the cache of controller 0, but data A is stored in the cache of controller 1, so the controller 0 can send a data read command to controller 1, so that controller 1 reads data A from the cache and sends it to controller 0, so controller 0 can directly return data A to the host. It should be noted that since the data communication between caches adopts high-speed data transmission channels, the cache data sharing and access speed between controllers is very fast. The process of obtaining data in the server's cache takes a very short time.
控制器中安装有操作系统以及其他软件程序。例如,每个控制器中都包含有至少一个预取管理单元,在本发明实施例中,每个控制器中所包含的预取管理单元的数量大致相等。每个预取管理单元用于执行一段地址范围的逻辑存储空间数据读取操作。这里的每个预取管理单元管理的逻辑存储空间,可以是一个逻辑存储单元(Logic Unit Number,LUN),也可以是LUN的一段区域,还可以是文件夹等,在此不作限定。预取管理单元在控制器中的分布情况可以根据逻辑存储空间的地址范围来确定。例如,对于一个逻辑地址(例如Logic Block Address,LBA),可以按照一致性哈希算法或者其他散列算法,获得出所述逻辑地址对应的预取管理单元所在的控制器。An operating system and other software programs are installed on the controller. For example, each controller includes at least one prefetch management unit, and in the embodiment of the present invention, the number of prefetch management units included in each controller is approximately equal. Each prefetching management unit is used to perform a logical storage space data read operation of a range of addresses. The logical storage space managed by each prefetch management unit here can be a logical storage unit (Logic Unit Number, LUN), or a section of a LUN, or a folder, etc., which is not limited here. The distribution of the prefetching management units in the controller can be determined according to the address range of the logical storage space. For example, for a logical address (such as Logic Block Address, LBA), the controller where the prefetch management unit corresponding to the logical address is located can be obtained according to a consistent hash algorithm or other hash algorithms.
存储设备可以包括当前技术已知的存储设备,如SSD、或直接存取存储器(Direct Access Storage Device,DASD)等。存储设备的存储空间可以被划分为若干个逻辑块(chunk),每个chunk具有唯一的ID。在本实施例中,对存储设备中的数据的管理是以chunk为单位的,例如可以以chunk为单位将数据读取到cache中。The storage device may include a storage device known in the current technology, such as an SSD, or a direct access storage device (Direct Access Storage Device, DASD). The storage space of the storage device can be divided into several logical blocks (chunk), and each chunk has a unique ID. In this embodiment, data in the storage device is managed in units of chunks, for example, data may be read into the cache in units of chunks.
根据前面的描述可知,在本发明实施例的一种应用场景中,每个控制器都可以对存储设备进行读写操作,例如,每个控制器都可以将存储设备中的数据读取到自己的cache里面,以执行后续的读数据请求时可以实现cache命中;在另一种应用场景中,每个控制器对应存储设备的一部分存储空间,只能对这部分存储空间中存储的数据进行读写操作,例如将这部分存储空间中存储的数据读取到自己的cache中,而该存储设备的其他存储空间中存储的数据由其他控制器管理。According to the foregoing description, in an application scenario of the embodiment of the present invention, each controller can perform read and write operations on the storage device, for example, each controller can read the data in the storage device to its own In the cache, the cache hit can be achieved when executing subsequent read data requests; in another application scenario, each controller corresponds to a part of the storage space of the storage device, and can only read the data stored in this part of the storage space. A write operation, such as reading the data stored in this part of the storage space into its own cache, while the data stored in other storage spaces of the storage device are managed by other controllers.
缓存数据的方法How to cache data
下面介绍本发明实施例提供的缓存数据的方法,如图2所示,为本发明实施例提供的缓存数据的方法的流程图,所述方法应用于存储系统中,所述存储系统包括多个控制器,其中,每个控制器包括缓存;所述方法包括:The method for caching data provided by the embodiment of the present invention is introduced below. As shown in FIG. 2 , it is a flow chart of the method for caching data provided by the embodiment of the present invention. controllers, wherein each controller includes a cache; the method includes:
步骤S201:第一控制器接收主机发送的读数据请求,所述读数据请求携带地址信息;根据所述读数据请求携带的地址信息确定第二控制器;向所述第二控制器发送所述地址信息。Step S201: the first controller receives the read data request sent by the host, the read data request carries address information; determines the second controller according to the address information carried in the read data request; sends the second controller to the second controller Address information.
可选的,所述地址信息包括待读取数据的起始地址和长度,可以根据待读取数据的起始地址,按照设定的散列算法,获得所述读数据请求对应的第二控制器。Optionally, the address information includes the start address and length of the data to be read, and the second control corresponding to the read data request can be obtained according to the start address of the data to be read and according to a set hash algorithm. device.
步骤S202:所述第二控制器根据所述地址信息获得待读取的目标数据的地址信息,以根据所述目标数据的地址信息将所述目标数据读取到缓存中。Step S202: The second controller obtains address information of the target data to be read according to the address information, so as to read the target data into the cache according to the address information of the target data.
可选的,第二控制器根据所述地址信息获得待读取的目标数据的地址信息后,可以根据所述目标数据的地址信息将所述目标数据读取到第二控制器的缓存中;或者根据所述目标数据的地址信息向第三控制器发送预取命令,第三控制器根据所述目标数据的地址信息将所述目标数据读取到第三控制器的缓存中。Optionally, after the second controller obtains the address information of the target data to be read according to the address information, it may read the target data into the cache of the second controller according to the address information of the target data; Or send a prefetch command to the third controller according to the address information of the target data, and the third controller reads the target data into the cache of the third controller according to the address information of the target data.
可选的,本发明实施例还可以应用在分布式系统中,所述分布式系统包括多个节点,每个节点为一个服务器,每个服务器执行的功能与存储系统中的每个控制器类似,在此不再赘述。Optionally, this embodiment of the present invention can also be applied in a distributed system, the distributed system includes multiple nodes, each node is a server, and each server performs a function similar to that of each controller in the storage system , which will not be repeated here.
需要说明的是,所述第一控制器接收到的主机发送的读数据请求可以是一个,也可以是多个。当读数据请求是多个时,可以根据所述多个读数据请求携带的地址信息判断所述多个读数据请求是否连续,若连续,则可以将所述多个读数据请求进行合并,得到一段连续的地址信息。所述根据所述读数据请求携带的地址信息确定第二控制器具体是指,根据合并后连续的地址信息确定第二控制器。若所述多个读数据请求之间不连续,则分别根据每个读数据请求携带的地址信息确定第二控制器。It should be noted that the first controller may receive one or more read data requests sent by the host. When there are multiple read data requests, it can be judged whether the multiple read data requests are continuous according to the address information carried by the multiple read data requests, and if they are continuous, the multiple read data requests can be combined to obtain A continuous piece of address information. The determining the second controller according to the address information carried in the read data request specifically refers to determining the second controller according to the combined continuous address information. If the multiple read data requests are discontinuous, the second controller is determined respectively according to the address information carried in each read data request.
在本发明实施例中,在第一控制器接收到主机发送的读数据请求后,根据所述读数据请求携带的地址信息确定第二控制器,并将所述地址信息发送给所述第二控制器,由第二控制器根据地址信息获得待读取的目标数据,以执行读取所述目标数据到缓存的操作。由于执行获得待读取的目标数据的操作的控制器是由读数据请求携带的地址信息确定的,因此在一段逻辑地址上面发生的读数据请求可以由一个控制器集中分析,对于这段逻辑地址来说,所获得的读数据请求的信息是全面的,因此可以准确地预测待读取的目标数据,并读取到缓存中。In the embodiment of the present invention, after the first controller receives the read data request sent by the host, it determines the second controller according to the address information carried in the read data request, and sends the address information to the second controller. The controller is configured to obtain the target data to be read by the second controller according to the address information, so as to execute the operation of reading the target data into the cache. Since the controller that executes the operation to obtain the target data to be read is determined by the address information carried in the read data request, the read data request that occurs on a logical address can be centrally analyzed by a controller. For this logical address In other words, the obtained read data request information is comprehensive, so the target data to be read can be accurately predicted and read into the cache.
如图3所示,为本发明实施例提供的另一个缓存数据的方法的流程图,为了方便描述,本发明实施例以三个控制器为例,但实际上并不限于三个控制器。参见图3,具体执行下述步骤的可以是控制器中的处理器,所述方法包括:As shown in FIG. 3 , it is a flow chart of another method for caching data provided by the embodiment of the present invention. For convenience of description, the embodiment of the present invention takes three controllers as an example, but it is not limited to three controllers in fact. Referring to Fig. 3, it may be the processor in the controller that specifically performs the following steps, and the method includes:
步骤S301:第一控制器接收主机发送的第一读数据请求,所述第一读数据请求携带第一待读取数据的地址信息,其中所述地址信息包括第一待读取数据的起始地址(LBA)和长度。Step S301: The first controller receives the first read data request sent by the host, the first read data request carries the address information of the first data to be read, wherein the address information includes the start of the first data to be read address (LBA) and length.
需要说明的是,本发明实施例中的逻辑地址,又称起始地址,又称LBA。It should be noted that the logical address in the embodiment of the present invention is also called the start address, also called the LBA.
步骤S302:第一控制器确定所述第一读数据请求对应的控制器。Step S302: the first controller determines the controller corresponding to the first data read request.
在本发明实施例中,每个控制器中都包含有至少一个预取管理单元,在本发明实施例中,每个控制器中所包含的预取管理单元的数量大致相等,每个预取管理单元用于管理一段地址范围的存储空间,例如LUN的一段区域。具体的,可以根据第一待读取数据的LBA,按照一致性哈希算法或者其他散列算法,获得第一读数据请求对应的控制器,进而获得该控制器中的预取管理单元。当控制器中包含一个预取管理单元时,第一读数据请求对应的控制器确定了,那么第一读数据请求对应的预取管理单元也就确定了;当控制器中包含多个预取管理单元时,每个预取管理单元管理一段地址范围的存储空间,因此也可以根据LBA唯一确定一个控制器中的一个预取管理单元。举例来说,所述第一待读取数据对应的控制器是第二控制器。In the embodiment of the present invention, each controller includes at least one prefetch management unit. In the embodiment of the present invention, the number of prefetch management units included in each controller is approximately equal, and each prefetch The management unit is used to manage the storage space of a certain address range, such as a certain area of a LUN. Specifically, the controller corresponding to the first read data request can be obtained according to the LBA of the first data to be read according to the consistent hash algorithm or other hash algorithms, and then the prefetch management unit in the controller can be obtained. When the controller contains a prefetch management unit, the controller corresponding to the first read data request is determined, then the prefetch management unit corresponding to the first read data request is also determined; when the controller contains multiple prefetch When using a management unit, each prefetch management unit manages a storage space of an address range, so a prefetch management unit in a controller can also be uniquely determined according to the LBA. For example, the controller corresponding to the first data to be read is the second controller.
散列算法,又称哈希算法,是指根据一定的关键(key)值确定唯一的访问地址的一种数据结构,其目的是加快查找。其中,key是指起始地址。可选的,通常的散列算法可以用一张散列表来实现,根据key值在散列表中进行查找可以获得其对应的访问地址。需要说明的是,一般的散列算法是线性的计算方式。在本发明实施例中,可以采用散列算法通过起始地址的输入来唯一确定一个控制器。Hash algorithm, also known as hash algorithm, refers to a data structure that determines a unique access address based on a certain key (key) value, and its purpose is to speed up the search. Among them, key refers to the starting address. Optionally, a common hash algorithm can be implemented with a hash table, and the corresponding access address can be obtained by searching in the hash table according to the key value. It should be noted that the general hash algorithm is a linear calculation method. In the embodiment of the present invention, a hash algorithm may be used to uniquely determine a controller through the input of the start address.
可选的,一致性散列算法可以采用环形的数据结构实现key值到访问位置的定位。在所述环形的数据结构中,可以包含多个虚拟节点,例如虚拟节点0、虚拟节点1、虚拟节点2、虚拟节点3…虚拟节点10,,每个虚拟节点对应一段访问地址。其中每两个相邻的虚拟节点依次连接,虚拟节点10与虚拟节点0首尾连接,共同组成一个环形。以存储系统有三个控制器为例,三个控制器分别命名为控制器A、控制器B和控制器C,每个控制器分别对应几个环形数据结构中的虚拟节点,例如控制器A对应虚拟节点0、虚拟节点1、虚拟节点2、虚拟节点3;控制器B对应虚拟节点4、虚拟节点5、虚拟节点6、虚拟节点7;控制器C对应虚拟节点8、虚拟节点9、虚拟节点10。通过这种方式,也可以使起始地址唯一对应一个控制器。Optionally, the consistent hash algorithm can use a ring data structure to locate the key value to the access location. The ring data structure may contain multiple virtual nodes, such as virtual node 0, virtual node 1, virtual node 2, virtual node 3...virtual node 10, and each virtual node corresponds to a segment of access address. Wherein every two adjacent virtual nodes are connected sequentially, and virtual node 10 is connected end-to-end with virtual node 0 to form a ring together. Take the storage system with three controllers as an example. The three controllers are respectively named controller A, controller B, and controller C. Each controller corresponds to several virtual nodes in the ring data structure. For example, controller A corresponds to Virtual node 0, virtual node 1, virtual node 2, and virtual node 3; controller B corresponds to virtual node 4, virtual node 5, virtual node 6, and virtual node 7; controller C corresponds to virtual node 8, virtual node 9, and virtual node 10. In this way, the start address can also uniquely correspond to a controller.
然而,对于一致性散列算法来说,当存储系统增加新的控制器时,例如增加控制器D,无需重新排布数据结构或者修改算法,仅需要调整每个控制器对应虚拟节点,就可以使新增的控制器D对应一段起始地址。However, for the consistent hash algorithm, when a new controller is added to the storage system, for example, controller D is added, there is no need to rearrange the data structure or modify the algorithm, and only need to adjust the corresponding virtual node of each controller. Make the newly added controller D correspond to a section of starting address.
可选的,存储系统的每个控制器中可以保存一张预设的配置表,所述配置表包括起始地址与各个控制器之间的对应关系,接收到读数据请求的控制器可以根据读数据请求中携带的起始地址在配置表中进行查询,从而获得所述起始地址对应的控制器。Optionally, each controller of the storage system can store a preset configuration table, the configuration table includes the corresponding relationship between the start address and each controller, and the controller that receives the read data request can according to The start address carried in the read data request is queried in the configuration table, so as to obtain the controller corresponding to the start address.
可选的,上述配置表可以仅仅保存在一个控制器中,当其他控制器接收到读数据请求时,可以向保存配置表的控制器发送查询请求,所述查询请求包括所述读数据请求携带的起始地址,使得所述保存配置表的控制器可以根据所述起始地址在配置表中进行查询,从而获得所述起始地址对应的控制器,并将查询结果发送给接收到读数据请求的控制器。Optionally, the above-mentioned configuration table may only be stored in one controller, and when other controllers receive a read data request, they may send a query request to the controller storing the configuration table, and the query request includes the start address, so that the controller that saves the configuration table can query in the configuration table according to the start address, so as to obtain the controller corresponding to the start address, and send the query result to the The requested controller.
可选的,为了防止保存配置表的控制器发生故障时,所述配置表丢失,可以在所述存储系统的另一个控制器中保存所述配置表的副本。Optionally, in order to prevent the configuration table from being lost when the controller storing the configuration table fails, another controller of the storage system may store a copy of the configuration table.
可选的,还可以通过取模的方式获得所述起始地址对应的控制器,具体的,用所述起始地址除以控制器的个数,根据计算出来的模即可获得对应的控制器。Optionally, the controller corresponding to the start address can also be obtained by taking a modulus. Specifically, the start address is divided by the number of controllers, and the corresponding control can be obtained according to the calculated modulus. device.
步骤S303:第二控制器接收主机发送的第二读数据请求,所述第二读数据请求携带第二待读取数据的地址信息,其中所述地址信息包括第二待读取数据的起始地址(LBA)和长度。Step S303: The second controller receives the second read data request sent by the host, the second read data request carries the address information of the second data to be read, wherein the address information includes the start of the second data to be read address (LBA) and length.
需要说明的是,步骤S301和步骤S303之间没有先后顺序之分。It should be noted that there is no sequence between step S301 and step S303.
步骤S304:第二控制器确定所述第二读数据请求对应的控制器。Step S304: the second controller determines the controller corresponding to the second data read request.
具体的,可以根据第二待读取数据的LBA获得其对应的控制器,与步骤S302类似,在此不再赘述。Specifically, the corresponding controller may be obtained according to the LBA of the second data to be read, which is similar to step S302 and will not be repeated here.
当第一读数据请求对应的控制器与第二读数据请求对应的控制器不相同时,两个控制器可以分别进行缓存数据预取的操作,互不影响。这里重点讨论第一读数据请求对应的控制器与第二读数据请求对应的控制器相同的情况。When the controllers corresponding to the first data read request are different from the controllers corresponding to the second data read request, the two controllers may respectively perform cache data prefetch operations without affecting each other. The discussion here focuses on the case that the controller corresponding to the first data read request is the same as the controller corresponding to the second data read request.
步骤S305:第一控制器向第二控制器发送第一读数据请求的落点信息,所述落点信息包括第一待读取数据的地址信息,此外,落点信息还可以包括发送第一读数据请求的主机的ID,以及第一控制器的ID等,所述落点信息可以作为数据预取的分析依据。Step S305: The first controller sends the drop point information of the first read data request to the second controller, the drop point information includes the address information of the first data to be read, and the drop point information may also include sending the first The ID of the host that reads the data request, the ID of the first controller, etc., and the drop point information can be used as an analysis basis for data prefetching.
由于第二读数据请求对应的控制器是第二控制器,因此第二控制器的处理器可以将第二读数据请求的落点信息存储在缓存中的地址推送给第二控制器的预取管理单元,或者以其他方式使得第二控制器的预取管理单元获得第二读数据请求的落点信息,在此不作限定。Since the controller corresponding to the second read data request is the second controller, the processor of the second controller can push the address where the landing point information of the second read data request is stored in the cache to the prefetch of the second controller The management unit, or the prefetch management unit of the second controller obtains the landing point information of the second data read request in other ways, which is not limited herein.
步骤S306:第二控制器根据第一读数据请求的落点信息和第二读数据请求的落点信息,预测下一个读数据请求的目标数据。Step S306: The second controller predicts the target data of the next data read request according to the drop point information of the first read data request and the drop point information of the second read data request.
下一个读数据请求是指存储系统即将接收的读数据请求(目前还没有收到),为了方便描述,将下一个读数据请求称作第三读数据请求。需要说明的是,下一个读数据请求并不限于紧接着第一读数据请求、第二读数据请求的读数据请求,只要是在第一读数据请求、第二读数据请求之后接收的读数据请求都可以称作下一个读数据请求。The next read data request refers to the read data request to be received by the storage system (not yet received), for the convenience of description, the next read data request is referred to as the third read data request. It should be noted that the next read data request is not limited to the read data request following the first read data request and the second read data request, as long as it is the read data received after the first read data request and the second read data request Each request can be referred to as the next read data request.
第二控制器可以利用预取管理单元来预测第三读数据请求的目标数据。预取管理单元是控制器中包含的用于执行缓存数据读取操作的功能单元,如图4所示,预取管理单元包括:数据块表、接口和预取策略模块。其中数据块表包含多个数据块,每个数据块对应磁盘上的一个逻辑块(chunk),并且大小相同。所述数据块用于记录读数据请求的落点信息以及其他信息。所述数据块表可以按照落点信息中包含的LBA从小到大排序,或者从大到小排序。另外,本发明实施例的数据块表也可以不限于表的形式,还可以是红黑树,或者二叉树等其他可以实现顺序查找的数据管理结构。需要说明的是,每个数据块中记录的是读数据请求的落点信息以及其他信息,但不包括待读取数据本身。另外,接口用于接收其他控制器发送的读数据请求的落点信息,或者向其他控制器发送预取命令;预取策略模块用于根据设定的预取策略执行读取操作。The second controller may use the prefetch management unit to predict target data of the third read data request. The prefetch management unit is a functional unit included in the controller for performing cache data read operations. As shown in FIG. 4 , the prefetch management unit includes: a data block table, an interface, and a prefetch strategy module. The data block table includes multiple data blocks, each data block corresponds to a logical block (chunk) on the disk, and has the same size. The data block is used to record drop point information of the data read request and other information. The data block table may be sorted according to the LBAs contained in the drop point information from small to large, or from large to small. In addition, the data block table in the embodiment of the present invention may not be limited to the form of a table, and may also be a red-black tree, or a binary tree or other data management structures that can implement sequential search. It should be noted that what is recorded in each data block is the location information of the read data request and other information, but does not include the data to be read itself. In addition, the interface is used to receive the landing point information of the read data request sent by other controllers, or send a prefetch command to other controllers; the prefetch strategy module is used to execute the read operation according to the set prefetch strategy.
如图5所示,每个数据块中记录的信息可以包括:Chunk ID、落点信息以及最近一次读取到的chunk的ID,其中落点信息具体可以包括主机ID、控制器ID、LBA和长度;As shown in Figure 5, the information recorded in each data block can include: Chunk ID, drop point information, and the ID of the chunk read last time, wherein the drop point information can specifically include host ID, controller ID, LBA and length;
所述Chunk ID具体是指所述数据块所对应的磁盘上的chunk的ID,用Chunk ID乘以每个chunk的大小可以获得chunk的起始地址。由于数据的预取是以chunk为单位的,所以在预取目标数据时还需要知道目标数据所在的chunk的起始地址。需要说明的是,前面描述的LBA是指读数据请求的起始地址,区别于所述chunk的起始地址。在某些情况下,读数据请求的起始地址与chunk的起始地址相同,但大多数情况,读数据请求的起始地址与chunk的起始地址不相同。The Chunk ID specifically refers to the ID of the chunk on the disk corresponding to the data block, and the initial address of the chunk can be obtained by multiplying the Chunk ID by the size of each chunk. Since data is prefetched in units of chunks, it is also necessary to know the start address of the chunk where the target data is located when prefetching the target data. It should be noted that the LBA described above refers to the start address of the read data request, which is different from the start address of the chunk. In some cases, the start address of the read data request is the same as the start address of the chunk, but in most cases, the start address of the read data request is not the same as the start address of the chunk.
所述最近一次读取到的chunk的ID,用于标识在所述数据块上发生的上次的读取操作的预取范围,由于在执行读取操作时,都是以chunk为单位来执行的,所以可以根据所述最近一次读取到的chunk的ID判断本次预取是否与上次预取的范围有重叠,如果有,在执行读取操作时可以不包含重叠部分的chunk。The ID of the chunk read last time is used to identify the prefetch range of the last read operation that occurred on the data block, since the read operation is performed in units of chunks Therefore, it can be judged according to the ID of the chunk read last time whether this prefetching overlaps with the range of the last prefetching, and if so, the overlapping chunks may not be included when performing the read operation.
此外,还可以记录读数据请求的序列号、时间戳、位图等信息。In addition, the sequence number, time stamp, bitmap and other information of the read data request can also be recorded.
如图6所示,步骤S306具体可以包括以下步骤:As shown in FIG. 6, step S306 may specifically include the following steps:
S3061:根据第一读数据请求的落点信息和第二读数据请求的落点信息,确定第一读数据请求和第二读数据请求具有连续的顺序关系。S3061: According to the drop point information of the first read data request and the drop point information of the second read data request, determine that the first read data request and the second read data request have a continuous sequence relationship.
具体的判断方法是,根据第一读数据请求的起始地址和长度获得第一读数据请求的末尾地址,如果第一读数据请求的末尾地址和第二读数据请求的起始地址连续,则说明第一读数据请求和第二读数据请求具有连续的顺序关系;或者,根据第二读数据请求的起始地址和长度获得第二读数据请求的末尾地址,如果第二读数据请求的末尾地址和第一读数据请求的起始地址连续,则说明第一读数据请求和第二读数据请求具有连续的顺序关系。The specific judgment method is to obtain the end address of the first read data request according to the start address and length of the first read data request, if the end address of the first read data request is continuous with the start address of the second read data request, then Explain that the first read data request and the second read data request have a continuous sequential relationship; or, obtain the end address of the second read data request according to the start address and length of the second read data request, if the end of the second read data request If the address is continuous with the start address of the first read data request, it means that the first read data request and the second read data request have a continuous sequence relationship.
需要说明的是,在本发明实施例中,第一读数据请求和第二读数据请求可以不绝对连续,允许它们之间有一定程度的地址空隙。It should be noted that, in the embodiment of the present invention, the first read data request and the second read data request may not be absolutely continuous, and a certain degree of address gap is allowed between them.
S3062:确定第一读数据请求和第二读数据请求对应的数据块。S3062: Determine data blocks corresponding to the first data read request and the second data read request.
具体的,可以根据第一读数据请求的落点信息和第二读数据请求的落点信息,确定第一读数据请求和第二读数据请求对应的数据块。由于第一读数据请求和第二读数据请求具有连续的顺序关系,所以他们对应的数据块也连续。第一读数据请求和第二读数据请求对应的数据块可以如图7所示。Specifically, the data blocks corresponding to the first data read request and the second data read request may be determined according to the drop point information of the first data read request and the drop point information of the second data read request. Since the first read data request and the second read data request have a sequential relationship, their corresponding data blocks are also continuous. Data blocks corresponding to the first read data request and the second read data request may be as shown in FIG. 7 .
S3063:确定与所述第一读数据请求和第二读数据请求对应的数据块连续的数据块,得到数据块的最大连续段。S3063: Determine data blocks that are continuous with the data blocks corresponding to the first read data request and the second read data request, and obtain the largest continuous segment of the data block.
具体的,可以根据第一读数据请求和第二读数据请求对应的数据块,向前遍历所述数据块表,若有数据块与第一读数据请求和第二读数据请求对应的数据块连续,则根据其连续的数据块上记录的落点信息判断上次发生所述数据块上的读数据请求是否和当前的第一读数据请求、第二读数据请求连续,若连续,则继续在数据块表中获得连续的数据块,直至获得与所述第一读数据请求和第二读数据请求对应的数据块的最大连续段。Specifically, the data block table may be traversed forward according to the data blocks corresponding to the first read data request and the second read data request, if there is a data block corresponding to the first read data request and the second read data request If it is continuous, judge whether the read data request on the data block that occurred last time is continuous with the current first read data request and the second read data request according to the drop point information recorded on the continuous data block. If continuous, continue Continuous data blocks are obtained in the data block table until the maximum continuous segment of the data block corresponding to the first read data request and the second read data request is obtained.
S3064:获得目标数据的长度和起始地址。S3064: Obtain the length and start address of the target data.
具体的,可以根据步骤S3063得到的数据块的最大连续段的大小,经过预取策略的计算,得到目标数据的长度。Specifically, the length of the target data can be obtained through the calculation of the prefetching strategy according to the size of the largest continuous segment of the data block obtained in step S3063.
可选的,如果由所述最近一次读取到的chunk的ID获知,目标数据中的其中一部分在上次读取操作时被预取过,那么可以除去这部分数据。Optionally, if it is known from the last read chunk ID that part of the target data has been prefetched during the last read operation, then this part of data may be removed.
另外,预测下一个读数据请求的目标数据,除了确定目标数据的长度以外,还需要确定目标数据的起始地址。在本实施例中,目标数据的起始地址为第一读数据请求的末尾地址或者第二读数据请求的末尾地址。In addition, to predict the target data of the next read data request, in addition to determining the length of the target data, it is also necessary to determine the start address of the target data. In this embodiment, the start address of the target data is the end address of the first data read request or the end address of the second data read request.
当第二控制器预测出下一个读数据请求的目标数据之后,可以根据所述目标数据的起始地址和长度从磁盘中读取所述目标数据,存储在第二控制器的cache中,以供下次执行第三读数据请求时可以cache命中。这种情况主要适用于每个控制器均可以管理或者访问存储设备中的每个磁盘的场景。After the second controller predicts the target data of the next read data request, the target data can be read from the disk according to the start address and length of the target data, stored in the cache of the second controller, and It can be used for a cache hit when the third read data request is executed next time. This situation is mainly applicable to a scenario where each controller can manage or access each disk in the storage device.
然而,在某些应用场景中,每个控制器只能管理或者访问存储设备中的部分存储空间(部分磁盘或者一个磁盘的部分存储空间),也就是说存储设备中的每段存储空间都有其对应的特定的控制器,不能被其他控制器管理或者访问。在这种情况下,假设目标数据所在的存储空间是由第三控制器管理的,那么本实施例还可以包括:However, in some application scenarios, each controller can only manage or access part of the storage space in the storage device (part of the disk or part of the storage space of a disk), that is to say, each segment of storage space in the storage device has Its corresponding specific controller cannot be managed or accessed by other controllers. In this case, assuming that the storage space where the target data is located is managed by the third controller, this embodiment may further include:
步骤S307:第二控制器向第三控制器发送预取命令,所述预取命令包括目标数据的起始地址和长度,使得第三控制器将所述目标数据读取到其缓存中。Step S307: the second controller sends a prefetch command to the third controller, the prefetch command includes the start address and length of the target data, so that the third controller reads the target data into its cache.
可选的,第二控制器可以根据目标数据的起始地址确定所述目标数据所在的存储空间是由第三控制器管理的,因此可以向第三控制器发送数据预取命令。具体的,第二控制器可以通过系统配置或者一些现有计算方法,由目标数据的起始地址,获得所述目标数据所在的存储空间对应的控制器。Optionally, the second controller may determine according to the start address of the target data that the storage space where the target data is located is managed by the third controller, and therefore may send a data prefetch command to the third controller. Specifically, the second controller may obtain the controller corresponding to the storage space where the target data is located from the start address of the target data through system configuration or some existing calculation methods.
步骤S308:第三控制器预取所述目标数据到其缓存中。Step S308: the third controller prefetches the target data into its cache.
具体的,第三控制器可以根据所述目标数据的起始地址和长度,从磁盘中读取所述目标数据,存储在第三控制器的cache中。Specifically, the third controller may read the target data from the disk according to the start address and length of the target data, and store it in the cache of the third controller.
步骤S309:第一控制器接收主机发送的第三读数据请求,所述第三读数据请求的待读取数据是所述目标数据,或者待读取数据是所述目标数据的一部分。Step S309: the first controller receives a third data read request sent by the host, and the data to be read in the third data read request is the target data, or the data to be read is a part of the target data.
第一控制器接收所述第三读数据请求后,发现其cache中没有存储所述目标数据,但第三控制器的cache中存储有所述目标数据,执行步骤S310。After the first controller receives the third data read request, it finds that the target data is not stored in its cache, but the target data is stored in the cache of the third controller, and step S310 is executed.
可选的,接收所述第三读数据请求的可以是存储系统中的任意一个控制器,若接收所述第三读数据请求的第三控制器,则第三控制器可以直接将其缓存中目标数据发送给所述主机,不需执行步骤S310-S311;若接收所述第三读数据请求的是存储系统中的其他控制器,其操作步骤与第一控制器接收所述第三读数据请求的情况相类似。Optionally, it may be any controller in the storage system that receives the third read data request. If the third controller receives the third read data request, the third controller may directly cache the The target data is sent to the host without performing steps S310-S311; if other controllers in the storage system receive the third read data request, the operation steps are the same as those of the first controller receiving the third read data Requests are similar.
步骤S310:第一控制器向第三控制器发送数据读取命令,所述数据读取命令包括所述目标数据的起始地址和长度,用于要求所述第三控制器发送所述目标数据。Step S310: the first controller sends a data read command to the third controller, the data read command includes the start address and length of the target data, and is used to request the third controller to send the target data .
步骤S311:第三控制器向第一控制器发送所述目标数据。Step S311: the third controller sends the target data to the first controller.
需要说明的是,控制器之间的数据通道采用的是高速数据传输通道,据统计,控制器间的cache数据的访问速度一般小于1ms。然而,如果无法cache命中,从磁盘读取目标数据的速度是6-10ms。因此,即使是跨控制器命中,其速度也是远大于从磁盘读取数据的速度的。It should be noted that the data channel between the controllers uses a high-speed data transmission channel. According to statistics, the access speed of cache data between the controllers is generally less than 1ms. However, if there is no cache hit, the speed of reading the target data from the disk is 6-10ms. Therefore, even cross-controller hits are much faster than reading data from disk.
步骤S312:第一控制器接收第三控制器发送的目标数据后,向主机发送所述目标数据,即实现了缓存命中。Step S312: After receiving the target data sent by the third controller, the first controller sends the target data to the host, that is, a cache hit is achieved.
另外,在本实施例中,第一控制器还可以根据第三读数据请求携带的信息确定第三读数据请求对应的控制器,向该控制器发送第三读数据请求的落点信息,使得该控制器的预取管理单元继续预测下一个读数据请求的待读取数据,执行缓存数据的操作。该过程可以重复步骤S301-步骤S308,这里不再赘述。In addition, in this embodiment, the first controller may also determine the controller corresponding to the third data read request according to the information carried in the third data read request, and send the landing point information of the third data read request to the controller, so that The prefetch management unit of the controller continues to predict the data to be read in the next data read request, and executes the operation of caching the data. This process can repeat step S301-step S308, which will not be repeated here.
在本发明实施例中,在第一控制器接收到主机发送的读数据请求后,根据所述读数据请求携带的地址信息确定第二控制器,并将所述地址信息发送给所述第二控制器,由第二控制器根据地址信息获得待读取的目标数据,以执行读取所述目标数据到缓存的操作。由于执行获得待读取的目标数据的操作的控制器是由读数据请求携带的地址信息确定的,因此在一段逻辑地址上面发生的读数据请求可以由一个控制器集中分析,对于这段逻辑地址来说,所获得的读数据请求的信息是全面的,因此可以准确地预测待读取的目标数据,并读取到缓存中。In the embodiment of the present invention, after the first controller receives the read data request sent by the host, it determines the second controller according to the address information carried in the read data request, and sends the address information to the second controller. The controller is configured to obtain the target data to be read by the second controller according to the address information, so as to execute the operation of reading the target data into the cache. Since the controller that executes the operation to obtain the target data to be read is determined by the address information carried in the read data request, the read data request that occurs on a logical address can be centrally analyzed by a controller. For this logical address In other words, the obtained read data request information is comprehensive, so the target data to be read can be accurately predicted and read into the cache.
存储系统Storage System
下面介绍本发明实施例提供的存储系统,如图8所示,为本发明实施例提供的存储系统80的结构图,包括多个控制器,其中,每个控制器包括处理器和缓存;其中:The following describes the storage system provided by the embodiment of the present invention. As shown in FIG. 8 , it is a structural diagram of the storage system 80 provided by the embodiment of the present invention, including multiple controllers, wherein each controller includes a processor and a cache; wherein :
第一控制器801用于接收主机发送的读数据请求,所述读数据请求携带地址信息;根据所述读数据请求携带的地址信息确定第二控制器802;向所述第二控制器802发送所述地址信息。具体的,执行上述操作的是第一控制器801中的处理器。The first controller 801 is configured to receive the read data request sent by the host, and the read data request carries address information; determine the second controller 802 according to the address information carried in the read data request; and send the request to the second controller 802 The address information. Specifically, it is the processor in the first controller 801 that performs the above operations.
可选的,所述地址信息包括待读取数据的起始地址和长度,可以根据待读取数据的起始地址,按照设定的散列算法,获得所述读数据请求对应的控制器是第二控制器802。Optionally, the address information includes the start address and length of the data to be read, and the controller corresponding to the read data request can be obtained according to the start address of the data to be read according to a set hash algorithm. The second controller 802 .
在本发明实施例中,每个控制器中都包含有至少一个预取管理单元,在本发明实施例中,每个控制器中所包含的预取管理单元的数量大致相等,每个预取管理单元用于管理一段地址范围的存储空间,例如LUN的一段区域。具体的,可以根据第一待读取数据的LBA,按照一致性哈希算法或者其他散列算法,获得第一读数据请求对应的控制器,进而获得该控制器中的预取管理单元。当控制器中包含一个预取管理单元时,第一读数据请求对应的控制器确定了,那么第一读数据请求对应的预取管理单元也就确定了;当控制器中包含多个预取管理单元时,每个预取管理单元管理一段地址范围的存储空间,因此也可以根据LBA唯一确定一个控制器中的一个预取管理单元。举例来说,所述第一待读取数据对应的控制器是第二控制器。In the embodiment of the present invention, each controller includes at least one prefetch management unit. In the embodiment of the present invention, the number of prefetch management units included in each controller is approximately equal, and each prefetch The management unit is used to manage the storage space of a certain address range, such as a certain area of a LUN. Specifically, the controller corresponding to the first read data request can be obtained according to the LBA of the first data to be read according to the consistent hash algorithm or other hash algorithms, and then the prefetch management unit in the controller can be obtained. When the controller contains a prefetch management unit, the controller corresponding to the first read data request is determined, then the prefetch management unit corresponding to the first read data request is also determined; when the controller contains multiple prefetch When using a management unit, each prefetch management unit manages a storage space of an address range, so a prefetch management unit in a controller can also be uniquely determined according to the LBA. For example, the controller corresponding to the first data to be read is the second controller.
可选的,可以采用散列算法通过起始地址的输入来唯一确定一个控制器,所述散列算法可以是一致性散列算法。Optionally, a hash algorithm may be used to uniquely determine a controller through the input of the start address, and the hash algorithm may be a consistent hash algorithm.
可选的,存储系统的每个控制器中可以保存一张预设的配置表,所述配置表包括起始地址与各个控制器之间的对应关系,接收到读数据请求的控制器可以根据读数据请求中携带的起始地址在配置表中进行查询,从而获得所述起始地址对应的控制器。Optionally, each controller of the storage system can store a preset configuration table, the configuration table includes the corresponding relationship between the start address and each controller, and the controller that receives the read data request can according to The start address carried in the read data request is queried in the configuration table, so as to obtain the controller corresponding to the start address.
可选的,上述配置表可以仅仅保存在一个控制器中,当其他控制器接收到读数据请求时,可以向保存配置表的控制器发送查询请求,所述查询请求包括所述读数据请求携带的起始地址,使得所述保存配置表的控制器可以根据所述起始地址在配置表中进行查询,从而获得所述起始地址对应的控制器,并将查询结果发送给接收到读数据请求的控制器。Optionally, the above-mentioned configuration table may only be stored in one controller, and when other controllers receive a read data request, they may send a query request to the controller storing the configuration table, and the query request includes the start address, so that the controller that saves the configuration table can query in the configuration table according to the start address, so as to obtain the controller corresponding to the start address, and send the query result to the The requested controller.
可选的,为了防止保存配置表的控制器发生故障时,所述配置表丢失,可以在所述存储系统的另一个控制器中保存所述配置表的副本。Optionally, in order to prevent the configuration table from being lost when the controller storing the configuration table fails, another controller of the storage system may store a copy of the configuration table.
可选的,还可以通过取模的方式获得所述起始地址对应的控制器,具体的,用所述起始地址除以控制器的个数,根据计算出来的模即可获得对应的控制器。Optionally, the controller corresponding to the start address can also be obtained by taking a modulus. Specifically, the start address is divided by the number of controllers, and the corresponding control can be obtained according to the calculated modulus. device.
第一控制器801根据所述读数据请求携带的地址信息确定第二控制器802之后,可以向所述第二控制器802发送所述地址信息。After the first controller 801 determines the second controller 802 according to the address information carried in the read data request, it may send the address information to the second controller 802 .
所述第二控制器802用于根据所述地址信息获得待读取的目标数据的地址信息,以根据所述目标数据的地址信息将所述目标数据读取到缓存中。具体的,执行上述操作的是第一控制器801中的处理器。The second controller 802 is configured to obtain address information of target data to be read according to the address information, so as to read the target data into the cache according to the address information of the target data. Specifically, it is the processor in the first controller 801 that performs the above operations.
可选的,第二控制器802根据所述地址信息获得待读取的目标数据的地址信息后,可以根据所述目标数据的地址信息将所述目标数据读取到第二控制器802的缓存中;或者根据所述目标数据的地址信息向第三控制器803发送预取命令,第三控制器803根据所述目标数据的地址信息将所述目标数据读取到第三控制器803的缓存中。Optionally, after the second controller 802 obtains the address information of the target data to be read according to the address information, it may read the target data into the cache of the second controller 802 according to the address information of the target data or send a prefetch command to the third controller 803 according to the address information of the target data, and the third controller 803 reads the target data into the cache of the third controller 803 according to the address information of the target data middle.
其中,第二控制器802可以利用预取管理单元来获得待读取的目标数据。具体的获得方法与上面描述的方法实施例类似,这里不再赘述。Wherein, the second controller 802 may use the prefetch management unit to obtain target data to be read. The specific obtaining method is similar to the method embodiment described above, and will not be repeated here.
可选的,第一控制器801接收主机发送的下一个读数据请求,所述下一个读数据请求的待读取数据是所述目标数据,或者待读取数据是所述目标数据的一部分。第一控制器801接收所述下一个读数据请求后,发现其cache中没有存储所述目标数据,但第三控制器803的cache中存储有所述目标数据,则可以向第三控制器803发送数据读取命令,所述数据读取命令包括所述目标数据的起始地址和长度,用于要求所述第三控制器803发送所述目标数据。第三控制器803接收所述数据读取命令后,向第一控制器801发送所述目标数据。Optionally, the first controller 801 receives the next read data request sent by the host, and the data to be read in the next read data request is the target data, or the data to be read is a part of the target data. After the first controller 801 receives the next read data request, it finds that the target data is not stored in its cache, but the target data is stored in the cache of the third controller 803, then it can send the request to the third controller 803 Sending a data read command, where the data read command includes the start address and length of the target data, and is used to request the third controller 803 to send the target data. After receiving the data read command, the third controller 803 sends the target data to the first controller 801 .
需要说明的是,下一个读数据请求并不限于紧接着所述第一控制器接收的读数据请求,只要是所述读数据请求之后接收的读数据请求都可以称作下一个读数据请求。It should be noted that the next read data request is not limited to the read data request received immediately after the first controller, and any read data request received after the read data request can be referred to as the next read data request.
由于控制器之间的数据通道采用的是高速数据传输通道,据统计,控制器间的cache数据的访问速度一般小于1ms。然而,如果无法cache命中,从磁盘读取目标数据的速度是6-10ms。因此,即使是跨控制器命中,其速度也是远大于从磁盘读取数据的速度的。Since the data channel between controllers uses a high-speed data transmission channel, according to statistics, the access speed of cache data between controllers is generally less than 1ms. However, if there is no cache hit, the speed of reading the target data from the disk is 6-10ms. Therefore, even cross-controller hits are much faster than reading data from disk.
在本发明实施例中,在第一控制器接收到主机发送的读数据请求后,根据所述读数据请求携带的地址信息确定第二控制器,并将所述地址信息发送给所述第二控制器,由第二控制器根据地址信息获得待读取的目标数据,以执行读取所述目标数据到缓存的操作。由于执行获得待读取的目标数据的操作的控制器是由读数据请求携带的地址信息确定的,因此在一段逻辑地址上面发生的读数据请求可以由一个控制器集中分析,对于这段逻辑地址来说,所获得的读数据请求的信息是全面的,因此可以准确地预测待读取的目标数据,并读取到缓存中。In the embodiment of the present invention, after the first controller receives the read data request sent by the host, it determines the second controller according to the address information carried in the read data request, and sends the address information to the second controller. The controller is configured to obtain the target data to be read by the second controller according to the address information, so as to execute the operation of reading the target data into the cache. Since the controller that executes the operation to obtain the target data to be read is determined by the address information carried in the read data request, the read data request that occurs on a logical address can be centrally analyzed by a controller. For this logical address In other words, the obtained read data request information is comprehensive, so the target data to be read can be accurately predicted and read into the cache.
本领域普通技术人员将会理解,本发明的各个方面、或各个方面的可能实现方式可以被具体实施为系统、方法或者计算机程序产品。因此,本发明的各方面、或各个方面的可能实现方式可以采用完全硬件实施例、完全软件实施例(包括固件、驻留软件等等),或者组合软件和硬件方面的实施例的形式,在这里都统称为“电路”、“模块”或者“系统”。此外,本发明的各方面、或各个方面的可能实现方式可以采用计算机程序产品的形式,计算机程序产品是指存储在计算机可读介质中的计算机可读程序代码。Those of ordinary skill in the art will understand that various aspects of the present invention, or possible implementations of various aspects, may be embodied as systems, methods or computer program products. Accordingly, aspects of the present invention, or possible implementations of various aspects, may take the form of an entirely hardware embodiment, an entirely software embodiment (including firmware, resident software, etc.), or an embodiment combining software and hardware aspects, described in These are collectively referred to herein as "circuits," "modules," or "systems." In addition, aspects of the present invention, or possible implementations of various aspects, may take the form of computer program products, and computer program products refer to computer-readable program codes stored in computer-readable media.
计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质。计算机可读存储介质包含但不限于电子、磁性、光学、电磁、红外或半导体系统、设备或者装置,或者前述的任意适当组合,如随机存取存储器(RAM)、只读存储器(ROM)、可擦除可编程只读存储器(EPROM或者快闪存储器)、光纤、便携式只读存储器(CD-ROM)。The computer readable medium may be a computer readable signal medium or a computer readable storage medium. Computer-readable storage media include, but are not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, devices, or devices, or any suitable combination of the foregoing, such as random access memory (RAM), read-only memory (ROM), Erase Programmable Read-Only Memory (EPROM or Flash), Fiber Optic, Portable Read-Only Memory (CD-ROM).
计算机中的处理器读取存储在计算机可读介质中的计算机可读程序代码,使得处理器能够执行在流程图中每个步骤、或各步骤的组合中规定的功能动作;生成实施在框图的每一块、或各块的组合中规定的功能动作的装置。The processor in the computer reads the computer-readable program code stored in the computer-readable medium, so that the processor can execute the functional actions specified in each step in the flow chart, or a combination of steps; A device that performs functional actions specified in each block or a combination of blocks.
计算机可读程序代码可以完全在用户的计算机上执行、部分在用户的计算机上执行、作为单独的软件包、部分在用户的计算机上并且部分在远程计算机上,或者完全在远程计算机或者服务器上执行。也应该注意,在某些替代实施方案中,在流程图中各步骤、或框图中各块所注明的功能可能不按图中注明的顺序发生。例如,依赖于所涉及的功能,接连示出的两个步骤、或两个块实际上可能被大致同时执行,或者这些块有时候可能被以相反顺序执行。The computer readable program code may execute entirely on the user's computer, partly on the user's computer, as a separate software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server . It should also be noted that, in some alternative implementations, the functions noted at the steps in the flowcharts or blocks in the block diagrams may occur out of the order noted in the figures. For example, two steps, or two blocks shown in succession, may in fact be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved.
Claims (12)
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/CN2013/084024 WO2015039352A1 (en) | 2013-09-23 | 2013-09-23 | Data caching method and storage system |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN103635887A CN103635887A (en) | 2014-03-12 |
| CN103635887B true CN103635887B (en) | 2015-07-08 |
Family
ID=50215548
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201380001620.0A Active CN103635887B (en) | 2013-09-23 | 2013-09-23 | Method and storage system for caching data |
Country Status (2)
| Country | Link |
|---|---|
| CN (1) | CN103635887B (en) |
| WO (1) | WO2015039352A1 (en) |
Families Citing this family (12)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN105574008B (en) * | 2014-10-11 | 2020-01-31 | 华为技术有限公司 | Task scheduling method and device applied to distributed file system |
| CN104461943B (en) * | 2014-12-29 | 2017-10-27 | 成都致云科技有限公司 | Method for reading data, device and system |
| CN107844270A (en) * | 2014-12-31 | 2018-03-27 | 华为技术有限公司 | A kind of memory array system and data write request processing method |
| CN108345551B (en) * | 2017-01-23 | 2020-05-12 | 杭州海康威视数字技术股份有限公司 | Data storage method and device |
| WO2019127487A1 (en) * | 2017-12-29 | 2019-07-04 | 华为技术有限公司 | Data prefetching method and apparatus, and storage device |
| WO2020037625A1 (en) * | 2018-08-23 | 2020-02-27 | 袁振南 | Distributed storage system and data read-write method therefor, and storage terminal and storage medium |
| CN111406251B (en) * | 2018-08-24 | 2023-12-08 | 华为技术有限公司 | Data prefetching method and device |
| KR102518095B1 (en) * | 2018-09-12 | 2023-04-04 | 삼성전자주식회사 | Storage device and system |
| CN112199304B (en) * | 2019-07-08 | 2024-04-09 | 华为技术有限公司 | Data pre-fetching method and device |
| CN110928495B (en) * | 2019-11-12 | 2023-09-22 | 杭州宏杉科技股份有限公司 | Data processing method and device on multi-control storage system |
| CN112579479B (en) * | 2020-12-07 | 2022-07-08 | 成都海光微电子技术有限公司 | Processor and method for maintaining transaction order while maintaining cache coherency |
| CN112799589B (en) * | 2021-01-14 | 2023-07-14 | 新华三大数据技术有限公司 | Data reading method and device |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US6728258B1 (en) * | 1995-11-15 | 2004-04-27 | Hitachi, Ltd. | Multi-processor system and its network |
| CN1818855A (en) * | 2005-02-09 | 2006-08-16 | 国际商业机器公司 | Method and apparatus for performing data prefetch in a multiprocessor system |
| CN101311894A (en) * | 2007-03-30 | 2008-11-26 | 英特尔公司 | Method and apparatus for speculative prefetching in a multi-processor/multi-core message-passing machine |
| CN101630303A (en) * | 2009-08-24 | 2010-01-20 | 成都市华为赛门铁克科技有限公司 | Request information processing method and device and multiprocessor storage system |
Family Cites Families (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20030195939A1 (en) * | 2002-04-16 | 2003-10-16 | Edirisooriya Samatha J. | Conditional read and invalidate for use in coherent multiprocessor systems |
| CN101201723A (en) * | 2006-12-14 | 2008-06-18 | 英业达股份有限公司 | Virtual disk router system, virtual disk access system and method |
| CN101055511B (en) * | 2007-05-16 | 2010-05-26 | 华为技术有限公司 | A storage array system and its data operation method |
| CN101840309B (en) * | 2009-10-28 | 2011-10-26 | 创新科存储技术有限公司 | Access control method and system of double control disk array in multipath environment |
-
2013
- 2013-09-23 WO PCT/CN2013/084024 patent/WO2015039352A1/en active Application Filing
- 2013-09-23 CN CN201380001620.0A patent/CN103635887B/en active Active
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US6728258B1 (en) * | 1995-11-15 | 2004-04-27 | Hitachi, Ltd. | Multi-processor system and its network |
| CN1818855A (en) * | 2005-02-09 | 2006-08-16 | 国际商业机器公司 | Method and apparatus for performing data prefetch in a multiprocessor system |
| CN101311894A (en) * | 2007-03-30 | 2008-11-26 | 英特尔公司 | Method and apparatus for speculative prefetching in a multi-processor/multi-core message-passing machine |
| CN101630303A (en) * | 2009-08-24 | 2010-01-20 | 成都市华为赛门铁克科技有限公司 | Request information processing method and device and multiprocessor storage system |
Also Published As
| Publication number | Publication date |
|---|---|
| CN103635887A (en) | 2014-03-12 |
| WO2015039352A1 (en) | 2015-03-26 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN103635887B (en) | Method and storage system for caching data | |
| US9507720B2 (en) | Block storage-based data processing methods, apparatus, and systems | |
| US9304928B2 (en) | Systems and methods for adaptive prefetching | |
| US9164676B2 (en) | Storing multi-stream non-linear access patterns in a flash based file-system | |
| KR102094163B1 (en) | Apparatus and method for managing cache in memory system based on hybrid chache, and the memory system | |
| US9665485B2 (en) | Logical and physical block addressing for efficiently storing data to improve access speed in a data deduplication system | |
| US10860494B2 (en) | Flushing pages from solid-state storage device | |
| KR20170008152A (en) | Data property-based data placement in nonvolatile memory device | |
| US9612975B2 (en) | Page cache device and method for efficient mapping | |
| US9817865B2 (en) | Direct lookup for identifying duplicate data in a data deduplication system | |
| US20130179635A1 (en) | Method and device for triggering data migration | |
| WO2017006674A1 (en) | Information processing system, storage control device, storage control method, and storage control program | |
| WO2017006675A1 (en) | Information processing system, storage control device, storage control method, and storage control program | |
| CN112214420A (en) | Data caching method, storage control device and storage equipment | |
| US20210117316A1 (en) | Data processing method and apparatus, and storage system | |
| CN112256599A (en) | Data prefetching method and device and storage device | |
| KR102180975B1 (en) | Memory subsystem with wrapped-to-continuous read | |
| CN115470157A (en) | Prefetching method, electronic device, storage medium and program product | |
| US9652155B2 (en) | Computer system, cash data management method, and computer | |
| KR102692838B1 (en) | Enhanced read-ahead capability for storage devices | |
| US20130179637A1 (en) | Data storage backup with lessened cache pollution | |
| US10621096B2 (en) | Read ahead management in a multi-stream workload | |
| US20190102288A1 (en) | Control modules, multi-level data storage devices, multi-level data storage methods, and computer readable media | |
| US9063669B2 (en) | Self-detecting storage bottleneck while handling sequential I/O operations | |
| JP6200100B2 (en) | Computer system |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| C10 | Entry into substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| C14 | Grant of patent or utility model | ||
| GR01 | Patent grant |