[go: up one dir, main page]

CN115543219B - Method, device, equipment and medium for optimizing host IO processing - Google Patents

Method, device, equipment and medium for optimizing host IO processing Download PDF

Info

Publication number
CN115543219B
CN115543219B CN202211508344.4A CN202211508344A CN115543219B CN 115543219 B CN115543219 B CN 115543219B CN 202211508344 A CN202211508344 A CN 202211508344A CN 115543219 B CN115543219 B CN 115543219B
Authority
CN
China
Prior art keywords
control
control block
page table
host
ring
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211508344.4A
Other languages
Chinese (zh)
Other versions
CN115543219A (en
Inventor
崔健
王江
李树青
李幸远
孙华锦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Metabrain Intelligent Technology Co Ltd
Original Assignee
Suzhou Inspur Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Inspur Intelligent Technology Co Ltd filed Critical Suzhou Inspur Intelligent Technology Co Ltd
Priority to CN202211508344.4A priority Critical patent/CN115543219B/en
Publication of CN115543219A publication Critical patent/CN115543219A/en
Application granted granted Critical
Publication of CN115543219B publication Critical patent/CN115543219B/en
Priority to PCT/CN2023/115975 priority patent/WO2024113996A1/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • G06F3/0611Improving I/O performance in relation to response time
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/10Address translation
    • G06F12/1009Address translation using page tables, e.g. page table structures
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0604Improving or facilitating administration, e.g. storage management
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0629Configuration or reconfiguration of storage systems
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/064Management of blocks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Memory System Of A Hierarchy Structure (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to the field of storage, and discloses a method, a device, equipment and a medium for optimizing host IO processing. The method comprises the following steps: acquiring the space size and the execution sequence of a control block required by host IO, and establishing a plurality of control block rings according to the space size and the execution sequence; filling control blocks required by the host IO to the plurality of control block rings in batches according to the execution sequence; and starting execution of the control blocks required by the host IO in response to the control blocks required by the host IO completing the first batch filling. The method disclosed by the invention can improve the use efficiency of the storage space, allow IO to balance between performance and efficiency and reduce the processing delay of IO.

Description

一种对主机IO处理的优化方法、装置、设备及介质A method, device, equipment and medium for optimizing host IO processing

技术领域technical field

本发明涉及存储领域,尤其涉及一种对主机IO处理的优化方法、装置、设备及介质。The present invention relates to the field of storage, in particular to an optimization method, device, equipment and medium for host IO processing.

背景技术Background technique

随着计算存储(Computational Storage)技术的兴起,计算存储架构通过将数据的计算从主机CPU中迁移到靠近存储单元的数据处理加速单元,减少了相应的数据的搬移操作,进而实现较大程度释放系统的性能。With the rise of Computational Storage (Computational Storage) technology, the Computational Storage architecture migrates the calculation of data from the host CPU to the data processing acceleration unit close to the storage unit, reducing the corresponding data movement operations, thereby achieving a greater degree of release system performance.

现有技术中,一种基于微码驱动的通用计算加速芯片架构中,UAA(UnifiedAcceleration Architecture,通用加速架构)通过PCIe(Peripheral ComponentInterconnect express,高速串行计算机扩展总线标准)接口同主机连接,UAA分为控制平面和数据平面两个部分,控制平面基于微码驱动架构以实现加速任务,分步骤地流转于各个加速引擎模块之间。UAA的架构下的CP(Control Page,控制页表)/CB(Control Block,控制块)的主要方式中存在工作效率和性能的问题,具体地,随着存储功能越来越复杂,一个IO需要的CB个数增长趋势明显,随之而来的是对CP或CP链的存储空间的需求的增长,芯片上给定大小的存储空间能够同时容纳的CP和CP链的个数呈减少趋势,导致了处理并行IO能力的降低;CB处理的处理时间差异很大,一个IO对应的CP页表和CP链中有相当一部分CB需要进行DDR(Double Data Rate,双倍速率)、硬盘访问或复杂运算,这些操作需要耗费较长的时间才能完成,在执行过程中,所有已经执行完成的前序CB对存储器的占用是没有意义的,另一方面,对于后续等待被执行的CB,越趋向尾部的CB,等待的时间越久,进而降低了存储器的使用效率;从IO处理延迟角度考虑,按照创建完成所有CB后再把第一个CB交由UAA执行的方式,由于需要创建的CB个数增多,延迟也会增加。In the prior art, in a microcode-driven general-purpose computing acceleration chip architecture, UAA (Unified Acceleration Architecture, universal acceleration architecture) is connected to the host through a PCIe (Peripheral Component Interconnect express, high-speed serial computer expansion bus standard) interface, and UAA is divided into It consists of two parts, the control plane and the data plane. The control plane is based on the microcode driver architecture to achieve acceleration tasks, and flows between various acceleration engine modules step by step. There are work efficiency and performance problems in the main way of CP (Control Page, control page table)/CB (Control Block, control block) under the UAA architecture. Specifically, as the storage function becomes more and more complex, an IO needs The increasing trend of the number of CBs is obvious, followed by an increase in the demand for storage space of CP or CP chains, and the number of CPs and CP chains that can be accommodated at the same time by a given size of storage space on the chip shows a decreasing trend. As a result, the ability to handle parallel IO is reduced; the processing time of CB processing varies greatly, and a considerable part of CB in the CP page table and CP chain corresponding to an IO needs to perform DDR (Double Data Rate, double rate), hard disk access or complex These operations take a long time to complete. During the execution process, it is meaningless for all the pre-order CBs that have been executed to occupy the memory. On the other hand, for the subsequent CBs waiting to be executed, the more they tend to the tail The longer the waiting time for CBs, the lower the efficiency of memory usage. From the perspective of IO processing delay, according to the method of creating all CBs and then handing over the first CB to UAA for execution, the number of CBs that need to be created increases. , the delay will also increase.

发明内容Contents of the invention

有鉴于此,本发明提出了一种对主机IO处理的优化方法、装置、设备及介质。其中,为了提高CP存储空间的使用效率和降低IO处理延迟,本发明提出了一种对主机IO处理的优化方法,在单个CP和链式CP定义的基础上,定义了环形控制页表(环形CP)模式,当CP或CP链使用环形控制页表模式时,一个主机IO的所有CB存储空间以环形的方式组织,即为控制块环(CB Ring)。在CB Ring模式下,一个IO所需的所有CB按照执行顺序动态的分批产生并填充至到CB Ring,执行完成后再动态回收,直到一个完整的IO流程处理完毕。CB Ring的存储空间可以小于一个IO对应的完整CB序列所需要的总空间,因此减少了空间占用; CB执行完毕后空间可以回收,提高了存储空间使用效率;CB的使用和生成是动态的,只需要生成部分CB就可以开始执行,降低了主机IO的延迟。In view of this, the present invention proposes an optimization method, device, equipment and medium for host IO processing. Among them, in order to improve the utilization efficiency of CP storage space and reduce the delay of IO processing, the present invention proposes a method for optimizing host IO processing, and defines a ring control page table (ring CP) mode, when the CP or CP chain uses the circular control page table mode, all the CB storage space of a host IO is organized in a circular manner, which is the control block ring (CB Ring). In the CB Ring mode, all the CBs required by an IO are dynamically generated in batches according to the order of execution and filled to the CB Ring, and then dynamically reclaimed after execution is completed until a complete IO process is processed. The storage space of CB Ring can be smaller than the total space required by the complete CB sequence corresponding to one IO, thus reducing the space occupation; after the execution of CB, the space can be reclaimed, which improves the efficiency of storage space usage; the use and generation of CB is dynamic, It only needs to generate part of the CB to start execution, which reduces the delay of host IO.

基于以上目的,本发明的实施例的一个方面提供了一种对主机IO处理的优化方法,所述方法包括以下步骤:获取主机IO所需的控制块的空间大小以及执行顺序,根据所述空间大小以及执行顺序建立多个控制块环;将所述主机IO所需的控制块按照所述执行顺序分批填充到所述多个控制块环上;响应于所述主机IO所需的控制块完成首批填充,开启对所述主机IO的所需的控制块的执行。Based on the above purpose, an aspect of the embodiments of the present invention provides a method for optimizing host IO processing. The method includes the following steps: obtaining the space size and execution order of the control blocks required by the host IO, and according to the space Size and execution order to establish multiple control block rings; fill the control blocks required by the host IO in batches according to the execution order on the multiple control block rings; respond to the control blocks required by the host IO The first batch of filling is completed, and the execution of the required control blocks of the host IO is started.

在一些实施例中,所述方法还包括:在控制页表的定义中添加环形模式的判断和由所述环形模式的控制页表生成对应的控制块环所需的参数信息,得到更新后的控制页表的定义。In some embodiments, the method further includes: adding the judgment of the ring mode to the definition of the control page table and generating the parameter information required for the corresponding control block ring from the control page table of the ring mode, and obtaining the updated Controls the definition of page tables.

在一些实施例中,所述在控制页表的定义中添加环形模式的判断和由所述环形模式的控制页表生成对应的控制块环所需的参数信息,得到更新后的控制页表的定义包括:基于判断当前控制页表是否为环形模式以及获取所述当前控制页表在控制页链表中的位置更新控制页表的头区域的定义;响应于所述当前控制页表为环形模式的控制页表,获取所述环形模式的控制页表生成对应的控制块环所需的参数信息,并基于所述参数信息设置附加参数表中的对应参数信息,并在所述环形模式的控制页表中设置附加参数表偏移地址,以得到所述环形模式的控制页表的附加参数表的起始位置。In some embodiments, adding the judgment of the ring mode in the definition of the control page table and generating the parameter information required for the corresponding control block ring from the control page table of the ring mode to obtain the updated control page table The definition includes: based on judging whether the current control page table is in ring mode and obtaining the position of the current control page table in the control page linked list, updating the definition of the header area of the control page table; in response to the fact that the current control page table is in ring mode control page table, obtain the parameter information required by the control page table of the ring mode to generate the corresponding control block ring, and set the corresponding parameter information in the additional parameter table based on the parameter information, and set the corresponding parameter information in the control page table of the ring mode The offset address of the additional parameter table is set in the table to obtain the starting position of the additional parameter table of the control page table of the ring mode.

在一些实施例中,所述基于判断当前控制页表是否为环形模式以及获取所述当前控制页表在控制页链表中的位置更新控制页表的头区域的定义包括:基于判断当前控制页表是否为环形模式更新控制页表的属性判断;响应于所述当前控制页表为环形模式的控制页表,获取所述环形模式的控制页表在所述控制页链表中的位置,并在所述环形模式的控制页表中设置指向所述控制页链表中下一个控制页表的首个控制块的地址的指针。In some embodiments, the updating the definition of the header area of the control page table based on judging whether the current control page table is in ring mode and obtaining the position of the current control page table in the control page linked list includes: based on judging that the current control page table Whether to update the attribute judgment of the control page table in ring mode; in response to the current control page table being the control page table in ring mode, obtain the position of the control page table in the ring mode in the control page linked list, and A pointer to the address of the first control block in the next control page table in the control page linked list is set in the control page table of the ring mode.

在一些实施例中,所述获取主机IO所需的控制块的空间大小以及执行顺序,根据所述空间大小以及执行顺序建立多个控制块环包括:基于所述更新后的控制页表的定义和获取的所述主机IO所需的控制块的空间大小以及执行顺序建立多个环形模式的控制页表,通过所述多个环形模式的控制页表生成对应的多个控制块环。In some embodiments, the acquisition of the space size and execution order of the control blocks required by the host IO, and establishing multiple control block rings according to the space size and execution order include: based on the definition of the updated control page table Create a plurality of ring-mode control page tables based on the obtained space size and execution sequence of the control block required by the host IO, and generate corresponding multiple control block rings through the plurality of ring-mode control page tables.

在一些实施例中,所述获取主机IO所需的控制块的空间大小以及执行顺序,根据所述空间大小以及执行顺序建立多个控制块环还包括:按照串接顺序在控制页链表的最后一个控制页表的控制块存放区域设置附加参数表,以用于存放由环形模式的控制页表生成的对应控制块环所需的参数信息。In some embodiments, the acquisition of the space size and execution order of the control blocks required by the host IO, and establishing multiple control block rings according to the space size and execution order further include: at the end of the control page linked list according to the concatenation order An additional parameter table is set in the control block storage area of the control page table to store the parameter information required by the corresponding control block ring generated by the control page table in the ring mode.

在一些实施例中,所述按照串接顺序在控制页链表的最后一个控制页表的控制块存放区域设置附加参数表,以用于存放由环形模式的控制页表生成的对应控制块环所需的参数信息包括:响应于所述附加参数表的参数体积大于所述最后一个控制页表的控制块存放区域,按照所述串接顺序逆序向前依次占用前一个控制页表的控制块存放区域。In some embodiments, the additional parameter table is set in the control block storage area of the last control page table in the control page linked list according to the serial connection order, so as to store the corresponding control block ring generated by the control page table in the ring mode. The required parameter information includes: in response to the parameter volume of the additional parameter table being larger than the control block storage area of the last control page table, the control block storage area of the previous control page table is sequentially occupied in reverse order according to the serial connection sequence area.

在一些实施例中,所述获取主机IO所需的控制块的空间大小以及执行顺序,根据所述空间大小以及执行顺序建立多个控制块环还包括:设置与控制页表的模式相对应的控制块类型;在控制块的控制块头中建立控制块的状态标记,以根据不同的控制块的状态标记执行不同的处理逻辑。In some embodiments, the acquisition of the space size and execution order of the control blocks required by the host IO, and the establishment of multiple control block rings according to the space size and execution order further include: setting the mode corresponding to the control page table Control block type; the state flag of the control block is established in the control block header of the control block, so as to execute different processing logics according to the state flags of different control blocks.

在一些实施例中,所述在控制块的控制块头中建立控制块的状态标记,以根据不同的控制块的状态标记执行不同的处理逻辑包括:设置控制块有效标志、完成标志以及结束标志,并且响应于所述有效标志有效,所述完成标志和结束标志存在。In some embodiments, the establishment of the state flag of the control block in the control block header of the control block, so as to execute different processing logics according to the state flags of different control blocks includes: setting the valid flag, the completion flag and the end flag of the control block, And in response to the valid flag being valid, the done flag and the end flag are present.

在一些实施例中,所述设置与控制页表的模式相对应的控制块类型包括:设置环回控制块标志,以用于表示当前控制块位于控制块环的存储空间的线性地址的尾部;设置主动生成控制块标志,以用于表示位于当前控制块的后面的控制块未生成,响应于调用带有主动生成控制标志的控制块,为所述带有主动生成控制标志的控制块设置生成引擎,以用于生成后面的控制块并填充到对应的控制块环上。In some embodiments, the setting the control block type corresponding to the mode of the control page table includes: setting a loopback control block flag to indicate that the current control block is located at the end of the linear address of the storage space of the control block ring; Set an active generation control block flag to indicate that the control block behind the current control block is not generated, and in response to calling the control block with the active generation control flag, set generation for the control block with the active generation control flag The engine is used to generate the following control blocks and fill them into the corresponding control block rings.

在一些实施例中,所述将所述主机IO所需的控制块按照所述执行顺序分批填充到所述多个控制块环上包括:根据固件对主机IO的命令的解析以为所述主机IO创建第一批控制页表,并按照执行顺序将控制块依次填充到所述第一批控制页表对应的控制块环上;设置所述固件将第一个填充的控制块的地址传递给对应引擎的工作队列,以进行排队等待。In some embodiments, the filling the control blocks required by the host IO into the plurality of control block rings in batches according to the execution sequence includes: analyzing the command of the host IO according to the firmware for the host The IO creates the first batch of control page tables, and fills the control blocks into the control block ring corresponding to the first batch of control page tables in order of execution; the firmware is set to pass the address of the first filled control block to Corresponding to the engine's work queue for queuing.

在一些实施例中,所述响应于所述主机IO所需的控制块完成首批填充,开启对所述主机IO的所需的控制块的执行包括:响应于检测到主动生成控制块标志,通过工作队列管理引擎向所述固件发送通知以完成控制块的首批填充,并将首批填充的控制页表对应的空间进行回收,开启对所述主机IO所需控制块的执行。In some embodiments, the starting the execution of the required control blocks of the host IO in response to the completion of the first batch of filling of the control blocks required by the host IO includes: in response to detecting that a control block flag is actively generated, The work queue management engine sends a notification to the firmware to complete the first batch of filling of the control block, reclaims the space corresponding to the first batch of filled control page table, and starts the execution of the control block required by the host IO.

本发明实施例的另一个方面,还提供了一种对主机IO处理的优化装置,包括以下模块:第一模块,配置用于获取主机IO所需的控制块的空间大小以及执行顺序,根据所述空间大小以及执行顺序建立多个控制块环;第二模块,配置用于将所述主机IO所需的控制块按照所述执行顺序分批填充到所述多个控制块环上;第三模块,配置用于响应于所述主机IO所需的控制块完成首批填充,开启对所述主机IO的所需的控制块的执行。Another aspect of the embodiments of the present invention also provides a device for optimizing host IO processing, including the following modules: a first module configured to obtain the space size and execution sequence of the control blocks required by the host IO, according to the The size of the space and the order of execution establish multiple control block rings; the second module is configured to fill the control blocks required by the host IO into the multiple control block rings in batches according to the execution order; the third A module configured to initiate execution of the host IO required control blocks in response to the host IO required control blocks completing a first batch of population.

本发明实施例的另一方面,还提供一种计算机设备,包括至少一个处理器;以及存储器,存储器存储有可在处理器上运行的计算机指令,指令由处理器执行时实现上述任一方法的步骤。Another aspect of the embodiments of the present invention also provides a computer device, including at least one processor; and a memory, the memory stores computer instructions that can be run on the processor, and when the instructions are executed by the processor, any of the above-mentioned methods can be implemented. step.

本发明实施例的又一方面,还提供了一种计算机可读存储介质,计算机可读存储介质存储有被处理器执行时实现如上任一方法步骤的计算机程序。In yet another aspect of the embodiments of the present invention, a computer-readable storage medium is provided, and the computer-readable storage medium stores a computer program for implementing any of the above method steps when executed by a processor.

本发明至少具有以下有益效果:本发明提出一种对主机IO处理的优化方法、装置、设备及介质,其中,本发明提出的一种对主机IO处理的优化方法通过建立控制块环,允许控制块环的存储空间小于一个主机IO对应的完整控制块序列所需的总的空间,减少了控制页表对存储空间的需求;在环形模式的控制页表下生成的控制块环是以控制页表的粒度为前提的,因此具有弹性,即允许主机IO在性能和效率之间做平衡;环形模式的控制页表下对应的控制块环上运行控制块在执行完毕后回收对应的空间,提高了存储空间的使用效率;并且控制块是动态生成和回收的,因此在一个主机IO中,只需要生成部分控制块即可开始服务IO,因此降低了IO的处理延迟;进一步,通过设置环回控制块标志和主动生成控制块标志,允许CB执行和生成两个方面同步进行,即把CB生成的耗时隐藏在其他CB的执行过程中。The present invention has at least the following beneficial effects: the present invention proposes an optimization method, device, device, and medium for host IO processing, wherein, the optimization method for host IO processing proposed by the present invention allows control The storage space of the block ring is smaller than the total space required for a complete control block sequence corresponding to a host IO, which reduces the storage space requirements of the control page table; the control block ring generated under the control page table in the ring mode is based on the control page The granularity of the table is the premise, so it is flexible, that is, it allows the host IO to balance between performance and efficiency; the corresponding control block under the control page table of the ring mode runs on the ring and reclaims the corresponding space after execution, improving The utilization efficiency of storage space is improved; and the control block is dynamically generated and recycled, so in a host IO, only a part of the control block needs to be generated to start serving IO, thus reducing the processing delay of IO; further, by setting the loopback The control block flag and the active generation of the control block flag allow the two aspects of CB execution and generation to be carried out simultaneously, that is, the time-consuming of CB generation is hidden in the execution process of other CBs.

附图说明Description of drawings

为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其它的实施例。In order to more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the accompanying drawings in the following description are only These are some embodiments of the present invention, and those skilled in the art can obtain other embodiments according to these drawings without any creative effort.

图1示出的为通用计算架构的示意图;Figure 1 shows a schematic diagram of a general computing architecture;

图2示出的为现有技术中一种对主机IO处理的方法的实施例的示意图;FIG. 2 is a schematic diagram of an embodiment of a method for processing host IO in the prior art;

图3示出的为现有技术中单个控制页表的结构的示意图;FIG. 3 is a schematic diagram of the structure of a single control page table in the prior art;

图4示出的为现有技术中链式控制页表的结构的示意图;FIG. 4 shows a schematic diagram of the structure of the chained control page table in the prior art;

图5示出的为本发明提供的一种对主机IO处理的优化方法的实施例的示意图;FIG. 5 is a schematic diagram of an embodiment of a method for optimizing host IO processing provided by the present invention;

图6示出的为本发明提供的一种对主机IO处理的优化方法的另一实施例的示意图;FIG. 6 is a schematic diagram of another embodiment of a method for optimizing host IO processing provided by the present invention;

图7示出的为本发明提供的一种对主机IO处理的优化方法中的主动触发和触发场景的示意图;FIG. 7 is a schematic diagram of active triggering and triggering scenarios in a method for optimizing host IO processing provided by the present invention;

图8示出的是本发明提供的一种对主机IO处理的优化装置的实施例的示意图;FIG. 8 is a schematic diagram of an embodiment of a device for optimizing host IO processing provided by the present invention;

图9示出的是本发明提供的一种计算机设备的实施例的示意图;FIG. 9 is a schematic diagram of an embodiment of a computer device provided by the present invention;

图10示出的是本发明提供的一种计算机可读存储介质的实施例的示意图。FIG. 10 is a schematic diagram of an embodiment of a computer-readable storage medium provided by the present invention.

具体实施方式Detailed ways

以下描述了本发明的实施例。然而,应该理解,所公开的实施例仅仅是示例,并且其它实施例可以采取各种替代形式。Embodiments of the present invention are described below. It is to be understood, however, that the disclosed embodiments are merely examples and that other embodiments may take various alternative forms.

此外,需要说明的是,本发明实施例中所有使用“第一”和“第二”的表述均是为了区分两个相同名称非相同的实体或者非相同的参量,可见“第一”“第二”仅为了表述的方便,不应理解为对本发明实施例的限定,后续实施例对此不再一一说明。术语“包括”、“包含”或其任何其它变形旨在涵盖非排他性的包括,以使包含一系列要素的过程、方法、物品或装置不仅包括那些要素,也可以包括未明确列出的或这些过程、方法、物品或装置所固有的要素。In addition, it should be noted that all the expressions using "first" and "second" in the embodiments of the present invention are to distinguish two entities with the same name but different parameters or parameters that are not the same. It can be seen that "first" and "second" "two" is only for the convenience of expression, and should not be understood as a limitation on the embodiments of the present invention, and the subsequent embodiments will not describe them one by one. The terms "comprises", "comprises" or any other variation thereof are intended to cover a non-exclusive inclusion such that a process, method, article or apparatus comprising a set of elements includes not only those elements, but may also include those not expressly listed or these An element inherent in a process, method, article, or device.

下面将结合附图说明本申请的一个或多个实施例。One or more embodiments of the present application will be described below with reference to the accompanying drawings.

参考图1和图2,图1示出的为通用计算架构的示意图,图2示出的为现有技术中一种对主机IO处理的方法的实施例的示意图,在通用计算架构(UAA)中,控制页表(CP)的大小可以是512字节、1024字节或2048字节,这些空间除去CP头、数据缓存和原始主机IO指令备份外,剩下的空间用来存放控制块CB,所有的CP置于一个连续的内存空间,从而形成一个CP资源池,此外,为了方便CP资源池的管理,单一资源池中的CP颗粒度需要保持一致,即只能选定一种CP的大小。CB的大小依据不同的应用引擎类型可以为16字节、32字节、64字节和128字节,因此,单个CP中所能承载的CB的数量受CP的大小和CB的大小共同决定。因此,在有些复杂应用场景,不同类的IO对应的CB链的长度差异过大,CP的大小选定会造成一定的困难,选的过小就无法承载长的CB链条;相反,CP的颗粒度选的过大,则会造成一定CP资源的浪费。Referring to Figure 1 and Figure 2, Figure 1 shows a schematic diagram of a general computing architecture, and Figure 2 shows a schematic diagram of an embodiment of a method for processing host IO in the prior art, in the Universal Computing Architecture (UAA) Among them, the size of the control page table (CP) can be 512 bytes, 1024 bytes or 2048 bytes. Except for the CP header, data cache and original host IO command backup, the remaining space is used to store the control block CB , all CPs are placed in a continuous memory space to form a CP resource pool. In addition, in order to facilitate the management of CP resource pools, the CP granularity in a single resource pool needs to be consistent, that is, only one CP can be selected. size. The size of the CB can be 16 bytes, 32 bytes, 64 bytes and 128 bytes according to different application engine types. Therefore, the number of CBs that can be carried in a single CP is determined by both the size of the CP and the size of the CB. Therefore, in some complex application scenarios, the length of CB chains corresponding to different types of IOs is too different, and the selection of CP size will cause certain difficulties. If the size of CP is too small, it will not be able to carry long CB chains; If the degree selection is too large, it will cause a waste of CP resources.

一次典型的主机IO过程在UAA的流程,首先,AEM(Acceleration Engine Manager主机接口管理引擎)遵从一定接口协议将原始的IO请求取回,并通过WQS(Work QueueScheduler 工作队列调度引擎)所管理的硬件事件队列来通知固件,固件得到通知后,对IO命令进行解析后,会为此次IO操作创建CP,并将分步操作的CB按要求填入CP中。在完成上述步骤后,固件会将第一个CB1的地址(4B宽)传递给WQS,WQS会将该CB1地址放入对应引擎的工作队列进行排队等待。当对应的引擎开始执行该CB1,则根据CB1的配置信息进行相应的加速操作。在完成后将执行状态返回给WQS(可选的,可以通知固件),并准备好下一个CB2的入口地址,也返回给WQS进入对应引擎的工作队列等待,以此类推。整个流转过程全在WQS硬件控制之下,非必要无需固件参与,当最后一个CBX完成后,需要对主机进行响应,并通知固件进行CP空间的回收。A typical host IO process is in the UAA process. First, AEM (Acceleration Engine Manager host interface management engine) retrieves the original IO request according to a certain interface protocol, and the hardware managed by WQS (Work QueueScheduler) The event queue is used to notify the firmware. After the firmware is notified, after analyzing the IO command, it will create a CP for this IO operation, and fill the CB of the step-by-step operation into the CP as required. After completing the above steps, the firmware will pass the address of the first CB1 (4B wide) to WQS, and WQS will put the CB1 address into the work queue of the corresponding engine for queuing. When the corresponding engine starts to execute the CB1, the corresponding acceleration operation is performed according to the configuration information of the CB1. After completion, return the execution status to WQS (optional, you can notify the firmware), and prepare the next entry address of CB2, and also return to WQS to wait in the work queue of the corresponding engine, and so on. The entire transfer process is under the control of the WQS hardware, and there is no need for firmware to participate unless necessary. When the last CBX is completed, it is necessary to respond to the host and notify the firmware to reclaim the CP space.

在图1和图2的基础上,参考图3和图4,图3示出的为现有技术中单个控制页表的结构的示意图;图4示出的为现有技术中链式控制页表的结构的示意图。如图3和图4所示,CP一般分为普通CP和链式CP,在单个CP的基础上引入链式CP,普通CP的是在单个CP的定义的基础上加入一个8字节的指向下个链式CP的地址指针;链式CP的大小和普通CP的大小保持一致,在链式CP里面除了CP头,剩下的空间用来存放CB,在链式CP头内增加了三个8字节大小的链式指针,一个是指向下个链式CP(若有),一个是指向前一个CP,最后一个是直接指向第一个普通CP(方便查找数据缓存等),并且每个CB里面都标识了它在整个CP链中的位置,方便引擎处理时能快速定位到所需的信息。但是随着UAA架构的发展,单个CP和链式CP遇到了工作效率问题和性能问题,具体地,随着存储功能越来越复杂,一个IO需要的CB个数增长趋势明显,随之而来的是对CP或CP链的存储空间的需求的增长,芯片上给定大小的存储空间能够同时容纳的CP和CP链的个数呈减少趋势,导致了处理并行IO能力的降低;CB处理的处理时间差异很大,一个IO对应的CP页表和CP链中有相当一部分CB需要进行DDR(DoubleData Rate)、硬盘访问或复杂运算,这些操作需要耗费较长的时间才能完成,在执行过程中,所有已经执行完成的前序CB对存储器的占用是没有意义的,另一方面,对于后续等待被执行的CB,越趋向尾部的CB,等待的时间越久,进而降低了存储器的使用效率;从IO处理延迟角度考虑,按照创建完成所有CB后再把第一个CB交由UAA执行的方式,由于需要创建的CB个数增多,延迟也会增加。On the basis of Fig. 1 and Fig. 2, referring to Fig. 3 and Fig. 4, Fig. 3 shows a schematic diagram of the structure of a single control page table in the prior art; Fig. 4 shows a chained control page in the prior art A schematic diagram of the structure of the table. As shown in Figure 3 and Figure 4, CPs are generally divided into ordinary CPs and chained CPs. Chained CPs are introduced on the basis of a single CP. For ordinary CPs, an 8-byte pointer is added to the definition of a single CP. The address pointer of the next chained CP; the size of the chained CP is consistent with the size of the ordinary CP. In the chained CP, except for the CP header, the remaining space is used to store the CB, and three are added in the chained CP header. 8-byte chained pointers, one pointing to the next chained CP (if any), one pointing to the previous CP, and the last pointing directly to the first normal CP (to facilitate the search for data cache, etc.), and each Its position in the entire CP chain is marked in the CB, which is convenient for the engine to quickly locate the required information when processing. However, with the development of the UAA architecture, single CP and chained CP encountered work efficiency and performance problems. Specifically, as the storage function became more and more complex, the number of CBs required by an IO increased significantly. What is most important is the increase in the demand for storage space of CP or CP chains, the number of CPs and CP chains that can be accommodated at the same time by a given size of storage space on the chip tends to decrease, resulting in a reduction in the ability to handle parallel IO; CB processing The processing time varies greatly. A considerable part of the CB in the CP page table and CP chain corresponding to an IO needs to perform DDR (Double Data Rate), hard disk access or complex calculations. These operations take a long time to complete. During the execution process , it is meaningless for all the pre-order CBs that have been executed to occupy the memory. On the other hand, for the subsequent CBs waiting to be executed, the closer to the tail CB, the longer the waiting time, which reduces the memory usage efficiency; from From the perspective of IO processing delay, if all CBs are created and then the first CB is handed over to UAA for execution, the delay will also increase due to the increase in the number of CBs that need to be created.

基于以上目的,本发明实施例的第一个方面,提出了一种对主机IO处理的优化方法的实施例。图5示出的为本发明提供的一种对主机IO处理的优化方法的实施例的示意图。如图5所示,本发明实施例的一种对主机IO处理的优化方法包括以下步骤:Based on the above objectives, the first aspect of the embodiments of the present invention proposes an embodiment of a method for optimizing host IO processing. FIG. 5 is a schematic diagram of an embodiment of a method for optimizing host IO processing provided by the present invention. As shown in Figure 5, a method for optimizing host IO processing according to an embodiment of the present invention includes the following steps:

S1、获取主机IO所需的控制块的空间大小以及执行顺序,根据所述空间大小以及执行顺序建立多个控制块环;S1. Obtain the space size and execution sequence of the control blocks required by the host IO, and establish multiple control block rings according to the space size and execution sequence;

S2、将所述主机IO所需的控制块按照所述执行顺序分批填充到所述多个控制块环上;S2. Filling the control blocks required by the host IO into the multiple control block rings in batches according to the execution order;

S3、响应于所述主机IO所需的控制块完成首批填充,开启对所述主机IO的所需的控制块的执行。S3. In response to the completion of the first batch of filling of the control blocks required by the host IO, start the execution of the required control blocks of the host IO.

基于以上目的,本发明实施例的第一个方面,提出了一种对主机IO处理的优化方法的实施例。图6示出的为本发明提供的一种对主机IO处理的优化方法的另一实施例的示意图。如图6所示,在普通CP和链式CP的基础上,提出了一种环形CP模式,为此,在原有的CP和CB定义的基础上添加关于环形CP模式的定义。具体地,为了表示CP是否存在环形CP模式,在CP属性中增加了环形CP模式的标记;其次,在环形CP模式下动态生成CB可能需要的其他参数,所以在环形CP在CP页的定义中增加了附加参数表的定义;修改后的单个CP页表的大小可以是512字节、1K字节或2K字节,主要由四个部分组成。Based on the above objectives, the first aspect of the embodiments of the present invention proposes an embodiment of a method for optimizing host IO processing. FIG. 6 is a schematic diagram of another embodiment of a method for optimizing host IO processing provided by the present invention. As shown in Fig. 6, on the basis of ordinary CP and chained CP, a ring CP mode is proposed. For this reason, the definition of ring CP mode is added on the basis of the original CP and CB definitions. Specifically, in order to indicate whether the CP has a ring CP mode, the flag of the ring CP mode is added to the CP attribute; secondly, other parameters that may be required by the CB are dynamically generated in the ring CP mode, so in the definition of the ring CP on the CP page The definition of the additional parameter table is added; the size of the modified single CP page table can be 512 bytes, 1K bytes or 2K bytes, and it mainly consists of four parts.

1. 128字节的CP头区域,包括:1. 128-byte CP header area, including:

(1)CP的属性:CP的类型,并标识了该CP是否位于一个CP链表之中、是否是环形CP页模式以及其在链表中的位置等属性信息;(1) CP attribute: the type of CP, and identifies whether the CP is located in a CP linked list, whether it is a ring CP page mode, and its position in the linked list and other attribute information;

(2)当前CP的序列号,在创建CP时指定;(2) The serial number of the current CP, specified when creating the CP;

(3)NVMe(Non-Volatile Memory express)队列的相关信息,包括2字节完成队列ID、2字节提交队列ID以及4字节调教队列头信息;(3) Information about NVMe (Non-Volatile Memory express) queues, including 2-byte completion queue ID, 2-byte submission queue ID, and 4-byte tuning queue header information;

(4)若组成CP链表,8个字节包括指向下一个链式CP的首个CB的地址指针;(4) If a CP linked list is formed, 8 bytes include the address pointer pointing to the first CB of the next chained CP;

(5)4个64位的时间戳,用于记录该CP执行过程中的关键时间节点;(5) Four 64-bit time stamps, used to record key time nodes during the execution of the CP;

(6)64个字节的空间,预留给固件使用。(6) The space of 64 bytes is reserved for firmware.

2. 在CP属性为环形CP页模式时,包含CB Ring控制头32字节:2. When the CP attribute is the ring CP page mode, include the 32 bytes of the CB Ring control header:

(1)附加参数表偏移地址4字节,表示单个CP中附加参数表的起始地址,即是从第一个CP的地址开始的相对偏移地址;(1) The offset address of the additional parameter table is 4 bytes, indicating the starting address of the additional parameter table in a single CP, that is, the relative offset address starting from the address of the first CP;

(2)28字节预留空间。(2) 28 bytes reserved space.

3. 若干字节的CB存放区域,对应于512B/1KB/2KB的CP长度,在CP为环形CP页表时,CB存放区域的长度分别为192字节/704字节/1728字节;在CP在其他模式时,CB存放区域长度分别为224字节/738字节/1760字节。3. The CB storage area of several bytes corresponds to the CP length of 512B/1KB/2KB. When the CP is a ring CP page table, the length of the CB storage area is 192 bytes/704 bytes/1728 bytes respectively; When the CP is in other modes, the length of the CB storage area is 224 bytes/738 bytes/1760 bytes respectively.

4. 64字节的数据缓存指针区域,用于指向公共的数据缓存区。4. The 64-byte data cache pointer area is used to point to the public data cache area.

5. 64字节的原始NVMe管理和IO指令备份区域,便于异常发生时,固件介入进行错误的恢复。5. 64 bytes of original NVMe management and IO instruction backup area, which is convenient for firmware to intervene to recover from errors when an exception occurs.

具体地,如下表1所示:Specifically, as shown in Table 1 below:

表1Table 1

在链式CP中,附加参数表的位置位于从最后一个CP开始的CB存放区域。如果附加参数的参数表本身体积较大,则从最后一个CP开始逆序向前依次占用前序CP的CB存放区域,CP链中剩余的CB存放区域组成CB环。In chained CPs, the location of the additional parameter table is located in the CB storage area starting from the last CP. If the parameter table of the additional parameters is large in size, the CB storage area of the previous CP is occupied in reverse order from the last CP, and the remaining CB storage areas in the CP chain form a CB ring.

对于CB的修改,包括增加CB头标志位定义和两种新的CB类型。具体地,包括:The modification of CB includes adding the definition of CB header flag and two new CB types. Specifically, including:

CB有效标志,1表示为合法CB,0表示为非法CB;CB valid flag, 1 means legal CB, 0 means illegal CB;

CB完成标志,1表示为已执行完成的CB,0表示为尚未执行的CB;CB completion flag, 1 indicates that the CB has been executed, and 0 indicates that the CB has not been executed;

Last CB标志,1表示为处理一个IO流程的CP或者CP链中的最后一个CB,0表示不是最后一个CB。Last CB flag, 1 indicates that it is the last CB in the CP or CP chain that processes an IO process, and 0 indicates that it is not the last CB.

以上的CB完成标志和Last CB标志仅当CB有效标志为1时有意义。The above CB completion flag and Last CB flag are meaningful only when the CB valid flag is 1.

除去以上对CB定义的修改,为例支持环形CP模式,环中的CB除了包含原有CB类型,新增两种CB类型:In addition to the modification of the CB definition above, for example, to support the ring CP mode, the CB in the ring includes two new CB types in addition to the original CB type:

环回CB,即CB Return(CB_R),表示当前CB已位于CB Ring存储空间线性地址的尾部,下一个CB需要到第一个CP的CB存储空间的起始位置去获得,对于环形CP模式来说,CB_R是必须要使用的。Looping CB, that is, CB Return (CB_R), indicates that the current CB is located at the end of the linear address of the CB Ring storage space, and the next CB needs to be obtained from the starting position of the CB storage space of the first CP. For the ring CP mode, That said, CB_R must be used.

主动生成CB的CB,即CB Builder (CB_B),此CB出现在CB Ring中时,表示此CB后续的CB尚未生成,WQS(Work Queue Scheduler 工作队列调度引擎)调度到此CB时,需要给CB生成引擎(由软件或硬件实现)来处理这个CB,生成后续的CB并填充到CB Ring,对于环形CP模式来说,CB_B是可选的。The CB that actively generates CB, that is, CB Builder (CB_B), when this CB appears in the CB Ring, it means that the subsequent CB of this CB has not been generated. When WQS (Work Queue Scheduler) schedules this CB, it needs to give the CB The generation engine (implemented by software or hardware) processes this CB, generates subsequent CBs and fills them into the CB Ring. For the ring CP mode, CB_B is optional.

其中,图7示出的为本发明提供的一种对主机IO处理的优化方法中的主动触发和触发场景的示意图。在UAA框架下,WQS负责分发CB,由专用的软件或硬件引擎处理对应的CB,并把处理结果和下一个CB的地址返回给WQS。在CB Ring模式下,当处理CB的硬件引擎提交下一个CB时,对于下一个CB头中不同的CB有效标志、CB完成标志、Last CB标志进行检测以完成CB填充、CB处理等不同逻辑。具体的处理逻辑如下:Wherein, FIG. 7 shows a schematic diagram of active triggering and triggering scenarios in a method for optimizing host IO processing provided by the present invention. Under the UAA framework, WQS is responsible for distributing CB, and the corresponding CB is processed by a dedicated software or hardware engine, and the processing result and the address of the next CB are returned to WQS. In the CB Ring mode, when the hardware engine processing the CB submits the next CB, it detects different CB valid flags, CB completion flags, and Last CB flags in the next CB header to complete different logics such as CB filling and CB processing. The specific processing logic is as follows:

1. WQS收到当前CB的处理结果,首先判断当前CB是否为最后一个CB,即判断当前CB的Last CB的标志是否为1。如果当前CB的Last CB的标志是否为1,则IO流程结束,并回收CP;否则判断当前CB的下一个CB是否有效。1. WQS receives the processing result of the current CB, first judges whether the current CB is the last CB, that is, judges whether the Last CB flag of the current CB is 1. If the Last CB flag of the current CB is 1, the IO process ends and the CP is recycled; otherwise, it is judged whether the next CB of the current CB is valid.

2. 判断当前CB的下一个CB是否有效,即判断当前CB的下一个CB的有效标志是否为1。如果当前CB的下一个CB的有效标志不为1(为0),此时IO流程虽未结束,但由于缺少后续的CB,所以通知固件进行后续的填充处理,由固件填充后续的CB,更新CB Ring,把下一个CB交给WQS调度,此过程为被动触发的CB的构建和回收;如果当前CB的下一个CB的有效标志为1,判断当前CB的下一个CB是否已完成。2. To judge whether the next CB of the current CB is valid, that is, to judge whether the valid flag of the next CB of the current CB is 1. If the valid flag of the next CB of the current CB is not 1 (0), the IO process is not over at this time, but due to the lack of subsequent CBs, the firmware is notified to perform subsequent filling processing, and the firmware fills the subsequent CBs and updates CB Ring, the next CB is handed over to WQS for scheduling. This process is a passively triggered CB construction and recycling; if the valid flag of the next CB of the current CB is 1, it is judged whether the next CB of the current CB has been completed.

3. 判断当前CB的下一个CB是否已完成,即判断当前CB的下一个CB的完成标志是否为1。如果当前CB的下一个CB的完成标志为1,此时IO流程虽未结束,但由于缺少后续的CB,所以通知固件进行后续的填充处理,由固件填充后续的CB,更新CB Ring,把下一个CB交给WQS调度,此过程为被动触发的CB的构建和回收;如果当前CB的下一个CB的完成标志不为1,判断当前CB的下一个CB是否为CB_B。3. Judging whether the next CB of the current CB has been completed, that is, judging whether the completion flag of the next CB of the current CB is 1. If the completion flag of the next CB of the current CB is 1, the IO process is not over at this time, but due to the lack of subsequent CBs, the firmware is notified to perform subsequent filling processing, and the firmware fills the subsequent CBs, updates the CB Ring, and puts the next A CB is handed over to the WQS for scheduling. This process is the construction and recycling of the passively triggered CB; if the completion flag of the next CB of the current CB is not 1, it is judged whether the next CB of the current CB is CB_B.

4. 判断当前CB的下一个CB是否为CB_B,如果当前CB的下一个CB为CB_B,由WQS分发CB_B给CB生成引擎,并由CB生成引擎(SW/HW)填充完成后续CB,填充完成后回复WQS完成状态,并把后续CB交给WQS调度,此过程为主动触发的CB的构建和回收过程;如果当前CB的下一个CB不是CB_B,则按照CB的内容调度给对应的引擎。4. Determine whether the next CB of the current CB is CB_B. If the next CB of the current CB is CB_B, WQS distributes CB_B to the CB generation engine, and the CB generation engine (SW/HW) fills the subsequent CB. After the filling is completed Reply to the completion status of WQS, and hand over the subsequent CB to WQS for scheduling. This process is the construction and recycling process of the actively triggered CB; if the next CB of the current CB is not CB_B, it will be scheduled to the corresponding engine according to the content of the CB.

当CB引擎发现下一个CB是CB_R时,需要进行额外的处理,即重新定位到第一个CP的第一个CB,并把此CB当作下一个CB提交至WQS。CB的动态生成,可以由原有UAA架构中的固件来实现,只需把一次性全部生成所有CB改为分批生成,为了提高实时性以及降低软件执行过程中时间的不确定性的影响,可以设计和实现专门的硬件引擎,用于CB的动态创建和回收。When the CB engine finds that the next CB is CB_R, it needs to perform additional processing, that is, relocate to the first CB of the first CP, and submit this CB to the WQS as the next CB. The dynamic generation of CB can be realized by the firmware in the original UAA architecture. It is only necessary to change all the CB generation at one time to batch generation. In order to improve real-time performance and reduce the impact of time uncertainty in the software execution process, A dedicated hardware engine can be designed and implemented for dynamic creation and recycling of CBs.

一个典型的工作在环形CP模式下的IO命令处理流程包括:首先,AEM会遵从一定接口协议将原始的IO请求取回,并通过WQS所管理的硬件事件队列来通知固件。固件得到通知后,对IO命令进行解析后,会为此次IO操作创建第一批CP,并将分步操作的CB按要求填入CP中。在完成上述步骤后,固件会将第一个CB1的地址(4B宽)传递给WQS,WQS会将该CB1地址放入对应引擎的工作队列进行排队等待。当处理CB的某个引擎检测到CB_B时,会通过WQS提交给CB生成引擎处理。当处理CB的某个引擎检测到被动CB填充触发场景时,通过WQS向固件发送消息,由固件完成CB填充。固件填充完成后,仍交给WQS进行分发。当最后一个CBX完成后,需要对主机进行响应,并通知固件进行CP空间的回收。主动触发场景和被动场景允许在一个IO流程中先后出现,也可以仅出现一种。A typical IO command processing flow working in the ring CP mode includes: first, AEM will follow a certain interface protocol to retrieve the original IO request, and notify the firmware through the hardware event queue managed by WQS. After the firmware is notified, after parsing the IO command, it will create the first batch of CPs for this IO operation, and fill the CB of the step-by-step operation into the CPs as required. After completing the above steps, the firmware will pass the address of the first CB1 (4B wide) to WQS, and WQS will put the CB1 address into the work queue of the corresponding engine for queuing. When an engine processing CB detects CB_B, it will be submitted to the CB generation engine for processing through WQS. When an engine processing CB detects a passive CB filling trigger scenario, it sends a message to the firmware through WQS, and the firmware completes the CB filling. After the firmware filling is completed, it is still handed over to WQS for distribution. When the last CBX is completed, it needs to respond to the host and notify the firmware to reclaim the CP space. The active triggering scene and the passive scene can appear successively in one IO process, or only one kind can appear.

通过以上方法允许控制块环的存储空间小于一个主机IO对应的完整控制块序列所需的总的空间,减少了控制页表对存储空间的需求;在环形模式的控制页表下生成的控制块环是以控制页表的粒度为前提的,因此具有弹性,即允许主机IO在性能和效率之间做平衡;环形模式的控制页表下对应的控制块环上运行控制块在执行完毕后回收对应的空间,提高了存储空间的使用效率;并且控制块是动态生成和回收的,因此在一个主机IO中,只需要生成部分控制块即可开始服务IO,因此降低了IO的处理延迟;进一步,通过设置环回控制块标志和主动生成控制块标志,允许CB执行和生成两个方面同步进行,即把CB生成的耗时隐藏在其他CB的执行过程中。Through the above method, the storage space of the control block ring is allowed to be smaller than the total space required for a complete control block sequence corresponding to a host IO, which reduces the storage space requirements of the control page table; the control block generated under the control page table of the ring mode The ring is premised on the granularity of the control page table, so it is flexible, that is, it allows the host IO to balance between performance and efficiency; the corresponding control block under the control page table of the ring mode runs on the ring and the control block is recycled after execution The corresponding space improves the utilization efficiency of the storage space; and the control block is dynamically generated and recycled, so in a host IO, only a part of the control block needs to be generated to start serving the IO, thus reducing the processing delay of the IO; further , by setting the loopback control block flag and the active generation control block flag, the two aspects of CB execution and generation are allowed to be synchronized, that is, the time-consuming of CB generation is hidden in the execution process of other CBs.

本发明的实施例的第二个方面,提出了一种对主机IO处理的优化装置。图8示出的是本发明提供的一种对主机IO处理的优化装置的实施例的示意图。如图8所示,本发明提供的一种对主机IO处理的优化装置包括:第一模块011,配置用于获取主机IO所需的控制块的空间大小以及执行顺序,根据所述空间大小以及执行顺序建立多个控制块环;第二模块012,配置用于将所述主机IO所需的控制块按照所述执行顺序分批填充到所述多个控制块环上;第三模块013,配置用于响应于所述主机IO所需的控制块完成首批填充,开启对所述主机IO的所需的控制块的执行。According to the second aspect of the embodiments of the present invention, a device for optimizing host IO processing is proposed. FIG. 8 is a schematic diagram of an embodiment of an apparatus for optimizing host IO processing provided by the present invention. As shown in FIG. 8 , a device for optimizing host IO processing provided by the present invention includes: a first module 011 configured to obtain the space size and execution sequence of the control blocks required by the host IO, according to the space size and The execution sequence establishes multiple control block rings; the second module 012 is configured to fill the control blocks required by the host IO in batches according to the execution sequence on the multiple control block rings; the third module 013, configured to initiate execution of the host IO required control blocks in response to the host IO required control blocks completing a first fill.

在一些实施例中,配置用于对主机IO处理的优化装置还包括配置用于:在控制页表的定义中添加环形模式的判断和由所述环形模式的控制页表生成对应的控制块环所需的参数信息,得到更新后的控制页表的定义。In some embodiments, the device configured to optimize host IO processing further includes being configured to: add a ring mode judgment in the definition of the control page table and generate a corresponding control block ring from the control page table of the ring mode The parameter information needed to get the definition of the updated control page table.

在一些实施例中,配置用于对主机IO处理的优化装置还包括配置用于:基于判断当前控制页表是否为环形模式以及获取所述当前控制页表在控制页链表中的位置更新控制页表的头区域的定义;响应于所述当前控制页表为环形模式的控制页表,获取所述环形模式的控制页表生成对应的控制块环所需的参数信息,并基于所述参数信息设置附加参数表中的对应参数信息,并在所述环形模式的控制页表中设置附加参数表偏移地址,以得到所述环形模式的控制页表的附加参数表的起始位置。In some embodiments, the optimization device configured to process host IO further includes: configured to: update the control page based on judging whether the current control page table is in ring mode and obtaining the position of the current control page table in the control page linked list Definition of the header area of the table; in response to the current control page table being a control page table in ring mode, obtaining the parameter information required by the control page table in ring mode to generate a corresponding control block ring, and based on the parameter information Set the corresponding parameter information in the additional parameter table, and set the offset address of the additional parameter table in the control page table of the ring mode, so as to obtain the starting position of the additional parameter table of the control page table in the ring mode.

在一些实施例中,配置用于对主机IO处理的优化装置还包括配置用于:基于判断当前控制页表是否为环形模式更新控制页表的属性判断;响应于所述当前控制页表为环形模式的控制页表,获取所述环形模式的控制页表在所述控制页链表中的位置,并在所述环形模式的控制页表中设置指向所述控制页链表中下一个控制页表的首个控制块的地址的指针。In some embodiments, the optimization device configured to process the host IO further includes: configured to: determine whether the current control page table is a ring mode and update the attribute judgment of the control page table; in response to the current control page table being a ring The control page table of the mode, obtains the position of the control page table of the ring mode in the control page linked list, and sets in the control page table of the ring mode the link pointing to the next control page table in the control page linked list Pointer to the address of the first control block.

在一些实施例中,第一模块011进一步配置用于:基于所述更新后的控制页表的定义和获取的所述主机IO所需的控制块的空间大小以及执行顺序建立多个环形模式的控制页表,通过所述多个环形模式的控制页表生成对应的多个控制块环。In some embodiments, the first module 011 is further configured to: establish a plurality of ring patterns based on the definition of the updated control page table and the acquired space size and execution order of the control block required by the host IO A control page table is used to generate a plurality of corresponding control block rings through the control page tables of the plurality of ring modes.

在一些实施例中,第一模块011进一步配置用于:按照串接顺序在控制页链表的最后一个控制页表的控制块存放区域设置附加参数表,以用于存放由环形模式的控制页表生成的对应控制块环所需的参数信息。In some embodiments, the first module 011 is further configured to: set an additional parameter table in the control block storage area of the last control page table in the control page linked list according to the concatenation order, so as to store the control page table in the ring mode The generated parameter information required by the corresponding control block ring.

在一些实施例中,第一模块011进一步配置用于:响应于所述附加参数表的参数体积大于所述最后一个控制页表的控制块存放区域,按照所述串接顺序逆序向前依次占用前一个控制页表的控制块存放区域。In some embodiments, the first module 011 is further configured to: in response to the parameter volume of the additional parameter table being larger than the control block storage area of the last control page table, sequentially occupying forward and backward according to the sequence of concatenation The control block storage area of the previous control page table.

在一些实施例中,第一模块011进一步配置用于:设置与控制页表的模式相对应的控制块类型;在控制块的控制块头中建立控制块的状态标记,以根据不同的控制块的状态标记执行不同的处理逻辑。In some embodiments, the first module 011 is further configured to: set the control block type corresponding to the mode of the control page table; establish the status flag of the control block in the control block header of the control block, so as to Status flags perform different processing logic.

在一些实施例中,第一模块011进一步配置用于:设置控制块有效标志、完成标志以及结束标志,并且响应于所述有效标志有效,所述完成标志和结束标志存在。In some embodiments, the first module 011 is further configured to: set a control block valid flag, a done flag and an end flag, and in response to the valid flag being valid, the done flag and the end flag exist.

在一些实施例中,第一模块011进一步配置用于:设置环回控制块标志,以用于表示当前控制块位于控制块环的存储空间的线性地址的尾部;设置主动生成控制块标志,以用于表示位于当前控制块的后面的控制块未生成,响应于调用带有主动生成控制标志的控制块,为所述带有主动生成控制标志的控制块设置生成引擎,以用于生成后面的控制块并填充到对应的控制块环上。In some embodiments, the first module 011 is further configured to: set the loopback control block flag to indicate that the current control block is located at the end of the linear address of the storage space of the control block ring; set the active generation control block flag to It is used to indicate that the control block located behind the current control block has not been generated. In response to calling the control block with the active generation control flag, set the generation engine for the control block with the active generation control flag to generate the following control block and fill it into the corresponding control block ring.

在一些实施例中,第二模块012进一步配置用于:根据固件对主机IO的命令的解析为所述主机IO创建第一批控制页表,并按照执行顺序将控制块依次填充到所述第一批控制页表对应的控制块环上;设置所述固件将第一个填充的控制块的地址传递给对应引擎的工作队列,以进行排队等待。In some embodiments, the second module 012 is further configured to: create the first batch of control page tables for the host IO according to the firmware's analysis of the commands of the host IO, and fill the control blocks into the first batch of control page tables in sequence according to the order of execution. On the control block ring corresponding to a batch of control page tables; the firmware is set to pass the address of the first filled control block to the work queue of the corresponding engine for queuing and waiting.

在一些实施例中,第三模块013进一步配置用于:响应于检测到主动生成控制块标志,通过工作队列管理引擎向所述固件发送通知以完成控制块的首批填充,并将首批填充的控制页表对应的空间进行回收,开启对所述主机IO所需控制块的执行。In some embodiments, the third module 013 is further configured to: in response to detecting that the control block flag is actively generated, send a notification to the firmware through the work queue management engine to complete the first batch of filling of the control block, and send the first batch of filling The space corresponding to the control page table is reclaimed, and the execution of the control block required by the host IO is started.

基于以上目的,本发明实施例的第三个方面,提出了一种计算机设备,图9示出的是本发明提供的一种计算机设备的实施例的示意图。如图9所示,本发明提供的一种计算机设备的实施例,包括以下模块:至少一个处理器021;以及存储器022,存储器022存储有可在处理器021上运行的计算机指令023,计算机指令023由处理器021执行时实现如上所述方法的步骤。Based on the above purpose, a third aspect of the embodiments of the present invention provides a computer device, and FIG. 9 is a schematic diagram of an embodiment of a computer device provided by the present invention. As shown in Fig. 9, the embodiment of a kind of computer equipment provided by the present invention comprises the following modules: at least one processor 021; When 023 is executed by the processor 021, the steps of the method described above are realized.

本发明还提供了一种计算机可读存储介质。图10示出的是本发明提供的一种计算机可读存储介质的实施例的示意图。如图10所示,计算机可读存储介质031存储有被处理器执行时执行如上方法的计算机程序032。The present invention also provides a computer-readable storage medium. FIG. 10 is a schematic diagram of an embodiment of a computer-readable storage medium provided by the present invention. As shown in FIG. 10 , a computer-readable storage medium 031 stores a computer program 032 for executing the above method when executed by a processor.

最后需要说明的是,本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,可以通过计算机程序来指令相关硬件来完成,设置系统参数的方法的程序可存储于一计算机可读取存储介质中,该程序在执行时,可包括如上述各方法的实施例的流程。其中,程序的存储介质可为磁碟、光盘、只读存储记忆体(ROM)或随机存储记忆体(RAM)等。上述计算机程序的实施例,可以达到与之对应的前述任意方法实施例相同或者相类似的效果。Finally, it should be noted that those of ordinary skill in the art can understand that all or part of the processes in the methods of the above embodiments can be realized through computer programs to instruct relevant hardware to complete, and the program of the method for setting system parameters can be stored in a computer-readable When the program is executed, the program may include the processes of the embodiments of the above-mentioned methods. Wherein, the storage medium of the program may be a magnetic disk, an optical disk, a read-only memory (ROM) or a random access memory (RAM). The foregoing computer program embodiments can achieve the same or similar effects as any of the foregoing method embodiments corresponding thereto.

此外,根据本发明实施例公开的方法还可以被实现为由处理器执行的计算机程序,该计算机程序可以存储在计算机可读存储介质中。在该计算机程序被处理器执行时,执行本发明实施例公开的方法中限定的上述功能。In addition, the method disclosed according to the embodiments of the present invention can also be implemented as a computer program executed by a processor, and the computer program can be stored in a computer-readable storage medium. When the computer program is executed by the processor, the above functions defined in the methods disclosed in the embodiments of the present invention are executed.

此外,上述方法步骤以及系统单元也可以利用控制器以及用于存储使得控制器实现上述步骤或单元功能的计算机程序的计算机可读存储介质实现。In addition, the above-mentioned method steps and system units can also be realized by using a controller and a computer-readable storage medium for storing a computer program for enabling the controller to realize the functions of the above-mentioned steps or units.

本领域技术人员还将明白的是,结合这里的公开所描述的各种示例性逻辑块、模块、电路和算法步骤可以被实现为电子硬件、计算机软件或两者的组合。为了清楚地说明硬件和软件的这种可互换性,已经就各种示意性组件、方块、模块、电路和步骤的功能对其进行了一般性的描述。这种功能是被实现为软件还是被实现为硬件取决于具体应用以及施加给整个系统的设计约束。本领域技术人员可以针对每种具体应用以各种方式来实现的功能,但是这种实现决定不应被解释为导致脱离本发明实施例公开的范围。Those of skill would also appreciate that the various illustrative logical blocks, modules, circuits, and algorithm steps described in connection with the disclosure herein may be implemented as electronic hardware, computer software, or combinations of both. To clearly illustrate this interchangeability of hardware and software, various illustrative components, blocks, modules, circuits, and steps have been described generally in terms of their functionality. Whether such functionality is implemented as software or as hardware depends upon the particular application and design constraints imposed on the overall system. Those skilled in the art may implement the functions in various ways for each specific application, but such implementation decisions should not be interpreted as causing a departure from the scope disclosed in the embodiments of the present invention.

在一个或多个示例性设计中,功能可以在硬件、软件、固件或其任意组合中实现。如果在软件中实现,则可以将功能作为一个或多个指令或代码存储在计算机可读介质上或通过计算机可读介质来传送。计算机可读介质包括计算机存储介质和通信介质,该通信介质包括有助于将计算机程序从一个位置传送到另一个位置的任何介质。存储介质可以是能够被通用或专用计算机访问的任何可用介质。作为例子而非限制性的,该计算机可读介质可以包括RAM、ROM、EEPROM、CD-ROM或其它光盘存储设备、磁盘存储设备或其它磁性存储设备,或者是可以用于携带或存储形式为指令或数据结构的所需程序代码并且能够被通用或专用计算机或者通用或专用处理器访问的任何其它介质。此外,任何连接都可以适当地称为计算机可读介质。例如,如果使用同轴线缆、光纤线缆、双绞线、数字用户线路(DSL)或诸如红外线、无线电和微波的无线技术来从网站、服务器或其它远程源发送软件,则上述同轴线缆、光纤线缆、双绞线、D0L或诸如红外线、无线电和微波的无线技术均包括在介质的定义。如这里所使用的,磁盘和光盘包括压缩盘(CD)、激光盘、光盘、数字多功能盘(DVD)、软盘、蓝光盘,其中磁盘通常磁性地再现数据,而光盘利用激光光学地再现数据。上述内容的组合也应当包括在计算机可读介质的范围内。In one or more exemplary designs, functions may be implemented in hardware, software, firmware, or any combination thereof. If implemented in software, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium. Computer-readable media includes both computer storage media and communication media including any medium that facilitates transfer of a computer program from one place to another. Storage media may be any available media that can be accessed by a general purpose or special purpose computer. By way of example and not limitation, the computer readable medium may include RAM, ROM, EEPROM, CD-ROM or other optical disk storage device, magnetic disk storage device or other magnetic storage device, or may be used to carry or store instructions in Any other medium that can be accessed by a general purpose or special purpose computer or a general purpose or special purpose processor, and the required program code or data structure. Also, any connection is properly termed a computer-readable medium. For example, if the software is transmitted from a website, server, or other remote source using coaxial cable, fiber optic cable, twisted pair, digital subscriber line (DSL), or wireless technologies such as infrared, radio, and microwave, then the coaxial Cable, fiber optic cable, twisted pair, DOL, or wireless technologies such as infrared, radio, and microwave are included in the definition of media. Disk and disc, as used herein, includes compact disc (CD), laser disc, optical disc, digital versatile disc (DVD), floppy disk, blu-ray disc where disks usually reproduce data magnetically, while discs reproduce data optically with lasers . Combinations of the above should also be included within the scope of computer-readable media.

以上是本发明公开的示例性实施例,但是应当注意,在不背离权利要求限定的本发明实施例公开的范围的前提下,可以进行多种改变和修改。根据这里描述的公开实施例的方法权利要求的功能、步骤和/或动作不需以任何特定顺序执行。此外,尽管本发明实施例公开的元素可以以个体形式描述或要求,但除非明确限制为单数,也可以理解为多个。The above are the exemplary embodiments disclosed in the present invention, but it should be noted that various changes and modifications can be made without departing from the scope of the disclosed embodiments of the present invention defined in the claims. The functions, steps and/or actions of the method claims in accordance with the disclosed embodiments described herein need not be performed in any particular order. In addition, although the elements disclosed in the embodiments of the present invention may be described or required in an individual form, they may also be understood as a plurality unless explicitly limited to a singular number.

应当理解的是,在本文中使用的,除非上下文清楚地支持例外情况,单数形式“一个”旨在也包括复数形式。还应当理解的是,在本文中使用的“和/或”是指包括一个或者一个以上相关联地列出的项目的任意和所有可能组合。It should be understood that as used herein, the singular form "a" and "an" are intended to include the plural forms as well, unless the context clearly supports an exception. It should also be understood that "and/or" as used herein is meant to include any and all possible combinations of one or more of the associated listed items.

上述本发明实施例公开实施例序号仅仅为了描述,不代表实施例的优劣。The serial numbers of the embodiments disclosed in the above-mentioned embodiments of the present invention are only for description, and do not represent the advantages and disadvantages of the embodiments.

本领域普通技术人员可以理解实现上述实施例的全部或部分步骤可以通过硬件来完成,也可以通过程序来指令相关的硬件完成,程序可以存储于一种计算机可读存储介质中,上述提到的存储介质可以是只读存储器,磁盘或光盘等。Those of ordinary skill in the art can understand that all or part of the steps for implementing the above-mentioned embodiments can be completed by hardware, or can be completed by instructing related hardware through a program, and the program can be stored in a computer-readable storage medium. The above-mentioned The storage medium may be a read-only memory, a magnetic disk or an optical disk, and the like.

所属领域的普通技术人员应当理解:以上任何实施例的讨论仅为示例性的,并非旨在暗示本发明实施例公开的范围(包括权利要求)被限于这些例子;在本发明实施例的思路下,以上实施例或者不同实施例中的技术特征之间也可以进行组合,并存在如上的本发明实施例的不同方面的许多其它变化,为了简明它们没有在细节中提供。因此,凡在本发明实施例的精神和原则之内,所做的任何省略、修改、等同替换、改进等,均应包含在本发明实施例的保护范围之内。Those of ordinary skill in the art should understand that: the discussion of any of the above embodiments is exemplary only, and is not intended to imply that the disclosed scope (including claims) of the embodiments of the present invention is limited to these examples; under the idea of the embodiments of the present invention , the technical features in the above embodiments or different embodiments can also be combined, and there are many other changes in different aspects of the above embodiments of the present invention, which are not provided in details for the sake of brevity. Therefore, within the spirit and principle of the embodiments of the present invention, any omissions, modifications, equivalent replacements, improvements, etc., shall be included in the protection scope of the embodiments of the present invention.

Claims (13)

1.一种对主机IO处理的优化方法,其特征在于,包括:1. An optimization method for host IO processing, characterized in that, comprising: 在控制页表的定义中添加环形模式的判断和由所述环形模式的控制页表生成对应的控制块环所需的参数信息,得到更新后的控制页表的定义;In the definition of the control page table, add the judgment of the ring mode and generate the parameter information required for the corresponding control block ring by the control page table of the ring mode, and obtain the definition of the updated control page table; 获取主机IO所需的控制块的空间大小以及执行顺序,根据所述空间大小以及执行顺序建立多个控制块环;将所述主机IO所需的控制块按照所述执行顺序分批填充到所述多个控制块环上;Obtain the space size and execution order of the control blocks required by the host IO, and establish multiple control block rings according to the space size and execution order; fill the control blocks required by the host IO in batches according to the execution order on the multiple control block rings; 响应于所述主机IO所需的控制块完成首批填充,开启对所述主机IO的所需的控制块的执行,initiating execution of the required control blocks of the host 10 in response to the host 10 required control blocks completing the first batch of populations, 其中,获取主机IO所需的控制块的空间大小以及执行顺序,根据所述空间大小以及执行顺序建立多个控制块环包括:Wherein, obtaining the space size and execution order of the control blocks required by the host IO, and establishing multiple control block rings according to the space size and execution order include: 设置与控制页表的模式相对应的控制块的类型;Set the type of control block corresponding to the mode of the control page table; 在控制块的控制块头中建立控制块的状态标记,以根据不同的控制块的状态标记执行不同的处理逻辑。The state flag of the control block is established in the control block header of the control block, so as to execute different processing logics according to the state flags of different control blocks. 2.根据权利要求1所述的方法,其特征在于,所述在控制页表的定义中添加环形模式的判断和由所述环形模式的控制页表生成对应的控制块环所需的参数信息,得到更新后的控制页表的定义包括:2. The method according to claim 1, characterized in that adding the judgment of the ring mode in the definition of the control page table and generating the required parameter information for the corresponding control block ring by the control page table of the ring mode , the definition of the updated control page table includes: 基于判断当前控制页表是否为环形模式以及获取所述当前控制页表在控制页链表中的位置更新控制页表的头区域的定义;Updating the definition of the header area of the control page table based on judging whether the current control page table is in ring mode and obtaining the position of the current control page table in the control page linked list; 响应于所述当前控制页表为环形模式的控制页表,获取所述环形模式的控制页表生成对应的控制块环所需的参数信息,并基于所述参数信息设置附加参数表中的对应参数信息,并在所述环形模式的控制页表中设置附加参数表偏移地址,以得到所述环形模式的控制页表的附加参数表的起始位置。In response to the fact that the current control page table is a control page table in ring mode, acquire the parameter information required by the control page table in ring mode to generate a corresponding control block ring, and set the corresponding parameter information in the additional parameter table based on the parameter information parameter information, and set an additional parameter table offset address in the control page table of the ring mode, so as to obtain the starting position of the additional parameter table of the control page table of the ring mode. 3.根据权利要求2所述的方法,其特征在于,所述基于判断当前控制页表是否为环形模式以及获取所述当前控制页表在控制页链表中的位置更新控制页表的头区域的定义包括:3. The method according to claim 2, wherein the method of updating the header area of the control page table based on judging whether the current control page table is in ring mode and obtaining the position of the current control page table in the control page linked list Definitions include: 基于判断当前控制页表是否为环形模式更新控制页表的属性判断;Judging based on judging whether the current control page table is an attribute of the ring mode update control page table; 响应于所述当前控制页表为环形模式的控制页表,获取所述环形模式的控制页表在所述控制页链表中的位置,并在所述环形模式的控制页表中设置指向所述控制页链表中下一个控制页表的首个控制块的地址的指针。Responding to the fact that the current control page table is a control page table in ring mode, obtain the position of the control page table in ring mode in the control page linked list, and set a pointer to the A pointer to the address of the first control block of the next control page table in the control page linked list. 4.根据权利要求1所述的方法,其特征在于,所述获取主机IO所需的控制块的空间大小以及执行顺序,根据所述空间大小以及执行顺序建立多个控制块环还包括:4. The method according to claim 1, wherein the acquisition of the space size and the execution order of the control blocks required by the host IO, and establishing a plurality of control block rings according to the space size and the execution order also includes: 基于所述更新后的控制页表的定义和获取的所述主机IO所需的控制块的空间大小以及执行顺序建立多个环形模式的控制页表,通过所述多个环形模式的控制页表生成对应的多个控制块环。Based on the definition of the updated control page table and the acquired space size and execution order of the control block required by the host IO, multiple ring-mode control page tables are established, through the multiple ring-mode control page tables A plurality of corresponding control block rings are generated. 5.根据权利要求4所述的方法,其特征在于,所述获取主机IO所需的控制块的空间大小以及执行顺序,根据所述空间大小以及执行顺序建立多个控制块环还包括:5. The method according to claim 4, wherein said obtaining the space size and the execution order of the control blocks required by the host IO, and establishing a plurality of control block rings according to the space size and the execution order also includes: 按照串接顺序在控制页链表的最后一个控制页表的控制块存放区域设置附加参数表,以用于存放由环形模式的控制页表生成的对应控制块环所需的参数信息。An additional parameter table is set in the control block storage area of the last control page table in the control page linked list according to the serial connection order, so as to store the parameter information required by the corresponding control block ring generated by the control page table in the ring mode. 6.根据权利要求5所述的方法,其特征在于,所述按照串接顺序在控制页链表的最后一个控制页表的控制块存放区域设置附加参数表,以用于存放由环形模式的控制页表生成的对应控制块环所需的参数信息包括:6. The method according to claim 5, wherein the additional parameter table is set in the control block storage area of the last control page table of the control page linked list according to the serial connection order, so as to store the control by the ring mode. The parameter information required for the corresponding control block ring generated by the page table includes: 响应于所述附加参数表的参数体积大于所述最后一个控制页表的控制块存放区域,所述按照串接顺序逆序向前依次占用前一个控制页表的控制块存放区域。In response to the fact that the parameter volume of the additional parameter table is larger than the control block storage area of the last control page table, the control block storage area of the previous control page table is sequentially occupied forward and backward according to the concatenation sequence. 7.根据权利要求1所述的方法,其特征在于,所述在控制块的控制块头中建立控制块的状态标记,以根据不同的控制块的状态标记执行不同的处理逻辑包括:7. The method according to claim 1, wherein said setting up the state flag of the control block in the control block header of the control block, so as to execute different processing logics according to the state flags of different control blocks comprises: 设置控制块有效标志、完成标志以及结束标志,并且响应于所述有效标志有效,所述完成标志和结束标志存在。A control block valid flag, a done flag and an end flag are set and present in response to the valid flag being valid. 8.根据权利要求7所述的方法,其特征在于,所述设置与控制页表的模式相对应的控制块类型包括:8. The method according to claim 7, wherein said setting the control block type corresponding to the mode of the control page table comprises: 设置环回控制块标志,以用于表示当前控制块位于控制块环的存储空间的线性地址的尾部;Set the loopback control block flag to indicate that the current control block is located at the end of the linear address of the storage space of the control block ring; 设置主动生成控制块标志,以用于表示位于当前控制块的后面的控制块未生成,响应于调用带有主动生成控制标志的控制块,为所述带有主动生成控制标志的控制块设置生成引擎,以用于生成后面的控制块并填充到对应的控制块环上。Set an active generation control block flag to indicate that the control block behind the current control block is not generated, and in response to calling the control block with the active generation control flag, set generation for the control block with the active generation control flag The engine is used to generate the following control blocks and fill them into the corresponding control block rings. 9.根据权利要求8所述的方法,其特征在于,所述将所述主机IO所需的控制块按照所述执行顺序分批填充到所述多个控制块环上包括:9. The method according to claim 8, wherein the filling the control blocks required by the host IO into the multiple control block rings in batches according to the execution order comprises: 根据固件对主机IO的命令的解析为所述主机IO创建第一批控制页表,并按照执行顺序将控制块依次填充到所述第一批控制页表对应的控制块环上;Create the first batch of control page tables for the host IO according to the firmware's analysis of the commands of the host IO, and fill the control blocks into the control block rings corresponding to the first batch of control page tables in order of execution; 设置所述固件将第一个填充的控制块的地址传递给对应引擎的工作队列,以进行排队等待。The firmware is set to pass the address of the first filled control block to the work queue of the corresponding engine for queuing. 10.根据权利要求9所述的方法,其特征在于,所述响应于所述主机IO所需的控制块完成首批填充,开启对所述主机IO的所需的控制块的执行包括:10. The method according to claim 9, wherein in response to the completion of the first batch of filling of the control blocks required by the host IO, enabling the execution of the required control blocks of the host IO comprises: 响应于检测到主动生成控制块标志,通过工作队列管理引擎向所述固件发送通知以完成控制块的首批填充,并将首批填充的控制页表对应的空间进行回收,开启对所述主机IO所需控制块的执行。In response to detecting that the control block flag is actively generated, the work queue management engine sends a notification to the firmware to complete the first filling of the control block, reclaims the space corresponding to the first filling of the control page table, and starts the host Execution of IO required control blocks. 11.一种对主机IO处理的优化装置,其特征在于,包括:11. An optimization device for host IO processing, characterized in that, comprising: 第一模块,配置用于获取主机IO所需的控制块的空间大小以及执行顺序,根据所述空间大小以及执行顺序建立多个控制块环;The first module is configured to obtain the space size and execution order of the control blocks required by the host IO, and establish multiple control block rings according to the space size and execution order; 第二模块,配置用于将所述主机IO所需的控制块按照所述执行顺序分批填充到所述多个控制块环上;The second module is configured to fill the control blocks required by the host IO into the plurality of control block rings in batches according to the execution sequence; 第三模块,配置用于响应于所述主机IO所需的控制块完成首批填充,开启对所述主机IO的所需的控制块的执行;The third module is configured to start the execution of the required control blocks of the host IO in response to the completion of the first batch of filling of the control blocks required by the host IO; 第四模块,配置用于在控制页表的定义中添加环形模式的判断和由所述环形模式的控制页表生成对应的控制块环所需的参数信息,得到更新后的控制页表的定义;The fourth module is configured to add the judgment of the ring mode in the definition of the control page table and generate the parameter information required for the corresponding control block ring from the control page table of the ring mode, so as to obtain the updated definition of the control page table ; 其中,所述第一模块进一步配置用于:Wherein, the first module is further configured to: 设置与控制页表的模式相对应的控制块的类型;Set the type of control block corresponding to the mode of the control page table; 在控制块的控制块头中建立控制块的状态标记,以根据不同的控制块的状态标记执行不同的处理逻辑。The state flag of the control block is established in the control block header of the control block, so as to execute different processing logics according to the state flags of different control blocks. 12.一种计算机设备,其特征在于,包括:12. A computer device, characterized in that it comprises: 至少一个处理器;以及at least one processor; and 存储器,所述存储器存储有可在所述处理器上运行的计算机指令,所述指令由所述处理器执行时实现权利要求1-10任意一项所述方法的步骤。A memory, the memory stores computer instructions operable on the processor, and when the instructions are executed by the processor, the steps of the method according to any one of claims 1-10 are implemented. 13.一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现权利要求1-10任意一项所述方法的步骤。13. A computer-readable storage medium, wherein the computer-readable storage medium stores a computer program, wherein when the computer program is executed by a processor, the steps of the method according to any one of claims 1-10 are implemented.
CN202211508344.4A 2022-11-29 2022-11-29 Method, device, equipment and medium for optimizing host IO processing Active CN115543219B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202211508344.4A CN115543219B (en) 2022-11-29 2022-11-29 Method, device, equipment and medium for optimizing host IO processing
PCT/CN2023/115975 WO2024113996A1 (en) 2022-11-29 2023-08-30 Optimization method and apparatus for host io processing, device, and nonvolatile readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211508344.4A CN115543219B (en) 2022-11-29 2022-11-29 Method, device, equipment and medium for optimizing host IO processing

Publications (2)

Publication Number Publication Date
CN115543219A CN115543219A (en) 2022-12-30
CN115543219B true CN115543219B (en) 2023-04-18

Family

ID=84722749

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211508344.4A Active CN115543219B (en) 2022-11-29 2022-11-29 Method, device, equipment and medium for optimizing host IO processing

Country Status (2)

Country Link
CN (1) CN115543219B (en)
WO (1) WO2024113996A1 (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115543219B (en) * 2022-11-29 2023-04-18 苏州浪潮智能科技有限公司 Method, device, equipment and medium for optimizing host IO processing
CN117008843B (en) * 2023-09-26 2024-01-19 苏州元脑智能科技有限公司 Control page linked list construction device and electronic equipment
CN119621637B (en) * 2025-02-11 2025-05-27 山东云海国创云计算装备产业创新中心有限公司 Control block resynchronization method, system, storage medium and electronic device

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102681952B (en) * 2012-05-12 2015-02-18 北京忆恒创源科技有限公司 Method for writing data into memory equipment and memory equipment
TWI592865B (en) * 2016-07-22 2017-07-21 大心電子(英屬維京群島)股份有限公司 Data reading method, data writing method and storage controller using the same
CN107145311B (en) * 2017-06-12 2020-06-19 苏州浪潮智能科技有限公司 IO data processing method and system
CN111737002B (en) * 2020-06-24 2022-05-31 苏州浪潮智能科技有限公司 Method, device and equipment for processing chained storage request and readable medium
CN112241310B (en) * 2020-10-21 2023-01-31 海光信息技术股份有限公司 Page table management method, information acquisition method, processor, chip, device and medium
CN113885945B (en) * 2021-08-30 2023-05-16 山东云海国创云计算装备产业创新中心有限公司 Calculation acceleration method, equipment and medium
CN114048149B (en) * 2021-10-30 2025-03-28 山东云海国创云计算装备产业创新中心有限公司 A method, system, storage medium and device for controlling the construction of a page linked list
CN113918101B (en) * 2021-12-09 2022-03-15 苏州浪潮智能科技有限公司 A method, system, device and storage medium for writing data cache
CN115543219B (en) * 2022-11-29 2023-04-18 苏州浪潮智能科技有限公司 Method, device, equipment and medium for optimizing host IO processing

Also Published As

Publication number Publication date
CN115543219A (en) 2022-12-30
WO2024113996A1 (en) 2024-06-06

Similar Documents

Publication Publication Date Title
CN115543219B (en) Method, device, equipment and medium for optimizing host IO processing
CN109783229B (en) Thread resource allocation method and device
CN103019651B (en) The method for parallel processing of complex task and device
JP5088234B2 (en) Message association processing apparatus, method, and program
CN113885945B (en) Calculation acceleration method, equipment and medium
CN111831408A (en) Asynchronous task processing method and device, electronic equipment and medium
CN107077390A (en) A task processing method and network card
CN108008950B (en) Method and device for realizing user interface updating
CN105187327A (en) Distributed message queue middleware
WO2021143590A1 (en) Distributed container image construction scheduling system and method
WO2021022714A1 (en) Message processing method for cross-block chain node, device, apparatus and medium
CN116302448B (en) Task scheduling method and system
CN111371848A (en) A request processing method, apparatus, device and storage medium
CN118295784A (en) A method, device, equipment and storage medium for mimicking scheduling decision
CN115114311A (en) A transaction execution method and related device
CN119883579B (en) Proxy method, electronic device and storage medium for high-frequency polling of operating system
WO2024077881A1 (en) Scheduling method and system for neural network training, and computer-readable storage medium
CN114461410B (en) Method and device for realizing distributed lock, electronic equipment and storage medium
CN104572275B (en) A kind of process loading method, apparatus and system
WO2023273157A1 (en) Workflow generation method and apparatus, and device and storage medium
CN117332881B (en) Distributed training method and electronic equipment
CN116820527B (en) Program upgrading method, device, computer equipment and storage medium
CN116468124B (en) Quantum task scheduling method and related device
CN115866092B (en) Data forwarding method, device, equipment and storage medium
CN102053917A (en) Smart card capable of reducing memory footprint and instruction processing method thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP03 Change of name, title or address

Address after: 215000 Building 9, No.1 guanpu Road, Guoxiang street, Wuzhong Economic Development Zone, Suzhou City, Jiangsu Province

Patentee after: Suzhou Yuannao Intelligent Technology Co.,Ltd.

Country or region after: China

Address before: 215000 Building 9, No.1 guanpu Road, Guoxiang street, Wuzhong Economic Development Zone, Suzhou City, Jiangsu Province

Patentee before: SUZHOU LANGCHAO INTELLIGENT TECHNOLOGY Co.,Ltd.

Country or region before: China