[go: up one dir, main page]

CN118264256A - Data coding method, device, hardware acceleration card, program product and medium - Google Patents

Data coding method, device, hardware acceleration card, program product and medium Download PDF

Info

Publication number
CN118264256A
CN118264256A CN202410679626.3A CN202410679626A CN118264256A CN 118264256 A CN118264256 A CN 118264256A CN 202410679626 A CN202410679626 A CN 202410679626A CN 118264256 A CN118264256 A CN 118264256A
Authority
CN
China
Prior art keywords
data
channels
byte
matching
matching information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202410679626.3A
Other languages
Chinese (zh)
Other versions
CN118264256B (en
Inventor
李逍
赵雅倩
史宏志
张亚强
高飞
陈筱琳
许光远
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Metabrain Intelligent Technology Co Ltd
Original Assignee
Suzhou Metabrain Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Metabrain Intelligent Technology Co Ltd filed Critical Suzhou Metabrain Intelligent Technology Co Ltd
Priority to CN202410679626.3A priority Critical patent/CN118264256B/en
Publication of CN118264256A publication Critical patent/CN118264256A/en
Application granted granted Critical
Publication of CN118264256B publication Critical patent/CN118264256B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
    • H03M7/30Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The invention discloses a data coding method, a device, a hardware acceleration card, a program product and a medium, relates to the field of data processing, and is used for solving the problem that an application scene is limited when the hardware acceleration card realizes a compression algorithm. The scheme obtains channel number configuration information determined based on state parameters of a hardware accelerator card and/or demand instructions of a user, and configures the number of channels to be the number of target channels; after the computing module determines the matching information according to the data to be encoded and the encoded data, the matching information and the data to be encoded are output to the encoding module through the channel so as to encode the data to be encoded according to the matching information. In the invention, when the hardware accelerator card realizes the compression algorithm, the number of channels can be dynamically adjusted according to different scenes and requirements, so that the flexibility and applicability are improved, and the application scene is prevented from being limited; by the configuration and matching of the number of channels, the number of channels can be utilized more effectively, the efficiency and performance of a compression algorithm are improved, and the load of a hardware acceleration card is reduced.

Description

一种数据编码方法、装置、硬件加速卡、程序产品及介质A data encoding method, device, hardware acceleration card, program product and medium

技术领域Technical Field

本发明涉及数据处理领域,特别涉及一种数据编码方法、装置、硬件加速卡、程序产品及介质。The present invention relates to the field of data processing, and in particular to a data encoding method, device, hardware acceleration card, program product and medium.

背景技术Background technique

数据压缩目前主要使用软件实现,但面临压缩效率不高、处理器负担较大和安全性不高等问题。因此,硬件加速卡作为协处理器完成数据压缩逐渐流行起来。硬件加速卡主要用于执行Gzip的压缩算法,Gzip算法中利用固定大小的滑动窗口处理待编码数据中的重复字符串,通过字典匹配得到最优的匹配信息。然后将匹配信息通过编码模块提供的通道输出至编码模块,以使得编码模块根据匹配信息对待编码数据进行编码。Data compression is currently mainly implemented using software, but it faces problems such as low compression efficiency, heavy processor burden and low security. Therefore, hardware accelerator cards are becoming increasingly popular as coprocessors to complete data compression. Hardware accelerator cards are mainly used to execute the Gzip compression algorithm, which uses a fixed-size sliding window to process repeated strings in the data to be encoded and obtains the best matching information through dictionary matching. The matching information is then output to the encoding module through the channel provided by the encoding module, so that the encoding module encodes the data to be encoded according to the matching information.

但是,在硬件加速卡实现Gzip算法时,计算模块计算出匹配信息之后,输出的匹配信息会根据后端的编码模块提供的通道数量的不同而变化,而目前后端的编码模块提供的通道数量通常是固定的,因此硬件加速卡实现压缩算法的场景非常受限。However, when the hardware acceleration card implements the Gzip algorithm, after the computing module calculates the matching information, the output matching information will vary according to the number of channels provided by the back-end encoding module. Currently, the number of channels provided by the back-end encoding module is usually fixed, so the scenario in which the hardware acceleration card implements the compression algorithm is very limited.

发明内容Summary of the invention

本发明的目的是提供一种数据编码方法、装置、硬件加速卡、程序产品及介质,硬件加速卡实现压缩算法时,可根据不同的场景和需求动态调整通道数量,提高了灵活性和适用性,避免应用场景受限;通过通道数量配置和匹配,可以更有效地利用通道数量,提高压缩算法的效率和性能,降低了硬件加速卡的负担。The purpose of the present invention is to provide a data encoding method, device, hardware acceleration card, program product and medium. When the hardware acceleration card implements the compression algorithm, the number of channels can be dynamically adjusted according to different scenarios and requirements, thereby improving flexibility and applicability and avoiding limitations on application scenarios. By configuring and matching the number of channels, the number of channels can be more effectively utilized, the efficiency and performance of the compression algorithm can be improved, and the burden on the hardware acceleration card can be reduced.

第一方面,本发明提供了一种数据编码方法,应用于硬件加速卡中的处理器,所述硬件加速卡还包括计算模块和编码模块,所述数据编码方法包括:In a first aspect, the present invention provides a data encoding method, which is applied to a processor in a hardware acceleration card, wherein the hardware acceleration card further includes a computing module and an encoding module, and the data encoding method includes:

获取基于硬件加速卡的状态参数和/或用户的需求指令确定的通道数量配置信息;Obtain channel quantity configuration information determined based on status parameters of the hardware acceleration card and/or user demand instructions;

根据所述通道数量配置信息将所述编码模块的通道数量配置为目标通道数量;Configuring the number of channels of the encoding module to be the target number of channels according to the channel number configuration information;

在所述计算模块根据待编码数据和已编码数据确定匹配信息后,将所述匹配信息和所述待编码数据通过所述目标通道数量的通道输出至所述编码模块,触发所述编码模块根据所述匹配信息对所述待编码数据进行编码;After the calculation module determines matching information according to the data to be encoded and the encoded data, the matching information and the data to be encoded are output to the encoding module through the channels of the target number of channels, triggering the encoding module to encode the data to be encoded according to the matching information;

所述匹配信息为所述待编码数据中和所述已编码数据中的预设字符串相同时,所述预设字符串在所述已编码数据中的匹配距离和匹配长度。The matching information is the matching distance and matching length of the preset character string in the encoded data when the preset character string in the data to be encoded is the same as the preset character string in the encoded data.

其中,获取基于硬件加速卡的状态参数和/或用户的需求指令确定的通道数量配置信息,包括:Wherein, obtaining the channel quantity configuration information determined based on the state parameters of the hardware acceleration card and/or the user's demand instruction includes:

根据用户的需求指令确定测试通道数量,将所述编码模块的通道数量配置为所述测试通道数量;Determine the number of test channels according to the user's demand instructions, and configure the number of channels of the encoding module to the number of test channels;

预估所述硬件加速卡在所述测试通道数量下的第一计算资源占用量;estimating a first computing resource occupancy of the hardware acceleration card under the number of test channels;

根据所述第一计算资源占用量调整所述测试通道数量,得到目标通道数量。The number of test channels is adjusted according to the first computing resource occupancy to obtain a target number of channels.

其中,根据所述第一计算资源占用量调整所述测试通道数量,得到目标通道数量,包括:The step of adjusting the number of test channels according to the first computing resource occupancy to obtain a target number of channels includes:

将所述第一计算资源占用量与第一预设阈值比较;Comparing the first computing resource occupancy with a first preset threshold;

若所述第一计算资源占用量大于所述第一预设阈值,则将所述测试通道数量减小,得到所述目标通道数量;If the first computing resource usage is greater than the first preset threshold, reducing the number of test channels to obtain the target number of channels;

若所述第一计算资源占用量小于所述第一预设阈值,则将所述测试通道数量增大,得到所述目标通道数量;If the first computing resource usage is less than the first preset threshold, increasing the number of test channels to obtain the target number of channels;

若所述第一计算资源占用量等于所述第一预设阈值,则确定所述测试通道数量为所述目标通道数量。If the first computing resource occupancy is equal to the first preset threshold, the number of test channels is determined to be the target number of channels.

其中,将所述第一计算资源占用量与第一预设阈值比较之后,还包括:After comparing the first computing resource usage with a first preset threshold, the method further includes:

计算所述第一计算资源占用量与所述第一预设阈值的差值,计算所述差值与所述第一预设阈值的比值;Calculating a difference between the first computing resource usage and the first preset threshold, and calculating a ratio of the difference to the first preset threshold;

根据所述比值及预设比值-变化量对应关系确定通道数量变化量;Determine the change in the number of channels according to the ratio and a preset ratio-change correspondence;

若所述第一计算资源占用量大于所述第一预设阈值,则将所述测试通道数量减小,得到所述目标通道数量,包括:If the first computing resource usage is greater than the first preset threshold, reducing the number of test channels to obtain the target number of channels includes:

若所述第一计算资源占用量大于所述第一预设阈值,则将所述测试通道数量减去所述通道数量变化量,得到所述目标通道数量;If the first computing resource usage is greater than the first preset threshold, subtracting the channel quantity change amount from the test channel quantity to obtain the target channel quantity;

若所述第一计算资源占用量小于所述第一预设阈值,则将所述测试通道数量增大,得到所述目标通道数量,包括:If the first computing resource usage is less than the first preset threshold, increasing the number of test channels to obtain the target number of channels includes:

若所述第一计算资源占用量小于所述第一预设阈值,则将所述测试通道数量加上所述通道数量变化量,得到所述目标通道数量。If the first computing resource occupancy is less than the first preset threshold, the target channel number is obtained by adding the test channel number to the channel number change.

其中,在判定所述第一计算资源占用量大于所述第一预设阈值时,还包括:When it is determined that the first computing resource usage is greater than the first preset threshold, the method further includes:

将所述第一计算资源占用量与第二预设阈值比较,所述第二预设阈值大于所述第一预设阈值;Comparing the first computing resource usage with a second preset threshold, where the second preset threshold is greater than the first preset threshold;

若所述第一计算资源占用量大于所述第二预设阈值,则反馈重新配置信号给用户,以使用户基于所述重新配置信号重新确定所述测试通道数量,并重新进入将所述编码模块的通道数量配置为所述测试通道数量的步骤。If the first computing resource occupancy is greater than the second preset threshold, a reconfiguration signal is fed back to the user so that the user redetermines the number of test channels based on the reconfiguration signal and re-enters the step of configuring the number of channels of the encoding module to the number of test channels.

其中,计算所述第一计算资源占用量与所述第一预设阈值的差值之前,还包括:Before calculating the difference between the first computing resource usage and the first preset threshold, the method further includes:

确定所述硬件加速卡的应用平台,确定所述硬件加速卡执行所述应用平台上的预设任务所需的第二计算资源占用量;Determine an application platform of the hardware acceleration card, and determine an amount of second computing resources occupied by the hardware acceleration card for executing a preset task on the application platform;

根据所述第二计算资源占用量确定所述硬件加速卡的计算资源剩余量,所述第二计算资源占用量与所述计算资源剩余量的和值不大于所述硬件加速卡的全部计算资源;Determine the remaining amount of computing resources of the hardware acceleration card according to the second computing resource occupation, wherein the sum of the second computing resource occupation and the remaining amount of computing resources is not greater than all computing resources of the hardware acceleration card;

根据所述计算资源剩余量确定所述第一预设阈值。The first preset threshold is determined according to the remaining amount of computing resources.

其中,获取基于硬件加速卡的状态参数和/或用户的需求指令确定的通道数量配置信息,包括:Wherein, obtaining the channel quantity configuration information determined based on the state parameters of the hardware acceleration card and/or the user's demand instruction includes:

确定所述硬件加速卡的应用平台,确定所述硬件加速卡执行所述应用平台上的预设任务所需的第二计算资源占用量;Determine an application platform of the hardware acceleration card, and determine an amount of second computing resources occupied by the hardware acceleration card for executing a preset task on the application platform;

根据所述第二计算资源占用量确定所述硬件加速卡的计算资源剩余量,所述第二计算资源占用量与所述计算资源剩余量的和值不大于所述硬件加速卡的全部计算资源;Determine the remaining amount of computing resources of the hardware acceleration card according to the second computing resource occupation, wherein the sum of the second computing resource occupation and the remaining amount of computing resources is not greater than all computing resources of the hardware acceleration card;

根据所述计算资源剩余量确定所述目标通道数量。The target number of channels is determined according to the remaining amount of computing resources.

其中,获取基于硬件加速卡的状态参数和/或用户的需求指令确定的通道数量配置信息,包括:Wherein, obtaining the channel quantity configuration information determined based on the state parameters of the hardware acceleration card and/or the user's demand instruction includes:

通过第一接口获取基于硬件加速卡的状态参数和/或用户的需求指令确定的通道数量配置信息;Acquire, through the first interface, channel quantity configuration information determined based on the state parameters of the hardware acceleration card and/or the user's demand instruction;

利用接口转换装置将所述第一接口获取到的通道数量配置信息转换为第二接口对应的通道数量配置信息;Using an interface conversion device to convert the channel quantity configuration information acquired by the first interface into the channel quantity configuration information corresponding to the second interface;

将所述第二接口对应的通道数量配置信息刷新至通道数量配置寄存器中;Refresh the channel quantity configuration information corresponding to the second interface into the channel quantity configuration register;

根据所述通道数量配置信息将所述编码模块的通道数量配置为目标通道数量,包括:Configuring the number of channels of the encoding module to be the target number of channels according to the channel number configuration information includes:

根据所述通道数量配置寄存器中的通道数量配置信息,将所述编码模块的通道数量配置为目标通道数量。According to the channel number configuration information in the channel number configuration register, the channel number of the encoding module is configured as the target channel number.

其中,所述待编码数据中将所有字节划分为N组,N为大于1的整数,每组中的字节数等于所述目标通道数量;根据待编码数据和已编码数据确定匹配信息,包括:Wherein, all bytes in the data to be encoded are divided into N groups, N is an integer greater than 1, and the number of bytes in each group is equal to the number of target channels; and matching information is determined according to the data to be encoded and the encoded data, including:

以每组所述字节中的每个所述字节为起始字节,N组所述字节同时查找所述已编码数据中是否存在与组内各个所述字节为所述起始字节相匹配的字符串,所述字符串的长度至少为2;Taking each byte in each group of the bytes as a starting byte, N groups of the bytes simultaneously search the encoded data for a character string that matches each byte in the group as the starting byte, wherein the length of the character string is at least 2;

记录每个字节为所述起始字节相匹配的字符串的匹配信息,所述匹配信息包括相匹配的字符串的匹配长度以及所述字符串的首字节在所述已编码数据中的匹配距离;Recording the matching information of the character string that matches each byte as the starting byte, the matching information includes the matching length of the matched character string and the matching distance of the first byte of the character string in the encoded data;

根据所述待编码数据中的N组中每个所述字节对应的匹配信息确定最优匹配信息;Determine the best matching information according to the matching information corresponding to each byte in the N groups in the data to be encoded;

将所述匹配信息和所述待编码数据通过所述目标通道数量的通道输出至所述编码模块,包括:Outputting the matching information and the data to be encoded to the encoding module through the target number of channels includes:

将所述最优匹配信息对应的数据结果集通过所述目标通道数量的通道输出至所述编码模块,所述数据结果集中包括最优匹配信息和所述待编码数据。The data result set corresponding to the optimal matching information is output to the encoding module through the channels of the target number of channels, and the data result set includes the optimal matching information and the data to be encoded.

其中,根据所述待编码数据中的N组中每个字节对应的匹配信息确定最优匹配信息,包括:Wherein, determining the optimal matching information according to the matching information corresponding to each byte in the N groups in the to-be-encoded data includes:

从第一组开始轮询,根据当前比较过程中的当前字节的第一匹配长度和下一字节的第二匹配长度确定下一比较过程中的当前字节的地址;Polling starts from the first group, and determining the address of the current byte in the next comparison process according to the first matching length of the current byte in the current comparison process and the second matching length of the next byte;

将满足预设结束条件的比较过程中的当前字节的匹配信息确定为最优匹配信息。The matching information of the current byte in the comparison process that meets the preset end condition is determined as the optimal matching information.

其中,将满足预设结束条件的比较过程中的当前字节的匹配信息确定为最优匹配信息之前,还包括:Before determining the matching information of the current byte in the comparison process that meets the preset end condition as the best matching information, the method further includes:

判断下一比较过程中的当前字节的地址偏移量是否大于所述目标通道数量;Determine whether the address offset of the current byte in the next comparison process is greater than the number of target channels;

若大于所述目标通道数量,则判定当前比较过程中的当前字节满足所述预设结束条件;If it is greater than the target channel number, determining that the current byte in the current comparison process meets the preset end condition;

若小于所述目标通道数量,则判定当前比较过程的当前字节不满足所述预设结束条件。If it is less than the target channel quantity, it is determined that the current byte of the current comparison process does not meet the preset end condition.

其中,根据当前比较过程中的当前字节的第一匹配长度和下一字节的第二匹配长度确定下一比较过程中的当前字节的地址,包括:Wherein, determining the address of the current byte in the next comparison process according to the first matching length of the current byte in the current comparison process and the second matching length of the next byte includes:

在所述第一匹配长度小于所述第二匹配长度时,输出当前比较过程中的当前字节,并将当前比较过程中的当前字节的地址加一作为所述下一比较过程中的当前字节的地址。When the first matching length is smaller than the second matching length, the current byte in the current comparison process is output, and the address of the current byte in the current comparison process is increased by one as the address of the current byte in the next comparison process.

其中,根据当前比较过程中的当前字节的第一匹配长度和下一字节的第二匹配长度确定下一比较过程中的当前字节的地址,包括:Wherein, determining the address of the current byte in the next comparison process according to the first matching length of the current byte in the current comparison process and the second matching length of the next byte includes:

在所述第一匹配长度不小于所述第二匹配长度时,输出所述第一匹配长度,并将当前比较过程中的当前字节的地址加所述第一匹配长度之后的地址作为所述下一比较过程中的当前字节的地址。When the first match length is not less than the second match length, the first match length is output, and the address of the current byte in the current comparison process plus the address after the first match length is used as the address of the current byte in the next comparison process.

其中,还包括:Among them, it also includes:

若所述当前字节的地址等于最大通道地址时,缓存所述当前字节的匹配信息;If the address of the current byte is equal to the maximum channel address, cache the matching information of the current byte;

在进入下一组字节的比较过程时,构建虚拟0地址,所述虚拟0地址中存储所述最大通道地址的字节的匹配信息;When entering the comparison process of the next group of bytes, constructing a virtual 0 address, in which the matching information of the bytes of the maximum channel address is stored;

将所述虚拟0地址存储的所述最大通道地址的字节的匹配信息作为当前比较过程的当前字节的匹配信息,并进入根据当前比较过程中的当前字节的第一匹配长度和下一字节的第二匹配长度确定下一比较过程中的当前字节的地址的步骤。The matching information of the byte of the maximum channel address stored in the virtual 0 address is used as the matching information of the current byte of the current comparison process, and the step of determining the address of the current byte in the next comparison process according to the first matching length of the current byte in the current comparison process and the second matching length of the next byte is entered.

其中,将所述最优匹配信息对应的数据结果集通过所述目标通道数量的通道输出至所述编码模块,包括:The step of outputting the data result set corresponding to the optimal matching information to the encoding module through the target number of channels includes:

将所述最优匹配信息对应的数据结果集写入深度为m的乒乓寄存器组,m为N的正整数倍;Writing the data result set corresponding to the optimal matching information into a ping-pong register group with a depth of m, where m is a positive integer multiple of N;

在所述乒乓寄存器组的N个通道标志有效时,触发所述编码模块通过所述目标通道数量的通道读取所述数据结果集。When the N channel flags of the ping-pong register group are valid, the encoding module is triggered to read the data result set through the channels of the target number of channels.

其中,所述乒乓寄存器组的个数为多个,将所述最优匹配信息对应的数据结果集写入深度为m的乒乓寄存器组,包括:There are multiple ping-pong register groups, and writing the data result set corresponding to the optimal matching information into a ping-pong register group with a depth of m includes:

在所述编码模块通过所述目标通道数量的通道读取其中一个乒乓寄存器组中存储的数据结果集时,将所述最优匹配信息对应的数据结果集写入其它未被读取的乒乓寄存器组中。When the encoding module reads the data result set stored in one of the ping-pong register groups through the channels of the target number of channels, the data result set corresponding to the best matching information is written into other ping-pong register groups that have not been read.

第二方面,本发明提供了一种数据编码装置,应用于硬件加速卡,所述硬件加速卡还包括计算模块和编码模块,所述数据编码装置包括:In a second aspect, the present invention provides a data encoding device, which is applied to a hardware acceleration card, wherein the hardware acceleration card further includes a computing module and an encoding module, and the data encoding device includes:

存储器,用于存储计算机程序;Memory for storing computer programs;

处理器,用于在存储计算机程序时,实现上述所述的数据编码方法的步骤。A processor is used to implement the steps of the above-mentioned data encoding method when storing a computer program.

第三方面,本发明提供了一种硬件加速卡,包括上述所述的数据编码装置,还包括计算模块和编码模块;In a third aspect, the present invention provides a hardware acceleration card, comprising the data encoding device described above, and further comprising a computing module and an encoding module;

所述计算模块用于根据待编码数据和已编码数据确定匹配信息,将所述匹配信息和所述待编码数据通过目标通道数量的通道输出至所述编码模块;The calculation module is used to determine matching information according to the data to be encoded and the encoded data, and output the matching information and the data to be encoded to the encoding module through channels of the target number of channels;

所述编码模块用于根据所述匹配信息对所述待编码数据进行编码;The encoding module is used to encode the data to be encoded according to the matching information;

所述匹配信息为所述待编码数据中和所述已编码数据中的预设字符串相同时,所述预设字符串在所述已编码数据中的匹配距离和匹配长度。The matching information is the matching distance and matching length of the preset character string in the encoded data when the preset character string in the data to be encoded is the same as the preset character string in the encoded data.

第四方面,本发明提供了一种计算机程序产品,包括计算机程序/指令,所述计算机程序/指令被处理器执行时实现上述所述数据编码方法的步骤。In a fourth aspect, the present invention provides a computer program product, comprising a computer program/instructions, which implement the steps of the above-mentioned data encoding method when executed by a processor.

第五方面,本发明提供了一种非易失性存储介质,所述非易失性存储介质上存储有计算机程序,所述计算机程序被处理器执行时实现上述所述的数据编码方法的步骤。In a fifth aspect, the present invention provides a non-volatile storage medium having a computer program stored thereon, and the computer program, when executed by a processor, implements the steps of the above-mentioned data encoding method.

本发明提供了一种数据编码方法、装置、硬件加速卡、程序产品及介质,涉及数据处理领域,用于解决硬件加速卡实现压缩算法时应用场景受限的问题。该方案获取基于硬件加速卡的状态参数和/或用户的需求指令确定的通道数量配置信息,并配置通道数量配置为目标通道数量;在计算模块根据待编码数据和已编码数据确定匹配信息后,将匹配信息和待编码数据通过通道输出至编码模块,以根据匹配信息对待编码数据进行编码。本发明中,硬件加速卡实现压缩算法时,可根据不同的场景和需求动态调整通道数量,提高了灵活性和适用性,避免应用场景受限;通过通道数量配置和匹配,可以更有效地利用通道数量,提高压缩算法的效率和性能,降低了硬件加速卡的负担。The present invention provides a data encoding method, device, hardware acceleration card, program product and medium, which relate to the field of data processing and are used to solve the problem of limited application scenarios when hardware acceleration cards implement compression algorithms. The solution obtains channel number configuration information determined based on the state parameters of the hardware acceleration card and/or the user's demand instructions, and configures the channel number as the target channel number; after the calculation module determines the matching information according to the data to be encoded and the encoded data, the matching information and the data to be encoded are output to the encoding module through the channel to encode the data to be encoded according to the matching information. In the present invention, when the hardware acceleration card implements the compression algorithm, the number of channels can be dynamically adjusted according to different scenarios and requirements, which improves flexibility and applicability and avoids limited application scenarios; through channel number configuration and matching, the number of channels can be more effectively utilized, the efficiency and performance of the compression algorithm can be improved, and the burden on the hardware acceleration card can be reduced.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

为了更清楚地说明本发明实施例中的技术方案,下面将对现有技术和实施例中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the prior art and the drawings required for use in the embodiments are briefly introduced below. Obviously, the drawings described below are only some embodiments of the present invention. For ordinary technicians in this field, other drawings can be obtained based on these drawings without creative work.

图1为本发明提供的一种数据编码方法的流程图;FIG1 is a flow chart of a data encoding method provided by the present invention;

图2为本发明提供的一种查找最优匹配信息的硬件架构图;FIG2 is a hardware architecture diagram of searching for optimal matching information provided by the present invention;

图3为本发明提供的一种查找最优匹配信息的流程图;FIG3 is a flow chart of searching for optimal matching information provided by the present invention;

图4为本发明提供的一种查找最优匹配信息的具体实施例示意图;FIG4 is a schematic diagram of a specific embodiment of searching for optimal matching information provided by the present invention;

图5为本发明提供的一种数据编码装置的示意图;FIG5 is a schematic diagram of a data encoding device provided by the present invention;

图6为本发明提供的一种硬件加速卡的示意图;FIG6 is a schematic diagram of a hardware acceleration card provided by the present invention;

图7为本发明提供的一种非易失性存储介质的示意图。FIG. 7 is a schematic diagram of a non-volatile storage medium provided by the present invention.

具体实施方式Detailed ways

本发明的核心是提供一种数据编码方法、装置、硬件加速卡、程序产品及介质,硬件加速卡实现压缩算法时,可根据不同的场景和需求动态调整通道数量,提高了灵活性和适用性,避免应用场景受限;通过通道数量配置和匹配,可以更有效地利用通道数量,提高压缩算法的效率和性能,降低了硬件加速卡的负担。The core of the present invention is to provide a data encoding method, device, hardware acceleration card, program product and medium. When the hardware acceleration card implements the compression algorithm, the number of channels can be dynamically adjusted according to different scenarios and requirements, thereby improving flexibility and applicability and avoiding limitations on application scenarios. By configuring and matching the number of channels, the number of channels can be more effectively utilized, thereby improving the efficiency and performance of the compression algorithm and reducing the burden on the hardware acceleration card.

为使本发明实施例的目的、技术方案和优点更加清楚,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。In order to make the purpose, technical solution and advantages of the embodiments of the present invention clearer, the technical solution in the embodiments of the present invention will be clearly and completely described below in conjunction with the drawings in the embodiments of the present invention. Obviously, the described embodiments are part of the embodiments of the present invention, not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by ordinary technicians in this field without creative work are within the scope of protection of the present invention.

首先对懒惰匹配进行解释,计算模块计算得到匹配信息的过程为:利用恒定大小的“滑动窗口”来处理待编码数据中与已编码数据中的字符串重复的字符串,经过字典匹配可以用一个或几个匹配信息(包括匹配长度和匹配距离)来表示这一段匹配的字符串。最后经过链点筛选,选择一个最长匹配长度的匹配信息,若存在多个相同的匹配长度最长的匹配信息,则选择一个匹配距离最短的匹配信息。懒惰匹配(lazy match)是对上述计算得到匹配信息算法的改进,寻找到匹配长度最长的匹配信息之后并不立即使用,而是要与下一个字节的匹配信息比较,判断是否比当前字符串的匹配长度更长,如果是则选用下一个字节的匹配信息,若否,则继续与下下个字节的匹配长度比较,如此循环,直至找到比当前字节的匹配长度还要长的匹配信息。First, lazy matching is explained. The process of calculating matching information by the calculation module is as follows: using a "sliding window" of a constant size to process the strings in the data to be encoded that are repeated with the strings in the encoded data, one or more matching information (including matching length and matching distance) can be used to represent this segment of matching string after dictionary matching. Finally, after chain point screening, a matching information with the longest matching length is selected. If there are multiple matching information with the same longest matching length, a matching information with the shortest matching distance is selected. Lazy matching is an improvement on the above-mentioned matching information calculation algorithm. After finding the matching information with the longest matching length, it is not used immediately, but compared with the matching information of the next byte to determine whether it is longer than the matching length of the current string. If so, the matching information of the next byte is selected. If not, it continues to be compared with the matching length of the next byte, and so on. The cycle continues until a matching information longer than the matching length of the current byte is found.

第一方面,如图1所示,本发明提供了一种数据编码方法,应用于硬件加速卡中的处理器,硬件加速卡还包括计算模块和编码模块,数据编码方法包括:In a first aspect, as shown in FIG. 1 , the present invention provides a data encoding method, which is applied to a processor in a hardware acceleration card. The hardware acceleration card further includes a computing module and an encoding module. The data encoding method includes:

S11:获取基于硬件加速卡的状态参数和/或用户的需求指令确定的通道数量配置信息。S11: Obtain channel quantity configuration information determined based on status parameters of the hardware acceleration card and/or user demand instructions.

具体地,在硬件加速卡实现Gzip等压缩算法的过程中,通道数量(或称为“并行度”)是一个关键的参数,它决定了数据在硬件加速卡内部处理时的并行化程度。通道数量的选择直接影响到压缩效率、吞吐量和资源利用率。因此,动态地根据硬件加速卡的状态参数或用户的需求指令来确定通道数量配置信息,对于优化硬件加速卡的性能至关重要。Specifically, in the process of hardware accelerator cards implementing compression algorithms such as Gzip, the number of channels (or "parallelism") is a key parameter that determines the degree of parallelism when data is processed inside the hardware accelerator card. The choice of the number of channels directly affects compression efficiency, throughput, and resource utilization. Therefore, dynamically determining the channel number configuration information based on the status parameters of the hardware accelerator card or the user's demand instructions is crucial to optimizing the performance of the hardware accelerator card.

基于硬件加速卡的状态参数确定通道数量配置信息时,硬件加速卡的状态参数可以包括多种信息,例如:硬件资源利用率:如果硬件加速卡的某些资源(如内存、计算单元等)的利用率较低,那么可以增加通道数量以提高并行化程度,进而提升压缩效率;功耗和散热:在某些情况下,过高的通道数量可能导致硬件加速卡的功耗增加,甚至引发过热问题。因此,根据功耗和散热状态动态调整通道数量是合理的;任务队列长度:如果硬件加速卡的任务队列较长,说明有更多的数据等待处理,此时,增加通道数量可以加快处理速度,减少等待时间。通过实时监控和评估这些状态参数,可以动态地确定出合适的通道数量配置信息,以适应当前的硬件状态和工作负载。When determining the channel number configuration information based on the status parameters of the hardware accelerator card, the status parameters of the hardware accelerator card may include a variety of information, such as: Hardware resource utilization: If the utilization of certain resources of the hardware accelerator card (such as memory, computing unit, etc.) is low, the number of channels can be increased to improve the degree of parallelization, thereby improving compression efficiency; Power consumption and heat dissipation: In some cases, an excessively high number of channels may increase the power consumption of the hardware accelerator card and even cause overheating problems. Therefore, it is reasonable to dynamically adjust the number of channels based on the power consumption and heat dissipation status; Task queue length: If the task queue of the hardware accelerator card is long, it means that there is more data waiting to be processed. At this time, increasing the number of channels can speed up the processing speed and reduce the waiting time. By monitoring and evaluating these status parameters in real time, the appropriate channel number configuration information can be dynamically determined to adapt to the current hardware status and workload.

基于用户的需求指令确定通道数量配置信息时,在某些场景下,用户可能需要根据具体的应用需求来指定通道数量。例如,某些应用可能对压缩速度有严格的要求,而其他应用则可能更注重压缩比。通过提供用户接口或API(Application ProgrammingInterface,应用程序编程接口),用户可以根据自己的需求向硬件加速卡发送通道数量配置指令。硬件加速卡接收到这些指令后,会相应地调整编码模块的通道数量。When determining the channel number configuration information based on the user's demand instructions, in some scenarios, the user may need to specify the channel number based on specific application requirements. For example, some applications may have strict requirements on compression speed, while other applications may pay more attention to compression ratio. By providing a user interface or API (Application Programming Interface), users can send channel number configuration instructions to the hardware accelerator card according to their needs. After receiving these instructions, the hardware accelerator card will adjust the number of channels of the encoding module accordingly.

可见,本步骤通过实时监控硬件加速卡的状态参数或接收用户的需求指令,动态地确定出合适的通道数量配置信息,以优化硬件加速卡的性能。这种动态配置的方式可以使得硬件加速卡在不同的工作负载和硬件状态下都能保持较高的效率和性能。It can be seen that this step dynamically determines the appropriate channel quantity configuration information to optimize the performance of the hardware accelerator card by real-time monitoring of the hardware accelerator card's status parameters or receiving user demand instructions. This dynamic configuration method can enable the hardware accelerator card to maintain high efficiency and performance under different workloads and hardware conditions.

S12:根据通道数量配置信息将编码模块的通道数量配置为目标通道数量。S12: configuring the number of channels of the encoding module to be the target number of channels according to the channel number configuration information.

在硬件加速卡中,编码模块的通道数量配置通常是固定的,而硬件加速卡的状态参数和用户需求指令可能会导致通道数量需要动态调整。因此,本步骤的目的是根据这些信息,将编码模块的通道数量配置为与当前需求最匹配的目标通道数量。In a hardware accelerator card, the number of channels of the encoding module is usually fixed, but the state parameters of the hardware accelerator card and user demand instructions may cause the number of channels to be dynamically adjusted. Therefore, the purpose of this step is to configure the number of channels of the encoding module to the target number of channels that best matches the current demand based on this information.

总的来说,本步骤的核心原理是根据硬件加速卡的状态参数和用户需求指令,动态地调整编码模块的通道数量配置,以适应不同的压缩场景和需求。这样可以提高硬件加速卡的灵活性和适用性,使其在各种不同的应用场景下都能发挥最佳的压缩性能。In general, the core principle of this step is to dynamically adjust the channel number configuration of the encoding module according to the status parameters of the hardware acceleration card and user demand instructions to adapt to different compression scenarios and requirements. This can improve the flexibility and applicability of the hardware acceleration card, so that it can play the best compression performance in various application scenarios.

S13:在计算模块根据待编码数据和已编码数据确定匹配信息后,将匹配信息和待编码数据通过目标通道数量的通道输出至编码模块,触发编码模块根据匹配信息对待编码数据进行编码。S13: After the calculation module determines the matching information according to the data to be encoded and the encoded data, the matching information and the data to be encoded are output to the encoding module through the channels of the target number of channels, triggering the encoding module to encode the data to be encoded according to the matching information.

匹配信息为待编码数据中和已编码数据中的预设字符串相同时,预设字符串在已编码数据中的匹配距离和匹配长度。The matching information is the matching distance and matching length of the preset character string in the encoded data when the preset character string in the data to be encoded is the same as the preset character string in the encoded data.

具体来说,本步骤包括以下几个关键操作,匹配信息的确定:在计算模块中,针对待编码数据和已编码数据,会进行匹配操作以确定匹配信息。这些匹配信息通常包括预设字符串在已编码数据中的匹配距离和匹配长度。例如,在Gzip算法中,通过查找已编码数据中的匹配字符串,确定待编码数据中是否存在相似的子串,以及它们的匹配距离和长度。Specifically, this step includes the following key operations: Determination of matching information: In the calculation module, a matching operation is performed on the data to be encoded and the encoded data to determine the matching information. This matching information usually includes the matching distance and matching length of the preset string in the encoded data. For example, in the Gzip algorithm, by searching for matching strings in the encoded data, it is determined whether there are similar substrings in the data to be encoded, as well as their matching distances and lengths.

通道输出:一旦匹配信息确定,计算模块将匹配信息和待编码数据通过目标通道数量的通道输出至编码模块。这里的目标通道数量是根据S12步骤确定的,以适应当前的硬件加速卡状态和用户需求。Channel output: Once the matching information is determined, the computing module outputs the matching information and the data to be encoded to the encoding module through the target number of channels. The target number of channels here is determined according to step S12 to adapt to the current hardware acceleration card status and user needs.

编码触发:编码模块接收到计算模块输出的匹配信息和待编码数据后,根据这些信息触发编码操作。具体来说,编码模块会根据匹配信息对待编码数据进行相应的编码处理,例如使用压缩算法对匹配的子串进行压缩,从而实现数据的压缩和编码。Coding trigger: After the coding module receives the matching information and the data to be coded output by the calculation module, it triggers the coding operation based on this information. Specifically, the coding module will perform corresponding coding processing on the data to be coded based on the matching information, such as using a compression algorithm to compress the matching substring, thereby achieving data compression and coding.

总的来说,本步骤通过将匹配信息和待编码数据输出至编码模块,实现了在硬件加速卡中进行数据编码的过程。这样可以充分利用硬件加速的优势,提高数据编码的效率和性能。In general, this step realizes the process of data encoding in the hardware acceleration card by outputting the matching information and the data to be encoded to the encoding module, which can fully utilize the advantages of hardware acceleration and improve the efficiency and performance of data encoding.

综上,本发明提供的数据编码方法,获取基于硬件加速卡的状态参数和/或用户的需求指令确定的通道数量配置信息,并配置通道数量配置为目标通道数量;在计算模块根据待编码数据和已编码数据确定匹配信息后,将匹配信息和待编码数据通过通道输出至编码模块,以根据匹配信息对待编码数据进行编码。本发明中,硬件加速卡实现压缩算法时,可根据不同的场景和需求动态调整通道数量,提高了灵活性和适用性,避免应用场景受限;通过通道数量配置和匹配,可以更有效地利用通道数量,提高压缩算法的效率和性能,降低了硬件加速卡的负担。In summary, the data encoding method provided by the present invention obtains the channel number configuration information determined based on the state parameters of the hardware acceleration card and/or the user's demand instructions, and configures the channel number as the target channel number; after the calculation module determines the matching information according to the data to be encoded and the encoded data, the matching information and the data to be encoded are output to the encoding module through the channel to encode the data to be encoded according to the matching information. In the present invention, when the hardware acceleration card implements the compression algorithm, the number of channels can be dynamically adjusted according to different scenarios and requirements, which improves flexibility and applicability and avoids limitation of application scenarios; through the configuration and matching of the number of channels, the number of channels can be more effectively utilized, the efficiency and performance of the compression algorithm can be improved, and the burden on the hardware acceleration card can be reduced.

在一种实施例中,获取基于硬件加速卡的状态参数和/或用户的需求指令确定的通道数量配置信息,包括:根据用户的需求指令确定测试通道数量,将编码模块的通道数量配置为测试通道数量;预估硬件加速卡在测试通道数量下的第一计算资源占用量;根据第一计算资源占用量调整测试通道数量,得到目标通道数量。In one embodiment, obtaining channel number configuration information determined based on status parameters of a hardware acceleration card and/or a user's demand instructions includes: determining the number of test channels according to the user's demand instructions, and configuring the number of channels of an encoding module as the number of test channels; estimating a first computing resource occupancy of the hardware acceleration card under the number of test channels; and adjusting the number of test channels according to the first computing resource occupancy to obtain a target number of channels.

本实施例描述了如何根据硬件加速卡的状态参数和/或用户的需求指令来确定目标通道数量配置信息,旨在确保在硬件加速卡上实现高效且稳定的编码性能。首先,接收用户关于通道数量的需求指令。这个需求可能是基于用户对编码速度、压缩率或硬件资源使用率的特定要求,也可以是直接设置的测试通道数量。根据这个需求指令,设定一个初始的“测试通道数量”。基于用户需求指令解析出的测试通道数量,将硬件加速卡中的编码模块的通道数量配置为这个值。这是为了在实际操作前,先对硬件加速卡进行一次模拟或测试运行。在编码模块以测试通道数量运行时,监控并预估硬件加速卡在这些通道数量下的计算资源占用量。这个预估可能包括CPU(中央处理器,Central Processing Unit)、内存、带宽等资源的使用情况。通过这一步骤,可以了解在当前配置下,硬件加速卡的资源使用效率和可能存在的瓶颈。根据预估的第一计算资源占用量,评估当前配置是否满足用户需求,并且是否会导致硬件资源的过度使用或浪费。如果预估结果显示资源占用过高,可能会降低测试通道数量以减少资源消耗;如果资源占用较低,则可能增加测试通道数量以提高编码效率。经过一系列的调整和评估后,最终确定一个目标通道数量。这个数量既满足了用户的需求,也确保了硬件加速卡的稳定高效运行。This embodiment describes how to determine the target channel number configuration information according to the state parameters of the hardware acceleration card and/or the user's demand instructions, aiming to ensure efficient and stable encoding performance on the hardware acceleration card. First, receive the user's demand instruction on the number of channels. This demand may be based on the user's specific requirements for encoding speed, compression rate or hardware resource utilization, or it may be the number of test channels set directly. According to this demand instruction, an initial "number of test channels" is set. Based on the number of test channels parsed by the user's demand instruction, the number of channels of the encoding module in the hardware acceleration card is configured to this value. This is to simulate or test the hardware acceleration card before actual operation. When the encoding module runs with the number of test channels, monitor and estimate the computing resource usage of the hardware acceleration card under these numbers of channels. This estimate may include the usage of resources such as CPU (Central Processing Unit), memory, bandwidth, etc. Through this step, the resource utilization efficiency and possible bottlenecks of the hardware acceleration card under the current configuration can be understood. According to the estimated first computing resource usage, evaluate whether the current configuration meets the user's needs and whether it will lead to excessive use or waste of hardware resources. If the estimated result shows that the resource usage is too high, the number of test channels may be reduced to reduce resource consumption; if the resource usage is low, the number of test channels may be increased to improve encoding efficiency. After a series of adjustments and evaluations, a target number of channels is finally determined. This number not only meets the needs of users, but also ensures the stable and efficient operation of the hardware accelerator card.

综上,本实施例允许系统根据硬件加速卡的实时状态和用户的具体需求,动态地调整编码模块的通道数量。这不仅可以提高编码效率,还可以避免硬件资源的浪费和过载,从而延长硬件加速卡的使用寿命。此外,通过用户需求的直接参与,系统还能够提供更加个性化和定制化的编码服务。In summary, this embodiment allows the system to dynamically adjust the number of channels of the encoding module according to the real-time status of the hardware acceleration card and the specific needs of the user. This can not only improve the encoding efficiency, but also avoid the waste and overload of hardware resources, thereby extending the service life of the hardware acceleration card. In addition, through the direct participation of user needs, the system can also provide more personalized and customized encoding services.

在一种实施例中,根据第一计算资源占用量调整测试通道数量,得到目标通道数量,包括:将第一计算资源占用量与第一预设阈值比较;若第一计算资源占用量大于第一预设阈值,则将测试通道数量减小,得到目标通道数量;若第一计算资源占用量小于第一预设阈值,则将测试通道数量增大,得到目标通道数量;若第一计算资源占用量等于所述第一预设阈值,则确定测试通道数量为目标通道数量。In one embodiment, the number of test channels is adjusted according to the first computing resource occupancy to obtain the target channel number, including: comparing the first computing resource occupancy with a first preset threshold; if the first computing resource occupancy is greater than the first preset threshold, reducing the number of test channels to obtain the target channel number; if the first computing resource occupancy is less than the first preset threshold, increasing the number of test channels to obtain the target channel number; if the first computing resource occupancy is equal to the first preset threshold, determining the number of test channels to be the target channel number.

本实施例描述了如何通过预估硬件加速卡在特定测试通道数量下的计算资源占用量,并根据这个预估值与目标阈值的比较结果,来动态调整测试通道数量,从而得到最优的目标通道数量。本实施例的核心思想是通过模拟或预估不同通道数量下硬件加速卡的计算资源占用量,并根据这个预估结果与目标阈值的比较,来逐步逼近最优的通道数量配置。This embodiment describes how to estimate the computing resource usage of the hardware acceleration card under a specific number of test channels, and dynamically adjust the number of test channels based on the comparison result between the estimated value and the target threshold, so as to obtain the optimal target number of channels. The core idea of this embodiment is to gradually approach the optimal channel number configuration by simulating or estimating the computing resource usage of the hardware acceleration card under different numbers of channels, and comparing the estimated result with the target threshold.

首先,根据用户的需求指令确定一个初始的测试通道数量,并将编码模块的通道数量配置为这个测试通道数量。然后,通过模拟或实际运行的方式,预估硬件加速卡在这个测试通道数量下的计算资源占用量(即第一计算资源占用量)。接下来,将预估的第一计算资源占用量与第一预设阈值进行比较。这个第一预设阈值通常是根据硬件加速卡的性能、功耗、温度等参数设定的,代表了一个安全或理想的计算资源占用上限。如果预估的第一计算资源占用量大于第一预设阈值,说明当前配置的测试通道数量过多,导致硬件加速卡的计算资源占用过高,可能会引发性能下降、功耗增加、过热等问题。因此,需要将测试通道数量减小,并重新预估计算资源占用量,直到找到一个合适的、小于或等于第一预设阈值的通道数量配置。如果预估的第一计算资源占用量小于第一预设阈值,说明当前配置的测试通道数量还有提升的空间,可以进一步增加通道数量以提高编码效率。因此,可以将测试通道数量增大,并再次预估计算资源占用量,以寻找一个更优的通道数量配置。经过多次迭代和调整,最终会找到一个既满足用户需求又符合硬件加速卡性能要求的通道数量配置,这个配置就是目标通道数量。First, an initial number of test channels is determined according to the user's demand instructions, and the number of channels of the encoding module is configured to this number of test channels. Then, the computing resource occupancy of the hardware acceleration card under this number of test channels (i.e., the first computing resource occupancy) is estimated by simulation or actual operation. Next, the estimated first computing resource occupancy is compared with the first preset threshold. This first preset threshold is usually set according to the performance, power consumption, temperature and other parameters of the hardware acceleration card, and represents a safe or ideal upper limit of computing resource occupancy. If the estimated first computing resource occupancy is greater than the first preset threshold, it means that the number of test channels currently configured is too large, resulting in excessive computing resource occupancy of the hardware acceleration card, which may cause performance degradation, increased power consumption, overheating and other problems. Therefore, it is necessary to reduce the number of test channels and re-estimate the computing resource occupancy until a suitable channel number configuration that is less than or equal to the first preset threshold is found. If the estimated first computing resource occupancy is less than the first preset threshold, it means that the number of test channels currently configured still has room for improvement, and the number of channels can be further increased to improve encoding efficiency. Therefore, the number of test channels can be increased, and the computing resource occupancy can be estimated again to find a more optimal channel number configuration. After many iterations and adjustments, we will eventually find a channel number configuration that meets both user needs and the performance requirements of the hardware accelerator card. This configuration is the target channel number.

在一种优选实施例中,第一预设阈值可以是一个固定值,也可以是一个范围,若第一预设阈值为一个范围时,如预设范围,则是将第一计算资源占用量与预设范围中的最大值和最小值比较,此时,若大于最大值,则进入将测试通道数量减小,得到目标通道数量的步骤,若小于最小值,则进入将测试通道数量增大,得到目标通道数量的步骤,若不大于最大值且不小于最小值,则将测试通道数量确定为目标通道数量。In a preferred embodiment, the first preset threshold value can be a fixed value or a range. If the first preset threshold value is a range, such as a preset range, the first computing resource occupancy is compared with the maximum value and the minimum value in the preset range. At this time, if it is greater than the maximum value, the step of reducing the number of test channels to obtain the target number of channels is entered; if it is less than the minimum value, the step of increasing the number of test channels to obtain the target number of channels is entered; if it is not greater than the maximum value and not less than the minimum value, the number of test channels is determined as the target number of channels.

综上,本实施例能够根据硬件加速卡的实时状态和用户需求动态调整通道数量,实现更高效的资源利用和更优质的编码性能。通过设置合理的预设阈值,可以避免因通道数量过多导致的硬件加速卡过载、过热等问题,保证系统的稳定性和安全性。能够适应不同应用场景和用户需求的变化,提供灵活的通道数量配置选项。In summary, this embodiment can dynamically adjust the number of channels according to the real-time status of the hardware accelerator card and user needs, achieving more efficient resource utilization and better encoding performance. By setting a reasonable preset threshold, problems such as hardware accelerator card overload and overheating caused by too many channels can be avoided, ensuring the stability and security of the system. It can adapt to changes in different application scenarios and user needs, and provide flexible channel number configuration options.

在一种实施例中,将第一计算资源占用量与第一预设阈值比较之后,还包括:计算第一计算资源占用量与第一预设阈值的差值,计算差值与第一预设阈值的比值;根据比值及预设比值-变化量对应关系确定通道数量变化量;若第一计算资源占用量大于第一预设阈值,则将测试通道数量减小,得到目标通道数量,包括:若第一计算资源占用量大于第一预设阈值,则将测试通道数量减去通道数量变化量,得到目标通道数量;若第一计算资源占用量小于第一预设阈值,则将测试通道数量增大,得到目标通道数量,包括:若第一计算资源占用量小于第一预设阈值,则将测试通道数量加上通道数量变化量,得到目标通道数量。In one embodiment, after comparing the first computing resource occupancy with the first preset threshold, it also includes: calculating the difference between the first computing resource occupancy and the first preset threshold, and calculating the ratio of the difference to the first preset threshold; determining the change in the number of channels according to the ratio and the preset ratio-change correspondence; if the first computing resource occupancy is greater than the first preset threshold, reducing the number of test channels to obtain the target number of channels, including: if the first computing resource occupancy is greater than the first preset threshold, subtracting the change in the number of channels from the number of test channels to obtain the target number of channels; if the first computing resource occupancy is less than the first preset threshold, increasing the number of test channels to obtain the target number of channels, including: if the first computing resource occupancy is less than the first preset threshold, adding the change in the number of channels to the number of test channels to obtain the target number of channels.

本实施例描述了如何根据硬件加速卡的状态参数和用户需求指令来动态地确定最佳的编码模块通道数量。其核心原理是,通过预估和调整编码模块在不同通道数量下的计算资源占用量,以达到优化性能、避免资源过载并最大化硬件利用率的目的。This embodiment describes how to dynamically determine the optimal number of encoding module channels based on the status parameters of the hardware accelerator card and user demand instructions. The core principle is to optimize performance, avoid resource overload, and maximize hardware utilization by estimating and adjusting the computing resource usage of the encoding module under different numbers of channels.

具体来说,上述实施例中已经说明,如果第一计算资源占用量大于第一预设阈值,说明当前配置的通道数量过多,需要减少;反之,如果第一计算资源占用量小于第一预设阈值,说明当前配置的通道数量可能过少,可以考虑增加。本实施例为了更精确地调整通道数量,还需要计算第一计算资源占用量与第一预设阈值的差值,并计算这个差值与第一预设阈值的比值。这个比值反映了当前计算资源占用量超出或低于预设阈值的程度。基于上述计算得到的比值,结合一个预设的比值-变化量对应关系(这个对应关系可能是一个查找表、一个函数或其他形式的映射关系),确定应该增加的或减少的通道数量变化量。这个变化量通常是根据硬件加速卡的性能和稳定性要求来设定的,以确保调整后的通道数量既不过载也不过少。最后,根据确定的变化量,对初始的测试通道数量进行调整,得到目标通道数量。如果第一计算资源占用量大于第一预设阈值,则将测试通道数量减去通道数量变化量;如果第一计算资源占用量小于第一预设阈值,则将测试通道数量加上通道数量变化量。这样,就得到了一个既满足性能要求又避免资源过载的目标通道数量。Specifically, it has been explained in the above embodiment that if the first computing resource occupancy is greater than the first preset threshold, it means that the number of channels currently configured is too large and needs to be reduced; conversely, if the first computing resource occupancy is less than the first preset threshold, it means that the number of channels currently configured may be too small and can be considered to be increased. In order to adjust the number of channels more accurately, this embodiment also needs to calculate the difference between the first computing resource occupancy and the first preset threshold, and calculate the ratio of this difference to the first preset threshold. This ratio reflects the degree to which the current computing resource occupancy exceeds or falls below the preset threshold. Based on the ratio calculated above, combined with a preset ratio-variation correspondence (this correspondence may be a lookup table, a function or other form of mapping relationship), the change in the number of channels that should be increased or decreased is determined. This change is usually set according to the performance and stability requirements of the hardware accelerator card to ensure that the adjusted number of channels is neither overloaded nor too small. Finally, according to the determined change, the initial number of test channels is adjusted to obtain the target number of channels. If the first computing resource usage is greater than the first preset threshold, the number of test channels is subtracted from the channel number change; if the first computing resource usage is less than the first preset threshold, the number of test channels is added to the channel number change. In this way, a target channel number that meets performance requirements and avoids resource overload is obtained.

同样的,若第一预设阈值为一个范围(如预设范围)时,若第一计算资源占用量大于最大值,则此实施例中计算的差值则为第一计算资源占用量与最大值的差值,若第一计算资源占用量小于最小值,则此实施例中的差值则为第一计算资源占用量与最小值之间的差值。Similarly, if the first preset threshold is a range (such as a preset range), if the first computing resource occupancy is greater than the maximum value, then the difference calculated in this embodiment is the difference between the first computing resource occupancy and the maximum value; if the first computing resource occupancy is less than the minimum value, then the difference in this embodiment is the difference between the first computing resource occupancy and the minimum value.

综上,本实施例可以根据硬件加速卡的实际状态和用户需求来灵活地配置通道数量,从而实现更好的性能和资源利用率。In summary, this embodiment can flexibly configure the number of channels according to the actual status of the hardware acceleration card and user requirements, thereby achieving better performance and resource utilization.

在一种实施例中,在判定第一计算资源占用量大于第一预设阈值时,还包括:将第一计算资源占用量与第二预设阈值比较,第二预设阈值大于第一预设阈值;若第一计算资源占用量大于第二预设阈值,则反馈重新配置信号给用户,以使用户基于重新配置信号重新确定测试通道数量,并重新进入将编码模块的通道数量配置为测试通道数量的步骤。In one embodiment, when it is determined that the first computing resource occupancy is greater than a first preset threshold, it also includes: comparing the first computing resource occupancy with a second preset threshold, the second preset threshold is greater than the first preset threshold; if the first computing resource occupancy is greater than the second preset threshold, then feeding back a reconfiguration signal to the user, so that the user re-determines the number of test channels based on the reconfiguration signal, and re-enters the step of configuring the number of channels of the encoding module to the number of test channels.

本实施例中,在第一预设阈值是一个用于判断硬件加速卡是否可能因资源占用过高而面临性能下降或运行不稳定的临界值。当预估的第一计算资源占用量超过这个阈值时,说明当前的测试通道数量配置可能过于激进,需要调整以避免潜在的性能问题。但是,直接减小测试通道数量并不一定总是最佳解决方案,特别是当资源占用量只是略高于第一预设阈值,而仍有潜力通过进一步优化算法或调整其他参数来提高性能。因此,本实施例引入了一个额外的判断步骤,即与第二预设阈值进行比较。In this embodiment, the first preset threshold is a critical value for determining whether the hardware acceleration card may face performance degradation or unstable operation due to excessive resource usage. When the estimated first computing resource usage exceeds this threshold, it means that the current configuration of the number of test channels may be too aggressive and needs to be adjusted to avoid potential performance problems. However, directly reducing the number of test channels is not always the best solution, especially when the resource usage is only slightly higher than the first preset threshold, and there is still potential to improve performance by further optimizing the algorithm or adjusting other parameters. Therefore, this embodiment introduces an additional judgment step, namely, comparison with the second preset threshold.

第二预设阈值是一个更高的阈值,它通常代表了硬件加速卡可能面临的严重性能下降或运行不稳定的极限。当第一计算资源占用量超过这个阈值时,说明当前的测试通道数量配置已经过于保守,甚至可能严重限制了硬件加速卡的性能。在这种情况下,简单地减小测试通道数量可能不是最佳选择,而是需要用户重新考虑和配置更适合的通道数量。The second preset threshold is a higher threshold, which usually represents the limit of severe performance degradation or unstable operation that the hardware accelerator may face. When the first computing resource usage exceeds this threshold, it means that the current configuration of the number of test channels is too conservative and may even severely limit the performance of the hardware accelerator. In this case, simply reducing the number of test channels may not be the best choice, but requires the user to reconsider and configure a more suitable number of channels.

因此,当第一计算资源占用量大于第一预设阈值但小于第二预设阈值时,尝试通过减小测试通道数量来降低资源占用量。但是,当第一计算资源占用量超过第二预设阈值时,反馈一个重新配置信号给用户,提示用户当前的通道数量配置可能存在问题,并建议用户基于这个信号重新确定测试通道数量。这样,用户可以根据系统的反馈和自己的需求,重新配置一个更合适的通道数量,以确保硬件加速卡能够高效且稳定地运行。Therefore, when the first computing resource usage is greater than the first preset threshold but less than the second preset threshold, an attempt is made to reduce the resource usage by reducing the number of test channels. However, when the first computing resource usage exceeds the second preset threshold, a reconfiguration signal is fed back to the user, prompting the user that there may be a problem with the current channel number configuration, and suggesting that the user re-determine the number of test channels based on this signal. In this way, the user can reconfigure a more appropriate number of channels based on the system feedback and their own needs to ensure that the hardware accelerator card can run efficiently and stably.

在一种实施例中,计算第一计算资源占用量与第一预设阈值的差值之前,还包括:确定硬件加速卡的应用平台,确定硬件加速卡执行应用平台上的预设任务所需的第二计算资源占用量;根据第二计算资源占用量确定硬件加速卡的计算资源剩余量,第二计算资源占用量与计算资源剩余量的和值不大于硬件加速卡的全部计算资源;根据计算资源剩余量确定第一预设阈值。In one embodiment, before calculating the difference between the first computing resource occupancy and the first preset threshold, the method further includes: determining an application platform of the hardware acceleration card, and determining a second computing resource occupancy required for the hardware acceleration card to execute a preset task on the application platform; determining a remaining computing resource of the hardware acceleration card according to the second computing resource occupancy, wherein the sum of the second computing resource occupancy and the remaining computing resource is not greater than all computing resources of the hardware acceleration card; and determining the first preset threshold according to the remaining computing resource.

本实施例描述了如何根据硬件加速卡的应用平台和执行预设任务所需的计算资源来确定第一预设阈值。这一步骤是为了更准确地评估和调整编码模块的通道数量,以确保在硬件加速卡有限的计算资源内,既能满足编码需求,又能保证其他预设任务能够正常运行。This embodiment describes how to determine the first preset threshold value based on the application platform of the hardware acceleration card and the computing resources required to perform the preset task. This step is to more accurately evaluate and adjust the number of channels of the encoding module to ensure that the encoding requirements can be met within the limited computing resources of the hardware acceleration card, while ensuring that other preset tasks can run normally.

首先,需要明确硬件加速卡所应用的具体平台或环境。不同的平台可能对硬件加速卡有不同的要求和使用场景。例如,服务器、数据中心或边缘设备等平台对硬件加速卡的性能和功能需求可能有所不同。First, it is necessary to clarify the specific platform or environment in which the hardware accelerator card is used. Different platforms may have different requirements and usage scenarios for hardware accelerator cards. For example, platforms such as servers, data centers, or edge devices may have different performance and functional requirements for hardware accelerator cards.

在确定了应用平台后,需要明确硬件加速卡在该平台上需要执行的预设任务,并估算这些任务所需的计算资源占用量。这些预设任务可能包括编码、解码、数据处理等。通过对这些任务的计算资源需求进行量化,可以为后续的计算资源分配提供参考。After determining the application platform, it is necessary to identify the preset tasks that the hardware accelerator card needs to perform on the platform and estimate the computing resource usage required for these tasks. These preset tasks may include encoding, decoding, data processing, etc. By quantifying the computing resource requirements for these tasks, a reference can be provided for subsequent computing resource allocation.

在知道了预设任务所需的计算资源后,进一步确定硬件加速卡的计算资源剩余量。计算资源剩余量是指硬件加速卡除去执行预设任务所需资源外,还剩余的计算资源。这个剩余量将用于动态配置编码模块的通道数量,以满足编码需求。After knowing the computing resources required for the preset task, the remaining computing resources of the hardware acceleration card are further determined. The remaining computing resources refer to the remaining computing resources of the hardware acceleration card after removing the resources required to perform the preset task. This remaining amount will be used to dynamically configure the number of channels of the encoding module to meet the encoding requirements.

基于计算资源剩余量,可以确定第一预设阈值。这个阈值用于在配置编码模块通道数量时,判断当前配置是否会导致硬件加速卡过载或资源浪费。如果预估的第一计算资源占用量超过了这个阈值,说明当前配置的通道数量过多,可能会导致硬件加速卡性能下降或无法正常运行其他预设任务;如果预估的占用量低于阈值,则可以考虑增加通道数量以提高编码效率。Based on the remaining amount of computing resources, a first preset threshold can be determined. This threshold is used to determine whether the current configuration will cause hardware acceleration card overload or resource waste when configuring the number of encoding module channels. If the estimated first computing resource occupancy exceeds this threshold, it means that the number of channels currently configured is too large, which may cause the performance of the hardware acceleration card to degrade or other preset tasks to fail to run normally; if the estimated occupancy is lower than the threshold, you can consider increasing the number of channels to improve encoding efficiency.

本实施例提供的基于硬件加速卡应用平台和预设任务需求来确定和调整编码模块通道数量的方法,确保了硬件加速卡在有限资源下能够高效、稳定地运行。The method provided in this embodiment for determining and adjusting the number of encoding module channels based on the hardware acceleration card application platform and preset task requirements ensures that the hardware acceleration card can operate efficiently and stably under limited resources.

在一种实施例中,获取基于硬件加速卡的状态参数和/或用户的需求指令确定的通道数量配置信息,包括:确定硬件加速卡的应用平台,确定硬件加速卡执行应用平台上的预设任务所需的第二计算资源占用量;根据第二计算资源占用量确定硬件加速卡的计算资源剩余量,第二计算资源占用量与计算资源剩余量的和值不大于硬件加速卡的全部计算资源;根据计算资源剩余量确定目标通道数量。In one embodiment, obtaining channel quantity configuration information determined based on a state parameter of the hardware acceleration card and/or a user's demand instruction includes: determining an application platform of the hardware acceleration card, and determining a second computing resource occupancy amount required for the hardware acceleration card to execute a preset task on the application platform; determining a remaining computing resource amount of the hardware acceleration card according to the second computing resource occupancy amount, wherein a sum of the second computing resource occupancy amount and the remaining computing resource amount is not greater than all computing resources of the hardware acceleration card; and determining a target channel quantity according to the remaining computing resource amount.

在另一种实施例中,也可以直接基于硬件加速卡的状态参数(在这里具体是应用平台和预设任务所需的计算资源)以及可能的用户需求确定目标通道参数。具体而言,应用平台是指硬件加速卡将被部署和运行的环境,如服务器、工作站、数据中心等。不同的应用平台可能有不同的工作负载和性能要求。确定应用平台是理解硬件加速卡运行环境的重要一步。In another embodiment, the target channel parameters can also be determined directly based on the state parameters of the hardware accelerator card (here specifically the application platform and the computing resources required for the preset task) and possible user needs. Specifically, the application platform refers to the environment where the hardware accelerator card will be deployed and run, such as a server, workstation, data center, etc. Different application platforms may have different workloads and performance requirements. Determining the application platform is an important step in understanding the operating environment of the hardware accelerator card.

预设任务是指硬件加速卡在应用平台上需要执行的一系列操作或计算任务(本身固定要执行的任务或者在该应用平台上与硬件加速卡连接的其他模块需要利用硬件加速卡执行的任务)。为了确定目标通道数量,需要知道这些预设任务需要多少计算资源。这通常通过测试、模拟或历史数据来估算。Preset tasks refer to a series of operations or computing tasks that the hardware accelerator card needs to perform on the application platform (tasks that are fixed to be performed by the hardware accelerator card itself or tasks that other modules connected to the hardware accelerator card on the application platform need to perform using the hardware accelerator card). In order to determine the target number of channels, it is necessary to know how much computing resources these preset tasks require. This is usually estimated through testing, simulation or historical data.

一旦知道了预设任务所需的计算资源占用量(称之为第二计算资源占用量),就可以从硬件加速卡的全部计算资源中减去这个值,从而得到计算资源剩余量。这个剩余量表示了硬件加速卡在不执行预设任务时,还剩下多少计算资源可供其他任务(如编码任务)使用。Once the computing resource usage required for the preset task is known (called the second computing resource usage), this value can be subtracted from the total computing resources of the hardware accelerator card to obtain the remaining computing resources. This remaining amount indicates how much computing resources are left for other tasks (such as encoding tasks) when the hardware accelerator card does not perform the preset task.

在得到了计算资源剩余量之后,就可以根据这个值来确定编码模块的目标通道数量。目标通道数量的确定需要考虑到编码模块本身的工作效率和性能需求。过多的通道可能会超出剩余计算资源的承载能力,导致性能下降;而过少的通道则可能无法充分利用剩余的计算资源,导致资源浪费。因此,需要找到一个平衡点(也即是根据剩余计算资源确定的目标通道数量),使得目标通道数量既能满足编码性能需求,又不会超出硬件加速卡的计算资源限制。After obtaining the remaining amount of computing resources, the target number of channels for the encoding module can be determined based on this value. The determination of the target number of channels needs to take into account the working efficiency and performance requirements of the encoding module itself. Too many channels may exceed the carrying capacity of the remaining computing resources, resulting in performance degradation; while too few channels may not be able to fully utilize the remaining computing resources, resulting in resource waste. Therefore, it is necessary to find a balance point (that is, the target number of channels determined based on the remaining computing resources) so that the target number of channels can meet the encoding performance requirements without exceeding the computing resource limitations of the hardware accelerator card.

在一种实施例中,获取基于硬件加速卡的状态参数和/或用户的需求指令确定的通道数量配置信息,包括:通过第一接口获取基于硬件加速卡的状态参数和/或用户的需求指令确定的通道数量配置信息;利用接口转换装置将第一接口获取到的通道数量配置信息转换为第二接口对应的通道数量配置信息;将第二接口对应的通道数量配置信息刷新至通道数量配置寄存器中;根据通道数量配置信息将编码模块的通道数量配置为目标通道数量,包括:根据通道数量配置寄存器中的通道数量配置信息,将编码模块的通道数量配置为目标通道数量。In one embodiment, obtaining channel quantity configuration information determined based on the state parameters of the hardware acceleration card and/or the user's demand instructions includes: obtaining channel quantity configuration information determined based on the state parameters of the hardware acceleration card and/or the user's demand instructions through a first interface; using an interface conversion device to convert the channel quantity configuration information obtained by the first interface into channel quantity configuration information corresponding to a second interface; refreshing the channel quantity configuration information corresponding to the second interface into a channel quantity configuration register; and configuring the channel quantity of the encoding module to a target channel quantity according to the channel quantity configuration information, including: configuring the channel quantity of the encoding module to the target channel quantity according to the channel quantity configuration information in the channel quantity configuration register.

本实施例中,第一接口:硬件加速卡可能具备多种接口来接收外部信号或指令,其中第一接口用于接收基于硬件加速卡的状态参数或用户的需求指令;这些参数或指令可能来自于系统监控器、用户界面、控制软件等。接口转换装置:由于不同接口可能采用不同的数据格式或通信协议,因此需要一个接口转换装置来将第一接口接收到的通道数量配置信息转换为第二接口(通常是与编码模块直接通信的接口)所对应的数据格式;这一步骤确保了信息的准确性和兼容性。通道数量配置寄存器:硬件加速卡内部通常包含多个寄存器来存储和管理各种配置信息;其中,通道数量配置寄存器专门用于存储编码模块的通道数量配置信息;通过刷新这个寄存器,可以确保编码模块能够按照最新的配置信息来工作。刷新操作:将第二接口对应的通道数量配置信息写入通道数量配置寄存器中,完成寄存器的刷新;这一操作通常在系统初始化时执行,或者在接收到新的配置指令时动态执行。配置过程:在刷新了通道数量配置寄存器之后,硬件加速卡会根据该寄存器中的配置信息来配置编码模块的通道数量;具体来说,它可能会发送一系列控制信号到编码模块,指示其开启或关闭一定数量的通道。In this embodiment, the first interface: the hardware acceleration card may have multiple interfaces to receive external signals or instructions, wherein the first interface is used to receive state parameters based on the hardware acceleration card or user demand instructions; these parameters or instructions may come from a system monitor, a user interface, a control software, etc. Interface conversion device: Since different interfaces may use different data formats or communication protocols, an interface conversion device is required to convert the channel quantity configuration information received by the first interface into a data format corresponding to the second interface (usually an interface that directly communicates with the encoding module); this step ensures the accuracy and compatibility of the information. Channel quantity configuration register: The hardware acceleration card usually contains multiple registers to store and manage various configuration information; wherein the channel quantity configuration register is specifically used to store the channel quantity configuration information of the encoding module; by refreshing this register, it can be ensured that the encoding module can work according to the latest configuration information. Refresh operation: write the channel quantity configuration information corresponding to the second interface into the channel quantity configuration register to complete the register refresh; this operation is usually performed when the system is initialized, or dynamically performed when a new configuration instruction is received. Configuration process: After refreshing the channel number configuration register, the hardware accelerator card will configure the number of channels of the encoding module according to the configuration information in the register; specifically, it may send a series of control signals to the encoding module to instruct it to turn on or off a certain number of channels.

本实施例通过接口通信、信息转换和寄存器刷新等一系列步骤,实现了根据硬件加速卡的状态参数和/或用户需求指令来动态配置编码模块的通道数量。这种动态配置方式可以提高硬件加速卡的灵活性和适应性,使其能够更好地满足各种应用场景的需求。This embodiment realizes the dynamic configuration of the number of channels of the encoding module according to the state parameters of the hardware acceleration card and/or user demand instructions through a series of steps such as interface communication, information conversion and register refresh. This dynamic configuration method can improve the flexibility and adaptability of the hardware acceleration card, so that it can better meet the needs of various application scenarios.

在一种实施例中,待编码数据中将所有字节划分为N组,N为大于1的整数,每组中的字节数等于目标通道数量;根据待编码数据和已编码数据确定匹配信息,包括:以每组字节中的每个字节为起始字节,N组字节同时查找已编码数据中是否存在与组内各个字节为起始字节相匹配的字符串;记录每个字节为起始字节相匹配的字符串的匹配信息,匹配信息包括相匹配的字符串的匹配长度以及字符串的首字节在已编码数据中的匹配距离;根据待编码数据中的N组中每个字节对应的匹配信息确定最优匹配信息;将匹配信息和待编码数据通过目标通道数量的通道输出至编码模块,包括:将最优匹配信息对应的数据结果集通过目标通道数量的通道输出至编码模块,数据结果集中包括最优匹配信息和待编码数据。In one embodiment, all bytes in the data to be encoded are divided into N groups, N is an integer greater than 1, and the number of bytes in each group is equal to the target number of channels; matching information is determined based on the data to be encoded and the encoded data, including: taking each byte in each group of bytes as a starting byte, and simultaneously searching the N groups of bytes in the encoded data to see whether there is a character string that matches each byte in the group as a starting byte; recording matching information of the character string that matches each byte as the starting byte, the matching information includes the matching length of the matched character string and the matching distance of the first byte of the character string in the encoded data; determining optimal matching information based on the matching information corresponding to each byte in the N groups in the data to be encoded; outputting the matching information and the data to be encoded to the encoding module through the channels of the target number of channels, including: outputting a data result set corresponding to the optimal matching information to the encoding module through the channels of the target number of channels, the data result set including the optimal matching information and the data to be encoded.

本实施例描述了一种在硬件加速卡中实施数据编码方法的特定方式,它专注于优化匹配信息的查找和输出过程,通过并行处理来提高数据处理速度。具体而言,在硬件加速卡中,待编码数据被划分为N组,每组中的字节数等于目标通道数量(由S12步骤确定)。这种分组策略是为了实现并行处理,即N组数据可以同时在硬件加速卡的计算模块中进行处理,查找与已编码数据的匹配字符串。对于每一组数据,都以组内的每个字节为起始字节,在已编码数据中查找是否存在与这些起始字节相匹配的字符串,这种查找是并行的,意味着N组数据同时进行查找操作,从而显著提高了查找效率。对于每个起始字节,如果找到了匹配的字符串,就会记录该字符串的匹配长度以及首字节在已编码数据中的匹配距离。这些匹配信息将用于后续的编码过程。匹配的字符串的匹配长度通常为正整数,若没有匹配到字符串,则将匹配长度计为0。在所有N组数据中,每个字节都对应一个匹配信息(如果存在多个匹配的字符串)。为了选择最佳的匹配信息,通常会根据某种标准(如匹配长度最长或匹配距离最短)来确定每个字节的最优匹配。然后,将所有这些最优匹配信息对应的数据结果集,用于后续的编码过程。最后,根据目标通道数量,将最优匹配信息对应的数据结果集输出至编码模块。这个数据结果集不仅包括最优匹配信息,还包括原始的待编码数据。由于之前的数据分组和并行处理,这个输出过程也是高效的,并且可以利用硬件加速卡的多个通道进行并行传输。This embodiment describes a specific way of implementing a data encoding method in a hardware accelerator card, which focuses on optimizing the search and output process of matching information and improving the data processing speed through parallel processing. Specifically, in the hardware accelerator card, the data to be encoded is divided into N groups, and the number of bytes in each group is equal to the number of target channels (determined by step S12). This grouping strategy is to achieve parallel processing, that is, N groups of data can be processed in the computing module of the hardware accelerator card at the same time to find matching strings with the encoded data. For each group of data, each byte in the group is used as the starting byte, and the encoded data is searched for whether there are strings matching these starting bytes. This search is parallel, which means that N groups of data are searched at the same time, thereby significantly improving the search efficiency. For each starting byte, if a matching string is found, the matching length of the string and the matching distance of the first byte in the encoded data will be recorded. These matching information will be used in the subsequent encoding process. The matching length of the matched string is usually a positive integer. If no string is matched, the matching length is counted as 0. In all N groups of data, each byte corresponds to a matching information (if there are multiple matching strings). In order to select the best matching information, the best match for each byte is usually determined based on a certain standard (such as the longest matching length or the shortest matching distance). Then, the data result set corresponding to all these best matching information is used in the subsequent encoding process. Finally, according to the number of target channels, the data result set corresponding to the best matching information is output to the encoding module. This data result set includes not only the best matching information, but also the original data to be encoded. Due to the previous data grouping and parallel processing, this output process is also efficient, and multiple channels of the hardware accelerator card can be used for parallel transmission.

可见,本实施例通过数据分组和并行处理的方式,显著提高了匹配信息的查找和输出效率,从而加快了整体的数据编码过程。这种方法充分利用了硬件加速卡的并行计算能力,使得数据编码更加高效和快速。It can be seen that this embodiment significantly improves the efficiency of searching and outputting matching information through data grouping and parallel processing, thereby speeding up the overall data encoding process. This method fully utilizes the parallel computing capability of the hardware acceleration card, making data encoding more efficient and faster.

在一种实施例中,根据待编码数据中的N组中每个字节对应的匹配信息确定最优匹配信息,包括:从第一组开始轮询,根据当前比较过程中的当前字节的第一匹配长度和下一字节的第二匹配长度确定下一比较过程中的当前字节的地址;将满足预设结束条件的比较过程中的当前字节的匹配信息确定为最优匹配信息。In one embodiment, the optimal matching information is determined based on the matching information corresponding to each byte in N groups of the data to be encoded, including: starting polling from the first group, determining the address of the current byte in the next comparison process based on the first matching length of the current byte in the current comparison process and the second matching length of the next byte; and determining the matching information of the current byte in the comparison process that meets the preset end condition as the optimal matching information.

本实施例描述了在处理待编码数据时,如何根据多组字节的匹配信息来确定最优匹配信息的方法。这种方法主要用于在硬件加速卡中实施数据编码时,提高编码效率和准确性。This embodiment describes a method for determining optimal matching information based on matching information of multiple groups of bytes when processing data to be encoded. This method is mainly used to improve encoding efficiency and accuracy when implementing data encoding in a hardware acceleration card.

首先,从待编码数据的第一组开始进行轮询,意味着从待编码数据的开头开始逐组进行比较;这确保了对待编码数据的每一组都会被处理,并且每组都有机会找到最优匹配信息。在比较过程中,需要确定下一比较过程中的当前字节的地址,以便继续比较;通过计算当前字节的第一匹配长度和下一个字节的第二匹配长度,可以确定下一比较过程中的当前字节的地址;这个步骤确保了比较过程的连续性,使得能够顺利地在待编码数据中进行匹配查找。在比较过程中,需要判断何时结束比较,并确定最优匹配信息;一旦满足预设结束条件,即可以确定当前比较过程中的当前字节的匹配信息为最优匹配信息;预设结束条件可能涉及到匹配长度、匹配距离或其他指标,具体条件应根据具体实现而定。First, polling starts from the first group of data to be encoded, which means that the data to be encoded is compared group by group from the beginning; this ensures that each group of data to be encoded will be processed and each group has the opportunity to find the best matching information. During the comparison process, it is necessary to determine the address of the current byte in the next comparison process in order to continue the comparison; by calculating the first matching length of the current byte and the second matching length of the next byte, the address of the current byte in the next comparison process can be determined; this step ensures the continuity of the comparison process, so that matching searches can be smoothly performed in the data to be encoded. During the comparison process, it is necessary to determine when to end the comparison and determine the best matching information; once the preset end condition is met, the matching information of the current byte in the current comparison process can be determined as the best matching information; the preset end condition may involve matching length, matching distance or other indicators, and the specific conditions should be determined according to the specific implementation.

通过这些步骤,可以在待编码数据中逐组进行比较,找到满足预设条件的最优匹配信息,并在每组中确定最优匹配信息。这有助于提高编码效率和准确性,以便在编码过程中更有效地利用匹配信息。Through these steps, the data to be encoded can be compared group by group, the best matching information that meets the preset conditions can be found, and the best matching information can be determined in each group. This helps to improve encoding efficiency and accuracy so that matching information can be used more effectively in the encoding process.

在一种实施例中,将满足预设结束条件的比较过程中的当前字节的匹配信息确定为最优匹配信息之前,还包括:判断下一比较过程中的当前字节的地址偏移量是否大于目标通道数量;若大于目标通道数量,则判定当前比较过程中的当前字节满足预设结束条件;若小于目标通道数量,则判定当前比较过程的当前字节不满足预设结束条件。In one embodiment, before determining the matching information of the current byte in the comparison process that meets the preset end condition as the optimal matching information, it also includes: judging whether the address offset of the current byte in the next comparison process is greater than the target channel number; if it is greater than the target channel number, judging that the current byte in the current comparison process meets the preset end condition; if it is less than the target channel number, judging that the current byte in the current comparison process does not meet the preset end condition.

具体而言,判断下一比较过程中的当前字节的地址偏移量是否大于目标通道数量:这一步骤的作用是检查下一比较过程中需要进行比较的字节与待编码数据中的N组字节之间的距离是否已经超过了目标通道数量,地址偏移量是指该字节相对于该组字节的第一个字节开始位置的偏移量,目标通道数量是根据硬件加速卡的状态参数和/或用户的需求指令确定的通道数量配置信息。如果下一比较过程中的当前字节的地址偏移量大于目标通道数量,说明此字节已经超出了目标通道范围,需要停止比较。也即,如果此字节已经超出了目标通道数量的范围,就可以判定比较过程已经完成,当前比较过程中的当前字节满足了预设结束条件;这意味着已经找到了最优匹配信息,不需要再进行进一步的比较。如果下一比较过程中需要进行比较的当前字节的地址偏移量仍然小于目标通道数量,说明比较过程还没有完成,需要继续比较后续的字节,以确定最优匹配信息;这时候就不能判定当前字节满足预设结束条件,需要继续进行比较直到满足结束条件。Specifically, determine whether the address offset of the current byte in the next comparison process is greater than the target channel number: the purpose of this step is to check whether the distance between the byte to be compared in the next comparison process and the N groups of bytes in the data to be encoded has exceeded the target channel number. The address offset refers to the offset of the byte relative to the starting position of the first byte of the group of bytes. The target channel number is the channel number configuration information determined according to the state parameters of the hardware accelerator card and/or the user's demand instructions. If the address offset of the current byte in the next comparison process is greater than the target channel number, it means that this byte has exceeded the target channel range and the comparison needs to be stopped. That is, if this byte has exceeded the range of the target channel number, it can be determined that the comparison process has been completed and the current byte in the current comparison process meets the preset end condition; this means that the optimal matching information has been found and no further comparison is required. If the address offset of the current byte to be compared in the next comparison process is still less than the target channel number, it means that the comparison process has not been completed and it is necessary to continue to compare subsequent bytes to determine the optimal matching information; at this time, it cannot be determined that the current byte meets the preset end condition and it is necessary to continue to compare until the end condition is met.

本实施例目的是在比较过程中及时判断是否已经找到了最优匹配信息,以提高数据编码的效率和准确性。The purpose of this embodiment is to timely determine whether the optimal matching information has been found during the comparison process, so as to improve the efficiency and accuracy of data encoding.

在一种实施例中,根据当前比较过程中的当前字节的第一匹配长度和下一字节的第二匹配长度确定下一比较过程中的当前字节的地址,包括:在第一匹配长度小于第二匹配长度时,输出当前比较过程中的当前字节,并将当前比较过程中的当前字节的地址加一作为下一比较过程中的当前字节的地址。In one embodiment, the address of the current byte in the next comparison process is determined based on the first matching length of the current byte in the current comparison process and the second matching length of the next byte, including: when the first matching length is less than the second matching length, outputting the current byte in the current comparison process, and adding one to the address of the current byte in the current comparison process as the address of the current byte in the next comparison process.

在一种实施例中,根据当前比较过程中的当前字节的第一匹配长度和下一字节的第二匹配长度确定下一比较过程中的当前字节的地址,包括:在第一匹配长度不小于第二匹配长度时,输出第一匹配长度,并将当前比较过程中的当前字节的地址加第一匹配长度之后的地址作为下一比较过程中的当前字节的地址。In one embodiment, the address of the current byte in the next comparison process is determined based on the first match length of the current byte in the current comparison process and the second match length of the next byte, including: when the first match length is not less than the second match length, the first match length is output, and the address of the current byte in the current comparison process plus the address after the first match length is added is used as the address of the current byte in the next comparison process.

本实施例描述了在处理待编码数据时,如何根据当前字节和下一字节的匹配长度来确定下一比较过程中当前字节的地址。本实施例有助于在数据编码过程中寻找最优的匹配信息,从而提高编码效率和压缩比。This embodiment describes how to determine the address of the current byte in the next comparison process according to the matching length between the current byte and the next byte when processing the data to be encoded. This embodiment helps to find the optimal matching information in the data encoding process, thereby improving the encoding efficiency and compression ratio.

步骤一:比较当前字节和下一字节的匹配长度。在这一步骤中,会比较当前字节(记作字节A)的第一匹配长度(记作长度A)和下一字节(记作字节B)的第二匹配长度(记作长度B)。这两个匹配长度分别代表了从字节A和字节B开始在已编码数据中能够找到的最长匹配字符串的长度。Step 1: Compare the matching lengths of the current byte and the next byte. In this step, the first matching length (recorded as length A) of the current byte (recorded as byte A) and the second matching length (recorded as length B) of the next byte (recorded as byte B) are compared. These two matching lengths represent the lengths of the longest matching strings that can be found in the encoded data starting from byte A and byte B, respectively.

步骤二:判断匹配长度的大小。比较长度A和长度B的大小,以确定哪一个更长。这一步是为了决定搜索策略,即接下来应该继续搜索以字节A为起点的匹配,还是跳到以字节B为起点进行搜索。Step 2: Determine the length of the match. Compare length A and length B to determine which one is longer. This step is to determine the search strategy, that is, whether to continue searching for a match starting with byte A or jump to searching starting with byte B.

步骤三:根据匹配长度的大小决定输出和地址更新。Step 3: Determine the output and address update based on the size of the matching length.

第一种情况:第一匹配长度小于第二匹配长度。输出当前字节:这意味着以当前字节A为起始点的匹配不是最优的,因此系统可能会选择不输出当前字节A的匹配信息,而是继续搜索更优的匹配。但在某些实现中,因为当前字节A的匹配不是最优的,因此输出当前字节。地址加一:将当前字节A的地址加一,以字节B作为新的当前字节,继续下一轮的比较。这是因为字节B的匹配长度更长,可能带来更优的压缩效果。The first case: the first match length is less than the second match length. Output the current byte: This means that the match starting from the current byte A is not optimal, so the system may choose not to output the match information of the current byte A, but continue to search for a better match. However, in some implementations, because the match of the current byte A is not optimal, the current byte is output. Address plus one: Add one to the address of the current byte A, use byte B as the new current byte, and continue the next round of comparison. This is because the match length of byte B is longer, which may bring better compression effect.

第二种情况:第一匹配长度不小于第二匹配长度。输出第一匹配长度:由于长度A不小于长度B,认为以字节A为起始点的匹配更有可能是最优的(或者至少在当前比较轮次中是这样)。因此,选择输出与字节A相关的匹配信息,包括第一匹配长度。更新地址:接下来,将当前字节A的地址加上第一匹配长度后的地址作为下一轮比较的起始地址。这是因为,既然从字节A开始能够找到一个较长的匹配字符串,那么在字节A之后的某个位置(即地址加第一匹配长度后的位置)继续搜索可能会找到另一个潜在的匹配点。The second case: the first match length is not less than the second match length. Output the first match length: Since length A is not less than length B, it is considered that the match starting from byte A is more likely to be optimal (or at least in the current comparison round). Therefore, choose to output the matching information related to byte A, including the first match length. Update the address: Next, add the address of the current byte A plus the address after the first match length as the starting address for the next round of comparison. This is because, since a longer matching string can be found starting from byte A, continuing the search at a certain position after byte A (that is, the position after the address plus the first match length) may find another potential matching point.

本实施例,通过比较当前字节和下一字节的匹配长度,可以更加智能地选择搜索的起点和方向,避免在不太可能找到更长匹配字符串的位置进行不必要的搜索。通过选择更有可能产生长匹配字符串的字节作为搜索起点,系统可以更快地找到最优匹配信息,从而提高编码效率和压缩率。不同的数据可能具有不同的特性,例如某些数据段可能更容易产生长匹配字符串。通过动态调整搜索过程,可以适应不同数据的特性,实现更加高效的编码。In this embodiment, by comparing the matching lengths of the current byte and the next byte, the starting point and direction of the search can be selected more intelligently, avoiding unnecessary searches at locations where it is unlikely to find a longer matching string. By selecting bytes that are more likely to produce long matching strings as the search starting point, the system can find the optimal matching information more quickly, thereby improving encoding efficiency and compression rate. Different data may have different characteristics, for example, some data segments may be more likely to produce long matching strings. By dynamically adjusting the search process, the characteristics of different data can be adapted to achieve more efficient encoding.

在一种实施例中,还包括:若当前字节的地址等于最大通道地址时,缓存当前字节的匹配信息;在进入下一组字节的比较过程时,构建虚拟0地址,虚拟0地址中存储最大通道地址的字节的匹配信息;将虚拟0地址存储的最大通道地址的字节的匹配信息作为当前比较过程的当前字节的匹配信息,并进入根据当前比较过程中的当前字节的第一匹配长度和下一字节的第二匹配长度确定下一比较过程中的当前字节的地址的步骤。In one embodiment, it also includes: if the address of the current byte is equal to the maximum channel address, caching the matching information of the current byte; when entering the comparison process of the next group of bytes, constructing a virtual 0 address, storing the matching information of the byte of the maximum channel address in the virtual 0 address; using the matching information of the byte of the maximum channel address stored in the virtual 0 address as the matching information of the current byte in the current comparison process, and entering the step of determining the address of the current byte in the next comparison process according to the first matching length of the current byte in the current comparison process and the second matching length of the next byte.

本实施例描述了一种在硬件加速卡的数据编码过程中,处理多通道匹配信息并确定最优匹配信息的特定情况的方式。This embodiment describes a method for processing multi-channel matching information and determining optimal matching information in a specific case during data encoding of a hardware acceleration card.

步骤一:若当前字节的地址等于最大通道地址时,缓存当前字节的匹配信息。当在多组字节中并行查找匹配字符串时,每一组都有一个起始字节(当前字节)和对应的地址。如果当前字节的地址(即当前比较的字节在组中的位置)达到了最大通道地址(即每组中的最后一个字节),则表明该组字节的匹配查找已经到达尾部。此时,缓存当前字节的匹配信息,以便后续处理。Step 1: If the address of the current byte is equal to the maximum channel address, cache the matching information of the current byte. When searching for matching strings in multiple groups of bytes in parallel, each group has a starting byte (current byte) and a corresponding address. If the address of the current byte (i.e. the position of the currently compared byte in the group) reaches the maximum channel address (i.e. the last byte in each group), it indicates that the matching search for the group of bytes has reached the end. At this time, cache the matching information of the current byte for subsequent processing.

步骤二:在进入下一组字节的比较过程时,构建虚拟0地址。当完成一组字节的匹配查找并缓存了最后一个字节的匹配信息后,需要进入下一组字节的查找过程。为了保持查找过程的连续性和一致性,构建了一个虚拟的0地址。这个虚拟0地址并不是实际的数据地址,而是用于存储前一组中最大通道地址(最后一个字节)的匹配信息,便于与下一组中第一个地址的数据比较。Step 2: When entering the comparison process of the next group of bytes, construct a virtual 0 address. After completing the matching search of a group of bytes and caching the matching information of the last byte, it is necessary to enter the search process of the next group of bytes. In order to maintain the continuity and consistency of the search process, a virtual 0 address is constructed. This virtual 0 address is not the actual data address, but is used to store the matching information of the maximum channel address (the last byte) in the previous group, so as to facilitate comparison with the data of the first address in the next group.

步骤三:将虚拟0地址存储的最大通道地址的字节的匹配信息作为当前比较过程的当前字节的匹配信息。在进入新的比较过程(即下一组字节的查找)时,将虚拟0地址中存储的匹配信息作为当前字节的匹配信息。这是因为,虽然物理上进入了一个新的字节组,但从逻辑上看,希望继续前一组字节的匹配查找过程,特别是当发现某个匹配可能跨越了两个字节组时。Step 3: The matching information of the byte of the maximum channel address stored in the virtual 0 address is used as the matching information of the current byte of the current comparison process. When entering a new comparison process (i.e., searching for the next group of bytes), the matching information stored in the virtual 0 address is used as the matching information of the current byte. This is because, although a new byte group is physically entered, logically, it is hoped to continue the matching search process of the previous group of bytes, especially when it is found that a match may span two byte groups.

步骤四:进入根据当前比较过程中的当前字节的第一匹配长度和下一字节的第二匹配长度确定下一比较过程中的当前字节的地址的步骤。在确定了当前字节的匹配信息后(无论是真实的还是从虚拟0地址中获取的),继续根据已定义的策略(如上述实施例所述)来确定下一比较过程中的当前字节的地址。这个过程将指导系统如何继续在多组字节中查找匹配字符串,并最终确定最优匹配信息。Step 4: Enter the step of determining the address of the current byte in the next comparison process according to the first matching length of the current byte in the current comparison process and the second matching length of the next byte. After determining the matching information of the current byte (whether real or obtained from the virtual 0 address), continue to determine the address of the current byte in the next comparison process according to the defined strategy (as described in the above embodiment). This process will guide the system on how to continue to search for matching strings in multiple groups of bytes and finally determine the optimal matching information.

可见,本实施例提供了一种处理多通道匹配信息时跨越字节组边界的策略。通过缓存最后一组字节的匹配信息、构建虚拟0地址以及将虚拟0地址中的信息作为新组字节的起始匹配信息,能够无缝地在不同组之间继续查找过程,从而找到最优的匹配信息。It can be seen that this embodiment provides a strategy for crossing byte group boundaries when processing multi-channel matching information. By caching the matching information of the last group of bytes, constructing a virtual 0 address, and using the information in the virtual 0 address as the starting matching information of a new group of bytes, the search process can be seamlessly continued between different groups, thereby finding the optimal matching information.

请参照图2,描述了一种硬件实现的架构,具体而言,提供两类存储队列,分别为数据存储队列,用于存储待编码数据中的各个字节,匹配信息存储队列,用于存储与每个字节对应的匹配信息。需要说明的是,由于硬件加速卡的并行处理能力,各个数据存储队列中的字节数据是同时来的,所以当检测到任一个fifo(First Input First Output,先入先出队列)非空时,即可将8个fifo的数据同时读出。各个匹配信息是随机来的(与前端执行各个字节的匹配算法的速度相关),所以当检测到8个match fifo(匹配信息存储队列)同时非空时,一起将8个匹配信息读出。所有输出的字节匹配信息通过打拍缓存到对应的寄存器数组中,同时寄存器数组会显示数据有效标志,控制模块在检测到数据有效标志时,读取寄存器数据中缓存的字节匹配信息。控制模块作为主要的管理调度模块,主要包括两个部分:用于流水并行比较的模块和用于得到最优匹配信息对应的数据结果集的匹配模块。用于流水并行比较的模块主要用于执行图3中的流程,具体为:当前比较过程中的当前字节的第一匹配长度和下一字节的第二匹配长度进行比较;当前字节的第一匹配长度是否小于下一字节的第二匹配长度;若是,输出当前比较过程中的当前字节,并将当前比较过程中的当前字节的地址加一作为下一比较过程中的当前字节的地址;若否,输出第一匹配长度,并将当前比较过程中的当前字节的地址加第一匹配长度之后的地址作为下一比较过程中的当前字节的地址;然后,下一比较过程中的当前字节的地址偏移量是否大于目标通道数量;若大于目标通道数量,则结束;否则重新进入当前比较过程中的当前字节的第一匹配长度和下一字节的第二匹配长度进行比较的过程。此种方式在一定限度的保持匹配信息比较的连贯性。Please refer to Figure 2, which describes a hardware-implemented architecture. Specifically, two types of storage queues are provided, namely, a data storage queue for storing each byte in the data to be encoded, and a matching information storage queue for storing matching information corresponding to each byte. It should be noted that due to the parallel processing capability of the hardware accelerator card, the byte data in each data storage queue comes at the same time, so when any fifo (First Input First Output) is detected to be non-empty, the data of 8 fifos can be read out at the same time. Each matching information comes randomly (related to the speed at which the front-end executes the matching algorithm of each byte), so when it is detected that 8 match fifos (matching information storage queues) are not empty at the same time, the 8 matching information are read out together. All output byte matching information is cached in the corresponding register array through tapping, and the register array will display the data valid flag. When the data valid flag is detected, the control module reads the byte matching information cached in the register data. As the main management and scheduling module, the control module mainly includes two parts: a module for pipeline parallel comparison and a matching module for obtaining the data result set corresponding to the optimal matching information. The module for pipeline parallel comparison is mainly used to execute the process in Figure 3, specifically: compare the first matching length of the current byte in the current comparison process with the second matching length of the next byte; whether the first matching length of the current byte is less than the second matching length of the next byte; if so, output the current byte in the current comparison process, and add one to the address of the current byte in the current comparison process as the address of the current byte in the next comparison process; if not, output the first matching length, and add the address of the current byte in the current comparison process plus the address after the first matching length as the address of the current byte in the next comparison process; then, whether the address offset of the current byte in the next comparison process is greater than the number of target channels; if greater than the number of target channels, end; otherwise, re-enter the process of comparing the first matching length of the current byte in the current comparison process with the second matching length of the next byte. This method maintains the consistency of the comparison of matching information to a certain extent.

如图4所示,在一个具体实施例中,<1>、<2>、<3>、<4>为4个字节组,每个组中包括8个字节,如字节组<1>中包括a1-h1,字节组<2>中包括a2-h2等。每个字节下的数字为匹配长度。As shown in FIG4 , in a specific embodiment, <1>, <2>, <3>, and <4> are four byte groups, each of which includes 8 bytes, such as byte group <1> includes a1-h1, byte group <2> includes a2-h2, etc. The number under each byte is the matching length.

其中计算各个字节组中的各个字节的匹配长度,是以组为单位并行的,提高计算效率。在得到各个字节的匹配长度之后,通过比较确定最优匹配信息是从第一组开始轮询,直到匹配到最优匹配信息为止。The matching length of each byte in each byte group is calculated in parallel in groups to improve the calculation efficiency. After the matching length of each byte is obtained, the best matching information is determined by comparison and polling starts from the first group until the best matching information is matched.

具体而言,确定最优匹配信息的过程如下:Specifically, the process of determining the best matching information is as follows:

从第一组开始比较(offset=1):Start comparing from the first group (offset=1):

若以a1为起始字节,开始最优比较,输出为a1,4,7;下一比较地址在下一组中的地址偏移量为offset1=7-2=5。If a1 is used as the starting byte, the optimal comparison begins and the output is a1, 4, 7; the address offset of the next comparison address in the next group is offset1=7-2=5.

若以b1为起始字节,开始最优比较,输出为4,7;下一比较地址在下一组中的地址偏移量为offset2=7-2=5。If b1 is used as the starting byte, the optimal comparison begins and the output is 4, 7; the address offset of the next comparison address in the next group is offset2=7-2=5.

若以c1为起始字节,开始最优比较,输出为4,g1,10;下一比较地址在下一组中的地址偏移量为offset3=10-0=10。If c1 is used as the starting byte, the optimal comparison begins and the output is 4, g1, 10; the address offset of the next comparison address in the next group is offset3=10-0=10.

若以d1为起始字节,开始最优比较,输出为3,g1,10;下一比较地址在下一组中的地址偏移量为offset4=10-0=10。If d1 is used as the starting byte, the optimal comparison begins and the output is 3, g1, 10; the address offset of the next comparison address in the next group is offset4=10-0=10.

若以e1为起始字节,开始最优比较,输出为e1,7;下一比较地址在下一组中的地址偏移量为offset5=7-2=5。If e1 is used as the starting byte, the optimal comparison begins and the output is e1, 7; the address offset of the next comparison address in the next group is offset5=7-2=5.

若以f1为起始字节,开始最优比较,输出为7;下一比较地址在下一组中的地址偏移量为offset6=7-2=5。If f1 is used as the starting byte, the optimal comparison begins and the output is 7; the address offset of the next comparison address in the next group is offset6=7-2=5.

若以g1为起始字节,开始最优比较,输出为g1,10;下一比较地址在下一组中的地址偏移量为offset7=10-0=10。If g1 is used as the starting byte, the optimal comparison begins and the output is g1, 10; the address offset of the next comparison address in the next group is offset7=10-0=10.

若以h1为起始字节,开始最优比较,输出为10;下一比较地址在下一组中的地址偏移量为offset8=10-0=10。If h1 is used as the starting byte, the optimal comparison begins and the output is 10; the address offset of the next comparison address in the next group is offset8=10-0=10.

第一组比较offset默认值为1,所以选择第1个最优结果匹配数据集,输出结果为a1,4,7,offset更新为5。The default value of the first comparison offset is 1, so the first optimal result matching data set is selected, the output result is a1, 4, 7, and the offset is updated to 5.

第二组比较(offset=5):The second set of comparisons (offset=5):

从a2开始最优比较,输出为3,d2,3,下一比较地址在下一组中的地址偏移量为offset1=0。The optimal comparison starts from a2, the output is 3, d2, 3, and the address offset of the next comparison address in the next group is offset1=0.

从b2开始最优比较,输出为b2,4,7,下一比较地址在下一组中的地址偏移量为offset2=7-2=5。The optimal comparison starts from b2, and the output is b2, 4, 7. The address offset of the next comparison address in the next group is offset2=7-2=5.

从c2开始最优比较,输出为4,7,下一比较地址在下一组中的地址偏移量为offset3=7-2=5。The optimal comparison starts from c2, the output is 4, 7, and the address offset of the next comparison address in the next group is offset3=7-2=5.

从d2开始最优比较,输出为3,7,下一比较地址在下一组中的地址偏移量为offset4=7-2=5。The optimal comparison starts from d2, the output is 3, 7, and the address offset of the next comparison address in the next group is offset4=7-2=5.

从e2开始最优比较,输出为3,下一比较地址在下一组中的地址偏移量为offset5=0。The optimal comparison starts from e2, the output is 3, and the address offset of the next comparison address in the next group is offset5=0.

从f2开始最优比较,输出为f2,7,下一比较地址在下一组中的地址偏移量为offset6=7-2=5。The optimal comparison starts from f2, the output is f2,7, and the address offset of the next comparison address in the next group is offset6=7-2=5.

从g2开始最优比较,输出为7,下一比较地址在下一组中的地址偏移量为offset7=7-2=5。The optimal comparison starts from g2, the output is 7, and the address offset of the next comparison address in the next group is offset7=7-2=5.

从h2开始最优比较,因为h2的地址等于通道数量8,所以不输出,缓存h2的匹配信息,下一比较地址在下一组中的地址偏移量为offset8=0。The optimal comparison starts from h2. Since the address of h2 is equal to the number of channels 8, it is not output and the matching information of h2 is cached. The address offset of the next comparison address in the next group is offset8=0.

第二组比较offset起始值为5,所以直接选择第5个最优结果匹配数据集,输出结果为3,offset更新为0。The second set of comparison offsets starts at 5, so the fifth best result matching data set is directly selected, the output result is 3, and the offset is updated to 0.

第三组比较(offset=0):The third group of comparisons (offset=0):

从h2开始最优比较,输出6,11;下一比较地址在下一组中的地址偏移量为offset0=11-2=9。The optimal comparison starts from h2, and the output is 6, 11; the address offset of the next comparison address in the next group is offset0=11-2=9.

从a3开始最优比较,输出为h2,a3,20;下一比较地址在下一组中的地址偏移量为offset1=20-6=14。The optimal comparison starts from a3, and the output is h2, a3, 20; the address offset of the next comparison address in the next group is offset1 = 20-6 = 14.

第三组比较offset起始值为0,所以上一次最后一个匹配信息参与到本组比较,本组只需比较h2和a3的大小即可决定本组的最优匹配数据集。若h2>a3,则选取h2开始的结果集,若h2<a3,则选取a3开始结果集。因此本次比较后offset更新为9。The offset starting value of the third comparison is 0, so the last matching information of the previous comparison is involved in this comparison. This group only needs to compare the size of h2 and a3 to determine the best matching data set of this group. If h2>a3, the result set starting with h2 is selected. If h2<a3, the result set starting with a3 is selected. Therefore, after this comparison, the offset is updated to 9.

第四组比较(offset=9):The fourth group of comparisons (offset=9):

最后一组数据,offset大于最大通道数,以此本次不输出。For the last set of data, the offset is greater than the maximum number of channels, so it is not output this time.

在一种实施例中,将最优匹配信息对应的数据结果集通过目标通道数量的通道输出至编码模块,包括:将最优匹配信息对应的数据结果集写入深度为m的乒乓寄存器组,m为N的正整数倍;在乒乓寄存器组的N个通道标志有效时,触发编码模块通过目标通道数量的通道读取数据结果集。In one embodiment, the data result set corresponding to the optimal matching information is output to the encoding module through the channels of the target number of channels, including: writing the data result set corresponding to the optimal matching information into a ping-pong register group with a depth of m, where m is a positive integer multiple of N; when the N channel flags of the ping-pong register group are valid, triggering the encoding module to read the data result set through the channels of the target number of channels.

本实施例描述了在数据编码过程中,如何将最优匹配信息对应的数据结果集有效地通过目标通道数量的通道输出至编码模块的具体实施方式。This embodiment describes a specific implementation method of how to effectively output the data result set corresponding to the best matching information to the encoding module through the channels of the target number of channels during the data encoding process.

步骤一:将最优匹配信息对应的数据结果集写入深度为m的乒乓寄存器组。使用深度为m的乒乓寄存器组(通常包括两个或更多个交替使用的寄存器)作为缓冲区,可以暂存最优匹配信息对应的数据结果集。这样,即使编码模块的处理速度较慢,计算模块也可以继续处理下一组数据,提高了整个系统的吞吐量和处理效率。乒乓寄存器组的深度m为N的正整数倍,意味着它可以同时存储多个通道的数据结果集。这允许系统并行处理多组数据,进一步提高了系统的并行处理能力。Step 1: Write the data result set corresponding to the optimal matching information into a ping-pong register group with a depth of m. Using a ping-pong register group with a depth of m (usually including two or more registers used alternately) as a buffer, the data result set corresponding to the optimal matching information can be temporarily stored. In this way, even if the processing speed of the encoding module is slow, the computing module can continue to process the next set of data, improving the throughput and processing efficiency of the entire system. The depth m of the ping-pong register group is a positive integer multiple of N, which means that it can store data result sets of multiple channels at the same time. This allows the system to process multiple sets of data in parallel, further improving the parallel processing capability of the system.

步骤二:在乒乓寄存器组的N个通道标志有效时,触发编码模块通过目标通道数量的通道读取数据结果集。乒乓寄存器组中的通道标志用于指示某个通道的数据结果集是否已准备好供编码模块读取。当N个通道标志都有效时(即N个通道的数据结果集都已准备好),触发编码模块开始读取数据。这种机制确保了数据的一致性和同步性。Step 2: When the N channel flags of the ping-pong register group are valid, the encoding module is triggered to read the data result set through the channels of the target number of channels. The channel flags in the ping-pong register group are used to indicate whether the data result set of a certain channel is ready for the encoding module to read. When all N channel flags are valid (that is, the data result sets of N channels are ready), the encoding module is triggered to start reading data. This mechanism ensures data consistency and synchronization.

本实施例,由于乒乓寄存器组的深度和通道数量都是可配置的(基于N和m的设定),因此系统可以根据不同的应用需求进行灵活调整。例如,如果应用需要更高的吞吐量,可以增加乒乓寄存器组的深度或通道数量;反之,如果资源有限,可以减少这些参数以节省硬件资源。通过乒乓寄存器组进行缓冲和同步,可以减少因编码模块和计算模块处理速度不匹配而导致的数据丢失风险。即使编码模块暂时无法处理新的数据结果集,计算模块也可以继续将结果写入乒乓寄存器组,等待编码模块准备好后再进行读取和处理。综上所述,本实施例通过引入乒乓寄存器组作为缓冲区,并结合通道标志的同步机制,实现了最优匹配信息对应的数据结果集的有效输出和编码模块的同步处理,提高了整个数据编码系统的效率和可靠性。In this embodiment, since the depth and number of channels of the ping-pong register group are configurable (based on the settings of N and m), the system can be flexibly adjusted according to different application requirements. For example, if the application requires higher throughput, the depth or number of channels of the ping-pong register group can be increased; conversely, if resources are limited, these parameters can be reduced to save hardware resources. Buffering and synchronization through the ping-pong register group can reduce the risk of data loss caused by the mismatch of the processing speed of the encoding module and the computing module. Even if the encoding module is temporarily unable to process a new data result set, the computing module can continue to write the result to the ping-pong register group and wait for the encoding module to be ready before reading and processing. In summary, this embodiment introduces the ping-pong register group as a buffer, and combines the synchronization mechanism of the channel flag to achieve the effective output of the data result set corresponding to the optimal matching information and the synchronous processing of the encoding module, thereby improving the efficiency and reliability of the entire data encoding system.

在一种实施例中,乒乓寄存器组的个数为多个,将最优匹配信息对应的数据结果集写入深度为m的乒乓寄存器组,包括:在编码模块通过目标通道数量的通道读取其中一个乒乓寄存器组中存储的数据结果集时,将最优匹配信息对应的数据结果集写入其它未被读取的乒乓寄存器组中。In one embodiment, there are multiple ping-pong register groups, and the data result set corresponding to the optimal matching information is written into the ping-pong register group with a depth of m, including: when the encoding module reads the data result set stored in one of the ping-pong register groups through the channels of the target number of channels, the data result set corresponding to the optimal matching information is written into other ping-pong register groups that have not been read.

本实施例描述了在使用多个乒乓寄存器组进行数据缓冲时的操作步骤。乒乓寄存器组通常用于高速数据处理中,以实现连续数据流的无缝缓冲和传输。设置多个乒乓寄存器组是为了实现数据的连续处理和传输。当一个乒乓寄存器组正在被编码模块读取数据时,另一个乒乓寄存器组可以同时被写入新的数据结果集,从而实现了数据处理的并行性和连续性。编码模块在需要处理数据时,会通过目标通道数量的通道从一个乒乓寄存器组中读取最优匹配信息对应的数据结果集。这个过程是数据处理的关键步骤,因为它确保了编码模块能够持续不断地接收到需要处理的数据。在编码模块从一个乒乓寄存器组读取数据的同时,计算模块就将新的最优匹配信息对应的数据结果集写入到另一个未被读取的乒乓寄存器组中。这个步骤确保了数据处理的连续性和效率,因为不需要等待编码模块完成当前数据结果集的处理就可以开始写入新的数据。This embodiment describes the operation steps when using multiple ping-pong register groups for data buffering. Ping-pong register groups are usually used in high-speed data processing to achieve seamless buffering and transmission of continuous data streams. Multiple ping-pong register groups are set to achieve continuous processing and transmission of data. When a ping-pong register group is being read by the encoding module, another ping-pong register group can be written to a new data result set at the same time, thereby achieving parallelism and continuity of data processing. When the encoding module needs to process data, it will read the data result set corresponding to the optimal matching information from a ping-pong register group through the channels of the target number of channels. This process is a key step in data processing because it ensures that the encoding module can continuously receive the data to be processed. While the encoding module reads data from a ping-pong register group, the computing module writes the data result set corresponding to the new optimal matching information into another ping-pong register group that has not been read. This step ensures the continuity and efficiency of data processing because it does not need to wait for the encoding module to complete the processing of the current data result set before starting to write new data.

综上,本实施例通过引入多个乒乓寄存器组,实现了数据的并行处理和连续传输,从而提高了硬件加速卡中数据编码方法的整体效率和性能。这种设计使得处理器和编码模块能够更高效地协同工作,减少了数据处理过程中的等待时间,提高了吞吐量。In summary, this embodiment realizes parallel processing and continuous transmission of data by introducing multiple ping-pong register groups, thereby improving the overall efficiency and performance of the data encoding method in the hardware acceleration card. This design enables the processor and the encoding module to work together more efficiently, reduces the waiting time in the data processing process, and improves the throughput.

以上实施例中,计算模块可以具体用于执行LZ77算法(莱姆佩尔-齐夫77算法),编码模块可以为huffman(哈夫曼)编码模块。In the above embodiments, the calculation module may be specifically used to execute the LZ77 algorithm (Lemper-Ziff 77 algorithm), and the encoding module may be a Huffman encoding module.

第二方面,如图5所示,本发明提供了一种数据编码装置,应用于硬件加速卡,硬件加速卡还包括计算模块和编码模块,数据编码装置包括:存储器51,用于存储计算机程序;处理器52,用于在存储计算机程序时,实现上述的数据编码方法的步骤。In a second aspect, as shown in FIG. 5 , the present invention provides a data encoding device, which is applied to a hardware acceleration card. The hardware acceleration card also includes a computing module and an encoding module. The data encoding device includes: a memory 51 for storing a computer program; a processor 52 for implementing the steps of the above-mentioned data encoding method when storing the computer program.

对于数据编码装置的介绍请参照上述实施例,本发明在此不再赘述。For the introduction of the data encoding device, please refer to the above embodiment, and the present invention will not be described in detail here.

第三方面,如图6所示,本发明提供了一种硬件加速卡,包括上述的数据编码装置,还包括计算模块和编码模块;In a third aspect, as shown in FIG6 , the present invention provides a hardware acceleration card, comprising the above-mentioned data encoding device, and further comprising a computing module and an encoding module;

计算模块用于根据待编码数据和已编码数据确定匹配信息,将匹配信息和待编码数据通过目标通道数量的通道输出至编码模块;The calculation module is used to determine matching information according to the data to be encoded and the encoded data, and output the matching information and the data to be encoded to the encoding module through channels of the target number of channels;

编码模块用于根据匹配信息对待编码数据进行编码;The encoding module is used for encoding the data to be encoded according to the matching information;

匹配信息为待编码数据中和已编码数据中的预设字符串相同时,预设字符串在已编码数据中的匹配距离和匹配长度。The matching information is the matching distance and matching length of the preset character string in the encoded data when the preset character string in the data to be encoded is the same as the preset character string in the encoded data.

硬件加速卡可以但不限于为FPGA(Field-Programmable Gate Array,现场可编程门阵列),对于硬件加速卡的其它介绍请参照上述实施例,本发明在此不再赘述。The hardware acceleration card may be, but is not limited to, an FPGA (Field-Programmable Gate Array). For other introductions to the hardware acceleration card, please refer to the above embodiments, which will not be described in detail in the present invention.

第四方面,本发明提供了一种计算机程序产品,包括计算机程序/指令,计算机程序/指令被处理器执行时实现上述数据编码方法的步骤。In a fourth aspect, the present invention provides a computer program product, comprising a computer program/instructions, which implement the steps of the above-mentioned data encoding method when the computer program/instructions are executed by a processor.

对于计算机程序产品的介绍请参照上述实施例,本发明在此不再赘述。For the introduction of the computer program product, please refer to the above embodiments, and the present invention will not be described in detail here.

第五方面,如图7所示,本发明提供了一种非易失性存储介质61,非易失性存储介质61上存储有计算机程序62,计算机程序62被处理器执行时实现上述的数据编码方法的步骤。In a fifth aspect, as shown in FIG. 7 , the present invention provides a non-volatile storage medium 61 on which a computer program 62 is stored. When the computer program 62 is executed by a processor, the steps of the above-mentioned data encoding method are implemented.

对于非易失性存储介质61的介绍请参照上述实施例,本发明在此不再赘述。For the introduction of the non-volatile storage medium 61 , please refer to the above embodiment, and the present invention will not go into details here.

还需要说明的是,在本说明书中,诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的状况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、物品或者设备中还存在另外的相同要素。It should also be noted that, in this specification, relational terms such as first and second, etc. are only used to distinguish one entity or operation from another entity or operation, and do not necessarily require or imply any such actual relationship or order between these entities or operations. Moreover, the terms "comprises", "comprising" or any other variants thereof are intended to cover non-exclusive inclusion, so that a process, method, article or device including a series of elements includes not only those elements, but also other elements not explicitly listed, or also includes elements inherent to such process, method, article or device. In the absence of further restrictions, an element defined by the statement "comprising a ..." does not exclude the presence of other identical elements in the process, method, article or device including the element.

对所公开的实施例的上述说明,使本领域专业技术人员能够实现或使用本发明。对这些实施例的多种修改对本领域的专业技术人员来说将是显而易见的,本文中所定义的一般原理可以在不脱离本发明的精神或范围的情况下,在其他实施例中实现。因此,本发明将不会被限制于本文所示的这些实施例,而是要符合与本文所公开的原理和新颖特点相一致的最宽的范围。The above description of the disclosed embodiments enables those skilled in the art to implement or use the present invention. Various modifications to these embodiments will be apparent to those skilled in the art, and the general principles defined herein may be implemented in other embodiments without departing from the spirit or scope of the present invention. Therefore, the present invention will not be limited to the embodiments shown herein, but rather to the widest scope consistent with the principles and novel features disclosed herein.

Claims (20)

1. A data encoding method, applied to a processor in a hardware accelerator card, the hardware accelerator card further comprising a computing module and an encoding module, the data encoding method comprising:
acquiring channel quantity configuration information determined based on state parameters of a hardware accelerator card and/or demand instructions of a user;
configuring the channel number of the coding module as a target channel number according to the channel number configuration information;
After the computing module determines matching information according to the data to be encoded and the encoded data, outputting the matching information and the data to be encoded to the encoding module through the channels with the target channel number, and triggering the encoding module to encode the data to be encoded according to the matching information;
and the matching information is the matching distance and the matching length of the preset character string in the coded data when the preset character string in the data to be coded is the same as the preset character string in the coded data.
2. The data encoding method according to claim 1, wherein acquiring channel number configuration information determined based on a state parameter of a hardware accelerator card and/or a demand instruction of a user, comprises:
Determining the number of test channels according to a demand instruction of a user, and configuring the number of channels of the coding module as the number of test channels;
estimating the first computing resource occupation amount of the hardware accelerator card under the number of the test channels;
and adjusting the number of the test channels according to the first computing resource occupation amount to obtain the number of target channels.
3. The data encoding method of claim 2, wherein adjusting the number of test channels according to the first computing resource occupancy to obtain the target number of channels comprises:
comparing the first computing resource occupation amount with a first preset threshold value;
if the first computing resource occupation amount is larger than the first preset threshold value, reducing the number of the test channels to obtain the number of the target channels;
If the first computing resource occupation amount is smaller than the first preset threshold value, increasing the number of the test channels to obtain the number of the target channels;
And if the first computing resource occupation amount is equal to the first preset threshold value, determining that the number of the test channels is the target channel number.
4. The data encoding method of claim 3, wherein after comparing the first computing resource occupancy with a first preset threshold, further comprising:
calculating a difference value between the first calculation resource occupation amount and the first preset threshold value, and calculating a ratio of the difference value to the first preset threshold value;
determining the channel quantity variation according to the ratio and the preset ratio-variation correspondence;
if the first computing resource occupation amount is greater than the first preset threshold, reducing the number of the test channels to obtain the number of the target channels, including:
If the first computing resource occupation amount is larger than the first preset threshold value, subtracting the channel number variation amount from the test channel number to obtain the target channel number;
If the first computing resource occupation amount is smaller than the first preset threshold value, increasing the number of the test channels to obtain the number of the target channels, including:
And if the first computing resource occupation amount is smaller than the first preset threshold value, adding the channel number variation amount to the test channel number to obtain the target channel number.
5. The data encoding method of claim 3, wherein upon determining that the first computing resource occupancy is greater than the first preset threshold, further comprising:
Comparing the first computing resource occupation amount with a second preset threshold value, wherein the second preset threshold value is larger than the first preset threshold value;
And if the first computing resource occupation amount is larger than the second preset threshold value, feeding back a reconfiguration signal to a user so that the user can redetermine the number of the test channels based on the reconfiguration signal, and re-entering the step of configuring the number of the channels of the coding module into the number of the test channels.
6. The data encoding method of claim 4, wherein prior to calculating the difference between the first computing resource occupancy and the first preset threshold, further comprising:
determining an application platform of the hardware accelerator card, and determining a second computing resource occupation amount required by the hardware accelerator card to execute a preset task on the application platform;
determining the computing resource residual quantity of the hardware acceleration card according to the second computing resource occupation quantity, wherein the sum value of the second computing resource occupation quantity and the computing resource residual quantity is not more than all computing resources of the hardware acceleration card;
and determining the first preset threshold according to the computing resource remaining quantity.
7. The data encoding method according to claim 1, wherein acquiring channel number configuration information determined based on a state parameter of a hardware accelerator card and/or a demand instruction of a user, comprises:
determining an application platform of the hardware accelerator card, and determining a second computing resource occupation amount required by the hardware accelerator card to execute a preset task on the application platform;
determining the computing resource residual quantity of the hardware acceleration card according to the second computing resource occupation quantity, wherein the sum value of the second computing resource occupation quantity and the computing resource residual quantity is not more than all computing resources of the hardware acceleration card;
And determining the number of the target channels according to the residual quantity of the computing resources.
8. The data encoding method according to claim 1, wherein acquiring channel number configuration information determined based on a state parameter of a hardware accelerator card and/or a demand instruction of a user, comprises:
Acquiring channel quantity configuration information determined based on state parameters of a hardware accelerator card and/or demand instructions of a user through a first interface;
converting the channel number configuration information acquired by the first interface into channel number configuration information corresponding to the second interface by using an interface conversion device;
Refreshing the channel number configuration information corresponding to the second interface into a channel number configuration register;
configuring the channel number of the coding module to be the target channel number according to the channel number configuration information, including:
And configuring the channel number of the coding module as a target channel number according to the channel number configuration information in the channel number configuration register.
9. The data encoding method according to any one of claims 1 to 8, wherein the data to be encoded is divided into N groups, N being an integer greater than 1, the number of bytes in each group being equal to the target channel number; determining matching information according to the data to be encoded and the encoded data, including:
taking each byte in each group of bytes as a starting byte, and simultaneously searching whether a character string matched with each byte in the group as the starting byte exists in the encoded data or not by N groups of bytes, wherein the length of the character string is at least 2;
Recording the matching information of the character string with each byte as the initial byte, wherein the matching information comprises the matching length of the matched character string and the matching distance of the initial byte of the character string in the encoded data;
Determining optimal matching information according to the matching information corresponding to each byte in N groups of data to be coded;
Outputting the matching information and the data to be encoded to the encoding module through the channels of the target channel number, wherein the method comprises the following steps:
Outputting a data result set corresponding to the optimal matching information to the coding module through the channels with the target channel number, wherein the data result set comprises the optimal matching information and the data to be coded.
10. The data encoding method of claim 9, wherein determining optimal matching information according to matching information corresponding to each of the bytes in the N groups of data to be encoded, comprises:
polling is started from the first group, and the address of the current byte in the next comparison process is determined according to the first matching length of the current byte in the current comparison process and the second matching length of the next byte;
and determining the matching information of the current byte in the comparison process meeting the preset ending condition as the optimal matching information.
11. The data encoding method of claim 10, wherein before determining the matching information of the current byte in the comparison process satisfying the preset end condition as the optimal matching information, further comprising:
judging whether the address offset of the current byte in the next comparison process is larger than the number of the target channels;
if the number of the current bytes is larger than the number of the target channels, judging that the current bytes in the current comparison process meet the preset ending condition;
if the number of the target channels is smaller than the number of the target channels, judging that the current byte in the current comparison process does not meet the preset ending condition.
12. The data encoding method of claim 10, wherein determining the address of the current byte in the next comparison process based on the first matching length of the current byte and the second matching length of the next byte in the current comparison process comprises:
And when the first matching length is smaller than the second matching length, outputting the current byte in the current comparison process, and adding one address of the current byte in the current comparison process as the address of the current byte in the next comparison process.
13. The data encoding method of claim 10, wherein determining the address of the current byte in the next comparison process based on the first matching length of the current byte and the second matching length of the next byte in the current comparison process comprises:
and outputting the first matching length when the first matching length is not smaller than the second matching length, and taking the address of the current byte in the current comparison process plus the address after the first matching length as the address of the current byte in the next comparison process.
14. The data encoding method of claim 10, further comprising:
if the address of the current byte is equal to the maximum channel address, caching matching information of the current byte;
When the next byte comparison process is entered, a virtual 0 address is constructed, and matching information of the bytes of the maximum channel address is stored in the virtual 0 address;
And taking the matching information of the byte of the maximum channel address stored by the virtual 0 address as the matching information of the current byte of the current comparison process, and entering a step of determining the address of the current byte in the next comparison process according to the first matching length of the current byte and the second matching length of the next byte in the current comparison process.
15. The data encoding method of claim 10, wherein outputting the data result set corresponding to the optimal matching information to the encoding module through the channel of the target channel number comprises:
Writing the data result set corresponding to the optimal matching information into a ping-pong register set with the depth of m, wherein m is positive integer multiple of N;
and triggering the coding module to read the data result set through the channels of the target channel number when N channel marks of the ping-pong register set are valid.
16. The data encoding method of claim 15, wherein the number of ping-pong register sets is plural, and writing the data result set corresponding to the optimal matching information into the ping-pong register set with depth m comprises:
and when the encoding module reads the data result set stored in one of the ping-pong register sets through the channels with the target channel number, writing the data result set corresponding to the optimal matching information into other ping-pong register sets which are not read.
17. A data encoding device, applied to a hardware accelerator card, the hardware accelerator card further comprising a computing module and an encoding module, the data encoding device comprising:
A memory for storing a computer program;
a processor for implementing the steps of the data encoding method according to any one of claims 1-16 when storing a computer program.
18. A hardware accelerator card, comprising the data encoding device of claim 17, further comprising a computing module and an encoding module;
The computing module is used for determining matching information according to the data to be encoded and the encoded data, and outputting the matching information and the data to be encoded to the encoding module through channels with the number of target channels;
the encoding module is used for encoding the data to be encoded according to the matching information;
and the matching information is the matching distance and the matching length of the preset character string in the coded data when the preset character string in the data to be coded is the same as the preset character string in the coded data.
19. A computer program product comprising computer programs/instructions which, when executed by a processor, implement the steps of the data encoding method of any of claims 1-16.
20. A non-volatile storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the data encoding method according to any of claims 1-16.
CN202410679626.3A 2024-05-29 2024-05-29 Data coding method, device, hardware acceleration card, program product and medium Active CN118264256B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410679626.3A CN118264256B (en) 2024-05-29 2024-05-29 Data coding method, device, hardware acceleration card, program product and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410679626.3A CN118264256B (en) 2024-05-29 2024-05-29 Data coding method, device, hardware acceleration card, program product and medium

Publications (2)

Publication Number Publication Date
CN118264256A true CN118264256A (en) 2024-06-28
CN118264256B CN118264256B (en) 2024-09-10

Family

ID=91613238

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410679626.3A Active CN118264256B (en) 2024-05-29 2024-05-29 Data coding method, device, hardware acceleration card, program product and medium

Country Status (1)

Country Link
CN (1) CN118264256B (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020069275A1 (en) * 2000-12-06 2002-06-06 Tindal Glen D. Global GUI interface for network OS
CN104202054A (en) * 2014-09-16 2014-12-10 东南大学 Hardware LZMA (Lempel-Ziv-Markov chain-Algorithm) compression system and method
CN108832935A (en) * 2018-05-31 2018-11-16 郑州云海信息技术有限公司 A kind of RLE algorithm implementation method, system, equipment and computer storage medium
CN109558165A (en) * 2018-11-29 2019-04-02 广州市百果园信息技术有限公司 A kind of method for optimizing configuration, device, equipment and storage medium
CN110085241A (en) * 2019-04-28 2019-08-02 北京地平线机器人技术研发有限公司 Data-encoding scheme, device, computer storage medium and data encoding apparatus
US20200167088A1 (en) * 2018-11-26 2020-05-28 Micron Technology, Inc. Configuring command/address channel for memory
CN112346845A (en) * 2021-01-08 2021-02-09 腾讯科技(深圳)有限公司 Method, device and equipment for scheduling coding tasks and storage medium
CN116828184A (en) * 2023-08-28 2023-09-29 腾讯科技(深圳)有限公司 Video encoding method, video decoding method, video encoding device, video decoding device, computer equipment and storage medium
CN117472471A (en) * 2023-11-09 2024-01-30 北京蔚领时代科技有限公司 Application program configuration method, device, equipment and storage medium

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020069275A1 (en) * 2000-12-06 2002-06-06 Tindal Glen D. Global GUI interface for network OS
CN104202054A (en) * 2014-09-16 2014-12-10 东南大学 Hardware LZMA (Lempel-Ziv-Markov chain-Algorithm) compression system and method
CN108832935A (en) * 2018-05-31 2018-11-16 郑州云海信息技术有限公司 A kind of RLE algorithm implementation method, system, equipment and computer storage medium
US20200167088A1 (en) * 2018-11-26 2020-05-28 Micron Technology, Inc. Configuring command/address channel for memory
CN109558165A (en) * 2018-11-29 2019-04-02 广州市百果园信息技术有限公司 A kind of method for optimizing configuration, device, equipment and storage medium
CN110085241A (en) * 2019-04-28 2019-08-02 北京地平线机器人技术研发有限公司 Data-encoding scheme, device, computer storage medium and data encoding apparatus
CN112346845A (en) * 2021-01-08 2021-02-09 腾讯科技(深圳)有限公司 Method, device and equipment for scheduling coding tasks and storage medium
CN116828184A (en) * 2023-08-28 2023-09-29 腾讯科技(深圳)有限公司 Video encoding method, video decoding method, video encoding device, video decoding device, computer equipment and storage medium
CN117472471A (en) * 2023-11-09 2024-01-30 北京蔚领时代科技有限公司 Application program configuration method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN118264256B (en) 2024-09-10

Similar Documents

Publication Publication Date Title
US11705924B2 (en) Low-latency encoding using a bypass sub-stream and an entropy encoded sub-stream
US10680643B2 (en) Compression scheme with control of search agent activity
CN102694554B (en) Data compression device, its operating method and the data processing equipment including the equipment
US11070230B2 (en) Run-length base-delta encoding for high-speed compression
CN105183557B (en) A kind of hardware based configurable data compression system
US8903781B2 (en) Real-time selection of compression operations
CN113468090B (en) PCIe communication method and device, electronic equipment and readable storage medium
JP2008225558A (en) Data relay integrated circuit, data relay device, and data relay method
CN118264256B (en) Data coding method, device, hardware acceleration card, program product and medium
WO2024066668A1 (en) Fast memory clear of system memory
JPH07210324A (en) Storage device
WO2021237513A1 (en) Data compression storage system and method, processor, and computer storage medium
WO2024001863A9 (en) Data processing method and related device
CN116132546A (en) Method, device, equipment and medium for data transmission
CN115811317A (en) Stream processing method and system based on self-adaptive non-decompression direct calculation
JP6377197B1 (en) Thread number variation communication apparatus and thread number variation communication program
CN119376952B (en) A heterogeneous acceleration system, method, computing device and storage medium
CN117472840B (en) Multi-core system and data processing method for multi-core system
US20250306762A1 (en) System and Method for Hardware-Accelerated Generation of Full Binary Tree Codebooks Using Field-Programmable Gate Array
US20250291485A1 (en) System and Method for Generating Full Binary Tree Codebooks with Minimal Computational Resources
US20250063107A1 (en) Method, device and computer program product for transmitting data block
US20250271992A1 (en) System and Method for Determining Compression Performance of Codebooks Without Generation
KR101818440B1 (en) Data compression device, operation method using the same, and data processing apparatus having the same
CN120561050A (en) Data transmission system, method, electronic device and storage medium
CN119576408A (en) Command information processing method and device, storage medium, and electronic device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant