CN120407017A

CN120407017A - Instruction processing method, device, equipment and computer-readable storage medium

Info

Publication number: CN120407017A
Application number: CN202410133420.0A
Authority: CN
Inventors: 李易; 张旭; 张康
Original assignee: Huawei Technologies Co Ltd
Current assignee: Huawei Technologies Co Ltd
Priority date: 2024-01-30
Filing date: 2024-01-30
Publication date: 2025-08-01
Also published as: WO2025162018A1

Abstract

The present application discloses an instruction processing method, apparatus, device and computer-readable storage medium, and belongs to the field of computer technology. The method includes: obtaining a first instruction to be processed; determining a compression level of a memory in a memory structure according to a memory structure and a required rate for reading the first instruction, the memory structure being used to store the compressed first instruction; and compressing the first instruction using the compression level. By compressing the first instruction, the size of the storage space occupied by the first instruction is reduced, and a memory structure of the same capacity can store more compressed first instructions than the number of first instructions before compression, thereby increasing the instruction storage capacity of the memory structure and reducing the storage overhead of the memory structure. Since the compression of the first instruction is based on the compression level of the memory, the process of restoring the compressed first instruction meets the required rate, and the process of restoring the compressed first instruction does not affect the normal operation of the memory.

Description

Instruction processing method, device, equipment and computer readable storage medium

Technical Field

The present application relates to the field of computer technologies, and in particular, to a method, an apparatus, a device, and a computer readable storage medium for processing instructions.

Background

In the field of computer technology, processors include instruction storage modules (instruction memory, IMEM), decoding modules (DEC), and parallel computing modules (ARITHMETIC AND logic units, ALUs). The instruction storage module is used for storing instructions obtained by compiling the application program, the decoding module is used for analyzing a plurality of instructions in parallel, and the parallel computing module is used for obtaining a plurality of result data through parallel computing according to information analyzed by the plurality of instructions, so that the running of the application program is realized according to the plurality of result data. As the complexity of the application increases, the number of instructions obtained by compiling the application is greater, and the instructions stored by the instruction storage module are greater.

Disclosure of Invention

The application provides an instruction processing method, an instruction processing device, an instruction processing equipment and a computer readable storage medium, which are used for storing more instructions, and the technical scheme is as follows:

according to a first aspect, an instruction processing method is provided, and the method comprises the steps of obtaining a first instruction to be processed, determining a compression level of a memory in a memory structure according to the memory structure and a demand rate for reading the first instruction, wherein the memory structure is used for storing the compressed first instruction, and compressing the first instruction by adopting the compression level.

By compressing the first instructions, the size of the storage space occupied by the first instructions is reduced, and the number of compressed first instructions which can be stored by the memory structure with the same capacity is more than the number of first instructions before compression, so that the instruction storage number of the memory structure is increased, and the storage cost of the memory structure is reduced. Since the compression of the first instruction is performed based on the compression hierarchy of the memory, and the compression hierarchy is determined based on the demand rate, the process of restoring the compressed first instruction satisfies the demand rate of the memory structure, and the process of restoring the compressed first instruction does not affect the normal operation of the memory in the memory structure.

In one possible implementation, before determining the compression level of the memory in the memory structure according to the memory structure and the demand rate of reading the first instruction, the method further comprises the steps of acquiring at least one of path attributes of reading paths of the memories included in the memory structure or reading efficiency of the memory structure, wherein the reading paths of the memories are paths connected between the memories and a reading module, and the reading module is used for reading the instructions stored in the memories and processing the read instructions, and determining the demand rate of reading the first instruction according to at least one of the path attributes or the reading efficiency. And determining the demand rate for reading the first instruction according to at least one of the path attribute and the reading efficiency, wherein the determined demand rate is more consistent with the actual running condition of the memory structure, and the accuracy of the determined demand rate is high.

In one possible implementation, determining the compression level of the memory in the memory structure according to the memory structure and the demand rate for reading the first instruction comprises determining the reading rate of each memory included in the memory structure, determining the restore rate of each memory according to the demand rate for reading the first instruction and the reading rate of each memory, wherein the restore rate of each memory is the rate of decompressing instructions stored in the memory on the reading path of the memory, and determining the compression level of each memory according to the restore rate of each memory, wherein the compression rate corresponding to the compression level of the memory is matched with the restore rate corresponding to the memory.

The method comprises the steps of reading a first instruction from a memory structure and restoring the first instruction, so that the restoring rate which can be used for restoring the instruction under the condition of ensuring the normal operation of the memory can be determined according to the demand rate of reading the first instruction from the memory structure and the reading rate of each memory, a compression level matched with the restoring rate is selected, the accuracy of the determined compression level is high through the combination of the whole memory structure and the part of the memory, the compression level is matched with the restoring rate of each memory, and the normal operation of each memory is not influenced by the compression based on the compression of the compression level.

In one possible implementation, the memory structure includes a memory module and a buffer module, the memory module has a compression layer with two-layer compression, the buffer module has a compression layer with one-layer compression reduction rate higher than that of the two-layer compression, the compression layer is used for compressing the first instruction, including compressing the first instruction in a compression manner of one-layer compression to obtain the second instruction, and compressing the second instruction in a compression manner of two-layer compression to obtain the third instruction, where the compression rate of the compression manner of two-layer compression is higher than that of the compression manner of one-layer compression. In the case of a memory structure including multiple memories, such as a memory module and a cache module, the first instruction may be compressed in multiple stages, resulting in a third instruction having a smaller bit width than the first instruction and a smaller memory overhead. For different memories, such as a memory module and a cache module, different compression modes can be adopted, and the compression process has high flexibility.

In one possible implementation manner, the ratio between the bit width of the first instruction and the bit width of the second instruction is equal to the compression ratio, and the method further comprises the steps of acquiring service characteristics of the first instruction before the first instruction is compressed by adopting a one-layer compression mode to obtain the second instruction, wherein the service characteristics indicate the operation quality requirement of the service operated based on the first instruction, and determining the compression ratio according to the service characteristics of the first instruction. And determining the compression ratio according to the service characteristics of the first instruction, so that the operation quality of the service operated based on the first instruction is ensured.

In one possible implementation manner, a first instruction is compressed in a compression mode of one layer of compression to obtain a second instruction, wherein the method comprises the steps of obtaining an instruction characteristic of the first instruction, and compressing the first instruction in a fixed-length compression mode to obtain the second instruction under the condition that fixed-length compression is supported according to the instruction characteristic, or splitting the first instruction into a plurality of second instructions under the condition that fixed-length compression is not supported according to the instruction characteristic, wherein the bit width of each second instruction in the plurality of second instructions is the same. And determining whether the first instruction supports fixed-length compression or not according to the instruction characteristics, performing fixed-length compression on the first instruction supporting the fixed-length compression, splitting the first instruction which does not support the fixed-length compression, and flexibly processing the first instruction according to the instruction characteristics, wherein whether the first instruction supports the fixed-length compression or not, the second instruction with uniform bit width can be obtained by processing the first instruction, and the method is wide in universality.

In one possible implementation, the instruction characteristic includes at least one of an instruction type, a frequency of use, or a field utilization of the first instruction. The method has various instruction characteristics and high flexibility.

In one possible implementation, compressing the first instruction by fixed-length compression to obtain a second instruction includes determining an effective field of the first instruction to obtain the second instruction including the effective field, where the effective field is a field used for executing the first instruction. And determining the second instruction based on the effective field, ensuring that the field carried by the second instruction is a necessary field for the operation of the instruction, and improving the field utilization rate of the second instruction and the reliable execution of the second instruction.

In one possible implementation, the second instruction includes a compression flag bit, the compression flag bit indicating that the second instruction is obtained by way of fixed-length compression or splitting. The second instruction is definitely obtained through compressing the marking bits, the determining process is simple, the efficiency is high, the subsequent reduction of the instructions can be facilitated, and the reduction efficiency is improved.

In one possible implementation manner, a compression mode of two-layer compression is adopted for the second instruction to obtain a third instruction, wherein the method comprises the steps of obtaining the occurrence frequency of each field included in the second instruction, obtaining the third instruction according to the occurrence frequency of each field, determining the encoding of each field based on the occurrence frequency of each field, and enabling the length of the encoding of each field to be in an inverse proportion relation with the occurrence frequency of each field. Because the encoded length of each field and the occurrence frequency of each field are in inverse proportion, the shorter the corresponding encoded length of the field with more occurrence times is, the length of the third instruction is effectively reduced.

In one possible implementation, before the third instruction is acquired according to the occurrence frequency of each field, the method further comprises the steps of constructing a coding dictionary corresponding to the second instruction according to the occurrence frequency of each field, wherein the coding dictionary comprises each field and the occurrence frequency of each field, and determining the codes of each field according to the coding dictionary. The occurrence frequency of each field is counted through the coding dictionary, the second instruction can be compressed through searching the coding dictionary in the follow-up process, and the compression process is simple and high in efficiency.

In one possible implementation, the third instruction includes a coded index, where the coded index is used to find a field corresponding to a code included in the third instruction in decompressing the third instruction. Because the third instruction carries the coding index, in the subsequent process of decompressing the third instruction, the field corresponding to the code included in the third instruction can be determined by analyzing the coding index, so that the decompression of the third instruction is realized, the decompression process is simple, and the decompression efficiency is high.

In one possible implementation, after the first instruction is compressed by adopting the compression hierarchy, the method further comprises the steps of acquiring the instruction length of the compressed first instruction, determining a storage unit for storing the compressed first instruction in a memory structure according to the instruction length, and storing the compressed first instruction in the determined storage unit. According to the instruction length of the compressed first instruction, a storage unit for storing is determined, and then the compressed first instruction is stored in the storage unit, so that the compressed first instruction is accurately stored.

In one possible implementation, the first instructions include instructions to forward the application. By compressing the first instruction, the reading efficiency of the first instruction is improved, and the forwarding efficiency based on the execution of the first instruction is higher.

In one possible implementation, acquiring the first instruction to be processed includes acquiring at least one instruction included in the instruction bundle, any one of the at least one instruction being the first instruction to be processed, and the at least one instruction being an instruction executed in parallel. The first instruction to be processed may be an instruction included in the instruction bundle, and the number of the instructions to be processed is not limited, and may be one first instruction or a plurality of first instructions, and the universality is wide. For at least one instruction to be executed in parallel, an instruction processing method can be adopted for compression, and the storage space occupied by the instruction bundle is reduced on the basis of ensuring the parallel processing efficiency. And, the instruction bundle comprising at least one instruction is for example an ultralong instruction word, the method can also be applied to the application scenario of the ultralong instruction word.

In a second aspect, another instruction processing method is provided, the method comprising obtaining a compressed first instruction, compression of the first instruction being implemented based on a compression hierarchy, the compression hierarchy being determined based on a memory structure for storing the compressed first instruction and a rate of demand for reading the first instruction, and restoring the compressed first instruction.

The storage space occupied by the compressed first instruction is smaller than that occupied by the first instruction, and the memory structure with the same memory size can store more compressed first instructions, so that the instruction storage quantity of the memory structure is improved. And for the compressed first instruction, the restoration can be performed, so that the normal realization of the subsequent related operation based on the first instruction is ensured.

In one possible implementation, the memory structure comprises a memory module and a buffer module, wherein the compression level of the memory module is two-layer compression, the compression level of the buffer module is one-layer compression, the reduction rate of the one-layer compression is higher than that of the two-layer compression, the compressed first instruction comprises a third instruction, the reduced first instruction comprises a second instruction obtained by reducing the third instruction by adopting a reduction mode corresponding to the two-layer compression mode, and the second instruction is reduced by adopting a reduction mode corresponding to the one-layer compression mode, so that the first instruction is obtained, and the compression rate of the two-layer compression mode is higher than that of the one-layer compression mode. Even if the memory structure includes a plurality of memories, for example, a memory module and a cache module, the third instruction may be restored in multiple segments, and each segment of restoration is a restoration mode corresponding to the compression level of the memory, that is, each segment of restoration does not affect the normal operation of the memory.

In one possible implementation, the third instruction includes a coding index, and the restoring mode corresponding to the two-layer compression mode is adopted for restoring the third instruction to obtain the second instruction, including analyzing the coding index included in the third instruction, determining each field corresponding to each code included in the third instruction according to the coding index, and determining the second instruction according to each field corresponding to each code, wherein the length of the code of each field is in inverse proportion to the occurrence frequency of each field. The corresponding relation between the code and the field can be determined through the code index, the third instruction is restored, the restoring process is simple, and the efficiency is high.

In one possible implementation manner, the restoring manner corresponding to the one-layer compression manner comprises decompression or splitting corresponding to fixed-length compression, and the restoring manner corresponding to the one-layer compression manner is adopted for restoring the second instruction to obtain the first instruction, wherein the restoring manner comprises decompression corresponding to the fixed-length compression for the second instruction to obtain the first instruction under the condition that the second instruction is obtained through the fixed-length compression, or splicing a plurality of second instructions obtained through splitting of the first instruction under the condition that the second instruction is obtained through splitting, so that the first instruction is obtained, and the bit width of each second instruction in the plurality of second instructions is the same. Whether fixed-length compression or splitting is adopted, corresponding reduction modes exist, the types of the reducible second instructions are more, and the universality is high.

In one possible implementation manner, the second instruction comprises a compression flag bit, the compression flag bit indicates that the second instruction is obtained through fixed-length compression or splitting, a reduction mode corresponding to a compression mode of one layer of compression is adopted for the second instruction to be reduced, and before the first instruction is obtained, the method further comprises the steps of analyzing the compression flag bit included in the second instruction, and determining the mode of obtaining the second instruction from fixed-length compression and splitting according to the compression flag bit. The compression mode adopted by the second instruction can be determined by analyzing the compression flag bit, the determination process is simple, the determination efficiency is high, the subsequent restoration of the instruction can be facilitated, and the restoration efficiency is improved.

In one possible implementation, acquiring the compressed first instruction includes extracting at least one instruction stored in a row of memory cells of the memory structure, and separating each instruction according to an instruction length of each instruction in the at least one instruction to obtain at least one instruction, where the at least one instruction includes the compressed first instruction. The method can extract the instructions stored in one row of storage units at a time, the number of the extracted instructions is large, and the instruction extraction efficiency is high.

In one possible implementation manner, after the compressed first instruction is restored, the method further comprises the steps of determining an instruction bundle corresponding to the first instruction, wherein the instruction bundle comprises at least one instruction, compression results of the at least one instruction are stored in the same row of storage units of the memory structure, compression results of any instruction in the compression results of the at least one instruction are the compressed first instruction, the at least one instruction is a parallel executed instruction, and sending the at least one instruction included in the instruction bundle to a decoding module, wherein the decoding module is used for analyzing the at least one instruction. After the first instruction is restored, at least one instruction is sent to the decoding module by taking the instruction bundle as a unit, so that the decoding module carries out parallel analysis on the at least one instruction included in the instruction bundle, the at least one instruction is controlled to carry out parallel analysis, the at least one instruction can be ensured to be smoothly parallel, and the processing efficiency of the at least one instruction is improved through parallel execution. The method can be applied to the scene of the instruction beam, and the application range is wide.

In one possible implementation, the first instructions include instructions to forward the application. When the first instruction is used for forwarding the application program, by compressing the first instruction, the reading efficiency of the first instruction is improved, and the forwarding efficiency based on the execution of the first instruction is higher.

In a third aspect, an instruction processing apparatus is provided, the apparatus including an acquisition module configured to acquire a first instruction to be processed, a determination module configured to determine a compression hierarchy of a memory in the memory structure according to the memory structure and a demand rate for reading the first instruction, the memory structure configured to store the compressed first instruction, and a compression module configured to compress the first instruction using the compression hierarchy.

In one possible implementation manner, the acquiring module is further configured to acquire at least one of a path attribute of a read path of each memory included in the memory structure or a read efficiency of the memory structure, where the read path of the memory is a path connected between the memory and the reading module, and the reading module is configured to read an instruction stored in the memory and process the read instruction, and the determining module is further configured to determine a demand rate of reading the first instruction according to at least one of the path attribute or the read efficiency.

In one possible implementation, the determining module is configured to determine a read rate of each memory included in the memory structure, determine a restore rate of each memory according to a demand rate for reading the first instruction and the read rate of each memory, where the restore rate of each memory is a rate for decompressing instructions stored in the memory on a read path of the memory, and determine a compression level of each memory according to the restore rate of each memory, where a compression rate corresponding to the compression level of the memory matches a restore rate corresponding to the memory.

In one possible implementation, the memory structure includes a memory module and a buffer module, where the memory module has a compression layer with two compression layers, and the buffer module has a compression layer with a reduction rate higher than that of the two compression layers, and the compression module is configured to compress the first instruction in a compression manner with one compression layer to obtain the second instruction, and compress the second instruction in a compression manner with two compression layers to obtain the third instruction, where the compression rate of the compression manner with two compression layers is greater than that of the compression manner with one compression layer.

In one possible implementation, the ratio between the bit width of the first instruction and the bit width of the second instruction is equal to the compression ratio, the obtaining module is further configured to obtain a service characteristic of the first instruction, where the service characteristic indicates an operation quality requirement of a service operated based on the first instruction, and the determining module is further configured to determine the compression ratio according to the service characteristic of the first instruction.

In one possible implementation manner, the compression module is used for acquiring the instruction characteristics of the first instruction, compressing the first instruction in a fixed-length compression mode to obtain the second instruction under the condition that fixed-length compression is determined to be supported according to the instruction characteristics, or splitting the first instruction into a plurality of second instructions under the condition that fixed-length compression is determined to be not supported according to the instruction characteristics, wherein the bit width of each second instruction in the plurality of second instructions is the same.

In one possible implementation, the instruction characteristic includes at least one of an instruction type, a frequency of use, or a field utilization of the first instruction.

In one possible implementation, the compression module is configured to determine a valid field of the first instruction, and obtain a second instruction that includes the valid field, where the valid field is a field used to execute the first instruction.

In one possible implementation, the second instruction includes a compression flag bit, the compression flag bit indicating that the second instruction is obtained by way of fixed-length compression or splitting.

In one possible implementation, the compression module is configured to obtain the frequency of occurrence of each field included in the second instruction, and obtain a third instruction according to the frequency of occurrence of each field, where the third instruction includes codes of each field, where the codes of each field are determined based on the frequency of occurrence of each field, and the length of the codes of each field is inversely proportional to the frequency of occurrence of each field.

In one possible implementation manner, the determining module is further configured to construct a coding dictionary corresponding to the second instruction according to the occurrence frequency of each field, where the coding dictionary includes each field and the occurrence frequency of each field, and determine the coding of each field according to the coding dictionary.

In one possible implementation, the third instruction includes a coded index, where the coded index is used to find a field corresponding to a code included in the third instruction in decompressing the third instruction.

In one possible implementation manner, the device further comprises a storage module, a storage unit and a storage unit, wherein the storage module is used for acquiring the instruction length of the compressed first instruction, determining a storage unit in the memory structure for storing the compressed first instruction according to the instruction length, and storing the compressed first instruction in the determined storage unit.

In one possible implementation, the first instructions include instructions to forward the application.

In one possible implementation manner, the acquiring module is configured to acquire at least one instruction included in the instruction bundle, any instruction in the at least one instruction is a first instruction to be processed, and the at least one instruction is an instruction executed in parallel.

In a fourth aspect, another instruction processing apparatus is provided, which includes an acquisition module configured to acquire a compressed first instruction, where compression of the first instruction is implemented based on a compression hierarchy, where the compression hierarchy is determined based on a memory structure and a demand rate for reading the first instruction, where the memory structure is configured to store the compressed first instruction, and a restoration module configured to restore the compressed first instruction.

In one possible implementation manner, the memory structure comprises a memory module and a buffer module, wherein the compression level of the memory module is two-layer compression, the compression level of the buffer module is one-layer compression, the reduction rate of the one-layer compression is higher than that of the two-layer compression, the compressed first instruction comprises a third instruction, the reduction module is used for reducing the third instruction by adopting a reduction mode corresponding to the compression mode of the two-layer compression to obtain a second instruction, the second instruction is reduced by adopting a reduction mode corresponding to the compression mode of the one-layer compression to obtain a first instruction, and the compression rate of the compression mode of the two-layer compression is higher than that of the compression mode of the one-layer compression.

In one possible implementation manner, the third instruction comprises a coding index and a restoring module, wherein the restoring module is used for analyzing the coding index included in the third instruction, determining each field corresponding to each code included in the third instruction according to the coding index, and determining the second instruction according to each field corresponding to each code, wherein the length of the code of each field is in inverse proportion to the occurrence frequency of each field.

In one possible implementation manner, the restoring manner corresponding to the compressing manner of the one layer of compression includes decompression or splitting corresponding to fixed-length compression, and the restoring module is used for decompressing the fixed-length compression corresponding to the second instruction to obtain a first instruction when the second instruction is obtained through the fixed-length compression manner, or splicing a plurality of second instructions obtained through splitting the first instruction when the second instruction is obtained through the splitting manner to obtain the first instruction, wherein the bit width of each second instruction in the plurality of second instructions is the same.

In one possible implementation manner, the second instruction includes a compression flag bit, where the compression flag bit indicates that the second instruction is obtained by fixed-length compression or splitting, and the restoration module is further configured to parse the compression flag bit included in the second instruction, and determine a manner of obtaining the second instruction from fixed-length compression and splitting according to the compression flag bit.

In one possible implementation, the obtaining module is configured to extract at least one instruction stored in a row of storage units of the memory structure, and separate each instruction according to an instruction length of each instruction in the at least one instruction to obtain at least one instruction, where the at least one instruction includes a compressed first instruction.

In one possible implementation manner, the device further comprises a sending module, a decoding module and a decoding module, wherein the sending module is used for determining an instruction bundle corresponding to the first instruction, the instruction bundle comprises at least one instruction, compression results of the at least one instruction are stored in the same row of storage units of the memory structure, compression results of any instruction in the compression results of the at least one instruction are compressed first instructions, the at least one instruction is a parallel executed instruction, and the decoding module is used for analyzing the at least one instruction.

In a fifth aspect, there is provided an instruction processing apparatus comprising a processor for loading and executing at least one instruction to cause the instruction processing apparatus to perform the method of the first aspect or any of the possible implementations of the first aspect or to perform the method of the second aspect or any of the possible implementations of the second aspect.

In one possible implementation, the apparatus includes a memory coupled to the processor, the memory storing at least one instruction.

In a sixth aspect, there is provided a computer readable storage medium having stored therein at least one instruction that is loaded and executed by a processor to implement the instruction processing method of the first aspect or any of the possible implementations of the first aspect or to implement the instruction processing method of the second aspect or any of the possible implementations of the second aspect.

In a seventh aspect, a computer program (product) is provided, the computer program (product) comprising computer programs/instructions for execution by a processor to cause a computer to implement the method of instruction processing in the first aspect or any one of the possible implementations of the first aspect or to implement the method of instruction processing in the second aspect or any one of the possible implementations of the second aspect.

In an eighth aspect, a communications apparatus is provided that includes a transceiver, a memory, and a processor. Wherein the transceiver, the memory and the processor are in communication with each other via an internal connection path, the memory is configured to store instructions, the processor is configured to execute the instructions stored by the memory to control the transceiver to receive signals and control the transceiver to transmit signals, and when the processor executes the instructions stored by the memory, the processor is configured to perform the method of the first aspect or any one of the possible implementation manners of the first aspect, or perform the method of the second aspect or any one of the possible implementation manners of the second aspect.

Optionally, the processor is one or more and the memory is one or more.

Alternatively, the memory may be integrated with the processor or the memory may be separate from the processor.

In a specific implementation process, the memory may be a non-transient (non-transitory) memory, for example, a Read Only Memory (ROM), which may be integrated on the same chip as the processor, or may be separately disposed on different chips.

In a ninth aspect, there is provided a chip comprising a processor for calling from a memory and executing program instructions or code stored in the memory, so that a communication device on which the chip is mounted performs the method of the above aspects.

In a tenth aspect, there is provided another chip comprising an input interface, an output interface, a processor and a memory, the input interface, the output interface, the processor and the memory being connected by an internal connection path, the processor being configured to execute code in the memory, the processor being configured to perform the method of the above aspects when the code is executed.

It should be appreciated that, the technical solutions and the corresponding possible implementations of the third aspect to the tenth aspect of the present application may refer to the technical effects of the first aspect and the corresponding possible implementations thereof or the second aspect and the corresponding possible implementations thereof, which are not described herein.

Drawings

FIG. 1 is a schematic diagram of an implementation environment provided by an embodiment of the present application;

FIG. 2 is a schematic diagram of another implementation environment provided by an embodiment of the present application;

FIG. 3 is a flowchart of an instruction processing method according to an embodiment of the present application;

FIG. 4 is a schematic diagram of a processor according to an embodiment of the present application;

FIG. 5 is a schematic diagram illustrating a process of instruction processing according to an embodiment of the present application;

FIG. 6 is a schematic diagram of a first instruction and a second instruction according to an embodiment of the present application;

FIG. 7 is a schematic diagram of a two-stage compression process according to an embodiment of the present application;

FIG. 8 is a flow chart of instruction compression according to an embodiment of the present application;

FIG. 9 is a flowchart of an instruction processing method according to an embodiment of the present application;

FIG. 10 is a schematic diagram of a process for reading instructions according to an embodiment of the present application;

FIG. 11 is a schematic diagram of a decompression process according to an embodiment of the present application;

FIG. 12 is a schematic diagram showing the effect of fixed-length reduction according to an embodiment of the present application;

FIG. 13 is a flow chart of fixed-length reduction according to an embodiment of the present application;

FIG. 14 is a flow chart of another fixed-length reduction provided by an embodiment of the present application;

FIG. 15 is a graph showing the comparison of the effects of instruction store according to an embodiment of the present application;

FIG. 16 is a schematic diagram of an instruction processing apparatus according to an embodiment of the present application;

FIG. 17 is a schematic diagram of a further instruction processing apparatus according to an embodiment of the present application;

fig. 18 is a schematic structural diagram of a network device according to an embodiment of the present application;

fig. 19 is a schematic structural diagram of still another network device according to an embodiment of the present application.

Detailed Description

The terminology used in the description of the embodiments of the application herein is for the purpose of describing particular embodiments of the application only and is not intended to be limiting of the application. For the purpose of making the objects, technical solutions and advantages of the present application more apparent, the embodiments of the present application will be described in further detail with reference to the accompanying drawings.

In the field of computer technology, a processor is an operation and control core of a network device, and is used to run an application program. Illustratively, the processor includes an instruction storage module, a decode module, and a parallel computing module. The instruction storage module is used for storing the instruction compiled by the application program so as to achieve the programmable effect. The decoding module is used for analyzing the instructions stored in the instruction storage module to obtain analysis results, sending the analysis results to the parallel computing module, and obtaining a plurality of result data through parallel computing according to the analysis results of the plurality of instructions by the parallel computing module so as to run the application program according to the plurality of result data.

In the related art, for a Run To Complete (RTC) architecture forwarding processor, the RTC architecture forwarding processor includes a plurality of processor cores and a plurality of instruction cache modules, where one processor core is connected to one instruction cache module and the plurality of instruction cache modules are connected to one instruction storage module. The hardware cost of the instruction cache module is smaller than that of the instruction storage module, and a plurality of small-capacity instruction cache modules are used for replacing a plurality of large-capacity instruction storage modules, so that the hardware cost of a processor can be reduced. However, in the high-performance service scenario with larger stored instruction amount, the instruction cache module still needs to provide larger capacity, the hardware overhead of the instruction cache module is increased, and the storage overhead is still higher.

An embodiment of the present application provides an instruction processing method, please refer to fig. 1, which shows a schematic diagram of an implementation environment of the instruction processing method provided by the embodiment of the present application, where the implementation environment includes a compression module 01, a memory structure 02 and a restoration module 03. The compression module 01, the memory structure 02 and the restoration module 03 may establish a communication connection relationship as shown in fig. 1 in an efficient or wireless network manner. In a possible implementation manner, the compression module 01 is configured to compress a first instruction to be processed by using the method provided by the embodiment of the present application, and store the compressed first instruction to the memory structure 02. The restoration module 03 may perform restoration operation on the compressed first instruction stored in the memory structure 02 by using the method provided by the embodiment of the present application, to obtain the first instruction.

Illustratively, the memory structure 02 in fig. 1 may include one or more memories, and fig. 2 is a schematic diagram of another implementation environment provided in an embodiment of the present application, and referring to fig. 2, the processor includes a storage module, a buffer module, and a decoding module. The memory module and the cache module may be used to store instructions (inst), in which case the memory structure 02 in fig. 1 includes memories that are the memory module and the cache module in the left and right diagrams of fig. 2. In one possible scenario, the cache module may be referred to as an instruction cache module (instruction cache, ICACHE) for caching instructions processed by the module to be decoded to increase the rate at which instructions are read by the decode module and thus increase the instruction resolution rate. In the left diagram of fig. 2, the instruction interaction among the storage module, the buffer module and the decoding module is performed by taking an instruction word as a whole, where the instruction word refers to a plurality of instructions located in a same row of storage units, and the same row of storage units may be storage units in the storage module or buffer units in the buffer module.

For the case where the memory structure 02 shown in fig. 1 includes a plurality of memories, the restoration module 03 may establish communication connections with the respective memories included in the memory structure 02, respectively. Referring to the right diagram of fig. 2, the storage module and the cache module in fig. 2 are respectively connected with the restoration module 03. The plurality of reduction modules 03 in the right diagram of fig. 2 refer to different operation units of the reduction module 03, and reduction operations performed by the different operation units are different. Alternatively, the number of the storage modules and the cache modules may be the same, that is, the storage modules and the cache modules are in one-to-one correspondence, or the number of the storage modules and the cache modules may be different, for example, a plurality of cache modules share one storage module, which is not limited in the embodiment of the present application.

Alternatively, the compression module 01, the memory structure 02 and the restoration module 03 in fig. 1 may be integrated on the same network device, where the network device may be any network device configured with a processor, and the configured processor may be any programmable processor with any structure, for example, an RTC architecture forwarding processor, or another type of processor. The network device may be a server, such as a central server, an edge server, or a local server in a local data center, for example. The server can be a physical server, and can also be a cloud server for providing cloud computing service in a cloud scene. In some embodiments, the network device may also be a terminal device such as a desktop, notebook, or smart phone, or a switch (switch), router, gateway (GW), or the like. The instruction processing operations performed by the compression module 01 may be implemented based on a compiler, for example, the compression module 01 is a compiler, and the instruction processing operations performed by the restoration module 03 may be implemented based on a hardware operation, for example, the restoration module 03 is a decoder.

An embodiment of the present application provides an instruction processing method, which may be applied to the implementation environments shown in fig. 1 or fig. 2, where the method is implemented by a compression module, and a flowchart of the method is shown in fig. 3, and includes S301-S303.

S301, acquiring a first instruction to be processed.

The instructions are illustratively commands that instruct the network device, such as a personal terminal or a server, to operate as a result of compiling the application program. The first instructions to be processed may be instructions compiled from any application, such as instructions for forwarding applications including, but not limited to, game applications, video playing software applications, or social application software, etc. Optionally, the compression module may acquire a code to be executed, and convert the acquired code into an instruction, to obtain a first instruction to be processed. The code to be operated can be a code input manually, or can be obtained by accessing a code library based on the operation requirement input manually. The compression module may also receive the first instruction to be processed sent from other devices or modules, and the embodiment of the present application does not limit a manner of obtaining the first instruction.

The embodiment of the application is also not limited to the number of the first instructions to be processed, and the number of the first instructions to be processed can be one. For example, the compression module may instruct the network device to operate by executing a move (mov) instruction based on the code determination, and the mov instruction may be determined to be the first instruction to be processed. Alternatively, the number of first instructions to be processed may be plural, and the plural first instructions may belong to the same instruction bundle or may belong to different instruction bundles, for example, a very long instruction word (very long instruction word, VLIW). The instructions included in the instruction bundles may be divided according to functions corresponding to the instructions, may be divided according to application programs corresponding to the instructions, or may be divided based on other modes. The instructions are connected together through the instruction bundle, and the instructions included in one instruction bundle are executed in parallel, so that the operation speed of executing the instructions is improved. The parallel execution of the plurality of instructions may refer to parallel execution of the plurality of instructions once, for example, one instruction bundle includes three instructions, and three instructions are executed in parallel in one beat, so as to implement synchronous execution of the three instructions. Executing multiple instructions in parallel may also refer to executing multiple instructions in parallel, i.e., in a manner that combines parallelism and serialization. For example, one instruction bundle includes eight instructions, and since only four instructions can be executed in one beat at most, the first four instructions are executed in parallel, and then after the execution of the first four instructions is completed, the last four instructions are executed in parallel.

Illustratively, the compression module obtains at least one instruction included in the instruction bundle, the at least one instruction being an instruction executed in parallel, any one of the at least one instruction being a first instruction to be processed. That is, each instruction included in the instruction bundle is processed as a first instruction by adopting the instruction processing method provided by the embodiment of the present application, since a process of processing one first instruction is similar to a process of processing a plurality of first instructions, and then, taking any instruction included in the instruction bundle as a first instruction to be processed as an example, a process of processing one first instruction is illustrated, and a process of processing other instructions included in the instruction bundle except any instruction may refer to similar descriptions, and no repeated description is given here.

S302, determining a compression level of a memory in a memory structure according to the memory structure and a demand rate for reading the first instruction, wherein the memory structure is used for storing the compressed first instruction.

In one possible scenario, the network device upon which the first instruction directs operation includes a memory structure for storing the first instruction to be processed to achieve a programmable effect. Taking an example that the network device is configured with an RTC architecture forwarding processor as shown in fig. 4, the RTC architecture forwarding processor includes a scheduler (input scheduler), a processor core (RTC core), a storage module, and the like, where the storage module, that is, a memory included in the memory structure. Optionally, a communication connection is established between the processor core and the storage module, and the established communication connection may be a direct connection or an indirect connection implemented based on the restoration module, that is, the processor core is connected to the restoration module, and the restoration module is connected to the storage module. Alternatively, the number of the restoration modules may be one, that is, one restoration module is respectively connected to the plurality of memory modules and the plurality of processor cores, and the number of the restoration modules may be the same as the number of the memory modules, that is, the restoration modules are in one-to-one correspondence with the processor cores and the memory modules as shown in fig. 4, one restoration module is used for connecting one processor core and one memory module, and the number of the restoration modules may be less than the number of the memory modules, that is, there is one restoration module connected to the plurality of memory modules, and there is also a case that one restoration module is connected to one memory module.

The compression module may compress the first instruction according to the memory structure and a demand rate at which the first instruction is read during the storing of the first instruction into the memory structure. Alternatively, the rate of occurrence in the present application may be understood as a time delay or clock period, and the required rate of reading the first instruction refers to the clock period required to read the first instruction from the memory structure, and the unit of measure of the clock period is, for example, a beat, or other unit of time. In one possible scenario, reading the first instruction from the memory structure includes a plurality of processes that read the first instruction from a memory included in the memory structure and process the read first instruction, such as a restore operation.

The embodiment of the application does not limit the process of acquiring the demand rate of reading the first instruction by the compression module, and comprises the steps of, but is not limited to, acquiring at least one of path attributes of reading paths of various memories included in a memory structure or reading efficiency of the memory structure, wherein the reading paths of the memories are paths connected between the memories and the reading module, and the reading module is used for reading the instructions stored in the memories and processing the read instructions, and determining the demand rate of reading the first instruction according to at least one of the path attributes or the reading efficiency.

The read module of the memory means, for example, a read module connected to the memory for reading instructions stored in the memory. Taking the memory structure as an example, the memory structure is shown in the left diagram of fig. 2, when the memory is a storage module, the reading module is a cache module, and when the memory is a cache module, the reading module is a decoding module. Taking the memory structure as an example, the memory structure is shown in the right diagram of fig. 2, when the memory is a storage module, the reading module is a restoration module 03 connected with the storage module, and when the memory is a cache module, the reading module is a restoration module 03 connected with the memory.

Alternatively, the compression module may determine a read path for connection between each memory and the read module, determine a path attribute of the read path, where the path attribute is a characteristic that affects a read latency of the read path, the path attribute includes, but is not limited to, a length or width of the read path, and the like. In one possible implementation, the read efficiency is associated with a network processor (network processor, NP) corresponding to a memory structure, and the compression module may obtain a processing performance of the NP corresponding to the memory structure, determine the read efficiency of the memory structure according to the processing performance, and determine a demand rate for reading the first instruction stored in the memory structure according to the read efficiency and a path attribute of a read path of each memory.

Illustratively, the demand rate of reading the first instruction stored in the memory structure refers to an overall rate required for completely reading the first instruction from the memory structure, and when one memory is included in the memory structure, the demand rate of reading the first instruction refers to a rate of reading the first instruction from one memory and a rate of restoring the read first instruction, and when a plurality of memories are included in the memory structure, such as the memory module and the cache module shown in fig. 2, the demand rate of reading the first instruction from the memory structure includes a rate of reading the first instruction from the memory module 1, a rate of restoring the read first instruction 2, a rate of sending the first instruction stored in the memory module to the cache module 3, a rate of reading the first instruction stored in the cache module 4, and a rate of restoring the read first instruction 5. Alternatively, the demand rate for reading the first instruction may be set based on experience, for example, the compression module provides an information input control, the operation and maintenance personnel sets the demand rate for reading the first instruction based on experience and the implementation environment, and inputs the demand rate through the information input control, and the compression module obtains the demand rate for reading the first instruction. The implementation environment may refer to a hardware configuration of the memory structure, for example, a type of NP adopted or a model of the memory structure.

Regardless of the manner in which the compression module obtains the demand rate for reading the first instruction, the compression level of memory in the memory structure may be determined based on the memory structure and the demand rate for reading the first instruction. The compression module determines a read rate of each memory included in the memory structure, determines a restore rate of each memory according to a demand rate for reading the first instruction and the read rate of each memory, wherein the restore rate of each memory is a rate for decompressing instructions stored in the memory on a read path of the memory, and determines a compression level of each memory according to the restore rate of each memory, wherein the restore rate corresponding to the compression level of the memory is matched with the restore rate corresponding to the memory.

Optionally, the compression module determines a memory of the first instruction to be stored in the memory structure, taking the memory structure including a storage module and a plurality of cache modules as an example, where the plurality of cache modules share the storage module, in this case, the first instruction is stored by the storage module first and then stored by any one of the plurality of cache modules, so that the memory of the first instruction to be stored in the memory structure is a storage module and a cache module.

The compression module may then determine, according to the hardware structure of each memory, a read rate required for reading the instructions from each memory, where the read rate of the first instruction is an overall rate at which reading the first instruction from the memory included in the memory structure is completed, i.e., the read rate of the first instruction from the memory is included in the demand rate, and the processing rate of the first instruction is processed, where processing the first instruction includes restoring the first instruction, and so on. Therefore, the sum of the restore rate and the read rate of each memory is not greater than the demand rate of reading the first instruction, the sum of the demand rate and the read rate of each memory can be subtracted, the obtained difference represents the maximum value of the sum of the restore rate of each memory, and then the restore rate of each memory is determined according to the hardware structure of each memory, such as the read path of each memory and the read rate required by the read module of each memory for reading the instruction, the restore rate of any memory indicates the restore rate of the instruction stored in any memory, and the restore rate does not affect the original forwarding performance of the memory.

Since the reduction rate and the compression complexity are inversely related, the higher the compression complexity is, the lower the reduction rate corresponding to the compression mode is, so that the corresponding compression complexity can be determined according to the reduction rate, and the compression level can be determined according to the compression complexity. Alternatively, the higher the compression level, the higher the compression complexity, the lower the reduction rate, and the higher the compression rate. Where compression ratio refers to the ratio of instructions after compression to instructions before decompression.

Optionally, the compression module may further obtain a rate range corresponding to each compression level, and determine the compression level of each memory according to the restore rate of each memory and the rate range corresponding to each compression level. Taking an example that the compression level includes one-layer compression and two-layer compression, the rate range corresponding to one-layer compression is the range A, the rate range corresponding to two-layer compression is the range B, because the reduction rate of the storage module is the rate included in the range B, the compression module determines that the compression level of the storage module is two-layer compression, and based on the reduction rate of the cache module is the rate included in the range A, the compression module determines that the compression level of the cache module is one-layer compression.

In one possible case, the compression module may further directly determine the compression level of each memory according to the read rate of each memory, and take the memory including the cache module and the storage module as an example, since the read rate of the storage module is lower than the read rate of the cache module, the compression level of the storage module is higher than the compression level of the cache module. For example, the compression level of the storage module is two-layer compression, the compression level of the cache module is one-layer compression, and the reduction rate of one-layer compression is higher than that of the two-layer compression.

S303, compressing the first instruction by adopting a compression level.

Illustratively, the compression module determines a compression manner of a compression hierarchy of each memory, and compresses the first instruction according to the determined compression manner. The compression modes of different compression levels can be different, for example, one-layer compression mode is fixed-length processing, and two-layer compression mode is variable-length compression. The compression modes of different compression levels can be the same, that is, the compression times indicated by different compression levels are different, for example, the compression mode of one layer of compression is compression by adopting the compression algorithm 1, the compression mode of two layers of compression is primary compression by adopting the compression algorithm 1 to obtain the compression result 1, the compression result 1 is subjected to secondary compression to obtain the compression result 2, and the compression result 2 is the result obtained by adopting the compression mode of two layers of compression.

In the case where the memory structure includes a plurality of memories, there is a compressed order of the plurality of memories. Taking the memory structure including the memory module and the cache module as an example, referring to fig. 5, after the first instruction is compressed by the compression algorithm 1 and the compression algorithm 2 to obtain a third instruction, the third instruction is stored in the memory module. And the third instruction stored in the storage module is sent to the cache module for caching after being read, and the third instruction is restored in the transmission process, namely decompression logic 2 corresponding to the compression algorithm 2 is executed, the third instruction is restored to be the second instruction, and then the second instruction is cached in the cache module. And then, the second instruction stored in the buffer module is sent to the decoding module for analysis after being read, and the second instruction is restored in the transmission process, namely decompression logic 1 corresponding to the compression algorithm 1 is executed, the second instruction is restored to be the first instruction, and the decoding module decodes the first instruction. In this case, since the decompression corresponding to the compression scheme of the storage module is performed first, and then the decompression corresponding to the compression scheme of the cache module is performed, the decompression order and the compression order are reversed, that is, the compression performed later is performed, and the corresponding restoration is performed earlier. Therefore, the compression module firstly executes the compression mode of the cache module and then executes the compression mode of the storage module.

Optionally, for a memory structure including a plurality of memories, the compression module may determine the compression mode of each memory first, then determine the compression sequence of each memory, or determine the compression sequence of each memory first, then determine the compression mode of each memory. Or determining the compression order and compression scheme based on multi-threaded parallelism. The compression module determines the compression sequence and the compression mode based on any mode, and the compression mode of each memory can be used for compressing the first instruction according to the compression sequence to obtain a compressed first instruction.

Because the processes of compressing the first instruction once and compressing for a plurality of times are similar, the memory structure comprises a memory module and a buffer module, the compression level of the memory module is two-layer compression, the compression level of the buffer module is one-layer compression, the reduction rate of one-layer compression is higher than that of two-layer compression, and the process of compressing the first instruction is exemplified and described.

In one possible implementation, the compression module compresses the first instruction by adopting a one-layer compression mode to obtain the second instruction, and compresses the second instruction by adopting a two-layer compression mode to obtain the third instruction, wherein the compression rate of the two-layer compression mode is larger than that of the one-layer compression mode. Illustratively, the compression mode of one layer of compression is fixed length processing, and the ratio between the bit width of the first instruction and the bit width of the second instruction is equal to the compression ratio. The compression ratio is, for example, any integer based on experience and implementation environment settings. The compression ratio may also be determined based on a service characteristic, optionally, the compression module obtains a service characteristic of the first instruction, the service characteristic indicating an operational quality requirement of a service operated based on the first instruction, and determines the compression ratio based on the service characteristic of the first instruction. The service characteristics may be determined based on service level agreements (service-LEVEL AGREEMENT, SLA), and include, but are not limited to, service running delay, packet loss rate, and the like. The compression module acquires service characteristics and determines compression ratio which can be performed under the condition of not affecting service operation quality based on the service characteristics. For example, the compression ratio determined based on the traffic characteristics may be 2, and when the first instruction is 64 bits, the second instruction obtained by fixed-length processing is 32 bits, and when the first instruction is 8 bits, the second instruction obtained by fixed-length processing is 4 bits, in which case the second instruction obtained by fixed-length processing of the first instruction may also be referred to as a half-width instruction.

In one possible case, the fixed length processing comprises fixed length compression or splitting, in which case the compression module can acquire the instruction characteristics of the first instruction, compress the first instruction in a fixed length compression mode to obtain the second instruction if the fixed length compression is determined to be supported according to the instruction characteristics, or split the first instruction into a plurality of second instructions if the fixed length compression is determined not to be supported according to the instruction characteristics, wherein the bit width of each second instruction in the plurality of second instructions is the same.

Illustratively, the instruction characteristic of the first instruction includes at least one of an instruction type, a frequency of use, or a field utilization of the first instruction. The instruction type may be a functional class of an instruction, for example, the instruction type includes a transfer class instruction or a memory access class instruction, etc., and the instruction type may also be classified according to an instruction format, for example, the instruction type includes a double-operand instruction, a single-operand instruction, a program transfer instruction, etc. Alternatively, the frequency of use of the first instruction may reflect the number of times of use of the instruction of the same function in a reference time, and the reference time may be any time unit set according to experience and implementation environment, for example, the reference time is 1 second, taking the first instruction as a move (mov) instruction as an example, and the compression module counts the number of mov instructions executed in 1 second, and counts the frequency of use of the mov instruction. The frequency of use may also refer to a proportion of the plurality of instructions running the application program, where the functional instruction corresponds to the first instruction. In one possible scenario, field utilization is used to reflect how many blank fields are included in the first instruction.

The compression module may choose to determine whether fixed-length compression may be performed on the first instruction based on at least one of instruction type, frequency of use, or field utilization. An implementation of determining whether fixed-length compression can be performed on the first instruction based on the three instruction characteristics is described next.

In the first implementation mode, the instruction type supporting fixed-length compression is acquired, and the fixed-length compression is determined to be executed on the first instruction under the condition that the acquired instruction type comprises the instruction type of the first instruction.

For example, the instruction type supporting the fixed-length compression may be manually input, for example, the network device in which the compression module is located provides an information input control, and the operator inputs the instruction type supporting the fixed-length compression according to the information input control, so that the compression module obtains the instruction type set supporting the fixed-length compression. Taking an example that the instruction type includes a mov instruction and an add (add) instruction, since the size of a blank field that is not used in the mov instruction is larger than a first threshold that is set based on experience, the mov instruction has more unused space, and fixed-length compression can be performed. The size of the unused blank field in the add instruction is smaller than the first threshold value set based on experience, the unused space of the add instruction is small, and fixed-length compression cannot be adopted. Based on this, the operator inputs the instruction type including mov instruction supporting fixed-length compression through the information input control. In a possible case, the compression module may also acquire an instruction type set supporting fixed-length compression through learning of historical data, for example, compression results of fixed-length compression for each instruction type, and statistics is performed on success rates of fixed-length compression for different instruction types through machine learning, so as to determine the instruction type supporting fixed-length compression.

In the second implementation mode, the fixed-length compression is determined to be adopted for the first instruction under the condition that the using frequency is larger than the second threshold value, or the fixed-length compression is determined not to be adopted for the first instruction under the condition that the using frequency is not larger than the second threshold value.

Alternatively, the second threshold may be any value set empirically, for example, 70%, or 50%. When the frequency of the first instruction is greater than the second threshold, the first instruction belongs to a more commonly used instruction, the first instruction is cached in the cache module for a plurality of times, and the occupied memory expense is large. Therefore, the first instruction is compressed in a fixed length mode, so that the cache space occupied by the first instruction in the cache module is reduced, and the storage overhead of the first instruction is further reduced. The frequency of the first instruction is not greater than the second threshold, which means that the first instruction is not a more commonly used instruction, the number of times of buffering the first instruction in the buffer module is small, if the first instruction is compressed by fixed length, the first instruction after the fixed length is compressed is restored, even if the storage cost is reduced, the restoration cost increased by restoring the compressed first instruction is higher than the reduced storage cost, and the instruction cost is still larger.

In the third implementation mode, the fixed-length compression is determined to be adopted for the first instruction under the condition that the field utilization rate is smaller than a third threshold value, or the fixed-length compression is determined not to be adopted for the first instruction under the condition that the field utilization rate is not smaller than the third threshold value.

Alternatively, the third threshold may be set according to a compression ratio, and in the case where fixed-length compression is used to compress the first instruction into a half-width instruction having a bit width of half, the third threshold may be 50%, and when the field utilization is less than 50%, which indicates that there are more unused free fields in the first instruction, the first instruction may be compressed into a half-width instruction by deleting the free fields. When the field utilization rate is greater than 50%, since the unused idle fields in the first instruction are few, even if the idle fields are all deleted, the half-width instruction cannot be obtained, but the half-width instruction is further processed, the processing process is redundant and complex, and the efficiency is low, so that the first instruction is not compressed in a fixed length mode.

The compression module may adopt any one of the first implementation manner, the second implementation manner and the third implementation manner to determine whether the first instruction can be compressed in a fixed length according to the instruction characteristics. And when different results appear in the judging result, for example, the first instruction cannot be fixed-length compressed according to the instruction type, the first instruction can be fixed-length compressed according to the field utilization rate, and the judging result can be determined according to the weights of different instruction types. With the empirically set weight of the instruction type lower than the field utilization, the compression module thereby determines that fixed length compression is to be employed on the first instruction because the field utilization indicates that fixed length compression may be performed on the first instruction.

Whether the compression module can compress the first instruction by fixed length or not is determined based on any mode, the first instruction can be subjected to fixed length processing according to the judging result. Illustratively, the process of fixed length compressing the first instruction includes determining a valid field of the first instruction, resulting in a second instruction that includes the valid field. Where the valid field refers to the field used to execute the first instruction. The field is, for example, a non-blank field, or may be a repeated field, or may be a field necessary for executing the instruction.

FIG. 6 is a schematic diagram of an effective field provided by an embodiment of the present application, and four types of instructions, respectively, a mov instruction, a merge (mrg) instruction, a compare (cmp) instruction, and a jump (jmp) instruction, are shown in FIG. 6. Referring to fig. 6, the mov instruction includes an operation code (opcode) field, an unactuated condition (cond) field, a destination (Dst) operand field, and a source (Src) 1 operand/Immediate (IMM) field, with slashed hatching in fig. 6 indicating blank fields. In fig. 6, the non-blank fields in the mov instruction include an opcode field, an unactuated cond field, a Dst field, and a Src1/IMM field, and since the mov instruction only needs the opcode field, the Dst field, and the Src1 field, the unactuated cond field is not needed, and thus the valid fields in the mov instruction are the opcode field, the Dst field, and the Src1 field. The valid fields for mrg instructions, cmp instructions, and jmp instructions may be determined in conjunction with mrg instructions, cmp instructions, and jmp instructions of FIG. 6 and fixed length compressed mrg instructions, fixed length compressed cmp instructions, and fixed length compressed jmp instructions.

After the compression module determines the valid field of the first instruction, the second instruction may be determined based on the valid field. Optionally, the compression module may extract the valid field included in the first instruction, and concatenate the valid field to obtain the second instruction. The effective field can be reduced, and the reduced effective field is spliced to obtain the second instruction. The effective field is reduced, namely, bits occupied by the effective field are reduced, first bits occupied by information carried in the effective field are determined, second bits occupied by the effective field are determined, and the bits occupied by the effective field are adjusted from the second bits to the first bits. For example, the field size of the effective field is 6 bits, and the information carried in the effective field occupies 4 bits, so that the effective field can be reduced to 4 bits. In fig. 6, the compression module extracts the opcode field, the Dst field and the Src1 field, reduces the opcode field, the Dst field and the Src1 field, and concatenates the reduced opcode field, dst field and Src1 field to form the fixed-length compressed mov instruction shown in fig. 6, that is, the second instruction. In one possible case, the compression module may also narrow the data selection range of the valid field, for example, determine a common selection range in the valid field with a frequency of use greater than a reference threshold, and use the common selection range as the data selection range indicated by the valid field. For example, in the mrg instruction of fig. 6, since the data selection uses a low address space and a small immediate, where the small immediate exponent value is less than the immediate of the fourth threshold, the low address space exponent value is less than the address space of the fifth threshold, which may be values based on experience and implementation environment settings. In this case, the small immediate can be modified to one field, enabling a reduction in the range of data selection. The above described process of fixed length compression may also be referred to as instruction set architecture compression (instruction set architecture compression, ISA compression) or instruction set compression, where possible.

For the first instruction which cannot be compressed in a fixed length mode, the compression module can split the first instruction into second instructions with the same compression ratio bit width. The splitting process may determine a splitting position of the first instruction in the instruction structure of the first instruction according to the instruction structure of the first instruction, for example, according to the compression ratio, and split the first instruction at the splitting position to obtain a plurality of second instructions. Taking the compression ratio of 2 as an example, the first instruction is 8 bits, based on dividing 8 by 2 and equal to 4, determining the splitting position of the first instruction as the middle point of the instruction structure, namely the position of the 4 th bit, splitting the first 4-bit field of the first instruction into a first second instruction, and splitting the last 4 bits of the first instruction into a second instruction. The first instruction is compressed by splitting or fixed length, the instruction size of the second instruction is unified, the bit width proportion of the instruction is normalized, and the subsequent process of compressing the second instruction with the unified bit width proportion is simpler, and the compression efficiency is higher.

Because the compression module may compress the first instruction to obtain the second instruction by fixed length, or split the first instruction to obtain the second instruction, the second instruction includes a compression flag bit, where the compression flag bit indicates that the second instruction is obtained by fixed length compression or splitting. Referring to fig. 7, fig. 7 includes 8 first instructions to be processed. The numbers after inst are used for distinguishing different first instructions, english in brackets is the instruction type of the first instructions, namely mov indicates that the first instructions are mov instructions, add indicates that the first instructions are add instructions, jmp indicates that the first instructions are jmp instructions. In fig. 7, 0 and 1 after each first instruction are used to determine that the instruction is not the last instruction in the instruction bundle when the boundary bit is equal to 0, and determine that the instruction is the last instruction in the instruction bundle when the boundary bit is equal to 1, that is, in fig. 7, inst0, inst1, inst2, inst3 belong to the same instruction bundle, inst4, inst5 belong to the same instruction bundle, and inst6, inst7 belong to the same instruction bundle.

The compression module performs fixed-length processing on 8 first instructions in fig. 7 to obtain a plurality of second instructions. The second instruction in fig. 7 includes a compression flag bit, and by assigning a value to the compression flag bit, different fixed-length processing can be indicated. For example, when the compression flag bit is a first value, the manner of determining to acquire the second instruction is fixed-length compression, and when the compression flag bit is a second value, the manner of determining to acquire the second instruction is split. Wherein the first value and the second value may be any of various empirically set values, the first value being for example 1 and the second value being for example 0.

In FIG. 7, the second instruction to fetch the mov instruction is fixed length compression, the compression flag bit before inst0_C is assigned a first value of 1, and the second instruction to fetch the add instruction is split, the compression flag bit before inst2 (add) is assigned a second value of 0. In fig. 7, the compression module splits the add instruction, so the number of second instructions corresponding to the add instruction is two. In fig. 7, C in the second instruction of the mov instruction and the jmp instruction indicates one compression.

After the first instruction is subjected to fixed-length processing to obtain a second instruction, the second instruction can be compressed in a two-layer compression mode so as to reduce the size of the second instruction. Because the corresponding restoring mode of the two-layer compression is executed on the reading path from the storage module to the cache module, and the instruction reading speed requirement of the cache module is low, the restoring speed of the two-layer compression mode adopted by the compression module can be lower than that of the one-layer compression mode, the two-layer compression mode comprises but is not limited to variable-length compression, and the length of the instruction obtained by variable-length compression is not fixed.

The compression module may sequentially perform variable length compression on each second instruction according to the instruction position of each second instruction, may perform variable length compression on each second instruction randomly, and may perform variable length compression on each second instruction from high to low or from low to high according to the occurrence frequency of each second instruction. Referring to fig. 8, in the fixed-length processing of the plurality of first instructions according to the instruction characteristics, a part of the first instructions are selected to be compressed to be the original half width by fixed-length compression, and instruction compression marks are added to obtain second instructions. For an uncompressed full-width instruction, namely the first instruction, the first instruction is split into two sections according to half width, so that two second instructions are obtained. For the second instruction obtained by processing, the compression module counts the second instruction with high frequency, namely the occurrence frequency is higher than the frequency threshold value, and preferentially performs variable length compression on the second instruction with high frequency.

Optionally, the variable length compression process includes, but is not limited to, obtaining the frequency of occurrence of each field included in the second instruction, and obtaining a third instruction according to the frequency of occurrence of each field, wherein the third instruction includes codes of each field, the codes of each field are determined based on the frequency of occurrence of each field, and the length of the codes of each field is inversely proportional to the frequency of occurrence of each field.

For example, the frequency of occurrence of a field may refer to the frequency of occurrence of the field in a full instruction, the full instruction referring to all instructions to be stored into the memory structure, the full instruction including the second instruction. In one possible scenario, different instructions include identical fields, which may be identical in index value, or may refer to fields that are functionally identical, or functionally similar. And for any field included in the second instruction, the compression module counts the first number of the any field and the second number of the fields included in the full-quantity instruction, and the quotient obtained by dividing the first number by the second number is the occurrence frequency of any field.

After determining the frequency of occurrence of each field, the compression module may encode according to the frequency of occurrence of each field, and in a possible case, the compression module may construct a coding dictionary corresponding to the second instruction according to the frequency of occurrence of each field, where the coding dictionary includes each field and the frequency of occurrence of each field, and determine the encoding of each field according to the coding dictionary. The compression module may store the frequency of occurrence of each field as an index of the field, store the field and the index together in the coding dictionary, or determine a storage position of each field in the compression dictionary according to the frequency of occurrence of each field, take the compression dictionary as a multi-way tree as an example, take the field with the highest frequency of occurrence as a root node, and then start to construct leaf nodes from bottom to top according to the frequency of occurrence of each field from top to bottom.

Regardless of the manner in which the compression module constructs the dictionary, the second instruction may be variable-length compressed based on the constructed dictionary to obtain the third instruction, as shown in fig. 8. Optionally, the compression module determines the occurrence frequency of each field according to the coding dictionary, and determines the codes corresponding to the fields according to the occurrence frequency. The higher the frequency of appearance, the shorter the length of the code, taking the field including the field a, the field B and the field C as an example, the frequency of appearance of the field a is higher than the frequency of appearance of the field B, and the frequency of appearance of the field B is higher than the frequency of appearance of the field C, in this case, the code of the field a is 0, the code of the field B is 1, and the code of the field C is 01. In one possible implementation, the variable length compression may be referred to as Huffman compression.

The codes corresponding to the fields are determined according to the occurrence frequency, and the fields with higher occurrence frequency are more in the total instructions, so that the fields with more numbers are encoded with small length, and the codes corresponding to the fields are spliced into the third instructions, thereby effectively reducing the overall length of the total instructions. In addition, the second instruction may include one or more fields, which are not limited by embodiments of the present application. When the second instruction includes a field, the above-mentioned process of coding according to the occurrence frequency of the field may be understood as performing instruction overall coding on the second instruction according to the occurrence frequency of the second instruction in the full-scale instruction, that is, one second instruction corresponds to one code, and the code is the third instruction.

In one possible implementation, the third instruction includes a coding index for searching a field corresponding to a code included in the third instruction in the process of decompressing the third instruction, where the code employs a compression dictionary, the coding index may be referred to as information for searching a dictionary index. With continued reference to fig. 7, in fig. 7, the coded index is information carried in a tag (tag), which may also be referred to as a fixed-length coded compression header (code compress tag) in some cases.

Under a possible condition, after the compression module compresses the first instruction, the compressed first instruction can be stored in the memory structure, so that the compressed first instruction achieves a programmable effect, analysis of a subsequent decoding module and calculation of the parallel calculation module are facilitated, and smooth execution of an application program corresponding to the first instruction is ensured. The compression module obtains the instruction length of the compressed first instruction, determines a storage unit for storing the compressed first instruction in the memory structure according to the instruction length, and stores the compressed first instruction in the determined storage unit.

Optionally, taking the compressed first instruction as the third instruction in the above embodiment as an example, the third instruction further includes a length field of the third instruction on the basis of including the encoding index, where the encoding module may determine the instruction length of the third instruction by analyzing the length field of the third instruction, and then store the third instruction in a storage unit with a free memory size not smaller than the instruction length according to the memory size of the free unit in the storage module. In one possible case, the compression module may store a plurality of third instructions belonging to the same instruction bundle in the same line of storage units for the case where the third instructions are instructions obtained by compressing the first instructions included in the instruction bundle. Optionally, the compression module may determine, according to boundary bits of each third instruction, a plurality of third instructions belonging to the same instruction bundle, or may determine, according to an instruction bundle identifier carried by each third instruction, a third instruction with the same instruction bundle identifier as a third instruction belonging to the same instruction bundle. Since the instructions in the subsequent analysis storage module are stored in the same row by taking a row of storage units, the third instructions belonging to the same instruction bundle can be ensured to be processed in parallel at the same time by storing the third instructions belonging to the same instruction bundle in the same row. With continued reference to fig. 7, in fig. 7, the compression module stores the plurality of third instructions obtained by variable-length compression in the same line of storage units. In fig. 7, the first uncompressed instruction occupies two rows of memory units, and by compressing the first instruction, the memory space occupied by the first instruction is effectively reduced, thereby reducing the memory overhead.

In summary, in the instruction processing method provided by the embodiment of the present application, before the first instruction is stored, the first instruction is compressed, so as to reduce the size of the first instruction, and the number of compressed first instructions that can be stored by a memory structure with the same capacity is greater than the number of compressed first instructions, so that the number of instructions stored by the memory structure is increased, and the storage overhead of the memory structure is reduced. And more instructions are stored in the same row of storage units, and in the process of reading the instructions in a row unit later, the instruction reading times are less, and the reading cost is low. Since the first instruction is compressed based on the compression level of the memory, and the compression level of the memory is determined according to the demand rate of the first instruction read from the memory structure, the process of restoring the compressed first instruction meets the demand rate of the memory structure, even if the compressed first instruction is restored in the process of reading the compressed first instruction, the normal operation of the memory in the memory structure is not influenced, and the smooth operation of the application program operated based on the first instruction is ensured.

The embodiment of the application provides an instruction processing method, which can be applied to the implementation environment shown in fig. 1 or fig. 2, and is exemplified by the method being executed by a restoration module, and the flowchart of the method is shown in fig. 9, and includes S901-S902.

S901, acquiring a compressed first instruction, wherein the compression of the first instruction is realized based on a compression hierarchy, the compression hierarchy is determined based on a memory structure and a demand rate for reading the first instruction, and the memory structure is used for storing the compressed first instruction.

Illustratively, the restoration module obtains an instruction identifier to be processed, and determines a compressed first instruction to be processed according to the instruction identifier. The restoration module can determine the instruction identifier which is currently allowed to be calculated according to the program loading condition in the equipment, and can also receive the instruction identifier which is manually input. The instruction identifier may be an instruction bundle identifier to which the compressed first instruction belongs, or may be a location identifier of a storage unit where the instruction is located, for example, a line of storage units, or may be other identifiers capable of distinguishing different instructions.

Regardless of the manner in which the restoration module obtains the instruction identifier to be processed, the restoration module determines the compressed first instruction to be processed according to the instruction identifier, and the restoration module can extract the compressed first instruction from the memory structure. In one possible implementation, the restoration module may extract at least one instruction stored in a row of memory cells of the memory module, separate each instruction according to an instruction length of each instruction in the at least one instruction, and obtain at least one instruction, where the at least one instruction includes the compressed first instruction.

At least one instruction stored in a row of storage units is 0100001110101, wherein 0 is a compression flag bit, the beginning of a first instruction is 1, and the instruction length of the first instruction is 3, so that the first instruction is 100, 0 after 100 is a boundary bit, the compression flag bit of a second instruction is determined to be 0, and the second instruction is determined according to the instruction length of the second instruction, so that separation of a plurality of instructions is realized.

Fig. 10 is a schematic diagram of an instruction processing procedure according to an embodiment of the present application, referring to fig. 10, a plurality of compressed instructions are stored in a memory module in units of instruction words, and one instruction word in fig. 10 includes eight half-width instructions (half inst), because only eight half-width instructions can be stored in a line of memory cells in the cache module. The lengths of instruction words corresponding to the eight half-width instructions in fig. 10 are different because the lengths of instructions obtained by performing variable-length compression on the half-width instructions included in the different instruction words are different.

In fig. 10, taking the read instruction word iw_c4 as an example, the restoration module locates the instruction word iw_c4 by a Program Counter (PC), extracts the located instruction word iw_c4, and separates each instruction according to the instruction length of each instruction included in the instruction word iw_c4. For the instruction length of the instruction shown in fig. 7 is information included in the tag of the instruction, the restoration module can determine the instruction length of each instruction through tag analysis, so as to realize separation of a plurality of instructions, for example, a tag analyzer is adopted in fig. 10. Since the separation of the multiple instructions is performed sequentially in turn, the process of tag resolution may be referred to as serial resolution in some cases. The restoration module can determine a compressed first instruction to be processed from the plurality of instructions after the plurality of instructions are separated. Continuing to take fig. 10 as an example, the first five instructions in the eight instructions separated in fig. 10 are instructions included in the instruction bundle 1, and since the instruction identifier indicates that the currently pending instruction bundle is the instruction bundle 1, the restoration module determines the first five instructions as compressed first instructions to be processed. Optionally, the first instruction comprises an instruction to forward the application.

In one possible implementation, the restoration module may further receive the compressed first instruction sent by the memory structure, so as to obtain the compressed first instruction to be restored. Taking the memory structure shown in fig. 4 as an example, the processor core sends an instruction reading request to the memory module, and the memory module determines the instruction to be read as a compressed first instruction according to the instruction reading request, and reads the compressed first instruction. Since the instruction format that the processor core can process is in an uncompressed instruction format, the compressed first instruction is also restored to an uncompressed first instruction before returning the instruction to the processor core. Based on the above, the storage module sends the extracted compressed first instruction to the restoration module, and the restoration module obtains the compressed first instruction.

S902, restoring the compressed first instruction.

The restoring module obtains a compression mode corresponding to the compressed first instruction, and restores the compressed first instruction by adopting a restoring mode corresponding to the compression mode. The compression mode corresponding to the first instruction is matched with the compression level of the memory for storing the first instruction. The restoration module may determine a compression manner employed by the compression module on the first instruction based on the communication connection with the compression module. For the case that the compressed first instruction is obtained by compressing the first instruction in a plurality of compression modes, the restoration module can determine the restoration sequence corresponding to each memory according to the reading sequence of each memory, and execute the restoration mode corresponding to the compression mode on the compressed first instruction according to the restoration sequence.

In the embodiment shown in fig. 3, the memory structure includes a memory module and a buffer module, the compression level of the memory module is two-layer compression, the compression level of the buffer module is one-layer compression, the reduction rate of the one-layer compression is higher than the reduction rate of the two-layer compression, the process of reducing the compressed first instruction includes, but is not limited to, reducing the third instruction by adopting a reduction mode corresponding to the two-layer compression mode to obtain a second instruction, and reducing the second instruction by adopting a reduction mode corresponding to the one-layer compression mode to obtain the first instruction, wherein the compression rate of the two-layer compression mode is greater than the compression rate of the one-layer compression mode.

The first restoring mode corresponding to the variable-length compression comprises a restoring module analyzing a coding index included in a third instruction, determining each field corresponding to each code included in the third instruction according to the coding index, and determining a second instruction according to each field corresponding to each code, wherein the length of the code of each field is inversely proportional to the occurrence frequency of each field.

In one possible implementation, the third instruction includes a coded index, for example, the third instruction includes a tag field, and the recovery module parses the tag field to determine the coded index carried in the tag field. Optionally, the coding index may directly indicate a correspondence between the code and the field, and in the case where the compression module performs coding based on the coding dictionary, the coding index may indicate the coding dictionary used in the process of obtaining the third instruction by the coding, and the restoration module determines the correspondence between the code and the field according to the coding dictionary.

In one possible case, since the compression module compresses the first instruction to obtain the second instruction, the second instruction is obtained by compressing the first instruction to a fixed length, the second instruction is obtained by splitting the first instruction to obtain the high-order part of the second instruction, and the third instruction is obtained by splitting the first instruction to obtain the low-order part of the second instruction. And the corresponding coding dictionary adopted by the second instruction for coding different conditions is stored in different positions, in this case, the restoring module determines the condition corresponding to the second instruction corresponding to the third instruction according to the compression mark bit carried by the third instruction, and initiates the coding index to the storage position of the coding dictionary corresponding to the determined condition, thereby realizing the acquisition of the coding dictionary.

Illustratively, the restoration module parses the compression flag bit, determines a storage location of the dictionary of the third instruction according to the compression flag bit and information carried by the third instruction, accesses the determined storage location, and obtains the dictionary of the third instruction according to the encoding index. And when the value of the compression flag bit is a first value, determining that a second instruction corresponding to the third instruction is a first case, and determining a coding dictionary for decompressing the third instruction from a plurality of coding dictionaries for fixed-length compression according to the coding index of the third instruction. When the value of the compression flag bit is a second value, determining whether the second instruction corresponding to the third instruction belongs to a high-order part or a low-order part according to the information carried by the second instruction, and when the second instruction corresponding to the third instruction belongs to the high-order part, determining that the second instruction corresponding to the third instruction is a second case, and determining a coding dictionary for decompressing the third instruction from a coding dictionary with high-order width according to the coding index. And when the second instruction corresponding to the third instruction belongs to the low-order part, determining that the second instruction corresponding to the third instruction is in the third condition, and determining a coding dictionary for decompressing the third instruction from the coding dictionary with low-order width according to the coding index.

Fig. 11 is a process of restoring a third instruction according to an embodiment of the present application, and fig. 11 shows a process of decompressing the third instruction included in an instruction word to obtain a second instruction. The method comprises the steps that an instruction word comprises 9 third instructions, a reduction module determines fixed-length processing corresponding to each third instruction according to compression marking bits of each third instruction, analyzes coding indexes carried in tags under the condition that the fixed-length processing corresponding to the third instructions is fixed-length compression, sends the coding indexes to a storage space for storing a coding dictionary for fixed-length compression, and receives the coding dictionary for decompressing the third instructions, which is searched and returned according to the coding indexes. And under the condition that fixed-length processing corresponding to the third instruction is split, sending the coding index to a storage space for storing the coding dictionary with high bit width for the third instruction obtained by compressing the second instruction with high bit width, receiving the coding dictionary for decompressing the third instruction, which is searched and returned according to the coding index, sending the coding index to the storage space for storing the coding dictionary with low bit width for the third instruction obtained by compressing the second instruction with low bit width, and receiving the coding dictionary for decompressing the third instruction, which is searched and returned according to the coding index.

Regardless of the manner in which the restoration module obtains the coding dictionary for decompressing the third instruction, the correspondence between the codes and the fields may be obtained according to the coding dictionary. Taking the storage position of each field constructed by the coding dictionary according to the frequency of occurrence of each field as an example, the restoring module determines the frequency of occurrence of each field according to the storage position of each field in the coding dictionary, and determines the corresponding relation between each field and the code according to the corresponding relation between the frequency of occurrence of each field and the code. And then, the restoring module searches and obtains the fields corresponding to the codes included in the third instruction according to the corresponding relation between the fields and the codes, and splices the corresponding fields to obtain the second instruction.

Illustratively, the restoring module may send the second instruction to the caching module for caching after restoring the third instruction to obtain the second instruction. Optionally, in the process of restoring the third instruction, the restoring module takes the instruction word as a dimension, where the instruction word refers to at least one instruction stored in a line cache unit of the cache module, so that the multiple second instructions obtained by restoring may be stored in the same line of the cache module, for example, as shown in fig. 10, and the instruction word iw_4 including the multiple second instructions obtained by decompressing is stored in a line in the cache module.

The reduction module may further reduce the second instruction after reducing the third instruction to obtain the second instruction. For example, a second instruction is extracted from the cache module, and the second instruction is restored by adopting a restoration mode corresponding to a compression mode of one layer of compression. In the case that the compression mode of the one-layer compression is fixed-length processing, the second reduction mode corresponding to the fixed-length processing includes determining a mode for obtaining the second instruction from fixed-length compression or splitting, and executing a reduction operation corresponding to the determined mode on the second instruction.

In one possible implementation manner, for the second instruction, the compression flag bit indicates that the second instruction is obtained by fixed-length compression or splitting, the reduction module may parse the compression flag bit included in the second instruction, so as to determine, according to the compression flag bit, a manner in which the second instruction is obtained from fixed-length compression and splitting, and further select a reduction operation for reducing the second instruction.

The restoration module reads the compression flag bit, determines that the second instruction is obtained by fixed-length compression when the compression flag bit is a first value, and determines that the second instruction is obtained by splitting when the compression flag bit is a second value. Fig. 12 is a schematic diagram of a second instruction restoration, where one instruction word in fig. 12 includes eight second instructions, and the restoration module parses compression flag bits before inst0_c, and determines that fixed-length processing corresponding to the second instruction is fixed-length compression because the value of the compression flag bits is a first value 1. And analyzing a compression mark bit before inst2, wherein the value of the compression mark bit is a second value 0, and determining that the fixed-length processing corresponding to the second instruction is split.

And after the fixed-length processing corresponding to the second instruction is determined, the reduction module can execute fixed-length reduction corresponding to the fixed-length processing. In one possible case, the restoration mode corresponding to the compression mode of the first layer of compression includes decompression corresponding to fixed-length compression or splitting corresponding splicing, and then the second restoration mode is executed on the second instruction, including but not limited to the following two restoration operations.

And in the first reduction operation, under the condition that the second instruction is obtained by a fixed-length compression mode, decompressing corresponding to the fixed-length compression is adopted for the second instruction, and the first instruction is obtained. And for the second instruction obtained by fixed-length compression, decompression corresponding to fixed-length compression can be performed on the second instruction. Continuing taking the second instruction shown in fig. 6 as an example, determining an instruction structure corresponding to the second instruction, restoring an invalid field corresponding to the second instruction according to the instruction structure, and splicing the invalid field and an effective field included in the second instruction to obtain the first instruction. The restoration module may determine the instruction structure according to the instruction type of the second instruction, for example, determine that the non-blank field of the second instruction further includes an unvalidated cond field based on the second instruction belongs to the mov instruction, determine information carried by the unvalidated cond field according to the valid field carried by the second instruction, and splice the unvalidated cond field with other blank fields and the valid field to obtain the first instruction shown in fig. 6.

And (2) under the condition that the second instruction is obtained in a splitting mode, splicing a plurality of second instructions obtained by removing the first instruction to obtain the first instruction, wherein the bit width of each second instruction in the plurality of second instructions is the same. And for the second instructions obtained by splitting, the restoration module determines a plurality of second instructions obtained by splitting the same first instruction, and splices the plurality of second instructions to obtain the first instruction. Alternatively, the restoration module may determine the adjacent storage according to the storage location of the second instruction, for example, a plurality of second instructions obtained by splitting the same first instruction are adjacent to each other, and the second instruction whose compression flag bit is the second value is the second instruction obtained by splitting the same first instruction. Or the restoration module can also determine the second instruction belonging to the same first instruction according to the information carried by the second instruction.

In one possible implementation manner, after the restoring module restores the first instruction, the restoring module further sends the first instruction to the decoding module, and the decoding module analyzes the first instruction. The decoding module takes the instruction bundle as a processing unit in the process of analyzing the instructions, so that the process of transmitting the first instructions by the restoring module comprises, but is not limited to, determining an instruction bundle corresponding to the first instructions, wherein the instruction bundle comprises at least one instruction, compression results of the at least one instruction are stored in the same row of storage units of the memory structure, compression results of any instruction in the compression results of the at least one instruction are compressed first instructions, the at least one instruction is a parallel executed instruction, and transmitting the at least one instruction included in the instruction bundle to the decoding module, wherein the decoding module is used for analyzing the at least one instruction.

Fig. 13 is a schematic diagram of an instruction processing procedure according to an embodiment of the present application, where in fig. 13, after a plurality of first instructions are obtained by performing fixed-length reduction on a plurality of second instructions, a first instruction included in an instruction bundle is extracted from the plurality of first instructions. For example, according to the boundary bit included in the first instruction, in the case that the boundary bit is 0, it is determined that the first instruction is not the last instruction included in the instruction bundle, the backward search is continued until the first instruction whose boundary bit is 1 is read, and the first instruction and the previous first instruction are determined as the first instruction included in the same instruction bundle. In FIG. 13, the instruction bundle includes four first instructions, inst0, inst1, inst2, and inst3, respectively, and the decode module receives a plurality of first instructions included in the instruction bundle.

Alternatively, the reduction module may perform fixed-length reduction on the second instruction first as shown in fig. 13, and then determine an instruction bundle to which the first instruction obtained by reduction belongs, or may determine an instruction bundle to which each second instruction belongs first, and then perform fixed-length reduction on the second instruction included in each instruction bundle to obtain the first instruction included in each instruction bundle. Taking the plurality of second instructions shown in FIG. 14 as an example, the restore module fetches instruction bundle 1 including inst0, inst1, inst2, and inst3, and instruction bundle 2 including inst4 and inst5 according to boundary bits of the plurality of second instructions. And decompressing the instruction beam, and performing fixed-length reduction on each second instruction included in the instruction beam to obtain a plurality of first instructions. The plurality of first instructions included in one instruction bundle are synchronously sent to the decoding module. In addition, the embodiment of the present application is not limited to the manner of determining the instruction included in the instruction bundle, and may be based on the boundary bits in the above embodiment, or may take other manners, for example, according to the instruction bundle identifier and the like.

In summary, according to the instruction processing method provided by the embodiment of the present application, the storage space occupied by the compressed first instruction is smaller than the storage space occupied by the first instruction, the memory structures with the same memory size can store more compressed first instructions, and the instruction storage number of the memory structures is high. Since the number of instructions stored in the same line in the memory structure is greater, in the case of reading instructions stored in the memory structure in units of lines, the efficiency of reading instructions is high, and instruction reading power consumption is saved.

For the case that the plurality of RTC cores in the RTC processor architecture share one cache module as shown in fig. 15, since the speed of reading the instructions stored in the cache module by each RTC core is fast, for example, the upper diagram in fig. 15 is the case that the uncompressed first instructions are stored in the cache module, the plurality of first instructions occupy four lines of cache units, the RTC cores can be read four times to take out the plurality of first instructions, the lower diagram in fig. 15 is the case that the compressed first instructions are stored in the cache module, the plurality of compressed first instructions occupy two lines of cache units, the RTC cores can take out the plurality of first instructions two times, the reading rate of the RTC cores is improved, the probability of collision occurring when the RTC cores access the cache module simultaneously is small, more RTC cores can be used to share the cache module, and the number of cache modules is saved. In addition, the restoration mode of the compressed first instruction meets the processing requirements of the reading modules of the memories of the memory structure, and even if the compressed first instruction is decompressed, the normal operation of the reading modules is not influenced, so that the processing performance of the reading modules and the normal operation of the service corresponding to the first instruction are ensured.

The instruction processing method of the embodiment of the application is introduced above, and the embodiment of the application also provides an instruction processing device corresponding to the method. Fig. 16 is a schematic structural diagram of an instruction processing apparatus according to an embodiment of the present application. The instruction processing apparatus shown in fig. 16 is capable of performing all or part of the operations shown in fig. 3 described above based on the following modules shown in fig. 16. It should be understood that the apparatus may include additional modules than those shown or omit some of the modules shown therein, and embodiments of the present application are not limited in this respect. As shown in fig. 16, the apparatus includes:

an acquiring module 1601, configured to acquire a first instruction to be processed;

a determining module 1602, configured to determine a compression level of a memory in a memory structure according to the memory structure and a demand rate for reading the first instruction, the memory structure being configured to store the compressed first instruction;

Compression module 1603 is configured to compress the first instruction using a compression hierarchy.

In one possible implementation, the acquiring module 1601 is further configured to acquire at least one of a path attribute of a read path of each memory included in the memory structure or a read efficiency of the memory structure, where the read path of the memory is a path connected between the memory and the reading module, and the reading module is configured to read an instruction stored in the memory and process the read instruction, and the determining module 1602 is further configured to determine a demand rate for reading the first instruction according to at least one of the path attribute or the read efficiency.

In one possible implementation, the determining module 1602 is configured to determine a read rate of each memory included in the memory structure, determine a restore rate of each memory according to a demand rate for reading the first instruction and the read rate of each memory, where the restore rate of each memory is a rate at which instructions stored in the memory are decompressed on a read path of the memory, and determine a compression level of each memory according to the restore rate of each memory, where a compression rate corresponding to the compression level of the memory matches a corresponding restore rate of the memory.

In one possible implementation, the memory structure includes a memory module and a buffer module, where the compression level of the memory module is two-layer compression, the compression level of the buffer module is one-layer compression, and the reduction rate of the one-layer compression is higher than that of the two-layer compression, the compression module 1603 is configured to compress the first instruction by using a compression manner of the one-layer compression to obtain the second instruction, and compress the second instruction by using a compression manner of the two-layer compression to obtain the third instruction, and the compression rate of the compression manner of the two-layer compression is higher than that of the compression manner of the one-layer compression.

In one possible implementation, the ratio between the bit width of the first instruction and the bit width of the second instruction is equal to the compression ratio, the obtaining module 1601 is further configured to obtain a service characteristic of the first instruction, where the service characteristic indicates an operation quality requirement of a service operated based on the first instruction, and the determining module 1602 is further configured to determine the compression ratio according to the service characteristic of the first instruction.

In one possible implementation manner, the compression module 1603 is configured to obtain an instruction characteristic of the first instruction, compress the first instruction by adopting a fixed-length compression manner to obtain the second instruction if it is determined that fixed-length compression is supported according to the instruction characteristic, or split the first instruction into a plurality of second instructions, where the bit width of each second instruction in the plurality of second instructions is the same if it is determined that fixed-length compression is not supported according to the instruction characteristic.

In one possible implementation, the compression module 1603 is configured to determine a valid field of the first instruction, and obtain a second instruction including the valid field, where the valid field is a field used to execute the first instruction.

In one possible implementation, the compression module 1603 is configured to obtain the frequency of occurrence of each field included in the second instruction, and obtain a third instruction according to the frequency of occurrence of each field, where the third instruction includes codes of each field, where the codes of each field are determined based on the frequency of occurrence of each field, and where the length of the codes of each field is inversely proportional to the frequency of occurrence of each field.

In a possible implementation manner, the determining module 1602 is further configured to construct a coding dictionary corresponding to the second instruction according to the occurrence frequency of each field, where the coding dictionary includes each field and the occurrence frequency of each field, and determine the coding of each field according to the coding dictionary.

In one possible implementation, the acquiring module 1601 is configured to acquire at least one instruction included in the instruction bundle, where any instruction in the at least one instruction is a first instruction to be processed, and at least one instruction is an instruction that is executed in parallel.

According to the device, the first instructions are compressed, so that the size of the storage space occupied by the first instructions is reduced, the number of compressed first instructions which can be stored by the memory structure with the same capacity is larger than the number of compressed first instructions, the instruction storage number of the memory structure is increased, and the storage cost of the memory structure is reduced. Since the compression of the first instruction is performed based on the compression hierarchy of the memory, and the compression hierarchy is determined based on the demand rate, the process of restoring the compressed first instruction satisfies the demand rate of the memory structure, and the process of restoring the compressed first instruction does not affect the normal operation of the memory in the memory structure.

The embodiment of the application also provides another instruction processing device. Fig. 17 is a schematic structural diagram of an instruction processing apparatus according to an embodiment of the present application. The instruction processing apparatus shown in fig. 17 is capable of performing all or part of the operations shown in fig. 9 described above based on the following modules shown in fig. 17. It should be understood that the apparatus may include additional modules than those shown or omit some of the modules shown therein, and embodiments of the present application are not limited in this respect. As shown in fig. 17, the apparatus includes:

The acquiring module 1701 is configured to acquire a compressed first instruction, where compression of the first instruction is implemented based on a compression hierarchy, where the compression hierarchy is determined based on a memory structure and a demand rate for reading the first instruction, and the memory structure is configured to store the compressed first instruction;

The restoring module 1702 is configured to restore the compressed first instruction.

In one possible implementation manner, the memory structure includes a memory module and a buffer module, the compression level of the memory module is two-layer compression, the compression level of the buffer module is one-layer compression, the reduction rate of the one-layer compression is higher than the reduction rate of the two-layer compression, the compressed first instruction includes a third instruction, the reduction module 1702 is configured to reduce the third instruction by adopting a reduction mode corresponding to the two-layer compression mode to obtain a second instruction, and reduce the second instruction by adopting a reduction mode corresponding to the one-layer compression mode to obtain a first instruction, where the compression rate of the two-layer compression mode is greater than the compression rate of the one-layer compression mode.

In one possible implementation manner, the third instruction includes a code index, the restoring module 1702 is configured to parse the code index included in the third instruction, determine each field corresponding to each code included in the third instruction according to the code index, determine the second instruction according to each field corresponding to each code, and the length of the code of each field is in an inverse proportion relationship with the occurrence frequency of each field.

In one possible implementation manner, the restoration manner corresponding to the compression manner of the one layer of compression includes decompression or splitting corresponding to fixed-length compression, and the restoration module 1702 is configured to, when the second instruction is obtained by fixed-length compression, decompress the second instruction by fixed-length compression to obtain the first instruction, or, when the second instruction is obtained by splitting, splice a plurality of second instructions obtained by splitting the first instruction to obtain the first instruction, where the bit widths of the second instructions in the plurality of second instructions are the same.

In one possible implementation manner, the second instruction includes a compression flag bit, where the compression flag bit indicates that the second instruction is obtained by fixed-length compression or splitting, and the restoring module 1702 is further configured to parse the compression flag bit included in the second instruction, and determine a manner of obtaining the second instruction from fixed-length compression and splitting according to the compression flag bit.

In one possible implementation, the obtaining module 1701 is configured to extract at least one instruction stored in a row of storage units of the memory structure, and separate each instruction according to an instruction length of each instruction in the at least one instruction to obtain at least one instruction, where the at least one instruction includes the compressed first instruction.

It should be understood that the apparatus provided in fig. 16 or fig. 17 is merely illustrative of the division of the functional modules when implementing the functions thereof, and in practical applications, the functional modules may be allocated to different functional modules according to needs, that is, the internal structure of the device may be divided into different functional modules to implement all or part of the functions described above. In addition, the apparatus and the method embodiments provided in the foregoing embodiments belong to the same concept, and specific implementation processes of the apparatus and the method embodiments are detailed in the method embodiments and are not repeated herein.

Referring to fig. 18, fig. 18 shows a schematic structural diagram of a network device 1800 according to an exemplary embodiment of the present application. The network device 1800 shown in fig. 18 is configured to perform the operations related to the instruction processing method shown in fig. 3 or fig. 9 described above. The network device 1800 is, for example, a switch, router, etc., and the network device 1800 may be implemented by a general bus architecture.

As shown in fig. 18, the network device 1800 includes at least one processor 1801, memory 1803, and at least one communication interface 1804.

The processor 1801 is, for example, a general-purpose central processing unit (central processing unit, CPU), a digital signal processor (DIGITAL SIGNAL processor, DSP), a network processor (network processer, NP), a graphics processor (graphics processing unit, GPU), a neural-network processor (neural-network processing units, NPU), a data processing unit (data processing unit, DPU), a microprocessor, or one or more integrated circuits for implementing aspects of the present application. For example, the processor 1801 includes an application-specific integrated circuit (ASIC), a programmable logic device (programmable logic device, PLD) or other programmable logic device, a transistor logic device, a hardware component, or any combination thereof. PLDs are, for example, complex programmable logic devices (complex programmable logic device, CPLD), field-programmable gate arrays (FPGAs), general-purpose array logic (GENERIC ARRAY logic, GAL), or any combination thereof. Which may implement or perform the various logical blocks, modules, and circuits described in connection with the disclosure of embodiments of the application. A processor may also be a combination of computing functions, including for example, one or more microprocessor combinations, a combination of DSPs and microprocessors, and the like.

Optionally, the network device 1800 also includes a bus. A bus is used to transfer information between the components of the network device 1800. The bus may be a peripheral component interconnect standard (PERIPHERAL COMPONENT INTERCONNECT, PCI) bus, or an extended industry standard architecture (extended industry standard architecture, EISA) bus, or the like. The buses may be divided into address buses, data buses, control buses, etc. For ease of illustration, only one thick line is shown in fig. 18, but not only one bus or one type of bus.

The Memory 1803 is, for example, but not limited to, a read-only Memory (ROM) or other type of static storage device that can store static information and instructions, a random access Memory (random access Memory, RAM) or other type of dynamic storage device that can store information and instructions, an electrically erasable programmable read-only Memory (ELECTRICALLY ERASABLE PROGRAMMABLE READ-only Memory, EEPROM), a compact disc (compact disc read-only Memory) or other optical disc storage, a compact disc storage (including compact disc, laser disc, optical disc, digital versatile disc, blu-ray disc, etc.), a magnetic disk storage medium, or other magnetic storage device, or any other medium that can be used to carry or store the desired program code in the form of instructions or data structures and that can be accessed by a computer. The memory 1803 is, for example, independent and is connected to the processor 1801 via a bus. The memory 1803 may also be integrated with the processor 1801.

The communication interface 1804 uses any transceiver-like device for communicating with other devices or communication networks, which may be an ethernet, a Radio Access Network (RAN), or a wireless local area network (wireless local area networks, WLAN), etc. The communication interface 1804 may include a wired communication interface and may also include a wireless communication interface. In particular, the communication interface 1804 may be an Ethernet interface, a fast Ethernet (FAST ETHERNET, FE) interface, a gigabit Ethernet (gigabit Ethernet, GE) interface, an asynchronous transfer mode (asynchronous transfer mode, ATM) interface, a wireless local area network (wireless local area networks, WLAN) interface, a cellular network communication interface, or a combination thereof. The ethernet interface may be an optical interface, an electrical interface, or a combination thereof. In an embodiment of the present application, the communication interface 1804 may be used for the network device 1800 to communicate with other devices.

In a particular implementation, the processor 1801 may include one or more CPUs, such as CPU0 and CPU1 shown in fig. 18, as one embodiment. Each of these processors may be a single-core (single-CPU) processor or may be a multi-core (multi-CPU) processor. A processor herein may refer to one or more devices, circuits, and/or processing cores for processing data (e.g., computer program instructions).

In a particular implementation, as one embodiment, the network device 1800 may include multiple processors, such as processor 1801 and processor 1805 shown in fig. 18. Each of these processors may be a single-core processor (single-CPU) or a multi-core processor (multi-CPU). A processor herein may refer to one or more devices, circuits, and/or processing cores for processing data (e.g., computer program instructions).

In a particular implementation, network device 1800 may also include output devices and input devices, as an embodiment. An output device communicates with the processor 1501 and information can be displayed in a variety of ways. For example, the output device may be a Liquid Crystal Display (LCD) CRYSTAL DISPLAY, a Light Emitting Diode (LED) display device, a Cathode Ray Tube (CRT) display device, a projector (projector), or the like. The input device(s) is in communication with the processor 1801 and may receive user input in a variety of ways. For example, the input device may be a mouse, a keyboard, a touch screen device, a sensing device, or the like.

In some embodiments, the memory 1803 is used to store program code 1810 for performing aspects of the present application, and the processor 1801 may execute the program code 1810 stored in the memory 1803. That is, the network device 1800 may implement the instruction processing method provided by the method embodiment through the processor 1801 and the program code 1810 in the memory 1803. One or more software modules may be included in program code 1810. Optionally, the processor 1801 itself may store program code or instructions for performing the inventive arrangements.

In particular embodiments, network device 1800 of embodiments of the present application may correspond to a computing device in each of the method embodiments described above.

Wherein the steps of the instruction processing method shown in fig. 3 or fig. 9 are performed by means of instructions in the form of integrated logic circuits of hardware or software in the processor of the network device 1800. The steps of a method disclosed in connection with the embodiments of the present application may be embodied directly in a hardware processor for execution, or in a combination of hardware and software modules in the processor for execution. The software modules may be located in a random access memory, flash memory, read only memory, programmable read only memory, or electrically erasable programmable memory, registers, etc. as well known in the art. The storage medium is located in a memory, and the processor reads information in the memory, and in combination with its hardware, performs the steps of the above method, which will not be described in detail here to avoid repetition.

Referring to fig. 19, fig. 19 is a schematic structural diagram of a network device 1900 according to another exemplary embodiment of the present application, where the network device 1900 shown in fig. 19 is configured to perform all or part of the operations related to the instruction processing method shown in fig. 3 or fig. 9. The network device 1900 is, for example, a switch, a router, or the like, and the network device 1900 may be implemented by a general bus architecture.

As shown in fig. 19, network device 1900 includes a main control board 1910 and an interface board 1930.

The main control board is also called a main processing unit (main processing unit, MPU) or a routing processing card (route processor card), and the main control board 1910 is used for controlling and managing various components in the network device 1900, including routing computation, device management, device maintenance, and protocol processing functions. The main control board 1910 includes a central processor 1911 and a memory 1912.

Interface board 1930 is also referred to as a line interface unit (line processing unit, LPU), line card, or service board. The interface board 1930 is used to provide various service interfaces and to implement forwarding of data packets. The service interfaces include, but are not limited to, ethernet interfaces, such as Flexible Ethernet service interfaces (Flexible ETHERNET CLIENTS, FLEXE Clients), POS (Packet over SONET/SDH) interfaces, and the like. Interface board 1930 includes a central processor 1931, a network processor 1932, forwarding table entry memory 1934, and a physical interface card (PHYSICAL INTERFACE CARD, PIC) 1933.

The central processor 1931 on the interface board 1930 is used to control and manage the interface board 1930 and communicate with the central processor 1911 on the main control board 1910.

The network processor 1932 is configured to implement forwarding processing of the packet. The network processor 1932 may be in the form of a forwarding chip. The forwarding chip may be a network processor (network processor, NP). In some embodiments, the forwarding chip may be implemented by an application-specific integrated circuit (ASIC) or a field programmable gate array (field programmable GATE ARRAY, FPGA). Specifically, the network processor 1932 is configured to forward the received packet based on the forwarding table stored in the forwarding table entry memory 1934, if the destination address of the packet is the address of the network device 1900, send the packet to the CPU (e.g. the central processing unit 1631) for processing, and if the destination address of the packet is not the address of the network device 1900, find the next hop and the egress interface corresponding to the destination address from the forwarding table according to the destination address, and forward the packet to the egress interface corresponding to the destination address. The processing of the uplink message may include processing of a message input interface, forwarding table lookup, and the processing of the downlink message may include forwarding table lookup, etc. In some embodiments, the central processor may also perform the function of a forwarding chip, such as implementing software forwarding based on a general purpose CPU, so that no forwarding chip is needed in the interface board.

The physical interface card 1933 is used to implement the docking function of the physical layer, from which the original traffic enters the interface board 1930, and from which processed messages are sent out. A physical interface card 1933, also referred to as a daughter card, may be mounted on interface board 1930 and is responsible for converting the optical-electrical signals into messages, performing validity checks on the messages, and forwarding the messages to network processor 1932 for processing. In some embodiments, central processor 1931 may also perform the functions of network processor 1932, such as implementing software forwarding based on a general purpose CPU, such that network processor 1932 is not required in physical interface card 1933.

Optionally, network device 1900 includes multiple interface boards, e.g., network device 1900 also includes an interface board 1940, interface board 1940 including a central processor 1941, a network processor 1942, a forwarding table entry memory 1942, and a physical interface card 1943. The function and implementation of the components in interface board 1940 are the same as or similar to interface board 1930 and are not described in detail herein.

Optionally, network device 1900 also includes a switch fabric 1920. The switch fabric 1920 may also be referred to as a switch fabric unit (switch fabric unit, SFU). In the case of network device 1900 having multiple interface boards, switch fabric 1920 is configured to perform data exchanges between the interface boards. For example, communication between interface board 1930 and interface board 1940 may be through switch mesh panel 1920.

The main control board 1910 is coupled to an interface board. For example. The main control board 1910, the interface board 1930 and the interface board 1940 are connected with the system back board through a system bus to realize intercommunication among the switching network boards 1920. In one possible implementation, an inter-process communication protocol (inter-process communication, IPC) channel is established between the main control board 1910 and the interface boards 1930 and 1940, and communication is performed between the main control board 1910 and the interface boards 1930 and 1940 through the IPC channel.

Logically, network device 1900 includes a control plane that includes a main control board 1910 and a central processor 1911, and a forwarding plane that includes various components that perform forwarding, such as a forwarding table entry memory 1934, a physical interface card 1933, and a network processor 1932. The control plane performs the functions of a router, generating a forwarding table, processing signaling and protocol messages, configuring and maintaining the state of the network device, and the like, and issues the generated forwarding table to the forwarding plane, and at the forwarding plane, the network processor 1932 performs table lookup forwarding on the messages received by the physical interface card 1933 based on the forwarding table issued by the control plane. The forwarding table issued by the control plane may be stored in forwarding table entry memory 1934. In some embodiments, the control plane and the forwarding plane may be completely separate and not on the same network device.

It should be noted that the main control board may have one or more blocks, and the main control board and the standby main control board may be included when there are multiple blocks. The interface boards may have one or more, the more data processing capabilities the network device is, the more interface boards are provided. The physical interface card on the interface board may also have one or more pieces. The switching network board may not be provided, or may be provided with one or more blocks, and load sharing redundancy backup can be jointly realized when the switching network board is provided with the plurality of blocks. Under the centralized forwarding architecture, the network device may not need to exchange network boards, and the interface board bears the processing function of the service data of the whole system. Under the distributed forwarding architecture, the network device may have at least one switching fabric, through which data exchange between multiple interface boards is implemented, providing high-capacity data exchange and processing capabilities. Therefore, the data access and processing power of the network device of the distributed architecture is greater than that of the network device of the centralized architecture. Alternatively, the network device may have a configuration in which only one board card is provided, that is, there is no switching network board, the functions of the interface board and the main control board are integrated on the one board card, and the central processor on the interface board and the central processor on the main control board may be combined into one central processor on the one board card, so as to perform the functions after stacking the two, where the network device has low data exchange and processing capabilities (for example, network devices such as a low-end switch or a router). The specific architecture employed is not limited in any way herein, depending on the specific networking deployment scenario.

In a specific embodiment, the network device 1900 corresponds to the instruction processing apparatus shown in fig. 16 or fig. 17 described above. In some embodiments, the compression module 1602 in the instruction processing apparatus shown in fig. 16 corresponds to the central processor 1911 or the network processor 1932 in the network device 1900.

The embodiment of the application also provides a communication device which comprises a transceiver, a memory and a processor. The transceiver, the memory and the processor are in communication with each other through an internal connection path, the memory is used for storing instructions, the processor is used for executing the instructions stored in the memory to control the transceiver to receive signals and control the transceiver to transmit signals, and when the processor executes the instructions stored in the memory, the processor is caused to execute an instruction processing method.

It is to be appreciated that the processor described above may be a CPU, but may also be other general purpose processors, DSP, ASIC, FPGA or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. A general purpose processor may be a microprocessor or any conventional processor or the like. It is noted that the processor may be a processor supporting an advanced reduced instruction set machine (ADVANCED RISC MACHINES, ARM) architecture.

Further, in an alternative embodiment, the memory may include read only memory and random access memory, and provide instructions and data to the processor. The memory may also include non-volatile random access memory. For example, the memory may also store information of the device type.

The memory may be volatile memory or nonvolatile memory, or may include both volatile and nonvolatile memory. The nonvolatile memory may be ROM, programmable ROM (PROM), erasable programmable ROM (erasable PROM), EEPROM, or flash memory, among others. The volatile memory may be RAM, which acts as external cache. By way of example, and not limitation, many forms of RAM are available. For example, static random access memory (STATIC RAM, SRAM), dynamic random access memory (dynamic random access memory, DRAM), synchronous Dynamic Random Access Memory (SDRAM), double data rate synchronous dynamic random access memory (doubledata RATE SDRAM, DDR SDRAM), enhanced synchronous dynamic random access memory (ENHANCED SDRAM, ESDRAM), synchronous link dynamic random access memory (SYNCHLINK DRAM, SLDRAM), and direct memory bus random access memory (direct rambus RAM, DR RAM).

The embodiment of the application also provides an instruction processing device, which comprises a processor, wherein the processor is used for loading and running at least one instruction, so that the instruction processing device can realize the instruction processing method shown in any one of fig. 3 or fig. 9. Optionally, the device further comprises a memory coupled to the processor, the memory for storing at least one instruction.

The embodiment of the application also provides a computer readable storage medium, wherein at least one instruction is stored in the storage medium, and the instruction is loaded and executed by a processor, so that the computer realizes the instruction processing method shown in any one of fig. 3 or fig. 9.

The present application also provides a computer program (product) which, when executed by a computer, can cause a processor or a computer to perform the respective steps and/or flows corresponding to the above-described method embodiments.

The embodiment of the application also provides a chip, which comprises a processor and is used for calling and running the instructions stored in the memory from the memory, so that the communication equipment provided with the chip executes the instruction processing method shown in any one of fig. 3 or 9.

The embodiment of the application also provides another chip, which comprises an input interface, an output interface, a processor and a memory, wherein the input interface, the output interface, the processor and the memory are connected through an internal connection path, the processor is used for executing codes in the memory, and when the codes are executed, the processor is used for executing an instruction processing method shown in any one of fig. 3 or 9.

In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, the processes or functions in accordance with the present application are produced in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center by a wired (e.g., coaxial cable, fiber optic, digital subscriber line), or wireless (e.g., infrared, wireless, microwave, etc.). The computer readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains an integration of one or more available media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid state disk Solid STATE DISK), etc.

It should be noted that, the information (including but not limited to user equipment information, user personal information, etc.), data (including but not limited to data for analysis, stored data, presented data, etc.), and signals related to the present application are all authorized by the user or are fully authorized by the parties, and the collection, use, and processing of the related data is required to comply with the relevant laws and regulations and standards of the relevant countries and regions. For example, the first instruction and the like involved in the present application are all acquired with sufficient authorization.

Those of ordinary skill in the art will appreciate that the various method steps and modules described in connection with the embodiments disclosed herein may be implemented as software, hardware, firmware, or any combination thereof, and that the steps and components of the various embodiments have been generally described in terms of functionality in the foregoing description to clearly illustrate this interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Those of ordinary skill in the art may implement the described functionality using different approaches for each particular application, but such implementation is not considered to be beyond the scope of the present application.

It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program for instructing relevant hardware, where the program may be stored in a computer readable storage medium, and the above storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.

When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer program instructions. By way of example, the methods of embodiments of the present application may be described in the context of machine-executable instructions, such as program modules, being included in devices on a real or virtual processor of a target. Generally, program modules include routines, programs, libraries, objects, classes, components, data structures, etc. that perform particular tasks or implement particular abstract data types. In various embodiments, the functionality of the program modules may be combined or split between described program modules. Machine-executable instructions for program modules may be executed within local or distributed devices. In a distributed device, program modules may be located in both local and remote memory storage media.

Computer program code for carrying out methods of embodiments of the present application may be written in one or more programming languages. These computer program code may be provided to a processor of a general purpose computer, special purpose computer, or other programmable instruction processing apparatus such that the program code, when executed by the computer or other programmable instruction processing apparatus, causes the functions/operations specified in the flowchart and/or block diagram block or blocks to be implemented. The program code may execute entirely on the computer, partly on the computer, as a stand-alone software package, partly on the computer and partly on a remote computer or entirely on the remote computer or server.

In the context of embodiments of the present application, computer program code or related data may be carried by any suitable carrier to enable an apparatus, device or processor to perform the various processes and operations described above. Examples of carriers include signals, computer readable media, and the like.

Examples of signals may include electrical, optical, radio, acoustical or other form of propagated signals, such as carrier waves, infrared signals, etc.

A machine-readable medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination thereof. More detailed examples of a machine-readable storage medium include an electrical connection with one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical storage device, a magnetic storage device, or any suitable combination thereof.

It will be clearly understood by those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described system, apparatus and module may refer to corresponding procedures in the foregoing method embodiments, which are not repeated herein.

In the several embodiments provided by the present application, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. For example, the above-described device embodiments are merely illustrative, e.g., the division of the modules is merely a logical function division, and there may be additional divisions of actual implementation, e.g., multiple modules or components may be combined or integrated into another system, or some features may be omitted, or not performed. In addition, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices, or modules, or may be an electrical, mechanical, or other form of connection.

The modules illustrated as separate components may or may not be physically separate, and components shown as modules may or may not be physical modules, i.e., may be located in one place, or may be distributed over a plurality of network modules. Some or all of the modules can be selected according to actual needs to achieve the purpose of the embodiment of the application.

In addition, each functional module in the embodiments of the present application may be integrated into one processing module, or each module may exist alone physically, or two or more modules may be integrated into one module. The integrated modules may be implemented in hardware or in software functional modules.

The integrated module, if implemented in the form of a software functional module and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application is essentially or a part contributing to the prior art, or all or part of the technical solution may be embodied in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method in the various embodiments of the present application. The storage medium includes a U disk, a removable hard disk, a read-only memory (ROM), a random access memory (random access memory, RAM), a magnetic disk, an optical disk, or other various media capable of storing program codes.

The terms "first," "second," and the like in this disclosure are used for distinguishing between similar elements or items having substantially the same function and function, and it should be understood that there is no logical or chronological dependency between the terms "first," "second," and "n," and that there is no limitation on the amount and order of execution. It will be further understood that, although the following description uses the terms first, second, etc. to describe various elements, these elements should not be limited by the terms. These terms are only used to distinguish one element from another element. For example, a first image may be referred to as a second image, and similarly, a second image may be referred to as a first image, without departing from the scope of the various described examples. The first image and the second image may both be images, and in some cases may be separate and distinct images.

It should also be understood that, in the embodiments of the present application, the sequence number of each process does not mean that the execution sequence of each process should be determined by the function and the internal logic, and should not limit the implementation process of the embodiments of the present application.

The term "at least one" in the present application means one or more, and the term "plurality" in the present application means two or more, for example, a plurality of second messages means two or more second messages. The terms "system" and "network" are often used interchangeably herein.

It is to be understood that the terminology used in the description of the various examples described herein is for the purpose of describing particular examples only and is not intended to be limiting. As used in the description of the various described examples and in the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.

It will also be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items. The term "and/or" is an association relation describing an associated object, and indicates that three kinds of relations may exist, for example, a and/or B, and may indicate that a exists alone, a and B exist together, and B exists alone. In the present application, the character "/" generally indicates that the front and rear related objects are an or relationship.

It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

It should also be understood that the terms "if" and "if" may be interpreted to mean "when" ("white" or "upon") or "in response to a determination" or "in response to detection. Similarly, the phrase "if determined" or "if [ a stated condition or event ] is detected" may be interpreted to mean "upon determination" or "in response to determination" or "upon detection of [ a stated condition or event ] or" in response to detection of [ a stated condition or event ] "depending on the context.

It should be appreciated that determining B from a does not mean determining B from a alone, but may also determine B from a and/or other information.

It should be further appreciated that reference throughout this specification to "one embodiment," "an embodiment," "one possible implementation" means that a particular feature, structure, or characteristic described in connection with the embodiment or implementation is included in at least one embodiment of the present application. Thus, the appearances of the phrases "in one embodiment" or "in an embodiment," "one possible implementation" in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.

Claims

1.A method of instruction processing, the method comprising:

Acquiring a first instruction to be processed;

determining a compression level of a memory in a memory structure according to the memory structure and a demand rate for reading the first instruction, wherein the memory structure is used for storing the compressed first instruction;

and compressing the first instruction by adopting the compression hierarchy.

2. The method of claim 1, wherein prior to determining the compression level of memory in the memory structure based on the memory structure and the rate of demand for reading the first instruction, further comprising:

acquiring at least one of path attributes of reading paths of all memories included in the memory structure or reading efficiency of the memory structure, wherein the reading paths of the memories are paths connected between the memories and a reading module, and the reading module is used for reading instructions stored in the memories and processing the read instructions;

and determining the demand rate of reading the first instruction according to at least one of the path attribute or the reading efficiency.

3. The method of claim 1 or 2, wherein determining the compression level of memory in the memory structure based on the memory structure and a rate of demand for reading the first instruction comprises:

Determining a read rate of each memory included in the memory structure;

Determining a restore rate of each memory according to a demand rate of reading the first instruction and a read rate of each memory, wherein the restore rate of each memory is a rate of decompressing instructions stored in the memory on a read path of the memory;

and determining the compression level of each memory according to the reduction rate of each memory, wherein the compression rate corresponding to the compression level of the memory is matched with the reduction rate corresponding to the memory.

4. A method according to any one of claims 1-3, wherein the memory structure comprises a memory module and a cache module, the memory module having a compression level of two-layer compression, the cache module having a compression level of one-layer compression, the one-layer compression having a higher restore rate than the two-layer compression;

the compressing the first instruction using the compression hierarchy includes:

Compressing the first instruction by adopting the one-layer compression mode to obtain a second instruction;

And compressing the second instruction by adopting the two-layer compression mode to obtain a third instruction, wherein the compression rate of the two-layer compression mode is larger than that of the one-layer compression mode.

5. The method of claim 4, wherein a ratio between a bit width of the first instruction and a bit width of the second instruction is equal to a compression ratio, and wherein compressing the first instruction in the one-layer compression manner, before obtaining the second instruction, further comprises:

Acquiring service characteristics of the first instruction, wherein the service characteristics indicate the operation quality requirement of a service operated based on the first instruction;

And determining the compression ratio according to the service characteristics of the first instruction.

6. The method according to claim 4 or 5, wherein compressing the first instruction by the one-layer compression method to obtain a second instruction includes:

Acquiring instruction characteristics of the first instruction;

And under the condition that fixed-length compression is supported according to the instruction characteristics, compressing the first instruction in a fixed-length compression mode to obtain the second instruction, or under the condition that fixed-length compression is not supported according to the instruction characteristics, splitting the first instruction into a plurality of second instructions, wherein the bit width of each second instruction in the plurality of second instructions is the same.

7. The method of claim 6, wherein the instruction characteristic comprises at least one of an instruction type, a frequency of use, or a field utilization of the first instruction.

8. The method according to claim 6 or 7, wherein compressing the first instruction by fixed-length compression to obtain the second instruction includes:

And determining a valid field of the first instruction to obtain a second instruction comprising the valid field, wherein the valid field is a field used for running the first instruction.

9. The method of any of claims 4-8, wherein the second instruction includes a compression flag bit indicating that the second instruction is obtained by fixed-length compression or splitting.

10. The method according to any one of claims 4-9, wherein compressing the second instruction by using the two-layer compression method to obtain a third instruction includes:

Acquiring the occurrence frequency of each field included in the second instruction;

And acquiring a third instruction according to the occurrence frequency of each field, wherein the third instruction comprises codes of each field, the codes of each field are determined based on the occurrence frequency of each field, and the length of the codes of each field is in inverse proportion to the occurrence frequency of each field.

11. The method of claim 10, wherein prior to said fetching a third instruction according to the frequency of occurrence of said respective fields, further comprising:

Constructing a coding dictionary corresponding to the second instruction according to the occurrence frequency of each field, wherein the coding dictionary comprises each field and the occurrence frequency of each field;

and determining the codes of the fields according to the coding dictionary.

12. The method of claim 10 or 11, wherein the third instruction includes a coded index for looking up a field corresponding to a code included in the third instruction during decompression of the third instruction.

13. The method of any of claims 1-12, wherein after compressing the first instruction using the compression hierarchy, further comprising:

Acquiring the instruction length of a compressed first instruction;

and determining a storage unit for storing the compressed first instruction in the memory structure according to the instruction length, and storing the compressed first instruction in the determined storage unit.

14. The method of any of claims 1-13, wherein the first instruction comprises an instruction to forward an application.

15. The method according to any one of claims 1-14, wherein the obtaining the first instruction to be processed comprises:

The method comprises the steps of acquiring at least one instruction included in an instruction bundle, wherein any instruction in the at least one instruction is a first instruction to be processed, and the at least one instruction is an instruction executed in parallel.

16. A method of instruction processing, the method comprising:

acquiring a compressed first instruction, wherein the compression of the first instruction is realized based on a compression hierarchy, the compression hierarchy is determined based on a memory structure and a demand rate for reading the first instruction, and the memory structure is used for storing the compressed first instruction;

And restoring the compressed first instruction.

17. The method of claim 16, wherein the memory structure comprises a memory module and a cache module, the memory module having a compression level of two-layer compression, the cache module having a compression level of one-layer compression, the one-layer compression having a higher reduction rate than the two-layer compression, the compressed first instruction comprising a third instruction;

The restoring the compressed first instruction includes:

Reducing the third instruction by adopting a reduction mode corresponding to the compression mode of the two-layer compression to obtain a second instruction;

and restoring the second instruction by adopting a restoring mode corresponding to the one-layer compression mode to obtain a first instruction, wherein the compression rate of the two-layer compression mode is larger than that of the one-layer compression mode.

18. The method of claim 17, wherein the third instruction includes a coded index, and the restoring the third instruction by using a restoring mode corresponding to the compression mode of the two-layer compression to obtain a second instruction includes:

analyzing the coding index included in the third instruction;

And determining each field corresponding to each code included in the third instruction according to the code index, and determining the second instruction according to each field corresponding to each code, wherein the length of the code of each field is in inverse proportion to the occurrence frequency of each field.

19. The method according to claim 17 or 18, wherein the reduction mode corresponding to the one-layer compression mode includes decompression corresponding to fixed-length compression or splitting corresponding splicing, and the reducing the second instruction by adopting the reduction mode corresponding to the one-layer compression mode to obtain a first instruction includes:

Under the condition that the second instruction is obtained through fixed-length compression, decompressing corresponding to the fixed-length compression is adopted for the second instruction, and the first instruction is obtained;

Or under the condition that the second instructions are obtained in a split mode, a plurality of second instructions obtained by splitting the first instructions are spliced to obtain the first instructions, and the bit widths of all the second instructions in the plurality of second instructions are the same.

20. The method of claim 19, wherein the second instruction includes a compression flag bit, the compression flag bit indicating that the second instruction is obtained by fixed-length compression or splitting, and before the reducing the second instruction by adopting the reduction mode corresponding to the one-layer compression mode, the method further includes:

and analyzing the compression marking bit included in the second instruction, and determining the mode of obtaining the second instruction from fixed-length compression and splitting according to the compression marking bit.

21. The method of any of claims 16-20, wherein the fetching the compressed first instruction comprises:

Fetching at least one instruction stored in a row of memory cells of said memory structure;

And separating each instruction according to the instruction length of each instruction in the at least one instruction to obtain the at least one instruction, wherein the at least one instruction comprises the compressed first instruction.

22. The method of any of claims 16-21, wherein after the restoring the compressed first instruction, further comprising:

determining an instruction bundle corresponding to the first instruction, wherein the instruction bundle comprises at least one instruction, compression results of the at least one instruction are stored in the same row of storage units of the memory structure, compression results of any one instruction in the compression results of the at least one instruction are the compressed first instruction, and the at least one instruction is an instruction executed in parallel;

and sending at least one instruction included in the instruction bundle to a decoding module, wherein the decoding module is used for analyzing the at least one instruction.

23. The method of any of claims 16-22, wherein the first instruction comprises an instruction to forward an application.

24. An instruction processing apparatus, the apparatus comprising:

The acquisition module is used for acquiring a first instruction to be processed;

The determining module is used for determining the compression level of the memory in the memory structure according to the memory structure and the demand rate for reading the first instruction, and the memory structure is used for storing the compressed first instruction;

and the compression module is used for compressing the first instruction by adopting the compression level.

25. An instruction processing apparatus, the apparatus comprising:

The system comprises an acquisition module, a compression module and a storage module, wherein the acquisition module is used for acquiring a compressed first instruction, the compression of the first instruction is realized based on a compression level, the compression level is determined based on a storage structure and a demand rate for reading the first instruction, and the storage structure is used for storing the compressed first instruction;

And the restoring module is used for restoring the compressed first instruction.

26. An instruction processing apparatus, characterized in that the apparatus comprises a processor for loading and executing at least one instruction to cause the instruction processing apparatus to implement the instruction processing method of any of claims 1-15 or to implement the instruction processing method of any of claims 16-23.

27. A computer readable storage medium having stored therein at least one instruction that is loaded and executed by a processor to implement the instruction processing method of any of claims 1-15 or to implement the instruction processing method of any of claims 16-23.

28. A chip comprising a processor for executing program instructions or code to cause a device comprising the chip to perform the instruction processing method of any one of claims 1-15 or to perform the instruction processing method of any one of claims 16-23.

29. A computer program product, characterized in that the computer program product comprises a computer program/instruction which is executed by a processor to cause the computer to perform the instruction processing method of any of claims 1-15 or to perform the instruction processing method of any of claims 16-23.