[go: up one dir, main page]

CN101349969B - Apparatus and system for executing pop comparison microinstruction in microprocessor - Google Patents

Apparatus and system for executing pop comparison microinstruction in microprocessor Download PDF

Info

Publication number
CN101349969B
CN101349969B CN2008101338414A CN200810133841A CN101349969B CN 101349969 B CN101349969 B CN 101349969B CN 2008101338414 A CN2008101338414 A CN 2008101338414A CN 200810133841 A CN200810133841 A CN 200810133841A CN 101349969 B CN101349969 B CN 101349969B
Authority
CN
China
Prior art keywords
order
micro
operand
layer
logic device
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
CN2008101338414A
Other languages
Chinese (zh)
Other versions
CN101349969A (en
Inventor
葛拉.M.柯尔
G.葛兰.亨利
泰瑞.派克斯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
INTELLIGENCE FIRST CO
Original Assignee
INTELLIGENCE FIRST CO
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by INTELLIGENCE FIRST CO filed Critical INTELLIGENCE FIRST CO
Priority to CN2008101338414A priority Critical patent/CN101349969B/en
Publication of CN101349969A publication Critical patent/CN101349969A/en
Application granted granted Critical
Publication of CN101349969B publication Critical patent/CN101349969B/en
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Images

Landscapes

  • Executing Machine-Instructions (AREA)

Abstract

The invention provides a device for executing ejection comparison microinstruction in a microprocessor and a system thereof, for executing ejection comparison. The device used in a microprocessor comprises a pairing operation translation logic device, a loading logic device and an execution logic device. The pairing operation translation logic device receives a macro instruction appointing ejection comparison and generates an ejection comparison microinstruction. The ejection comparison microinstruction instructs the pipeline layer of the microprocessor to execute the ejection comparison. The loading logic device is coupled with the pairing operation translation logic device for receiving the ejection comparison microinstruction and reading a first operand from an address of a memory, wherein the address is appointed by a register. The register is appointed by the ejection comparison microinstruction. The execution logic device is coupled with the loading logic device for receiving an operand and comparing the first and the second operands.

Description

Eject the relatively Apparatus and system of micro-order in order to carry out in the microprocessor
The application be that April 1, application number in 2004 are 200410032141.8 the applying date, denomination of invention divides an application for invention (or utility model) patented claim of " ejecting the relatively Apparatus and system of micro-order in order to carry out in the microprocessor ".
Technical field
The present invention is relevant for microelectronic field, particularly relevant in the pipeline microprocessor in order to the Apparatus and system of effective execution character string scanner uni comparison operation.
Background technology
Early stage microprocessor once can only be carried out an instruction.Therefore, each other instruction meeting extracts from storer, and the specified all functions of this instruction can be carried out by the functional unit in the microprocessor, finishes up to all functions.At this moment, individual instructions can be abandoned, and from storer, extract next instruction and carry out.
Though the execution of the programmed instruction in the early stage microprocessor is easy to understand, its actual implementation effect is quite slowly.From that time, the deviser of microprocessor promptly constantly puts forth effort on the framework of revising microprocessor, improves the execution speed or the treatment capacity (throughput) of instruction.Recently, pipeline architecture very generally is used as the means that increase the instruction process amount in this technical field.Pipeline architecture is disassembled into a series of continuous computing with the functional unit of microprocessor, and is very similar with the layer differentiation of assembly line.Therefore, in an application program, when a certain layer of microprocessor is just being carried out by the specified computing of first instruction, one is right after layer before this certain layer then carries out by specified another computing of second instruction that is connected on after first instruction, this situation be very possible (and, from the viewpoint of treatment capacity, be very desirable).When all pipeline layers all when carrying out computing, just can make microprocessor reach efficient treatment capacity.But when a specific pipeline layer is carried out its specify arithmetic cost time too of a specified duration, will produce inefficient problem.In this case, pause (stall) signal can be sent to previous pipeline layer, forces it to suspend running, till this specific pipeline layer is finished its function.
Pipeline architecture sustainable development to following degree: the specified computing of many programmed instruction (also being called macro instruction) can be in once promptly finishing by pipeline.For example, the additive operation of register can be extracted two register manipulation numbers simultaneously from register in a register layer, and in a follow-up execution level its addition is born results, last and the result behind execution level writes back in the layer, and the result is write back to a result register.Therefore, in order to the single instruction of the additive operation of carrying out register, can be by configuration for pass through continuous pipeline layer with a pipeline Frequency Synchronization, net result then can be at single pipeline in the cycle for the user, the addition of execution register.
Though the specified computing of many macro instructions can still have a lot of instructions via once finishing execution by pipeline, its specify arithmetic is a complexity like this, so that can't promptly finish during a processing by pipeline.So-called character string comparison order promptly is a kind of macro instruction of such type, as scanning character string instruction or the instruction of compare string string.The position of one or two operand (it must read, and compares mutually, or compares with a 3-operand that is stored in the internal register, to produce comparative result) is specified in such instruction meeting indirectly from data-carrier store.The computing of this kind pattern is called and is written into comparison operation.Yet most Modern microprocessor has a specific pipeline layer, and it can carry out such operation: the 1) operand in the access memory, or 2) use the operand that provides to carry out arithmetic or logical calculated.Therefore, during the same line cycle in this certain layer, can not carry out the computing of these patterns simultaneously.Therefore, be written into comparison operation and need carry out two sub-computings.At first, operand must read from storer.What be next to this is that the operand that reads must compare, to bear results.Therefore, when the computing of carrying out read operands from storer (that is, the first sub-computing), the extraction of subsequent instructions must suspend.And when carrying out additive operation (that is, the second sub-computing), just recover the action of extraction.
From the viewpoint of treatment capacity, it is disadvantageous making pipeline suspend one or more cycle.One single is written into the comparison operation meeting and causes at least that once pipeline suspends.Particularly when character string comparison macro instruction is reused many times, (can in a lot of application programs, see usually), because of repeating the time-out that a character string comparison operation is caused, can increase along with specified multiplicity, and become more unfavorable originally.
In a pipeline microprocessor, the computing that a plurality of pipeline cycles of any needs finish, the running that all can face pipeline layer lacks the problem of efficient.When this inefficient running increased the weight of because of aforementioned repetition situation, the execution speed of microprocessor can variation.Therefore, need the Apparatus and system in a kind of microprocessor, make to be written into comparison operation and can to finish in the cycle at single pipeline.
Summary of the invention
Except other is used, the present invention is directed to these and other problem and the shortcoming that solve prior art.
Specifically, according to one aspect of the invention, provide a kind of and in microprocessor, repeated to eject the device of comparison operation in order to carry out one, comprise: a paired computing translation logic device, in order to receive the macro instruction that this repeats to eject comparison operation, and produce one and eject micro-order, one ejects relatively micro-order, one micro-order of successively decreasing, wherein, this ejection micro-order is indicated the pipeline layer in the microprocessor to carry out and is ejected computing, this ejection relatively micro-order indicates the pipeline layer in the microprocessor to carry out the ejection comparison operation, and this micro-order of successively decreasing is carried out the computing of successively decreasing to counter register; One is written into the logic device, is coupled to this paired computing translation logic device, in order to receive relatively micro-order of this ejection, and read a second operand from an address of storer, wherein, this address is specified by the content of second register, and this second register ejects relatively micro-order appointment by this; An and actuating logic device, be coupled to this and be written into the logic device, in order to receive this second operand, and this second operand and a first operand compared, this first operand is to be read by described this ejection micro-order of translating among the above-mentioned a plurality of micro-orders that obtain, wherein, this is written into the logic device and is included in the ground floor in this pipeline layer, and this actuating logic device is included in the second layer in this pipeline layer, wherein, this second layer is positioned at after this ground floor, this eject relatively micro-order in a single pipeline handle in the cycle this ground floor and this second layer one of them.
According to another aspect of the present invention, a kind of device that instructs in order to an execution one scan character string instruction or a compare string string in a microprocessor is provided, comprise: a paired computing transfer interpreter, configuration is translated into corresponding a plurality of micro-orders for scanning the instruction of character string instruction or this compare string string, wherein, a plurality of micro-orders of this correspondence comprise that an ejection micro-order and ejects relatively micro-order, wherein this ejection micro-order indicates the pipeline layer in the microprocessor to carry out the ejection computing, this ejection relatively micro-order indicates this microprocessor to carry out two kinds of computings, wherein these two kinds of computings are carried out by two successive layerss of this microprocessor, these two successive layerss comprise: one is written into layer, configuration is for carrying out one first computing in these two kinds of computings, and this first computing comprises reads a second operand from one of storer position; An and execution level, be coupled to this and be written into layer, configuration is for carrying out one second computing in these two kinds of computings, this second computing comprises this second operand of reception, and this second operand and a first operand compared, to produce a result, wherein this first operand is to be read by described this ejection micro-order of translating among the above-mentioned a plurality of micro-orders that obtain, wherein, in this microprocessor, this execution level is followed this and is written into layer, this ejection relatively micro-order be in a single pipeline handles in the cycle this be written into layer and this execution level one of them, wherein, the content of this execution level increasing or decreasing one register, make this content point to the next address of this storer, this address of this storer stores next operand, compares during the next one ejects comparison operation for this execution level.
According to another aspect of the present invention, provide a kind of and in a pipeline microprocessor, repeated to eject the system of comparison operation in order to carry out one, comprise: a paired computing translation logic device, this paired computing translation logic device receives and translates the macro instruction that this repeats to eject comparison operation, and produces one and eject micro-order, and eject relatively micro-order, a micro-order of successively decreasing; And one be written into the logic device, configuration ejects relatively micro-order for receiving this, and read a second operand from a memory location, and this second operand is sent to the actuating logic device in the succeeding layer of this pipeline microprocessor, wherein, this actuating logic device is at a single pipeline relatively this second operand and first operand in the cycle, and produce a comparative result, this first operand is to be read by described this ejection micro-order of translating among the above-mentioned a plurality of micro-orders that obtain, wherein, this ejection comparison micro-order is to handle this in the cycle in a single pipeline to be written into the logic device.
The present invention proposes a kind of technology, relatively reaches the ejection comparison operation in order to reach being written in the pipeline microprocessor.The present invention proposes a kind of micro processor, apparatus, ejects comparison operation in order to carry out.This micro processor, apparatus comprises that a paired computing translation logic device, is written into a logic device and an actuating logic device.Computing translation logic device receives and specifies the macro instruction that ejects comparison operation in pairs, and relatively micro-order is ejected in generation.Ejection relatively micro-order can indicate the pipeline layer in the microprocessor to carry out the ejection comparison operation.Be written into the logic device and be coupled to paired computing translation logic device, eject relatively micro-order in order to receive, and from an address of storer, read first operand, wherein, this address is specified by the content of a register.This register compares micro-order by ejection and specifies.The actuating logic device then is coupled to and is written into the logic device, in order to receiving first operand, and a first operand and a second operand is compared.
Wherein, this ejection comparison micro-order is indicated this to be written into the logic device and is read this first operand, and indicates this actuating logic device that this first operand and this second operand are compared.This is written into the logic device and comprises a ground floor in this pipeline layer, and this actuating logic device comprises the second layer in this pipeline layer, and wherein, this second layer is positioned at after this ground floor.This eject relatively micro-order in a single pipeline handle in the cycle this ground floor and this second layer one of them.This ground floor is sent to this second layer with this first operand and this ejection comparison micro-order, to carry out the comparison of this first operand and this second operand.This actuating logic device upgrades a flag register according to a comparative result of this first operand and this second operand.This flag register comprises a zero flag, a carry flag and an overflow sign.This actuating logic device is this content of this register of increasing or decreasing simultaneously, makes this content point to next address in this storer, compares during the next one ejects comparison operation for this actuating logic device.
On the other hand, the present invention proposes the device in a kind of microprocessor, in order to carry out the instruction of scanning character string instruction or compare string string.This device has a paired computing transfer interpreter, and its configuration is translated into corresponding micro-order for scanning the instruction of character string instruction or compare string string.Corresponding micro-order comprises relatively micro-order of ejection, in order to the indication microprocessor, carries out two kinds of computings, and wherein these two kinds of computings are carried out by two successive layerss of microprocessor.These two successive layerss comprise be written into the layer and execution level.Be written into layer in order to carry out first kind in these two kinds of computings.First kind in these two kinds of computings comprises from a position of storer, reads first operand.Execution level is coupled to and is written into layer.Execution level is in order to carry out second kind in these two kinds of computings.Second kind in these two kinds of computings comprises the reception first operand, and a first operand and a second operand are compared, and bears results.
Wherein, in this microprocessor, this execution level is followed this and is written into layer.This ejection relatively micro-order be in a single pipeline handles in the cycle this be written into layer and this execution level one of them.This is written into layer this first operand and this ejection comparison micro-order is sent to this execution level, to carry out the comparison of this first operand and this second operand.This execution level upgrades a flag register according to this result.This execution level is the content of increasing or decreasing one register simultaneously, makes this content point to the next address of this storer, and this addressed memory of this storer has next operand, compares during the next one ejects comparison operation for this execution level.
On the other hand, the present invention proposes the system in a kind of pipeline microprocessor, ejects comparison operation in order to carry out.This system has one and changes logic device and over to and be written into the logic device.This changes the macro instruction that the logic device receives and translate over to and ejects relatively micro-order corresponding to one.Be written into the logic device and receive relatively micro-order of this ejection, and from a memory location, read first operand, and first operand is sent to actuating logic device in the succeeding layer of pipeline microprocessor.The actuating logic device, can compare a first operand and a second operand in the cycle at single pipeline, and produces comparative result.
Wherein, this ejection comparison micro-order is indicated this to be written into the logic device and is read this first operand, and indicates this actuating logic device relatively this first operand and this second operand.This ejection relatively micro-order is to handle this in the cycle in a single pipeline to be written into layer.
Description of drawings
Fig. 1 is the block scheme of pipeline layer of the pipeline microprocessor of a correlation technique;
Fig. 2 is for being carried out the table of repeat character string comparison operation by the microprocessor of Fig. 1;
Fig. 3 for according to the present invention illustrate in order to be carried out to block scheme to the microprocessor 300 that is written into relatively or ejects comparison operation; And
Fig. 4 is for to carry out the table that repeats to eject comparison operation by the microprocessor of Fig. 3.
Wherein, description of reference numerals is as follows:
100,300: microprocessor 101: extract layer
102: translation layer 103: the register layer
104: address layer 105: data/ALU layer
106: write back layer 107,108: register
109: external command storer 110: external data memory
200,400: table 301: the extraction logic device
302: external memory storage or instruction cache
303: instruction buffer 304: paired computing translation logic device
305: micro-order formation 306: queue buffer
307: register file 308,321: internal register
309,319: micro-order impact damper 310,311: operand register
312: be written into logic device 313: data-carrier store
314: micro-order impact damper 315: operand 1 impact damper
316: read operands impact damper 317: operand 2 impact dampers
318: actuating logic device 320: result register
322: flag register
Embodiment
Below explanation is under the train of thought of a specific embodiment and necessary condition thereof and provide, and can make general those skilled in the art can utilize the present invention.Yet the various modifications that this preferred embodiment is done will be apparent to those skilled in the art, and, in this General Principle of discussing, also can be applied to other embodiment.
Consider to carry out the background discussion that is written into comparison operation in the above relevant existing pipeline microprocessor, now will cooperate Fig. 1 and 2 to carry out the discussion of existing example of technology, it clearly explains orally in the conventional line framework, can hinder the restriction that is written into comparison operation and ejects effective execution of comparison operation.Then, can cooperate Fig. 3 and 4 explanation the present invention.In microprocessor proposed by the invention, carrying out the quantity that ejects the required micro-order (and corresponding pipeline cycle) of comparison operation can lack half than the required quantity of implementing scanning character string macro instruction, also lacks 1/3 than the required quantity of implementing the instruction of compare string string.
Fig. 1 is the layer block scheme of the pipeline microprocessor 100 of correlation technique.Microprocessor 100 comprises an extract layer 101, translation layer 102, register layer 103, address layer 104, data/ALU layer 105 and writes back layer 106.
In the running, extract layer 101 can read macro instruction from external command storer 109, carried out by microprocessor 100.Translation layer 102 can be translated into corresponding micro-order with the macro instruction of extracting.The register 107 of register layer 103 meeting in register file, read by the specified operand of micro-order, and use for the succeeding layer 104-106 of pipeline.Address layer 104 can produce by the specified storage address of micro-order, is used for data storing and reads computing.Data/ALU layer 105 can be carried out arithmetic, other specify arithmetic of logical OR, and the operand that reads from register 107 with use bears results; Or access external data memory 110, and the address of using address layer 104 to be produced stores or reads memory operand.Data/ALU layer 105 meeting use register 108 (that is, register T1 and T2), come the store operation number, and in a result register RESLT, provide the result, and upgrade a flag register FLAGS in addition, represent some characteristic (for example, zero flag is represented zero result etc.) of the content of RESLT.Writing back layer 106 can be with in data/ALU layer 105, and the result who produces or obtain from data-carrier store 110 upgrades the register 107 in the register file.Micro-order can be synchronous with pipeline frequency signal (not shown), in regular turn the continuous layer of each of processing pipeline.In order to reach best pipeline usefulness, when a known micro-order when a known pipeline layer is carried out, a previous micro-order should be carried out in a follow-up pipeline layer, and subsequently a micro-order should be carried out in a previous pipeline layer.In other words, during any known pipeline cycle, all layers 101-106 of microprocessor 100 should carry out the function of its design; There is not layer to leave unused.
But best pipeline amount is difficult to reach, and this is that so that these computings must be disassembled into secondary or sub-more frequently computing, wherein each sub-computing is specified by the micro-order of correspondence because are complexity like this by the specified many computings of macro instruction.Therefore, when delivering to translation layer 102 for one in the macro instruction of this pattern, finish complex calculation though translation layer 102 can produce micro-order, the pipeline layer before translation layer 102 must suspend.If the specified certain operations of macro instruction can be implemented via the single micro-order of processing pipeline unimpededly, then specified computing is called single cycle computing.If specific computing needs three micro-orders, then it is called 3-cycle computing.Clear as can be known be that it is helpful that the micro-order number that the specify arithmetic of implementing macro instruction is required reduces.
It is very complicated why computing that macro instruction is specified or function are considered to, and its reason is a lot.One of them specific factor, that is the present invention institute is to be processed, comes from the framework layout or the configuration of the logic function in the existing pipeline microprocessor.For example, if macro instruction is specified a kind of computing, the sub-computing of mutual exclusion (mutually exclusive) more than it need carry out once in specific pipeline layer 101-106, then this macro instruction must disassemble corresponding micro-order, each micro-order specify the sub-computing of mutual exclusion one of them.Therefore, one first micro-order can be indicated specific pipeline layer 101-106, carries out the sub-computing of first mutual exclusion.Follow one second micro-order after first micro-order, it can indicate specific pipeline layer 101-106, carries out the sub-computing of second mutual exclusion.Follow-up micro-order can continue to produce, and is performed up to the sub-computing of all mutual exclusions.
Now will cooperate Fig. 2 that one specific examples of aforementioned multiple micro-order problem is described.
Fig. 2 is a form 200, and it shows by the microprocessor 100 of Fig. 1 carries out the instruction of repetition compare string string.Table 200 has seven row, each pipeline layer 101-106 that each row is discussed corresponding to earlier figures 1 and one-period row, wherein, pairing pipeline cycle when this cycle row is used for the pipeline layer 101-106 of idsplay order transmission from one channel to another microprocessor 100.In the pipeline cycle that numeral in the cycle row is continuous, each cycle is then corresponding to cycle of the specific quantity of pipeline frequency signals in the microprocessor 100 (or claim the core frequency signal, show among the figure).One of skill in the art will appreciate that, in general, in existing pipeline microprocessor 100, a corresponding core frequency cycle in pipeline cycle.The instruction that this discussion is paid close attention to is shown in significantly and extracts-writes in the reversion; Then be shown as with the incoherent previous and subsequent instructions of this discussion "---".Time-out in the microprocessor pipeline is denoted as " (STALL) pauses ".In cycle,, then can produce break at a pipeline if the specific pipeline layer 101-106 of its correspondence can't carry out specified function.
The example of Fig. 2 is described the restriction of microprocessor 100 commonly used, and the demand of its a plurality of micro-orders of can deriving is to finish the sub-computing of relevant mutual exclusion of the specified computing of a specific macro instruction (being CMPSD).The compare string string macro instruction of this specific macro instruction (CMPSD) for being produced according to the framework routine that meets the x86 compatible microprocessors, and, before the CMPSD macro instruction, add the preamble REPE (repeat if equal) of x86 " then repeating " if equate in order to explain orally.So this example is selected routine of x86 for use, be because x86-compatible microprocessors and relevant macro instruction thereof are that industry is known, yet those skilled in the art can recognize, below the problem discussed also spread all over microprocessor architecture design in other non-x86.
During the cycle 1, the logic in the extract layer 101 of microprocessor 100 can read or extract the compare string string macro instruction (REPE.CMPSD) with repetition preamble from command memory 109.Those skilled in the art will recognize, this is in train of thought employed " storer " 109 1 speech of pipeline microprocessing systems, can be applicable to any type of able to programme or middle program storage media, comprise that disc, ROM (read-only memory) (ROM), random-access memory (ram), chip reach on-chip cache etc. outward.Grand operation code (CMPSD) can indicate microprocessor 100 to carry out the comparison of two operands.These two operand bits are in data-carrier store 110.Wherein, first operand is positioned at by the specified first operand address of the content of register ESI 107, and second operand then is positioned at by the specified second operand address of the content of register EDI 107.In register ECX 107, the number of times of having specified comparison operation to repeat.When repeating for the first time end, after comparing first and second operand, REPE.CMPSD can indicate microprocessor 100, position in the flag register 108 is upgraded, to show the attribute of comparative result, whether equaled for 0 (showing) as it, or whether the result can produce carry (carry) (being shown by the carry flag (not shown) in the flag register FLAGS 108) by the zero flag (not shown) in the flag register FLAGS 108.In this specific examples, the REPE preamble can be checked the state of zero flag, if zero flag shows that two operands that compared are unequal, then can stop repetitive operation.Therefore, it is very useful repeating the instruction of compare string string, because it makes the programmer can indicate microprocessor, two zones of storer 110 is compared, and can determine apace whether these zones comprise identical data.
Repeat the block of compare string string instruction REPE.CMPSD meeting comparing data storer 110, whether equate to judge it.Multiple scanning character string instruction REPE.SCASD then indicates microprocessor 100, scan a block 110, its start address is specified by register ESI 107, and scanning times is then specified by register ECX 107, and, the content of comparing data block 110 and register EAX 107.Therefore, compare string string computing meeting relatively is arranged in two operands of storer 110, and the computing of scanning character string then compares the content of operand in the storer 110 and register EAX 107.After for the first time relatively, (under the situation of scanning character string computing is ESI 107 to the character string pointer register; Under the situation of compare string string computing, then be ESI 107 and EDI 107) content can be according to a byte number and increasing or decreasing.This byte number can supply character string comparison or character string scan operation when repeating next time, points to the position of new data item in each character string.In the microprocessor of x86 compatibility, the Directional Sign (not shown)s in the FLAGS 108 can be judged whether increasing or decreasing of character string pointer register.Moreover after repeating, the content of register ECX can be successively decreased at every turn.When counter register (ECX 107) is decremented to zero, maybe when the condition of repetition no longer satisfies, the repeating and will stop of repeat character string scanner uni character string comparison.In the example of Fig. 2, if zero flag shows that two compare operation numbers are unequal, then the repeat character string comparison operation promptly stops.
During the cycle 2, repeat compare string string macro instruction (REPE.CMPSD) and can handle translation layer 102.Wherein, repeat compare string string macro instruction (REPE.CMPSD) and be translated into repetitive sequence with four micro-orders.First micro-order (POP T1, [ESI]) can indicate microprocessor 100:1) from register ESI 107, first address of the first operand in the reading of data storer 110,2) content of increasing or decreasing ESI 107, used next first operand when repeating this comparison operation next time to point to, 3) use the address that ESI provided, from data-carrier store 110, read first operand, and 4) first operand that reads is stored among the register T1108.Second micro-order (POP T2, [EDI]) can indicate microprocessor 100:1) from register EDI 107, second address of the second operand in the reading of data storer 110,2) content of increasing or decreasing EDI 107, used next second operand when repeating this comparison operation next time to point to, 3) use the address that EDI provided, from data-carrier store 110, read second operand, and 4) second operand that reads is stored among the register T2 108.The 3rd micro-order (the CMP T2 of this repetitive sequence, T1) can indicate microprocessor 100, the first operand that is stored among the T1 is compared with the second operand that is stored among the T2, in result register RESLT108, produce comparative result, and, come updating mark register FLAGS 108 based on result's attribute.In the processor of x86 compatibility, the content that the content of T1 deducts T2 can produce comparative result.Yet, those skilled in the art will appreciate that, can use other technology, come two operands of comparison.For example, the content of available T2 deducts the content of T1, produces comparative result.The 4th micro-order (DEC ECX) is then indicated microprocessor 100, countdown register ECX 107, and finish the execution of a repetitive sequence.Therefore, in order to repeat a character string comparison operation, meeting is extracted first operand and is stored among the register T1 108 from data-carrier store 110, and from data-carrier store 110, extract second operand and be stored among the register T2 108, again T1 108 is compared with the content of T2 108.At last, counter register can successively decrease.Repetitive sequence can continue to carry out, and, receives repeat condition and is not true (in this example, promptly zero flag shows that the content of RESLT 108 is non-vanishing) or the completed signal (not shown) of predetermined number of iterations from data/ALU layer 105 up to translation layer 102.
Because two kinds of sub-computings of mutual exclusion must be carried out in data/ALU layer 105, to reach once complete repetition, so the repeat character string relatively specified ejection comparison operation of macro instruction (REPE.CMPSD) must be disassembled into four micro-orders: POP T1, [ESI], POP T2, [EDI], CMP T2, T1 and DECECX.As can be known, data/ALU layer 105 can use the operand that reads from register 107, carries out arithmetic, other specify arithmetic of logical OR, and bears results from the discussion of aforementioned Fig. 1; Or the address of using address layer 104 to be produced, come access data storer 110, to store or to read memory operand.And, in order to reach the repetition of once ejecting comparison operation, data/ALU layer 105 necessary 1) access data storer 110, to read second operand, and 2) second operand and the first operand that is stored among the T1 108 are compared, to produce comparative result.But, because during the known pipeline cycle, data/ALU layer 105 can only carry out this two seeds computing one of them, so need two micro-order POPT2, [EDI] and CMP T2, T1.
Therefore, during the cycle 2, translation layer 102 can produce the first micro-order POP T1, [ESI].Moreover, during the cycle 2, because translation layer 102 needs the extra pipeline cycle, to produce other micro-order in the repetitive sequence, so (STALL) signal that pauses can be delivered to extract layer 101, to avoid extracting follow-up instruction.
During the cycle 3, the first micro-order POP T1, [ESI] can processing register layer 103.Wherein, register ESI 107 can be by access, to read first address of first operand.Moreover during the cycle 3, translation layer 102 can produce the second micro-order POP T2, [EDI].In addition, during the cycle 3, pause and to continue, subsequent instructions is delivered to translation layer 102 to avoid extract layer 101.
During the cycle 4, POP T1, [ESI] can handle address layer 104.Wherein, can translate first address of during the cycle 3, reading, and deliver to data-carrier store 110 from ESI 107.Those skilled in the art will be appreciated that modern microprocessor 100 uses the virtual address framework that virtual address translation need be become physical address often, with access memory 110.Moreover, during the cycle 4, POP T2, [EDI] can processing register layer 103.Wherein, meeting access function resister EDI 107 is to read second address of second operand.In addition, during the cycle 4, the 3rd micro-order CMP T2, T1 can be produced by translation layer 102.Moreover, during the cycle 4, pause and can continue, to avoid extract layer 101 subsequent instructions is delivered to translation layer 102.
During the cycle 5, POP T1, [ESI] can deal with data/ALU layer 105.Wherein, can use first address after the translating that address layer 104 provided, come the primary importance in the access memory 110, to read the first operand that is used to eject comparison operation, and first operand can be stored among the register T1 108, can carry out access by follow-up micro-order.Moreover, during the cycle 5, POP T2, [EDI] can handle address layer 104.Wherein, can translate second address of during the cycle 4, reading, and deliver to data-carrier store 110 from EDI 107.In addition, during the cycle 5, the 3rd micro-order CMP T2, T1 can processing register layer 103.This does not wherein need to carry out any computing.In addition, during the cycle 5, the 4th micro-order DEC ECX is produced by translation layer 102, with the once repetition of the micro-order of finishing the repeat character string comparison operation.Pause and to continue, subsequent instructions is delivered to translation layer 102 to avoid extract layer 101.
During the cycle 6, POP T1, [ESI] can handle and write back layer 106.Wherein, when the content of coming source-register ESI 107 is written back to register layer 103, and the first micro-order POP T1, [ESI] finishes after the execution, comes the content of source-register ESI 107 to be incremented or to successively decrease.Moreover, during the cycle 6, POP T2, [EDI] can deal with data/ALU layer 105.Wherein, the second place in the access memory 110 is come, to read the second operand that is used to eject comparison operation in second address of translating that can use address layer 104 to be provided, and second operand can be stored among the register T2 108, can carry out access by follow-up micro-order.In addition, during the cycle 6, the 3rd micro-order CMP T2, T1 can handle address layer 104.This does not wherein need to carry out any computing.Moreover during the cycle 6, the 4th micro-order DECECX can processing register layer 103.Wherein, the content of counter register ECX 107 can read from register file.In addition, during the cycle 6, repeat the pairing first micro-order POP T1, [ESI] second time of translation layer 102 meeting generation repeat character string comparison operations.Those skilled in the art will recognize, during this cycle, be read out in the content of ESI 107, in order to indication carry out repeating for the second time of repeat character string comparison operation before, can carry out the write activity of ESI 107.
During the cycle 7, POP T2, [EDI] can handle and write back layer 106.Wherein, when the content of destination register EDI 107 is written back to register layer 103, and the second micro-order POP T1, [ESI] finishes after the execution, and the content of destination register EDI 107 can be incremented or successively decrease.Moreover, during the cycle 7, CMP T2, T1 can deal with data/ALU layer 105.Wherein, T2 108 can compare with the content of T1 108, and result relatively can deliver to result register RESLT 108, and flag register FLAGS108 can upgrade, with the attribute of reflection RESLT108.In addition, during the cycle 7, the 4th micro-order DEC ECX can handle address layer 104.Wherein, do not need to carry out any computing.In addition, during the cycle 7, repeat the pairing second micro-order POP T2, [EDI] second time of translation layer 102 meeting generation repeat character string comparison operations.Those skilled in the art will be appreciated that, during this cycle, be read out in the content of EDI 107, in order to indication carry out repeating for the second time of repeat character string comparison operation before, can carry out the write activity of EDI 107.
During the cycle 8, the 4th micro-order CMP T2, T1 can handle and write back layer.Wherein, do not need to carry out any computing, and repeat the first time of repeat character string comparison operation and can finish.
Being noted that needs four micro-orders to reach the once repetition of repeat character string comparison operation, as previously mentioned.One of them micro-order, i.e. DEC ECX needs the suitable repeat count values of record one, and other three micro-orders ([EDI] and CMP T2 T1) then need be written into two operands from storer 110 for POP T1, [ESI], POP T2, to carry out relatively.Specifically, the framework of microprocessor 100 can prevent that data/ALU layer 105 executable operations number are written into and comparison operation.Therefore, must deliver to register T2 108 by the EDI 107 stored second operands of the second place pointed, must produce follow-up instruction CMP T2 then, T1 is to compare first and second operand.
The present invention has mentioned that repeat character string comparison operation and repeat character string scan operation system are widely used for the existing application of desktop or laptop computer.Therefore, more than cooperating Fig. 1 and the inefficent meeting of 2 pipelines of being discussed to cause execution slower, from user's viewpoint, is disadvantageous.In the situation of repeat character string scan operation, have only an operand from data-carrier store 110, to extract; Other operand can be stored among the register EAX 107.Therefore, repeat in order to make the computing of multiple scanning character string, the second micro-order POP T2, [EDI] will remove from sequence, and CMP T2, and the T1 micro-order will be by CMP EAX, and T1 replaces.And even in the situation of multiple scanning character string computing, the problem of framework still can stop the specified first operand of ESI 107 during the identical pipeline cycle, is written into from storer 110, and compares with the content of EAX 107.Therefore, the present invention will overcome above-mentioned problem.The present invention's (following cooperation Fig. 3 and 4 discusses) is by the new configuration to function in the microprocessor pipeline and counterlogic, and utilize the ejection micro-order relatively of the correspondence of these new pipeline features, and can from the microinstruction sequence that repeats to eject comparison operation, remove a micro-order.
Fig. 3 for according to the present invention illustrate in order to be carried out to block scheme to the microprocessor 300 that is written into relatively or ejects comparison operation.Microprocessor 300 has the extract layer that comprises extraction logic device 301, comprise paired computing translation logic device 304 and micro-order formation 305 translation layer, have register file 307 the register layer, comprise the layer that is written into that is written into logic device 312, and the execution level that comprises actuating logic device 318.For the sake of clarity, do not show among the figure to be written into layer layer (as address layer) before, and the layer behind the execution level (as writing back layer).What the microprocessor commonly used discussed with Fig. 1-2 100 was relative is, microprocessor 300 of the present invention has read storer functional configuration in order to being written into the layer that is written into of operand from data-carrier store 313, its with separate in order to the execution level of carrying out arithmetic and logic function.Microprocessor 300 also comprises paired computing translation logic device 304, can help to utilize being written into and execution level of separation, with reduce to carry out be written in pairs relatively reach eject comparing function required (as when carrying out by the compare string string or scanning the specified computing of character string macro instruction, be written into and the compare operation number required) the number in pipeline cycle.
In the running, extraction logic device 301 can extract macro instruction, and these macro instructions can be delivered to instruction buffer 303 from external memory storage 302 or instruction cache 302.Computing translation logic device 304 can receive macro instruction from instruction buffer 303 in pairs, and macro instruction can be translated into the micro-order of corresponding sequence, in order to indication microprocessor 300, finishes the specified computing of macro instruction.The micro-order of each generation can be sent to the queue buffer 306 in the micro-order formation sequentially.
If scanning character string macro instruction or compare string string macro instruction are read out (for convenience of description from instruction buffer 303, in impact damper 303, demonstration is as the multiple scanning character string macro instruction REP.SCASD of example), then paired computing translation logic device 304 can be translated into macro instruction the micro-order of corresponding sequence, to finish the computing of appointment.For single scanning character string macro instruction or single compare string string macro instruction, a micro-order in the corresponding sequence comprises and is written into comparison micro-order LDCMP XX, [ESI], in order to the indication microprocessor, from data cache 303, (wherein, the address of operand is by register ESI 308 (or register EDI 308 to be written into operand, appointment if so) content is pointed to), and second operand stored in this operand and the internal register 308,321 can be compared.In the situation of single scanning character string macro instruction SCASD, internal register 308 is positioned at register file 307.In the situation of compare string string macro instruction CMPSD, internal register 321 is to come access by actuating logic device 318, and comprises the second operand that has before read from data-carrier store 313.Be used to scan character string macro instruction REP.SCASD if will repeat preamble, or compare string string macro instruction REP.CMPSD, a micro-order in the then corresponding sequence comprises relatively micro-order POPCMPXX of ejection, [ESI], in order to indication microprocessor 300, carry out with above-mentioned and be written into the identical computing of comparison micro-order, but in addition, for each repetition, can indicate content increasing or decreasing, and the content of a counter register ECX 308 is successively decreased a pointer register (being register ESI 308 or EDI 308).In one embodiment, register file logic 307 is with above-mentioned register 308 increasing or decreasings.In another embodiment, for each specified repetition, can send clear and definite micro-order, and make register 308 increasing or decreasings.
Because memory data (that is first operand) read and execution level in first and second operand comparison both no longer be the mutual exclusion computing, be feasible so specify a paired ejection relatively to work by single micro-order of the present invention.In fact, though eject the comparison operation means of content of the present invention as an illustration, but those skilled in the art will appreciate that, scope of the present invention can be included any type of execution function that is written in pairs in, as be written into arithmetic, be written into logical OR and be written into skip functionality, wherein, the required operand of the calculating in the function must read from storer 313.
The register layer can read micro-order in regular turn from the queue position 306 of micro-order formation 305.If the register 308 in the access function resister file 307 is wanted in the micro-order that reads indication, then register file 307 is understood the register 308 of access appointments, and provides its content in operand register 310,311.In addition, micro-order can be transferred to the next pipeline layer in the micro-order impact damper 309.
Micro-order and relevant operand can down transmit up to its arrival and be written into layer via the layer of back, wherein if so specified by micro-order, then are written into logic device 312 and can be used to access data storer 313, with reading of data.Be written into logic device 312 reading of data from data-carrier store 313, and send in the read operands impact damper 316, to come access by actuating logic device 318.The register manipulation number also can be transferred to the execution level in operand 1 impact damper 315 and operand 2 impact dampers 317.In addition, micro-order and other relevant information can be transferred to the execution level in the micro-order impact damper 314.In ejection of the present invention relatively in the particular case of micro-order POPCMP, being written into logic device 312 can access data storer 313, reading the operand that ejects comparison operation, and provides operand, to come access by the actuating logic in operand 3 impact dampers 316.
Actuating logic device 318 can receive micro-order from micro-order impact damper 314, and receives relevant operand data from operand impact damper 315-317, and carries out specify arithmetic, to bear results, delivers to a result register 320.In addition, micro-order and relevant information can be via micro-order impact dampers 319, and are sent to follow-up layer.In ejecting the particular case that compares micro-order POPCMP, actuating logic device 318 can be from operand 3 impact dampers 316 read operands, and itself and a second operand compared, this second operand is: 1) provided (in the situation of scanning character string computing) by a register file register 308, or 2) provided (in the situation of compare string string computing by register T1 321, wherein a previous micro-order that is written into has been indicated microprocessor 300, be written into second operand from data-carrier store 313, and it is stored among the T1 321).The result of comparison operation can deliver to result register 320, and flag register FLAGS 322 can be updated, with reflection result's attribute.In one embodiment, the scanning character string instruction is an x86 scanning character string macro instruction, and comparison operation comprises and obtains following difference between the two: be stored in the first operand of framework register EAX 308, and read from data-carrier store 313 and by the content second operand pointed of framework register EDI 308.In another embodiment, the instruction of compare string string is an x86 compare string string macro instruction, and comparison operation comprises and obtains following difference between the two: read from data-carrier store 313 and by the content first operand pointed of framework register ESI 308, and read from data-carrier store 313 and by the content second operand pointed of framework register EDI 308.
Fig. 4 is a form 400, and its demonstration is carried out repetition compare string string macro instruction by the microprocessor 300 of Fig. 3.Table 400 has seven row, each relevant pipeline layer that each row is discussed corresponding to earlier figures 3 and one-period row, pairing pipeline cycle when this cycle row is used for the pipeline layer of idsplay order transmission from one channel to another microprocessor 300.Though for the sake of clarity, address layer is not shown in the calcspar of Fig. 3, address layer guild is shown in the table 400, so that the execution situation of relevant micro-order to be described.Moreover for the sake of clarity, table 400 shown in the pipeline of microprocessor 300 of the present invention, the pipeline layer till the execution level.Similar during with the table 200 of aforementioned discussion Fig. 2, the instruction of being paid close attention to is shown in clearly to extract and carries out in the row; Then be shown as with the incoherent previous and subsequent instructions of the present invention "---".
The example of Fig. 4 has been described the restriction how pipeline microprocessor 300 of the present invention overcomes microprocessor 100 commonly used, and can the derive demand of a plurality of micro-orders of these restrictions is to finish the sub-computing of mutual exclusion.In the example of Fig. 4, it is relevant that this a little computing and specific x86 repeat the specified computing of compare string string macro instruction REPE.CMPSD, yet specific herein macro instruction only is intended for example so that content of the present invention to be described.It should be noted that the present invention can be applicable to other microprocessor instruction set framework and complicated macro instruction, need calculate action therein after being written into operand.
During the cycle 1, extraction logic device 301 can read or extract the compare string string macro instruction REPE.CMPSD that tool repeats preamble from command memory 302.Mentioned about Fig. 2 part as described above, this is in train of thought employed " storer " 302 1 speech of pipeline microprocessing systems, can be applicable to any type of able to programme or middle program storage media, comprise that disc, ROM (read-only memory) (ROM), random-access memory (ram), chip reach on-chip cache etc. outward.Grand operation code (CMPSD) can indicate microprocessor 300 to carry out the comparison of two operands.These two operand bits are in data-carrier store 313.First operand is positioned at by the specified first operand address of the content of register ESI 308, and second operand then is positioned at by the specified second operand address of the content of register EDI 308.In register ECX 308, the number of times of having specified comparison operation to repeat.When repeating for the first time end, after comparing first and second operand, REPE.CMPSD can indicate microprocessor 300, position in the flag register FLAGS 322 is upgraded, to show the attribute of comparative result, whether equaled for 0 (showing) as it, or whether the result can produce carry (being shown by the carry flag (not shown) in the flag register FLAGS 322) by the zero flag (not shown) in the flag register FLAGS 322.In this specific examples, the REPE preamble can be checked the state of zero flag, if zero flag shows that two operands that compared are unequal, then can stop repetitive operation.
Repeat the block of compare string string instruction REPE.CMPSD meeting comparing data storer 313, whether equate to judge it.Multiple scanning character string instruction REPE.SCASD then indicates microprocessor 300, scan a data-carrier store 313, its start address is specified by register ESI 308, and scanning times is then specified by register ECX 308, and, the content of comparing data storer 313 and register EAX 308.After for the first time relatively, (under the situation of scanning character string computing is ESI 308 to the character string pointer register; Next in the situation of compare string string computing is ESI 308 and EDI 308) content can be according to a byte number and increasing or decreasing.This byte number can supply character string comparison or character string scan operation when repeating next time, points to the position of new data item in each character string.In the microprocessor of x86 compatibility, the Directional Sign (not shown)s in the FLAGS 108 can be judged whether increasing or decreasing of character string pointer register.In addition, in the microprocessor of x86 compatibility, the specific coding mode of scanning character string instruction or the instruction of compare string string can indicate microprocessor to go compare byte, word group or double-word group, determines the amount of character string pointer register 308 increasing or decreasing of being wanted with this.Moreover after repeating, the content of register ECX can be successively decreased at every turn.When register ECX 308 is decremented to zero, maybe when the condition of repetition no longer satisfies, the repeating and will stop of repeat character string scanner uni character string comparison.In this example, if zero flag shows that two compare operation numbers are unequal, then the repeat character string comparison operation promptly stops.
During the cycle 2, repeat compare string string macro instruction (REPE.CMPSD) and can handle translation layer 102.Wherein, repeat compare string string macro instruction (REPE.CMPSD) and be translated into repetitive sequence with three micro-orders.First micro-order (POP T1, [ESI]) can indicate microprocessor 300:1) from register ESI 308, first address of the first operand in the reading of data storer 313,2) content of increasing or decreasing ESI 308, used next first operand when repeating this comparison operation next time to point to, 3) address of using ESI 308 to be provided is from data-carrier store 313, read first operand, and 4) first operand that reads is stored among the register T1 321.Second micro-order (POPCMP[EDI], T1)) can indicate microprocessor 300:1) from register EDI 308, second address of the second operand in the reading of data storer 313,2) content of increasing or decreasing EDI 308, used next second operand when repeating this comparison operation next time to point to, 3) address of using EDI 308 to be provided, from data-carrier store 313, read second operand, 4) with the second operand that reads, compare with the first operand that is stored in register T1 321, and 5) produce comparative result in result register RESLT320, and, come updating mark register FLAGS 322 based on result's attribute.In one embodiment, calculate the difference of two operands to produce comparative result.The 3rd micro-order (DEC ECX) can indication microprocessor 300, countdown register ECX 308, and finish the execution of a repetitive sequence.In one embodiment, clear and definite micro-order DEC ECX is used to refer to microprocessor 300, and counter register 308 is successively decreased.In another embodiment, during repeating,, or eject relatively micro-order POPCMP[EDI as little operation code POPCMP at every turn], T1 is indicated, and counter register ECX 308 can be successively decreased automatically.Repetitive sequence can continue to carry out, and, receives repeat condition and is not true (in this example, promptly zero flag shows that the content of RESLT 320 is non-vanishing) or the completed signal (not shown) of predetermined number of iterations from actuating logic device 318 up to translation logic 304.
Different with microprocessor commonly used 100 is, 300 of microprocessors of the present invention need be disassembled into three micro-order POP T1 with the specified ejection comparison operation of repeat character string comparison macro instruction REPE.CMPSD, [ESI], POPCMP[EDI], T1 and DEC ECX, and because be written into and comparison operation is configured to different layer (promptly be written into layer and execution level), so both no longer are mutual exclusions.Therefore, when ejecting comparison operation when being executed in microprocessor 300 of the present invention, can become single cycle computing.Owing to eject comparison operation is to compare micro-order POPCMP (it in cycle, can handle each layer of microprocessor 300 of the present invention at single pipeline) by single ejection to reach, so this computing is called single cycle computing.
Therefore, during the cycle 2, computing transfer interpreter 304 can produce the first micro-order POP T1, [ESI] in pairs.Moreover, during the cycle 2, because translation logic 304 needs the extra pipeline cycle, to produce other micro-order in the repetitive sequence, so (STALL) signal that pauses can be delivered to extraction logic device 301, to avoid extracting follow-up instruction.
During the cycle 3, the first micro-order POP T1, [ESI] can the processing register layer.Wherein, register ESI 308 can be by access, to read first address of first operand.Moreover during the cycle 3, translation layer 102 can produce and eject relatively micro-order POPCMP[EDI], T1.In addition, during the cycle 3, pause and to continue, subsequent instructions is delivered to transfer interpreter 304 to avoid extraction logic device 301.
During the cycle 4, POP T1, [ESI] can handle address layer 104 (not showing among Fig. 3).Wherein, can translate first address of during the cycle 3, reading, and deliver to data-carrier store 313, be similar to the computing that Fig. 2 example is narrated from ESI 308.Moreover, during the cycle 4, eject relatively micro-order POPCMP[EDI], T1 can the processing register layer.Wherein, meeting access function resister EDI 308 is to read second address of second operand.In addition, during the cycle 4, the 3rd micro-order DEC ECX can be produced by paired computing translation logic device 304.Moreover, during the cycle 4, pause and can continue, to avoid extraction logic device 301 subsequent instructions is delivered to translation logic 304.
During the cycle 5, POP T1, [ESI] can handle and be written into layer.Wherein, can use first address after the translating that address layer 1 04 provided, come the primary importance in the access data storer 313, reading the first operand that is used to eject comparison operation, and first operand can be sent to the execution level in the operand impact damper 3316.Moreover, during the cycle 5, POPCMP[EDI], T1 can handle address layer 104.Wherein, can translate during the cycle 4, read second address, and deliver to 313 from EDI 308.In addition, during the cycle 5, the 3rd micro-order DEC ECX can the processing register layer.Wherein, the content of register ECX 308 reads from register file 307, and be sent to operand register 310,311 one of them.In one embodiment, the content of ECX 308 can be sent to actuating logic device 318, successively decreasing, and writes to register ECX 308.In another embodiment, for not needing to be sent to actuating logic device 318, register logical 307 is promptly successively decreased ECX 308 by configuration.And as mentioned above, when the third embodiment of the present invention ejected the operation code POPCMP that compares micro-order in execution, ECX 308 can successively decrease automatically.In addition, during the cycle 5, repeat the first required micro-order POP T1, [ESI] second time of translation logic 304 meeting generation repeat character string comparison operations.Those skilled in the art will recognize that register ESI 308 increases progressively the content of (or successively decreasing), can provide by the embodiment that discuss in this place, or be provided by interbus technology commonly used, and this technology is sent to previous layer in order to the result with one deck.Moreover, during the cycle 5, pause and can continue, to avoid extraction logic device 301 subsequent instructions is delivered to translation logic 304.
During the cycle 6, POP T1, [ESI] can handle the execution level of microprocessor 300.Wherein, actuating logic device 318 can read first operand from impact damper 316, and with its storage, in order in register T1 321, to compare during the afterwards continuous pipeline cycle.Moreover, during the cycle 6, POPCMP[EDI], T1 can handle and be written into layer.Wherein, can use second address after translating by address layer provided, come the second place in the access data slow at a high speed 313, reading the second operand that is used to eject comparison operation, and second operand can be sent to the actuating logic device 318 in the impact damper 316.In addition, during the cycle 6, the 3rd micro-order DEC ECX can handle address layer 104.In addition, during the cycle 6, repeat pairing first micro-order meeting processing register layer for the second time, and repeat pairing second the second time of translation logic 304 meeting generation repeat character string comparison operations and eject relatively micro-order POPCMP[EDI], T1.
During the cycle 7, POPCMP[EDI], T1 can the processing execution layer.Wherein, actuating logic device 318 receives first operand from register T1 321, and receives second operand from impact damper 316, these two operands is compared again.Result relatively can be sent to result register 320, and FLAGS322 can be updated, thereby repeat the first time of finishing this compare string string computing.Moreover during the cycle 7, DEC ECX can handle and be written into layer.Wherein, do not need to carry out any computing.In addition, during the cycle 7, repeat pairing micro-order and can handle translation layer, register layer and address layer the second time of this compare string string computing.
During the subsequent cycle of pipeline frequency, repeat the follow-up inferior of compare string string computing and repeat pairing micro-order, can handle each successive layers of microprocessor 300 of the present invention, no longer be true up to repeat condition, or counter register ECX 308 vanishing.
Different with microprocessor commonly used 100 is, carry out according to the present invention repetition compare string string or the computing of scanning character string not the extra register of needs store (as the register T2 in Fig. 2 example).And, in the cycle, can use single micro-order POPCMP to carry out paired ejection comparison operation at single pipeline, therefore can improve the whole efficiency of compare string string and the computing of scanning character string.
Though the present invention and purpose thereof, feature and advantage are described in detail, other embodiment also can be within the scope of the present invention.For example, preamble is mentioned, and carries out single or computing of multiple scanning character string or the required micro-order and the quantity in corresponding pipeline cycle of compare string string computing, can by adopt favorable characteristics of the present invention and reduce significantly.Yet, any pattern be written into-carry out computing, as be written into-addition, be written into-subtract each other and be written into-logical operation, also can enjoy identical benefit.
In addition, though being the macro instruction according to the microprocessor architecture design of x86 compatibility, preamble describes, but those skilled in the art will appreciate that will be written into and carry out any pipeline architecture that function is included into two continuous sub-computings for its pipeline layer, the present invention is all applicable.
In a word, the above only is preferred embodiment of the present invention, when can not with the scope implemented of qualification the present invention.Generally the equalization of doing according to claim of the present invention changes and modifies, and all should still belong in the scope that patent of the present invention contains.

Claims (10)

1. one kind is repeated to eject the device of comparison operation in order to carry out one in microprocessor, comprising:
One paired computing translation logic device, in order to receive the macro instruction that this repeats to eject comparison operation, and produce one and eject micro-order, and eject relatively micro-order, a micro-order of successively decreasing, wherein, this ejection micro-order is indicated the pipeline layer in the microprocessor to carry out and is ejected computing, this ejection relatively micro-order indicates the pipeline layer in the microprocessor to carry out the ejection comparison operation, and this micro-order of successively decreasing is carried out the computing of successively decreasing to counter register;
One is written into the logic device, is coupled to this paired computing translation logic device, in order to receive relatively micro-order of this ejection, and read a second operand from an address of storer, wherein, this address is specified by the content of second register, and this second register ejects relatively micro-order appointment by this; And
One actuating logic device, be coupled to this and be written into the logic device, in order to receiving this second operand, and this second operand and a first operand are compared, this first operand is to be read by described this ejection micro-order of translating among the above-mentioned a plurality of micro-orders that obtain
Wherein, this is written into the logic device and is included in the ground floor in this pipeline layer, and this actuating logic device is included in the second layer in this pipeline layer, and wherein, this second layer is positioned at after this ground floor,
This eject relatively micro-order in a single pipeline handle in the cycle this ground floor and this second layer one of them.
2. device as claimed in claim 1, wherein, this ejection comparison micro-order is indicated this to be written into the logic device and is read this second operand, and indicates this actuating logic device that this second operand and this first operand are compared.
3. device as claimed in claim 1, wherein, this ground floor is sent to this second layer with this second operand and this ejection comparison micro-order, to carry out the comparison of this second operand and this first operand.
4. device as claimed in claim 1, wherein, this actuating logic device upgrades a flag register according to a comparative result of this second operand and this first operand.
5. device as claimed in claim 4, wherein, this flag register comprises a zero flag, a carry flag and an overflow sign.
6. device as claimed in claim 1, wherein, this content of this this second register of actuating logic device increasing or decreasing makes this content point to next address in this storer, compares during the next one ejects comparison operation for this actuating logic device.
One kind in a microprocessor in order to carry out the device of an one scan character string instruction or compare string string instruction, comprising:
One paired computing transfer interpreter, configuration is translated into corresponding a plurality of micro-orders for scanning the instruction of character string instruction or this compare string string, wherein, a plurality of micro-orders of this correspondence comprise that an ejection micro-order and ejects relatively micro-order, wherein this ejection micro-order indicates the pipeline layer in the microprocessor to carry out the ejection computing, this ejection relatively micro-order indicates this microprocessor to carry out two kinds of computings, and wherein these two kinds of computings are carried out by two successive layerss of this microprocessor, and these two successive layerss comprise:
One is written into layer, and configuration is for carrying out one first computing in these two kinds of computings, and this first computing comprises reads a second operand from one of storer position; And
One execution level, be coupled to this and be written into layer, configuration is for carrying out one second computing in these two kinds of computings, this second computing comprises this second operand of reception, and this second operand and a first operand compared, to produce a result, wherein this first operand is to be read by described this ejection micro-order of translating among the above-mentioned a plurality of micro-orders that obtain
Wherein, in this microprocessor, this execution level is followed this and is written into layer,
This ejection relatively micro-order be in a single pipeline handles in the cycle this be written into layer and this execution level one of them,
Wherein, the content of this execution level increasing or decreasing one register makes this content point to the next address of this storer, and this address of this storer stores next operand, compares during the next one ejects comparison operation for this execution level.
8. device as claimed in claim 7, wherein, this is written into layer this second operand and this ejection comparison micro-order is sent to this execution level, to carry out the comparison of this second operand and this first operand.
9. device as claimed in claim 8, wherein, this execution level upgrades a flag register according to this result.
10. one kind is repeated to eject the system of comparison operation in order to carry out one in a pipeline microprocessor, comprising:
One paired computing translation logic device, this paired computing translation logic device receives and translates the macro instruction that this repeats to eject comparison operation, and produces one and eject micro-order, and eject relatively micro-order, a micro-order of successively decreasing; And
One is written into the logic device, configuration ejects relatively micro-order for receiving this, and read a second operand from a memory location, and this second operand is sent to the actuating logic device in the succeeding layer of this pipeline microprocessor, wherein, this actuating logic device is at a single pipeline relatively this second operand and first operand in the cycle, and produces a comparative result, this first operand is to be read by described this ejection micro-order of translating among the above-mentioned a plurality of micro-orders that obtain
Wherein, this ejection comparison micro-order is to handle this in the cycle in a single pipeline to be written into the logic device.
CN2008101338414A 2004-04-01 2004-04-01 Apparatus and system for executing pop comparison microinstruction in microprocessor Expired - Lifetime CN101349969B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2008101338414A CN101349969B (en) 2004-04-01 2004-04-01 Apparatus and system for executing pop comparison microinstruction in microprocessor

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2008101338414A CN101349969B (en) 2004-04-01 2004-04-01 Apparatus and system for executing pop comparison microinstruction in microprocessor

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
CN 200410032141 Division CN1564126A (en) 2004-04-01 2004-04-01 Device and system for executing pop-up comparison microinstructions in microprocessor

Publications (2)

Publication Number Publication Date
CN101349969A CN101349969A (en) 2009-01-21
CN101349969B true CN101349969B (en) 2011-06-15

Family

ID=40268776

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2008101338414A Expired - Lifetime CN101349969B (en) 2004-04-01 2004-04-01 Apparatus and system for executing pop comparison microinstruction in microprocessor

Country Status (1)

Country Link
CN (1) CN101349969B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5983344A (en) * 1997-03-19 1999-11-09 Integrated Device Technology, Inc. Combining ALU and memory storage micro instructions by using an address latch to maintain an address calculated by a first micro instruction
US6338136B1 (en) * 1999-05-18 2002-01-08 Ip-First, Llc Pairing of load-ALU-store with conditional branch

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5983344A (en) * 1997-03-19 1999-11-09 Integrated Device Technology, Inc. Combining ALU and memory storage micro instructions by using an address latch to maintain an address calculated by a first micro instruction
US6338136B1 (en) * 1999-05-18 2002-01-08 Ip-First, Llc Pairing of load-ALU-store with conditional branch

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
US 6338136 B1,说明书第15栏第55行-65行,第16栏第9行-28行,权利要求1以及图5.

Also Published As

Publication number Publication date
CN101349969A (en) 2009-01-21

Similar Documents

Publication Publication Date Title
EP0042442B1 (en) Information processing system
CN100401258C (en) Method and apparatus for maintaining context while executing translated instructions
US5150468A (en) State controlled instruction logic management apparatus included in a pipelined processing unit
CN101819518A (en) Method and device for quickly saving context in transactional memory
CN105074657B (en) The hardware and software solution of diverging branch in parallel pipeline
EP3729286B1 (en) System and method for executing instructions
CN103793263A (en) DMA transaction-level modeling method based on Power PC processor
US8027828B2 (en) Method and apparatus for synchronizing processors in a hardware emulation system
CN109656868B (en) Memory data transfer method between CPU and GPU
US6516410B1 (en) Method and apparatus for manipulation of MMX registers for use during computer boot-up procedures
CN101349969B (en) Apparatus and system for executing pop comparison microinstruction in microprocessor
CN102193860A (en) Microcontroller online debugging circuit and method as well as microcontroller
CA2304609A1 (en) Autonomously cycling data processing architecture
JPS58149541A (en) Data processing device
US8886512B2 (en) Simulation apparatus, computer-readable recording medium, and method
CN116841614B (en) Sequential vector scheduling method under disordered access mechanism
CN109683962A (en) A kind of method and device of instruction set simulator pipeline modeling
US20080244240A1 (en) Semiconductor device
CN103748557B (en) Simulation equipment and simulation method thereof
CN101819608B (en) Device and method for accelerating instruction fetch in microprocessor instruction-level random verification
CN116136762A (en) A Design Method of FPGA Semi-custom Heterogeneous Computing System Based on OpenCL
CN107329807A (en) Data delay treating method and apparatus, computer-readable recording medium
CN101739370B (en) Bus system and its method of operation
JP7718014B2 (en) Compiler device, processing unit, instruction generation method, program, compilation method, and compiler program
CN102799415A (en) File reading and writing parallel processing method combining with semaphore

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CX01 Expiry of patent term

Granted publication date: 20110615

CX01 Expiry of patent term