[go: up one dir, main page]

CN102955712A - Method and device for providing association relation and executing code optimization - Google Patents

Method and device for providing association relation and executing code optimization Download PDF

Info

Publication number
CN102955712A
CN102955712A CN2011102523537A CN201110252353A CN102955712A CN 102955712 A CN102955712 A CN 102955712A CN 2011102523537 A CN2011102523537 A CN 2011102523537A CN 201110252353 A CN201110252353 A CN 201110252353A CN 102955712 A CN102955712 A CN 102955712A
Authority
CN
China
Prior art keywords
code
incidence relation
instruction sequence
performance
instruction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2011102523537A
Other languages
Chinese (zh)
Other versions
CN102955712B (en
Inventor
邹嘉
范伟
侯锐
M·沃斯特
王艳琦
孙正雅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
International Business Machines Corp
Original Assignee
International Business Machines Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by International Business Machines Corp filed Critical International Business Machines Corp
Priority to CN201110252353.7A priority Critical patent/CN102955712B/en
Priority to DE102012214672A priority patent/DE102012214672A1/en
Priority to GB1215035.5A priority patent/GB2494268A/en
Publication of CN102955712A publication Critical patent/CN102955712A/en
Application granted granted Critical
Publication of CN102955712B publication Critical patent/CN102955712B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • G06F8/44Encoding
    • G06F8/443Optimisation
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3409Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment for performance assessment
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3466Performance evaluation by tracing or monitoring
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Prevention of errors by analysis, debugging or testing of software
    • G06F11/3604Analysis of software for verifying properties of programs
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Prevention of errors by analysis, debugging or testing of software
    • G06F11/3604Analysis of software for verifying properties of programs
    • G06F11/3612Analysis of software for verifying properties of programs by runtime analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Software Systems (AREA)
  • Debugging And Monitoring (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

The invention provides a method and system for executing performance optimization. The method comprises the steps of acquiring performance sampling data associated with the execution of a first code on a first physical platform, determining association relation between a command sequence and a performance deficiency event according to the performance sampling data, and providing the association relation to other physical platforms; and acquiring the association relation on a second physical platform, and optimizing a second code by utilizing the association relation, thereby further optimizing the execution performance of the second code on the first physical platform. The device provided by the invention corresponds to the method. By utilizing the method and the device provided by the invention, the cross-platform performance optimization can be realized and the optimization process can be more convenient and more efficient by generating the association relation on the target physical platform and optimizing the development codes on a development platform.

Description

The method and apparatus that provides incidence relation and run time version to optimize
Technical field
The present invention relates to the optimization of computing machine execution performance, more specifically, relate to and stride the performance optimization that physical platform carries out.
Background technology
The development of infotech has proposed more and more higher requirement to the execution performance of computing machine.In the practice, execution performance not only depends on the physical platform of computing machine, also depends on software application utilization ratio to physical platform when carrying out.If have good concertedness (synergy) between the software and hardware, namely, software application can be utilized the executive capability of physical platform fully, just can obtain higher execution performance.
In order to improve execution performance, when exploitation and operating software application, often need software application is optimized, make it adapt to better hardware platform.Particularly, in the software code implementation, processor can be sampled and record to the implementation status of instruction, forms performance sampled data.This performance sampled data can reflect behavior and the event relevant with hardware performance that occurs when instruction is carried out.By analyzing such performance sampled data, can know the practice condition of specific instruction on particular hardware platform.Practice condition based on obtaining just can be optimized instruction code, eliminates the event that causes performance deficiency, thereby improves execution performance.
Particularly, existing various processor unit can provide hardware signal to indicate the hardware performance event.In order to obtain these performance events, can performance monitoring unit be set at hardware platform, this monitor unit comprises the hardware performance counter of multiple special use, and each counter is connected to a hardware signal by converter.The sampling thief that moves in the OS kernel can be sampled to these counters, and above-mentioned sampling can periodically be carried out or the overflow exception of do as one likes energy monitor triggers.The sampling daily record can be stored among the memory buffer, writes at last a document, forms thus performance sampled data.
Typically, according to different hardware configurations, in processor, may occur tens even thousands of kinds of different hardware performance events.Such event for example comprises, instruction cache does not hit (ICacheMiss), and data cache does not hit (DCacheMiss), TLB does not hit (TLBMiss), instruction pipelining is ended (Stall), pipeline latency (Recycle), etc.The type of event mainly depends on the physical arrangement of processor.These events can change the state of processor, are the key factors that affects execution performance.
Fig. 1 illustrates the example of a typical performance sampled data.In the example of Fig. 1, performance sampled data is arranged to a form, each clauses and subclauses of form, and every row for example illustrates the implementation status of an instruction, comprises the information of instruction and the performance event statistics relevant with this instruction.The information of instruction comprises, for example, and the module title (Mod) of instruction, instruction address (Addr), (jump instruction) destination address (TargetAddr), instruction operation code (Opcode), operand (Operand), hits (ticks) etc.; The performance event statistics relevant with this instruction comprises the aforesaid ICacheMiss of generation, DCacheMiss, TLBMiss, Stall, the sample information of the events such as Recycle.Be appreciated that performance sampled data can comprise more or different performance events according to the structure of physical platform and the configuration of performance monitoring unit.Performance deficiency when at least a portion performance event and instruction that usually, records is carried out is relevant.For example, the various speed bufferings that record among Fig. 1 do not hit (CacheMiss) event meeting so that the direct reading out data from speed buffering of processor, the Stall of streamline and the meeting of Recycle event so that the execution of instruction stream temporarily hang up or wait for, etc.The generation of these events all can so that the execution speed of processor and Efficiency Decreasing cause performance deficiency, therefore can be called the performance deficiency event with this class event.
Therefore, by analyzing above performance sampled data, obtain the command information relevant with performance deficiency, just might overcome these defectives, thereby code is optimized, improve the execution performance of processor.Fig. 2 illustrates the schematic diagram that carries out code optimization in the prior art.As shown in the figure, provide optimizer and performance element at the Same Physical platform.Various codes comprise source code, the intermediate code of software application or transform binary code afterwards, at first are imported in the optimizer.If code is to carry out first, optimizer directly is transferred to performance element with code so.Carry out in the process of these codes at performance element, as mentioned above, performance monitoring unit can generate with code and carry out relevant performance sampled data.In case obtain such performance sampled data, optimizer just can obtain the event relevant with performance deficiency by the analytical performance sampled data.Based on the performance deficiency event, optimizer can be adjusted and optimizes code, attempts to eliminate or reduce the performance deficiency event.Then, optimizer will again be sent to performance element through optimized code and carry out, thereby again generate performance sampled data.By analytical performance sampled data again, optimizer is optimized code again, further the removal capacity defective.So, through so repeatedly execution and adjustment, can be so that the code of software application be suitable for carrying out platform better.
In optimizing process shown in Figure 2, the optimization of code is based on the analysis to performance sampled data, and being based on the execution of this code on particular platform, performance sampled data generates, therefore, must at first carry out just and may be optimized it code on the target physical platform, execution and optimization must be carried out at the Same Physical platform.Yet in practice, the exploitation of software application is carried out in development platform by the technician usually, and it is carried out common hardware platform the client and carries out.Because the physical features of the hardware platform that executive software is used is various, carry out code optimization for each hardware platform and just need to carry out at the enterprising line code of various platforms, this certainly leads to very expensive Cost optimization.On the other hand, performance sampled data can reflect the command information of carrying out on the physical platform, therefore, might be brought the hidden danger of secure context by the mode of developer's run time version optimization on client's hardware platform, the confidential information of client aspect is perhaps leaked in for example source of leaks code information.Therefore, existing performance optimization scheme is at many-sided Shortcomings.
Summary of the invention
In view of problem set forth above, the present invention proposes a kind of cross-platform scheme of carrying out performance optimization, is intended to overcome at least one the problem that exists in the prior art.
Particularly, according to first aspect present invention, a kind of method that incidence relation is provided is proposed, comprise: obtain the performance sampled data relevant with the execution of first code, this performance sampled data comprise the instruction corresponding with described first code information and with the information of performance deficiency event corresponding to described instruction; According to described performance sampled data, make up at least one instruction sequence, and determine the incidence relation between described at least one instruction sequence and the performance deficiency event; And, described incidence relation is offered other physical platforms, for the optimization of on these other physical platforms, carrying out second code based on described incidence relation.
According to second aspect present invention, the method that a kind of run time version is optimized is proposed, comprising: the incidence relation that provides such as first aspect is provided; According to described incidence relation, determine the performance deficiency event that second code is corresponding; And based on the performance deficiency event of determining, optimize second code.
According to third aspect present invention, a kind of device that incidence relation is provided is proposed, comprise: the sampled data acquiring unit, be configured to obtain the performance sampled data relevant with the execution of first code, this performance sampled data comprise the instruction corresponding with described first code information and with the information of performance deficiency event corresponding to described instruction; The incidence relation determining unit is configured to make up at least one instruction sequence according to described performance sampled data, and determines the incidence relation between described at least one instruction sequence and the performance deficiency event; And incidence relation provides the unit, is configured to described incidence relation is offered other physical platforms, for the optimization of carrying out second code on these other physical platforms based on described incidence relation.
According to fourth aspect present invention, the device that a kind of run time version is optimized is proposed, comprising: the incidence relation acquiring unit is configured to obtain the incidence relation that the device such as the third aspect provides; The defective determining unit is configured to according to described incidence relation, determines the performance deficiency event that second code is corresponding; And the code optimization unit, be configured to optimize second code based on the performance deficiency event of determining
Utilize method and apparatus of the present invention, can be based on the incidence relation between the instruction sequence that generates at the target physical platform and the performance deficiency event, on development platform, exploitation code is optimized, make it be suitable for better target platform, thereby realize cross-platform performance optimization, and so that optimizing process is more convenient efficient.
Description of drawings
Fig. 1 illustrates the example of a typical performance sampled data;
Fig. 2 illustrates the schematic diagram that carries out code optimization in the prior art;
Fig. 3 illustrates the process flow diagram of the method that incidence relation is provided according to an embodiment of the invention;
Fig. 4 illustrates the according to an embodiment of the invention process flow diagram of the method for run time version optimization;
Fig. 5 A illustrates the substep of determining according to an embodiment of the invention incidence relation;
Fig. 5 B illustrates the substep of determining according to another embodiment of the present invention incidence relation;
Fig. 6 illustrates according to one embodiment of the invention instruction sequence is carried out example fierce and coding;
Fig. 7 illustrates the according to an embodiment of the invention schematic diagram of incidence relation;
Fig. 8 illustrates the substep of determining according to an embodiment of the invention the performance deficiency event;
Fig. 9 A illustrates the schematic block diagram of the device that incidence relation is provided according to an embodiment of the invention;
Fig. 9 B illustrates the according to an embodiment of the invention schematic block diagram of the device of run time version optimization;
Figure 10 illustrates the block diagram of the exemplary computer system 100 that is suitable for realizing embodiment of the present invention.
Embodiment
The below is described in detail the specific embodiment of the present invention.The person of ordinary skill in the field knows, many aspects of the present invention can be presented as system, method or computer program.Therefore, many aspects of the present invention can specific implementation be following form, that is, can be completely hardware, software (comprising firmware, resident software, microcode etc.) or this paper are commonly referred to as " circuit ", the software section of " module " or " system " and the combination of hardware components completely.In addition, many aspects of the present invention can also take to be embodied in the form of the computer program in one or more computer-readable mediums, comprise the procedure code that computing machine can be used in this computer-readable medium.
Can use any combination of one or more computer-readable media.Computer-readable medium can be computer-readable signal media or computer-readable recording medium.Computer-readable recording medium for example can be---but being not limited to---electricity, magnetic, light, electromagnetism, ultrared or semi-conductive system, device, device or any above combination.The more specifically example of computer-readable recording medium (non exhaustive tabulation) comprises following: electrical connection, portable computer diskette, hard disk, random access memory (RAM), ROM (read-only memory) (ROM), erasable type programmable read only memory (EPROM or flash memory), optical fiber, Portable, compact disk ROM (read-only memory) (CD-ROM), light storage device, magnetic memory device or the above-mentioned any suitable combination of one or more wires are arranged.In the linguistic context of presents, computer-readable recording medium can be any comprising or stored program tangible medium, and this program is used by instruction execution system, device or device or is combined with it.
That computer-readable signal media can be included in the base band or propagate as a carrier wave part, wherein embody the data-signal of the propagation of computer-readable procedure code.Electromagnetic signal that the signal of this propagation can adopt various ways, comprises---but being not limited to---, light signal or any above suitable combination.Computer-readable signal media can be not for computer-readable recording medium, but can send, propagate or transmit for any computer-readable medium by instruction execution system, device or device use or the program that is combined with it.
The program code that comprises on the computer-readable medium can be with any suitable medium transmission, comprises that---but being not limited to---is wireless, electric wire, optical cable, RF etc., perhaps any suitable combinations thereof.
Be used for carrying out the computer program code of operation of the present invention, can write with any combination of one or more programming languages, described programming language comprises object oriented program language-such as Java, Smalltalk, C++, also comprises conventional process type programming language-such as " C " programming language or similar programming language.Procedure code can fully be carried out in user's calculating, partly carries out at user's computing machine, carry out or carry out at remote computer or server fully at remote computer as part on an independently software package execution, the computing machine of part the user.In rear a kind of situation, remote computer can be by the network of any kind---comprise LAN (Local Area Network) (LAN) or wide area network (WAN)-be connected to user's computing machine, perhaps, can (for example utilize the ISP to pass through the Internet) and be connected to outer computer.
Referring to process flow diagram and/or block diagram according to method, device (system) and the computer program of the embodiment of the invention many aspects of the present invention are described.Should be appreciated that, the combination of each square frame can be realized by computer program instructions in each square frame of process flow diagram and/or block diagram and process flow diagram and/or the block diagram.These computer program instructions can offer the processor of multi-purpose computer, special purpose computer or other programmable data treating apparatus, thereby produce a kind of machine, so that pass through these instructions of computing machine or the execution of other programmable data treating apparatus, produce the device of setting function/operation in the square frame in realization flow figure and/or the block diagram.
Also can be stored in these computer program instructions in energy command calculations machine or the computer-readable medium of other programmable data treating apparatus with ad hoc fashion work, like this, the instruction that is stored in the computer-readable medium produces a manufacture that comprises the command device (instruction means) of setting function/operation in the square frame in realization flow figure and/or the block diagram.
Also can be loaded into computer program instructions on computing machine or other programmable data treating apparatus, so that carry out the sequence of operations step at computing machine or other programmable data treating apparatus, producing computer implemented process, thereby just provide the process of setting function/operation in the square frame in realization flow figure and/or the block diagram in the instruction that computing machine or other programmable device are carried out.
With reference to the accompanying drawings, present invention is described in conjunction with specific embodiments.Such description and is not intended to scope of the present invention is limited only for purpose of explanation.
In a plurality of embodiment of the present invention, by the performance sampled data of evaluating objects physical platform, obtain the incidence relation between instruction sequence and the performance deficiency, then, the incidence relation that obtains development platform utilization comes Optimized code, thereby realizes cross-platform performance optimization.As previously mentioned, performance sampled data can reflect behavior and the event relevant with hardware performance that occurs when instruction is carried out, the code of carrying out on its hardware characteristics that closely depends on physical platform and this physical platform.That is to say, performance sampled data has embodied the hardware characteristics of physical platform and the code characteristic of execution simultaneously.This also just must first run time version in the prior art, produce corresponding performance sampled data, could be according to the reason of this sampled data Optimized code.Yet, in an embodiment of the present invention, by the analytical performance sampled data, therefrom excavate out the and instruction sequence statistical law relevant with the performance deficiency event, and the incidence relation between acquisition instruction sequence and the performance deficiency, such incidence relation only reflects the hardware characteristics of physical platform, and irrelevant with performed code.Based on such incidence relation, just can in the enterprising line code optimization of development platform, so that optimized code is suitable for the target physical platform better, thereby realize cross-platform performance optimization.Specifically describe embodiments of the invention below in conjunction with accompanying drawing and example.
According to embodiments of the invention, cross-platform process of carrying out performance optimization can be divided into two stages, namely, the method that incidence relation is provided of carrying out at the first physical platform (for example target physical platform), and carry out the method for code optimization in the second physical platform (for example developing physical platform) execution based on incidence relation.
Fig. 3 illustrates the process flow diagram of the method that incidence relation is provided according to an embodiment of the invention.Step in this process flow diagram can be carried out at the first physical platform.Particularly, as shown in the figure, in step 30, obtain the performance sampled data relevant with the execution of first code, this performance sampled data comprise the instruction corresponding with described first code information and with the information of performance deficiency event corresponding to described instruction; In step 32, according to described performance sampled data, make up instruction sequence, and the incidence relation between definite instruction sequence and the performance deficiency event; And, in step 34, described incidence relation is offered other physical platforms, for the optimization of on these other physical platforms, carrying out second code based on described incidence relation.
Based on so provided incidence relation, just can carry out code optimization, thereby improve execution performance.Fig. 4 illustrates the according to an embodiment of the invention process flow diagram of the method for run time version optimization.Step in this process flow diagram can be carried out at the second physical platform.Particularly, as shown in Figure 4, in step 40, obtain the incidence relation that provides such as Fig. 3 method; In step 42, according to described incidence relation, determine the performance deficiency event that second code is corresponding; And in step 44, based on the performance deficiency event of determining, optimize second code, thereby optimize the execution performance of second code on the first physical platform.
Each step of above embodiment is described below in conjunction with object lesson.
At first, in step 30, obtain the performance sampled data relevant with the execution of first code at the first physical platform.The first physical platform can be performance optimization for the target physical platform; First code can be the arbitrary code of carrying out on the first physical platform, comprises source code, intermediate code, binary code of various softwares, application, program etc.As previously mentioned, existing multiple physical platform all provides performance monitoring unit, is used for the signal that indicates the hardware performance event that processor produces is gathered and counts.Simultaneously, the sampling thief that moves in the OS kernel can be sampled to above-mentioned counting.In one embodiment, in order to obtain performance sampled data, in step 30, during processor is carried out first code, directly read in real time sampled result from above-mentioned sampling thief, sampled result is converged and puts in order, thereby obtain performance sampled data.In other executive mode, sampling thief can periodically generate the sampling daily record, and it is stored among the memory buffer, and the daily record of perhaps will sampling is further read from memory buffer, writes a document, and it is stored in the hard disk ad-hoc location.In the case, in step 30, can be directly read the sampling daily record from the ad-hoc location of memory buffer or hard disk, the daily record of should sampling forms performance sampled data.
As previously mentioned, performance sampled data usually includes command information and relevant performance event.Correspondingly, the performance sampled data that obtains by above step 30 corresponding instruction term of execution of can comprising first code, and the performance event that causes of these instructions comprise the performance deficiency event.Particularly, the performance sampled data that obtains can be shown in the example of Fig. 1.Yet, it will be understood by those skilled in the art that the many factors such as configuration of instruction code, performance monitoring unit and sampling thief that the content-dependent of performance sampled data is carried out on the structure of physical platform, physical platform.Along with the difference of these factors, the performance sampled data that obtains in the step 30 can comprise more and/or different command informations and performance event with respect to the example of Fig. 1.
Then, based on the performance sampled data that obtains, in step 32, these sampled datas are analyzed, therefrom make up instruction sequence, concurrently excavate the rule that instruction sequence causes performance deficiency, namely, determine the incidence relation between instruction sequence and the performance deficiency event, so that such incidence relation reflect the first physical platform hardware characteristics and with its on the code (for example, first code) carried out irrelevant.The analysis of above-mentioned incidence relation and definite can the realization in several ways.
Particularly, Fig. 5 A illustrates the step of obtaining according to an embodiment of the invention above-mentioned incidence relation, namely the substep of step 32 among Fig. 3.In this embodiment, the occurrence number by the statistics occurrence number of instruction sequence and performance event obtains incidence relation between the two.Particularly, as shown in the figure, in step 3211, make up instruction sequence according to performance sampled data.As mentioned above, performance sampled data has generally recorded the information of each performed instruction according to the execution sequence of instruction, comprise instruction address, operational code etc., as shown in Figure 1.For the jump instruction of not carrying out according to general sequence, performance sampled data can record source address and the destination address of this instruction.Therefore, sequentially recall along the record of performance sampled data, carry out redirect according to source address and destination address in case of necessity, just can obtain the execution route of instruction.According to such execution route, just can obtain continuously a plurality of instructions of execution, construct thus the instruction sequence that is consisted of by a plurality of continual commands.In an example, instruction sequence is with the content of operation of instruction, and more specifically, operational code is identified.For example, the L that occurs continuously in the performance sampled data of Fig. 1 (load loads), MVC (move, mobile), XC (or, exclusive disjunction) and ST (Store, storage) just can think an instruction sequence.Then, in step 3212, add up the occurrence number of each instruction sequence in performance sampled data.Then, in step 3213, add up the occurrence number of performance deficiency event corresponding to each instruction sequence.At last, in step 3214, based on the ratio of the above occurrence number that counts, determine the incidence relation between instruction sequence and the performance deficiency event.Particularly, in this step, can determine the occurrence number of each performance deficiency event with respect to the ratio of the occurrence number of instruction sequence, and the ratio that obtains is compared with predetermined threshold, determine incidence relation between instruction sequence and the performance deficiency according to comparative result.For example, for above-mentioned L-MVC-XC-ST instruction sequence, suppose in step 3212 and find, this instruction sequence has occurred in performance sampled data 100 times altogether.Then, in step 3213, find by verifying performance sampled data, in carrying out this sequence during the last item instruction ST, 85 times the TLBMiss event has occured, streamline Stall state occurred 5 times, the ICacheMiss event occurred 2 times.So, in step 3214, can determine when carrying out this instruction sequence, have 85% ratio the TLBMiss event to occur.Suppose that predefined threshold percentage is 80%, so just can determine, have incidence relation between the such performance deficiency event of above-mentioned instruction sequence and TLBMiss.In one embodiment, can also further introduce the tightness degree that the degree of association characterizes incidence relation.Particularly, can be with the ratio of the above-mentioned occurrence number value as the degree of association.
As seen, the embodiment of Fig. 5 A determines its incidence relation by the occurrence number of direct statistics instruction sequence and performance deficiency event.In many cases, the number of instruction sequence very huge (especially in the situation that do not limit sequence length) is so that the counting yield of above-mentioned statistic procedure is not ideal enough.Therefore, according to the computational resource of reality, can come in different ways implementation step 32, obtain above-mentioned incidence relation.
In one embodiment, after the step 3212 of the occurrence number of each instruction sequence of statistics of Fig. 5 A, select the instruction sequence of frequent appearance according to above-mentioned occurrence number.The instruction sequence of described frequent appearance can be the instruction sequence that occurrence number surpasses certain threshold level.Then, in step 3213, only add up the frequent corresponding performance deficiency event of instruction sequence that occurs.Like this, neglect the frequency of occurrences not high enough, thereby not too representative instruction sequence, a part of statistical computation saved.
In another embodiment, the performance event based on single instruction comes instruction sequence is carried out preliminary screening.Particularly, for each the independent instruction in the performance sampled data, the performance deficiency event sampling number that occurs when obtaining to carry out this instruction is with respect to the ratio of the sampling number of this instruction.If this ratio is higher than a predefined threshold value, just the END instruction of this instruction as the instruction sequence that may cause the performance deficiency event.Correspondingly, from this instruction, review forward the step number of some along instruction execution path, thereby obtain candidate's instruction sequence.For example, for the command M VC of the 6th row among Fig. 1, the sampling number of this instruction is 14 times, wherein the Recycle event occurred 5 times.Suppose that this ratio is higher than predefined threshold value, so just can assert, the instruction sequence that ends up with MVC is the sequence that might cause the Recycle event.Correspondingly, from this MVC, recall forward the instruction sequence that may cause Recycle that just can obtain the candidate along instruction execution path.On the other hand, if appearring in certain instruction, the ratio of certain performance deficiency event is lower than above-mentioned threshold value, just can think, this instruction unlikely is the END instruction that causes the instruction sequence of this kind performance deficiency event, thereby needn't recall take this instruction as starting point.Thus, only obtain the candidate instruction sequence that those might cause the performance deficiency event, thereby save part statistics and calculating.For the candidate instruction sequence of obtaining, can carry out statistical counting to it similarly with Fig. 5 A embodiment, thereby finally determine the incidence relation between instruction sequence and the performance deficiency event.Be appreciated that those skilled in the art can combine above several embodiment according to actual needs, produce more changeable body, obtain the incidence relation between instruction sequence and the performance deficiency event.
Alternatively, with the mode of the occurrence number of direct statistics instruction sequence and performance deficiency event differently, in another embodiment of the present invention, based on information entropy instruction sequence is carried out cluster and analysis, thereby obtain related between instruction sequence and the performance deficiency event.Fig. 5 B illustrates the step according to definite incidence relation of this embodiment.Shown in Fig. 5 B, in this embodiment, the step of obtaining incidence relation comprises: step 3221 makes up instruction sequence; Step 3222 is selected the higher instruction sequence of the frequency of occurrences; Step 3223 is carried out cluster and coding based on information entropy to instruction sequence; Step 3224, the statistics of bonding properties event selects to have the cluster of identification; Step 3225 is carried out Further Division to the cluster of selecting; In step 3226, judge whether the cluster of this moment reaches predetermined condition, in the situation that do not reach again execution in step 3222-3225 of predetermined condition, until reach predetermined condition.The below specifically describes above-mentioned each step.
In step 3221, from performance sampled data, obtain instruction execution path, obtain a plurality of instruction sequences according to instruction execution path.The step 3211 of this step and Fig. 5 A is similar, repeats no more.Then, in step 3222, select the higher instruction sequence of the frequency of occurrences.Particularly, the occurrence number of instruction sequence can be compared with predetermined threshold, select the instruction sequence that those occurrence numbers surpass threshold value; Perhaps, the occurrence number of instruction sequence can be sorted, go out the instruction sequence of given number according to occurrence number select progressively from high to low; Can also adopt other modes to select the high instruction sequence of the frequency of occurrences.
Then, in step 3223, based on information entropy the instruction sequence of selecting is carried out cluster and coding.Information entropy be entropy in the reference thermal mechanics concept and to the determinacy of information, the tolerance that randomness is made.In general, a system is chaotic, and wherein the uncertainty of variable is larger, and information entropy is just larger; Otherwise a system is orderly, and information entropy is just lower.Having proposed several different methods in the prior art is similar to information entropy on mathematics and estimates.Based on the information entropy of estimating on the mathematics, can carry out cluster to instruction sequence, so that the information entropy of the system that instruction sequence consists of after the cluster is minimum, namely quantity of information is maximum.Then, the cluster of instruction sequence is carried out " coding ", namely represent different classes with different code, thereby be convenient to follow-up analysis.
In a simple example, whether be L or MVC based on first instruction in the instruction sequence, whether END instruction is that ST carries out cluster to instruction sequence.Particularly, if first instruction of instruction sequence is L or MVC, the first instruction feature of this instruction sequence is labeled as 1, otherwise is 0; If END instruction is ST, the END instruction of this instruction sequence is labeled as 1, otherwise is 0.Simultaneously, if the ratio of the Stall event of END instruction place of instruction sequence surpasses 20%, then the Stall event is labeled as Y, otherwise is N.Thus, can obtain form shown in Figure 6.Can see, by instruction sequence is carried out cluster and coding, the concurrent rule that excavates instruction sequence is shown more easily.
Then, in step 3224, the statistics of bonding properties event selects to have the cluster of identification.Particularly, the cluster that statistics marks with code and the degree of association between the performance event are selected the larger cluster of the degree of association.Still take the form of Fig. 6 as example, can see, the cluster that is labeled as 1-1 has occurred 8 times, and wherein 6 Stall events is labeled as Y.At this moment, can select this cluster to be further analyzed, and not consider other clusters.
Be appreciated that we have lost a part of instruction sequence information because instruction sequence has been carried out cluster and coding.For example, owing to being L or MVC with all first instructions, END instruction is that the instruction sequence of ST all is labeled as 1-1, and we have lost other features of this cluster.For this reason, in step 3225, we carry out Further Division, the recovered part loss of information to the Sequence clustering of selecting.For example, the cluster 1-1 to step 3224 is selected carries out Further Division with it, is that the instruction sequence of L is divided into one group with first instruction, is that the instruction of MVC is divided into another group with first instruction.For each group instruction sequence, in step 3226, judge that whether this group instruction sequence reaches predetermined condition, in the situation that do not reach predetermined condition repeating step 3222-3224, carries out cluster and selection, again until reach predetermined condition.Above-mentioned predetermined condition can be set according to factors such as computational resource, required precision, can comprise, for example the quantity of the instruction sequence in every group reaches certain threshold level less than the degree of association of certain threshold level, instruction sequence and performance event, etc.Thus, by the mode that first cluster is divided again, progressively from the instruction sequence of enormous amount, extract the information of identification, obtain the incidence relation between instruction sequence and the performance deficiency event.
Although abovely described several structure instruction sequences in conjunction with object lesson, and the embodiment of the incidence relation between definite instruction sequence and the performance deficiency event,, it will be understood by those skilled in the art that above embodiment only is exemplary and nonrestrictive.On the basis of above example, those skilled in the art can make further modification and variant, utilize similarly or other mode extracts incidence relation between instruction sequence and the performance event.Such modification and variant also should be included within the category of the present invention.
Fig. 7 illustrates the schematic diagram of the incidence relation that obtains according to one embodiment of the invention.In the example of Fig. 7, incidence relation illustrates with the form of form.In this form, each clauses and subclauses illustrates related between an instruction sequence and the performance deficiency event.For example, according to the first row of form, { BERC}'s instruction sequence is associated with performance deficiency event Stall, this means for BNORC, (LTGR or LH), carries out continuously this instruction sequence and can cause streamline the Stall state to occur.In a further embodiment, the degree of association between instruction sequence and the performance deficiency event can also be shown, namely, carry out the probability that this instruction sequence can cause occurring performance deficiency.Be appreciated that incidence relation also can store and show by extended formatting, and be not limited to the example of Fig. 7.
The incidence relation that contrasts performance sampled data shown in Figure 1 and Fig. 7 example can be seen, (for example directly show the information of each instruction in the performance sampled data, instruction address, execution order, operational code, operand, hits etc.), therefore, can obtain instruction execution path by performance sampled data, further, might be by (for example understanding instruction execution path, utilize disassemblers) code information that obtains to carry out on the physical platform, for example information of aforesaid first code.Yet, shown in the incidence relation be the instruction sequence with the performance deficiency event between related, the statistics that this association is based on instruction sequence and various performance events obtains, and does not wherein relate to the concrete execution information of each instruction.Therefore, by the incidence relation between instruction sequence and the performance deficiency event, can not know the information of the code of carrying out on the physical platform.That is to say, the above-mentioned incidence relation only feature with physical platform is relevant, and with its on the code carried out irrelevant, thereby can not reveal code information.On the other hand, in many cases, only being caused by the execution of single instruction of performance deficiency event, but carry out continuously the result of a plurality of instructions.Therefore, excavate out instruction sequence that a plurality of instructions of continuous execution consist of and the incidence relation between the performance deficiency event more constitutionally reflect the rule that produces performance deficiency, also reflect better the feature of physical platform.
The characteristics of incidence relation in view of the above, above-mentioned incidence relation are suitable as the reference criterion of carrying out code optimization for physical platform very much.Therefore, then, in step 34, such incidence relation is offered other physical platforms, for the optimization of carrying out second code on these other physical platforms based on described incidence relation, described second code can be different from the first code in the step 30.Particularly, in one embodiment, above definite incidence relation can be stored in this locality, and give the access rights of specific other physical platforms to this incidence relation file.Perhaps, in another embodiment, in step 34, the incidence relation of determining directly can be sent to above-mentioned other physical platforms.In another embodiment, the incidence relation of determining can also be transferred to an incidence relation shared platform, obtain incidence relation so that other physical platforms can arrive this shared platform.
Provide on the basis of incidence relation in the method for the first physical platform according to Fig. 3, can for example obtain described incidence relation on the second physical platform at other physical platforms, and utilize this incidence relation to carry out code optimization, thereby realize cross-platform performance optimization, shown in the step of Fig. 4.
Particularly, in step 40, obtain above-mentioned incidence relation at the second physical platform.In one embodiment, the incidence relation that provides is stored on the first physical platform.At this moment, in step 40, directly send request to the first physical platform by the second physical platform, the incidence relation of generation is read in request.After the authorization response that obtains the first physical platform, the second physical platform just can read the incidence relation that produces for the first physical platform from the first physical platform.In another embodiment, be stored on the incidence relation shared platform from the incidence relation of the first physical platform, and be added with the label of the first physical platform.In the case, in step 40, the second physical platform reads incidence relation for the first physical platform from shared platform.
Be appreciated that the transmission of incidence relation between each physical platform can realize by various communication modes known and that might adopt future, comprise and utilize wireless, wired, optical cable, RF etc. various media, pass through FTP, HTTP, POP3, the transmission that the variety of protocols such as SMTP carry out.
Obtained at the second physical platform on the basis of the incidence relation that produces for the first physical platform, just can be in the code optimization of the enterprising hand-manipulating of needle of the second physical platform to the first physical platform.Particularly, in step 42, according to described incidence relation, determine the performance deficiency event that second code is corresponding; And in step 44, based on the performance deficiency event of determining, optimize second code, thereby eliminate or reduce the performance deficiency event that second code may cause.Above-mentioned second code is to remain the code carried out at the first physical platform, also is the code that remains to be optimized.
Fig. 8 illustrates the according to an embodiment of the invention substep of step 42, namely determines the concrete steps of performance deficiency event.Particularly, as shown in Figure 8, in order to determine performance deficiency event corresponding to second code, at first in step 421, the scanning second code generates the instruction sequence corresponding with second code.Then, in step 422, the instruction sequence of generation and the incidence relation of acquisition are mated, namely, in the incidence relation that obtains, search whether there is corresponding instruction sequence.If find the instruction sequence of coupling, in step 423, according to the incidence relation between instruction sequence and the performance deficiency event, determine the corresponding performance deficiency event of instruction sequence of coupling so.Be appreciated that the performance deficiency event of determining in the step 423 is that the current second code of supposition is in the situation that the first physical platform is carried out the performance deficiency event that might occur.
Performance deficiency based on as above determining or predict in the step 44 of Fig. 4, is optimized second code, with elimination and/or minimizing performance deficiency event, thereby optimizes the execution performance of second code on the first physical platform.The code optimization mode of step 44 can be utilized the optimal way that usually adopts in the prior art.For example, in one embodiment, can suitably adjust the execution sequence of second code, thereby reduce the instruction sequence that may cause the performance deficiency event.For example, second code include instruction sequence A be used for a plurality of operands are carried out continued operation, but this instruction sequence easily causes the Stall event.At this moment, if there be the independent instruction B not too relevant with sequential in instruction sequence A back, instruction B can be shifted to an earlier date and be inserted in the middle of the instruction sequence A so, interrupt the sequence A that might cause defective.In another embodiment, carry out code optimization by adding extracode.For ICacheMiss, DCacheMiss, the high speed buffer memorys such as TLBMiss are hit event not, can when scanning the instruction sequence that causes these events, add some extra instruction codes before these instructions.The operand that next these extra instruction codes will may use for notification processor, director data etc. are prefetched to high-speed cache, avoid thus occurring when carrying out ensuing instruction sequence not hit event of high-speed cache.In addition, those skilled in the art can carry out other forms of modification and optimization to second code as required, thereby so that second code can efficiently execution on the first physical platform.
Thus, the method according to the embodiment of Fig. 4 is optimized second code based on incidence relation, has improved the execution performance of second code on the first physical platform.
In embodiment described above, at first obtain incidence relation between instruction sequence and the performance deficiency event at the first physical platform, then on the second platform, optimize second code based on above-mentioned incidence relation, thereby realize cross-platform performance optimization.As previously mentioned, incidence relation between instruction sequence and the performance deficiency event does not relate to concrete instruction and code, therefore can not reveal the code information of carrying out on the first platform, such incidence relation is delivered to the risk that the second physical platform also can not produce secure context.Simultaneously, above-mentioned incidence relation more constitutionally reflects the hardware characteristics of the first physical platform, therefore, can obtain preferably effect of optimization as the basis to there being pending code to be optimized take incidence relation.So, if the technician wants the software application that optimization is developed for target platform, he needn't carry out this software application at target platform as prior art, but can be directly on the development platform of oneself code to software application be optimized, thereby so that the process of performance optimization is more convenient and efficient.
Based on same inventive concept, the present invention also provides the system of performance optimization, and this system comprises the device that is positioned at the device that incidence relation is provided on the first physical platform and is positioned at the run time version optimization on the second physical platform.Fig. 9 A illustrates the schematic block diagram of the device that incidence relation is provided according to an embodiment of the invention.Shown in Fig. 9 A, the device that incidence relation is provided of the present embodiment comprises: sampled data acquiring unit 911, be configured to obtain the performance sampled data relevant with the execution of first code, this performance sampled data comprise the instruction corresponding with described first code information and with the information of performance deficiency event corresponding to described instruction; Incidence relation determining unit 912 is configured to make up at least one instruction sequence according to described performance sampled data, and determines the incidence relation between described at least one instruction sequence and the performance deficiency event; And incidence relation provides unit 913, is configured to described incidence relation is offered other physical platforms, for the optimization of carrying out second code on these other physical platforms based on described incidence relation.Fig. 9 B illustrates the according to an embodiment of the invention schematic block diagram of the device of run time version optimization.Shown in Fig. 9 B, the device of the run time version optimization of the present embodiment comprises: incidence relation acquiring unit 921 is configured to obtain the incidence relation that the device of Fig. 9 A provides; Defective determining unit 922 is configured to according to described incidence relation, determines the performance deficiency event that second code is corresponding; And code optimization unit 923, be configured to optimize second code based on the performance deficiency event of determining.
Particularly, sampled data acquiring unit 911 obtains the performance sampled data that records when carrying out first code from the performance monitor of the first physical platform.The performance sampled data that obtains can shown in the example of Fig. 1, can also comprise more and/or different command informations and performance event.Based on the performance sampled data that obtains, incidence relation determining unit 912 therefrom makes up instruction sequence to these data analysis, concurrently excavates the rule that instruction sequence causes performance deficiency, namely, determine incidence relation between instruction sequence and the performance deficiency event.In one embodiment, the occurrence number of incidence relation determining unit 912 by the statistics occurrence number of instruction sequence and performance event obtains incidence relation between the two.In another embodiment, incidence relation determining unit 912 is screened instruction sequence further, thereby saves a part of statistical computation.In another embodiment, incidence relation determining unit 912 is carried out cluster and analysis based on information entropy to instruction sequence, thereby obtains related between instruction sequence and the performance deficiency event.The incidence relation that obtains can be as shown in Figure 7 example, also can store and show by extended formatting.Further, incidence relation provides unit 913 in several ways above-mentioned incidence relation to be offered other physical platforms, and the second physical platform for example is for optimizing based on the incidence relation run time version on these other physical platforms.
On this basis, the incidence relation acquiring unit 921 in the second physical platform reads above-mentioned incidence relation by various communication modes.Then, the instruction that defective determining unit 922 is corresponding with second code and described incidence relation compare and mate, and determine the performance deficiency event that second code is corresponding.So second code can be optimized based on the performance deficiency event of determining in code optimization unit 923, eliminates or reduce the performance deficiency event that second code may cause, thereby optimizes the execution performance of second code on the first physical platform.
For the concrete executive mode of above unit, the detailed description that can carry out with reference to before associated methods flow process and object lesson does not repeat them here.
Above-described performance optimization method and system can utilize computing system to realize.Figure 10 shows the block diagram of the exemplary computer system 100 that is suitable for realizing embodiment of the present invention.As shown, computer system 100 can comprise: CPU (CPU (central processing unit)) 101, RAM (random access memory) 102, ROM (ROM (read-only memory)) 103, system bus 104, hard disk controller 105, keyboard controller 106, serial interface controller 107, parallel interface controller 108, display controller 109, hard disk 110, keyboard 111, serial external unit 112, parallel external unit 113 and display 114.In these equipment, with system bus 104 coupling CPU 101, RAM 102, ROM 103, hard disk controller 105, keyboard controller 106, serialization controller 107, parallel controller 108 and display controller 109 arranged.Hard disk 110 and hard disk controller 105 couplings, keyboard 111 and keyboard controller 106 couplings, serial external unit 112 and serial interface controller 107 couplings, parallel external unit 113 and parallel interface controller 108 couplings, and display 114 and display controller 109 couplings.Should be appreciated that the described structured flowchart of Figure 10 illustrates just to the purpose of example, rather than limitation of the scope of the invention.In some cases, can increase or reduce as the case may be some equipment.
Process flow diagram in the accompanying drawing and block diagram illustrate the system according to various embodiments of the invention, architectural framework in the cards, function and the operation of method and computer program product.In this, each square frame in process flow diagram or the block diagram can represent the part of module, program segment or a code, and the part of described module, program segment or code comprises the executable instruction of one or more logic functions for realizing regulation.Should be noted that also what the function that marks in the square frame also can be marked to be different from the accompanying drawing occurs in sequence in some realization as an alternative.For example, in fact the square frame that two adjoining lands represent can be carried out substantially concurrently, and they also can be carried out by opposite order sometimes, and this decides according to related function.Also be noted that, each square frame in block diagram and/or the process flow diagram and the combination of the square frame in block diagram and/or the process flow diagram, can realize with the hardware based system of the special use of the function that puts rules into practice or operation, perhaps can realize with the combination of specialized hardware and computer instruction.
Although below in conjunction with specific embodiments, method of the present invention, system and unit are described in detail, the present invention is not limited to this.Those of ordinary skills can be under instructions instruction carry out multiple conversion, substitutions and modifications and without departing from the spirit and scope of the present invention to the present invention.Should be appreciated that, all such variations, replacement, modification still fall within protection scope of the present invention.Protection scope of the present invention is limited by claims.

Claims (20)

1. one kind provides the method for incidence relation for code optimization, comprising:
Obtain the performance sampled data relevant with the execution of first code, this performance sampled data comprise the instruction corresponding with described first code information and with the information of performance deficiency event corresponding to described instruction;
According to described performance sampled data, make up at least one instruction sequence, and determine the incidence relation between described at least one instruction sequence and the performance deficiency event; And
Described incidence relation is offered other physical platforms, for the optimization of on these other physical platforms, carrying out second code based on described incidence relation.
2. the sampling daily record that the method for claim 1, wherein said performance sampled data produce when carrying out first code and forming.
3. the method for claim 1, wherein saidly determine that the incidence relation between at least one instruction sequence and the performance deficiency event comprises:
The occurrence number of described at least one instruction sequence of statistics;
Add up the occurrence number of performance deficiency event corresponding to described at least one instruction sequence; And
Based on the ratio of the occurrence number of the occurrence number of the instruction sequence of above statistics and performance deficiency event, determine the incidence relation between at least one instruction sequence and the performance deficiency event.
4. method as claimed in claim 3, also comprise: the instruction sequence of selecting frequent appearance according to the occurrence number of described at least one instruction sequence, and the occurrence number of the performance deficiency event that at least one instruction sequence of described statistics is corresponding is specially, and only adds up the occurrence number of the corresponding performance deficiency event of instruction sequence of described frequent appearance.
5. the method for claim 1, wherein saidly determine that the incidence relation between at least one instruction sequence and the performance deficiency event comprises:
Select the higher instruction sequence of the frequency of occurrences;
Based on information entropy instruction sequence is carried out cluster;
The statistics of bonding properties defective event selects to have the cluster of identification; And
The Sequence clustering of selecting is carried out Further Division, until reach predetermined condition.
6. method as claimed in claim 5, wherein said predetermined condition comprise with lower at least one: the quantity of the instruction sequence in the group of being divided reaches the second certain threshold level less than the degree of association of the first certain threshold level, instruction sequence and performance deficiency event.
7. the method optimized of a run time version comprises:
The incidence relation that either method provides in obtaining according to claim 1-6; And
According to described incidence relation, determine the performance deficiency event that second code is corresponding;
Based on the performance deficiency event of determining, optimize second code.
8. method as claimed in claim 7, the wherein said incidence relation that obtains comprises: obtain described incidence relation via the incidence relation shared platform.
9. method as claimed in claim 7, determine that wherein performance deficiency event corresponding to second code comprises:
The scanning second code generates the instruction sequence corresponding with second code;
The instruction sequence of generation and the incidence relation of acquisition are mated;
According to described incidence relation, determine the corresponding performance deficiency event of instruction sequence of coupling.
10. method as claimed in claim 7, at least one item during wherein said optimization second code comprises are in the following manner eliminated or are reduced the performance deficiency event:
Adjust the execution sequence of second code;
Add extra instruction code to second code.
11. one kind provides the device of incidence relation for code optimization, comprising:
The sampled data acquiring unit is configured to obtain the performance sampled data relevant with the execution of first code, this performance sampled data comprise the instruction corresponding with described first code information and with the information of performance deficiency event corresponding to described instruction;
The incidence relation determining unit is configured to make up at least one instruction sequence according to described performance sampled data, and determines the incidence relation between described at least one instruction sequence and the performance deficiency event; And
Incidence relation provides the unit, is configured to described incidence relation is offered other physical platforms, for the optimization of carrying out second code on these other physical platforms based on described incidence relation.
12. the sampling daily record that device as claimed in claim 11, wherein said performance sampled data produce when carrying out first code and forming.
13. device as claimed in claim 11, wherein said incidence relation determining unit is configured to:
The occurrence number of described at least one instruction sequence of statistics;
Add up the occurrence number of performance deficiency event corresponding to described at least one instruction sequence; And
Based on the ratio of the occurrence number of the occurrence number of the instruction sequence of above statistics and performance deficiency event, determine the incidence relation between at least one instruction sequence and the performance deficiency event.
14. device as claimed in claim 13, wherein said incidence relation determining unit also is configured to: the instruction sequence of selecting frequent appearance according to the occurrence number of described at least one instruction sequence, and the occurrence number of the performance deficiency event that at least one instruction sequence of described statistics is corresponding is specially, and only adds up the occurrence number of the corresponding performance deficiency event of instruction sequence of described frequent appearance.
15. device as claimed in claim 11, wherein said incidence relation determining unit is configured to:
Select the higher instruction sequence of the frequency of occurrences;
Based on information entropy instruction sequence is carried out cluster;
The statistics of bonding properties defective event selects to have the cluster of identification; And
The Sequence clustering of selecting is carried out Further Division, until reach predetermined condition.
16. device as claimed in claim 15, wherein said predetermined condition comprise with lower at least one item: the quantity of the instruction sequence in the group of being divided reaches the second certain threshold level less than the degree of association of the first certain threshold level, instruction sequence and performance deficiency event.
17. the device that run time version is optimized comprises:
The incidence relation acquiring unit is configured to obtain the incidence relation that provides such as each described device among the claim 11-16; And
The defective determining unit is configured to according to described incidence relation, determines the performance deficiency event that second code is corresponding; And
The code optimization unit is configured to optimize second code based on the performance deficiency event of determining.
18. device as claimed in claim 17, wherein said incidence relation acquiring unit is configured to: obtain described incidence relation via the incidence relation shared platform.
19. method as claimed in claim 17, wherein said defective determining unit is configured to:
The scanning second code generates the instruction sequence corresponding with second code;
The instruction sequence of generation and the incidence relation of acquisition are mated;
According to described incidence relation, determine the corresponding performance deficiency event of instruction sequence of coupling.
20. method as claimed in claim 17, wherein said code optimization cell location are eliminated at least one item in the following manner or are reduced the performance deficiency event:
Adjust the execution sequence of second code;
Add extra instruction code to second code.
CN201110252353.7A 2011-08-30 2011-08-30 There is provided incidence relation and the method and apparatus of run time version optimization Expired - Fee Related CN102955712B (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
CN201110252353.7A CN102955712B (en) 2011-08-30 2011-08-30 There is provided incidence relation and the method and apparatus of run time version optimization
DE102012214672A DE102012214672A1 (en) 2011-08-30 2012-08-17 Providing a mapping relation and performing code optimization
GB1215035.5A GB2494268A (en) 2011-08-30 2012-08-23 Performing code optimization

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201110252353.7A CN102955712B (en) 2011-08-30 2011-08-30 There is provided incidence relation and the method and apparatus of run time version optimization

Publications (2)

Publication Number Publication Date
CN102955712A true CN102955712A (en) 2013-03-06
CN102955712B CN102955712B (en) 2016-02-03

Family

ID=47045287

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201110252353.7A Expired - Fee Related CN102955712B (en) 2011-08-30 2011-08-30 There is provided incidence relation and the method and apparatus of run time version optimization

Country Status (3)

Country Link
CN (1) CN102955712B (en)
DE (1) DE102012214672A1 (en)
GB (1) GB2494268A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019047795A1 (en) * 2017-09-07 2019-03-14 阿里巴巴集团控股有限公司 Method and apparatus for detecting model security and electronic device

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN120353719A (en) * 2025-06-26 2025-07-22 中国科学院软件研究所 Cross-architecture comparison-based software performance defect mining method

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7039910B2 (en) * 2001-11-28 2006-05-02 Sun Microsystems, Inc. Technique for associating execution characteristics with instructions or operations of program code
US20090055636A1 (en) * 2007-08-22 2009-02-26 Heisig Stephen J Method for generating and applying a model to predict hardware performance hazards in a machine instruction sequence
CN101727335A (en) * 2008-10-31 2010-06-09 国际商业机器公司 Installation method for binary code program and system
CN102099786A (en) * 2008-07-22 2011-06-15 松下电器产业株式会社 Program optimization method

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2002024052A (en) * 2000-07-03 2002-01-25 Matsushita Electric Ind Co Ltd Error reproduction test method for computer peripherals
US8640114B2 (en) * 2006-09-07 2014-01-28 Oracle America, Inc. Method and apparatus for specification and application of a user-specified filter in a data space profiler
US20070006037A1 (en) * 2005-06-29 2007-01-04 Microsoft Corporation Automated test case result analyzer
US8621468B2 (en) * 2007-04-26 2013-12-31 Microsoft Corporation Multi core optimizations on a binary using static and run time analysis
US20090113403A1 (en) * 2007-09-27 2009-04-30 Microsoft Corporation Replacing no operations with auxiliary code

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7039910B2 (en) * 2001-11-28 2006-05-02 Sun Microsystems, Inc. Technique for associating execution characteristics with instructions or operations of program code
US20090055636A1 (en) * 2007-08-22 2009-02-26 Heisig Stephen J Method for generating and applying a model to predict hardware performance hazards in a machine instruction sequence
CN102099786A (en) * 2008-07-22 2011-06-15 松下电器产业株式会社 Program optimization method
CN101727335A (en) * 2008-10-31 2010-06-09 国际商业机器公司 Installation method for binary code program and system

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019047795A1 (en) * 2017-09-07 2019-03-14 阿里巴巴集团控股有限公司 Method and apparatus for detecting model security and electronic device
US10691794B2 (en) 2017-09-07 2020-06-23 Alibaba Group Holding Limited Method, apparatus, and electronic device for detecting model security

Also Published As

Publication number Publication date
GB201215035D0 (en) 2012-10-10
DE102012214672A1 (en) 2013-02-28
CN102955712B (en) 2016-02-03
GB2494268A (en) 2013-03-06

Similar Documents

Publication Publication Date Title
Tang et al. Quest: Query-aware sparsity for efficient long-context llm inference
US11106685B2 (en) Method to rank documents by a computer, using additive ensembles of regression trees and cache optimisation, and search engine using such a method
US9886384B2 (en) Cache control device for prefetching using pattern analysis processor and prefetch instruction and prefetching method using cache control device
US20090177642A1 (en) Method and system for automated detection of application performance bottlenecks
CN110334036A (en) A kind of method and apparatus for realizing data cached scheduling
EP4369207A1 (en) Compilation optimization method for program source code, and related product
JP5373870B2 (en) Prediction device, prediction method, and program
EP1475705A2 (en) Optimizing cache efficiency within application software
JPWO2012020456A1 (en) Time-series data processing apparatus and method
JP2012113706A (en) Computer-implemented method, computer program, and data processing system for optimizing database query
JP2020500368A (en) Data prefetching method, apparatus, and system
CN107145446B (en) Application program APP test method, device and medium
CN117370058A (en) Service processing method, device, electronic equipment and computer readable medium
CN102789377B (en) The method and apparatus of processing instruction grouping information
CN102955712B (en) There is provided incidence relation and the method and apparatus of run time version optimization
CN113312619A (en) Malicious process detection method and device based on small sample learning, electronic equipment and storage medium
US10120666B2 (en) Conditional branch instruction compaction for regional code size reduction
US9575868B2 (en) Processor stressmarks generation
US10853217B2 (en) Performance engineering platform using probes and searchable tags
CN114461590B (en) A database file page pre-fetching method and device based on association rules
US7350025B2 (en) System and method for improved collection of software application profile data for performance optimization
Sheluhin et al. Forecasting of Computer Network Anomalous States Based on Sequential Pattern Analysis of “Historical Data”
JP5064825B2 (en) Buffer cache device, buffer cache method and program
JP2009026029A (en) Transaction control apparatus, transaction control method, transaction control program, and storage medium storing the program
US8943177B1 (en) Modifying a computer program configuration based on variable-bin histograms

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20160203

Termination date: 20200830