CN101582043A

CN101582043A - Dynamic task allocation method of heterogeneous computing system

Info

Publication number: CN101582043A
Application number: CNA2008100375632A
Authority: CN
Inventors: 郑骏; 胡文心; 蔡建华
Original assignee: East China Normal University
Current assignee: East China Normal University
Priority date: 2008-05-16
Filing date: 2008-05-16
Publication date: 2009-11-18

Abstract

The invention relates to a dynamic task assignment method for a heterogeneous computing system. The heterogeneous computing system is composed of a group of heterogeneous processors to cooperate to complete application tasks. The task is decomposed into a group of parallel subtasks and dispatched to each subtask according to the order of execution. The processor, the method realizes the dynamic allocation and optimal allocation of tasks in the heterogeneous computing system through the dynamic selection of the tasks to be processed by the processor. Compared with the prior art, the present invention considers the optimal scheduling problem of heterogeneous computing systems from a dynamic point of view, and uses swarm intelligence technology to propose an optimal scheduling method based on heterogeneous computing systems, which can be based on heterogeneous computing The processing capabilities of different processors in the system are dynamically allocated to tasks, and the possible competition problems among them are considered; the present invention dynamically considers the computing capabilities of the processors and the load of the processors, and can further speed up the execution of heterogeneous computing systems time.

Description

A kind of dynamic task allocation method of heterogeneous computing system

Technical field

The present invention relates to the Optimization Dispatching technology of heterogeneous computing system, particularly relate to a kind of dynamic task allocation method of heterogeneous computing system.

Background technology

Heterogeneous computing system HCS (Heterogeneous Computing System) utilizes the common cooperation of computing machine of one group of isomery to finish a certain application task.Task is resolved into a plurality of parallel subtasks, be dispatched to each processor of HCS, so not only satisfied dissimilar application, also developed the computing power of various machines in the system, thereby made system that higher performance be arranged by execution sequence.Past people is thought increases the performance that multimachine device more just can improve system.But under many circumstances, the more machine of usage quantity can not get a desired effect, even because select a large amount of machines for use, but because the computing power difference of various types, thereby make and in computation process, wait for mutually between each subtask, prolonged the whole task executions time, thereby reduced the performance of system, people are considering how reasonably allocating task makes the performance of heterogeneous computing system can be good at performance for this reason.

Summary of the invention

Technical matters to be solved by this invention is exactly to provide a kind of dynamic task allocation method of heterogeneous computing system for the defective that overcomes above-mentioned prior art existence.

Purpose of the present invention can be achieved through the following technical solutions: a kind of dynamic task allocation method of heterogeneous computing system, heterogeneous computing system is finished application task by the common cooperation of the processor of one group of isomery, it resolves into one group of parallel subtask with task, and be dispatched to each processor by execution sequence, it is characterized in that this method may further comprise the steps:

A. advance quene threshold value and one and stop into quene threshold value for each processor is provided with one;

B. allocating task is given each processor, and execution in step is as follows:

(b1) length value of the waiting list on the computing machine;

(b2) judge this length value whether less than this processor advance the quene threshold value, if, execution in step (b3) then;

(b3) calculate each subtask in the group of subtask and be assigned to probability in this processor;

(b4) subtask of probability maximum is added the waiting list of this processor, and this subtask is deleted from the group of subtask;

(b5) calculate the length value of the waiting list on this processor;

(b6) judge this length value whether less than the into quene threshold value that stops of this processor, if, then return step (b3), if not, then return step (b1);

C. when a plurality of processors are competed same subtask, calculate the intensity of load of these a plurality of processors, this subtask is distributed to the processor of intensity of load minimum.

Compared with prior art, the present invention has considered the Optimization Dispatching problem of heterogeneous computing system from dynamic angle, utilize the swarm intelligence technology, a kind of Optimization Dispatching method based on heterogeneous computing system has been proposed, this method can be carried out dynamic Task Distribution according to the processing power of different processor in the heterogeneous computing system, and has considered the race problem that wherein may occur; The computing power of the dynamic considering processor of the present invention, the loading condition of processor, the execution time that can further accelerate heterogeneous computing system.

Description of drawings

Fig. 1 is a process flow diagram of the present invention.

Execution time comparison synoptic diagram relatively when Fig. 2 is desktop computer of employing of embodiments of the invention;

Execution time comparison synoptic diagram relatively when Fig. 3 is six desktop computers of employing of embodiments of the invention;

Fig. 4 compares synoptic diagram at the execution time under the different test environments of the employing of embodiments of the invention.

Embodiment

The invention will be further described below in conjunction with accompanying drawing.

Shown in Fig. 1～4, a kind of dynamic task allocation method of heterogeneous computing system, heterogeneous computing system is finished application task by the common cooperation of the processor of one group of isomery, and it resolves into one group of parallel subtask with task, and be dispatched to each processor by execution sequence, this method may further comprise the steps:

(b1) length value of the waiting list on the computing machine;

(b5) calculate the length value of the waiting list on this processor;

Heterogeneous computing system is by a series of different processor P={p ₁, p ₂, p ₃Lp _nForm, task is decomposed into a series of subtask T={t ₁, t ₂, t ₃Lt _m, suppose that these tasks are separate, general n＜m.The problem that need to solve is how according to the different disposal ability of processor, reasonably distributes the subtask, makes the time of Processing tasks T short as much as possible, the performance advantage of performance heterogeneous computing system.

Parameter-definition

Task t _iPriority note make β _i, owing to be heterogeneous system definition V (t _i, p _j) expression task t _iAt processor p _jThe time of last operation, definition of T p (t _i, p _j) expression task t _iArrive processor p by network allocation _jDuring last execution, the needed transmission time.Comp (t _i, p _j) expression task t _iAt p _jOn deadline, so

Comp(t _i，p _j)＝V(t _i，p _j)+Tp(t _i，p _j)(1)

W (t _i) expression task t _iFrom distributing to processor to finishing used time, i.e. t _iStand-by period.L (P, p _j) expression processor p _jThe length of first-class pending formation, wherein, set P represents processor p _jIn task, then have

L (p_{j}) = Σ_{i &Element; P}^{P} Comp (t_{i}, p_{j}) - - - (2)

Γ _jExpression processor p _jFinish the time of all tasks, S set is represented processor p _jThe set of middle all tasks of handling.Have so

Γ_{j} = Σ_{i = j}^{S} Comp (t_{i}, p_{j}) - - - (3)

Definition Γ is the deadline of whole task, and our target is to minimize Γ, promptly

Γ＝min{max _1≤j≤nΓ _j}(4)

The dynamic task apportion model

How allocating task can adopt bee colony Task Distribution mode to heterogeneous computing system, and as the corresponding honeybee of processor, work such as offspring are looked for food, looked after to the corresponding honeybee of task, the Task Distribution of the corresponding honeybee of Task Distribution.Therefore, the information interchange model between honeybee and the environment can be used in the task scheduling of heterogeneous computing system, dynamic to realize, adaptive Task Distribution mode.

In the ant colony algorithm of heterogeneous computing system Task Distribution, one of the waiting list regulation of processor is advanced group threshold value L _InWith stop into group threshold value L _Stop, when the length of the waiting list of this processor is less than or equal to into group threshold value L _InThe time, then from unappropriated task, select the waiting list that task adds this processor, when the waiting list length of this processor more than or equal to L _StopThe time, the waiting list that stops this processor adds task.

Each processor p _jTo in the system still the task of unallocated resource the reaction fault value of a correspondence is all arranged.We represent with one group of m * n matrix.

A = (\begin{matrix} α_{11} & α_{12} & L & α_{1 n} \\ α_{21} & α_{22} & α_{2 n} \\ M & M & O & M \\ α_{n 1} & α_{n 2} & α_{nm} \end{matrix})

α wherein _IjExpression processor p _jTo task t _iThe respective doors limit value, α _IjWith this task at processor p _jOn execution time V (t _i, p _j), and task is transferred to processor p _jOn transmission time Tp (t _i, p _j) relevant, its formula is as follows

α _ij＝ζ+u×Tp(t _i，p _j)+l×V(t _i，p _j)(5)

Wherein: ζ, u, l are constant.

As can be seen, task execution time and data transmission period are long more from formula (5), and reaction fault value is big more, and the possibility of accepting this task is more little; Task execution time and data transmission period are long more, and reaction fault value is more little, and the possibility that receives this task is big more.

Equally, still unappropriated task can be sent stimulus signal to the processor of available use, with vectorial B=(θ ₁θ ₂L θ _n) represent.θ wherein _iThe stimulus signal that the expression task is sent is with task stand-by period W (t _i) and the priority β of task _iRelevant, its formula is as follows

θ _i＝β _i+h×W(t _i)(6)

Wherein: h is a constant.

From formula (6) as can be seen, task stand-by period W (t _i) long more, the priority β of task _iThe stimulus signal that big more then this task of priority is sent is strong more, can preferentially be selected into the load queue of processor.

So, when Task Distribution, can come the selection task to enter the waiting list of processor according to the stimulus signal that the reaction fault value and the uncompleted task of each processor are sent.Can be by probability

P (α_{ij}, θ_{i}) = \frac{θ_{i}^{2}}{α_{ij}^{2} + θ_{i}^{2}} - - - (7)

We can draw to draw a conclusion task t according to formula (7) _iThe stimulus signal that sends is strong more, or processor p _jReaction fault value low more, then to be assigned to the possibility of this processor big more for task.

Provided the step of selecting task on each processor below:

Step1. calculate on this processor wait to row L length, if L＜L _InThen continue;

Step2. according to formula (5), calculate the stimulus signal θ of promising scheduler task _i,, calculate the reaction fault value α of this processor according to formula (6) _Ij,, calculate the probability P (α that scheduler task is selected into this processor according to formula (7) _Ij, θ _i);

Step3. select probability P (α _Ij, θ _i) one group of maximum (t _i, p _j), t _iThe wait that enters this processor is to row, with t _iFrom T, delete;

Step4. calculate the length of the wait of this processor, judge, if L＜L to row L _StopThen change Step2, otherwise change Step1;

Under actual conditions, the situation that a plurality of processors are striven same task unexpectedly can appear, at document ^[1]In utilize social hierarchy's outline of bee colony self-organization to solve the race problem of a plurality of honeybees, the honeybee of promptly participating in competition has been endowed an intensity level, intensity level is more little, the probability of its triumph is big more.In this article, the intensity of load size comparator processor when striving unexpectedly that we can be by weighing each processor.Processor P _jIntensity W _jBe defined as the length of the pending formation on this processor and being competed of task t _iDeadline sum on this processor.Promptly

W _j＝L(p _j)+Comp(t _i，p _j)(8)

When competition occurring, get the processor of intensity level minimum.

Present embodiment is tested method of the present invention, and with existing mean allocation method (i.e. not considering processor processing power and load state, and give each processor the workload mean allocation) compare, this test run is in PVM (Parallel Virtual Machine) 3.4.3 heterogeneous computing environment.PVM is supported in the virtual machine loading tasks operation automatically, can intercom mutually between task and synchronously.The node that in the PVM system, allows user's appointed task to be loaded.During experiment, with 1 DELL server PowerEdge 1800 (CPU:Xeon 3.2GHz internal memory: 1G), as main frame.Six desktop computers are as the node machine, and wherein a CPU is Pentium41.5GHz, in save as 1024MB; 3 CPU are Pentium4866MHz, in save as 512MB; 2 CPU are Pentium3500MHz, in save as 256MB.

When carrying out simulated experiment, 6 node machines increase successively, and node machine of every increase is tested one group of data, and experimental data is as shown in table 1.

Table 1: the execution time of the following two kinds of models of different processor relatively in the heterogeneous system

Fig. 2 compares the execution time when having only a node machine, and this moment, the experimental result of two kinds of experimental techniques was approaching, and this paper model has only been saved 23.34s (3.2%) than average distribution system.Along with the increase of experiment node machine, the advantage of this paper model is more and more outstanding, and as shown in Figure 3, two kinds of model execution time compare when being 6 desktop computers of employing, and this paper model has been saved 101.42s (17.55%) than the mean allocation method execution time.

Fig. 4 compares the execution time of the following two kinds of models of different test environments in the heterogeneous system, can obviously find out from Fig. 4, when the 4th node machine is increased to the 5th node machine, the execution time of average distribution system does not obviously reduce, only save 6.90s, and the execution time of this paper model has been saved 23.6s.Reason is the computing power that should be the 5th the node machine very big gap of comparing with preceding four node machines, the mean allocation algorithm is not considered machine performance, just with task simply mean allocation give each node machine, the execution time of the 5th node machine is more than the execution time of preceding four node machines when handling identical data, thereby cause having increased a machine, the execution time does not obviously reduce.And this paper algorithm has been considered the processing power of each node machine, and according to the loading condition of processor, dynamic rational allocating task shortens deadline of whole task.

The invention belongs to utilizing the swarm intelligence The Application of Technology.Swarm intelligence (Swarm Intelligence) is meant the colony that is made up of a plurality of simple individualities, has the ability of finishing problem solving by simple coordination each other.Over past ten years, the research field that application group's intelligence solves variety of issue more and more is subjected to people's attention.The collective behavior that occurs in a group social insect is called as swarm intelligence, colony of social insect use intelligence, distributed method solves complicated problems jointly, these problems are individual insurmountable.Some scholars have proposed to be used for the derivation algorithm and the theory of combinatorial optimization problem according to the result of study to the insect group behavior, as ant group algorithm, ant colony algorithm etc., and have obtained application in a lot of fields.Task scheduling problem in the heterogeneous computing system itself also belongs to combinatorial optimization problem, and parallel subtasks wherein and each processor can be considered as simple individuality, each other by coordinating to realize optimum.

Claims

1. A dynamic task allocation method for a heterogeneous computing system. The heterogeneous computing system is composed of a group of heterogeneous processors to cooperate to complete the application task, which decomposes the task into a group of parallel subtasks, and dispatches them to each processing unit according to the order of execution. Machine, it is characterized in that, the method comprises the following steps:

a. Set a queue entry threshold and a stop queue entry threshold for each processor;

b. Assign tasks to each processor, the execution steps are as follows:

(b1) calculating the length value of the waiting queue on the processor;

(b2) judging whether the length value is less than the queue threshold value of the processor, if so, then perform step (b3);

(b3) calculating the probability that each subtask in the subtask group is assigned to the processor;

(b4) Add the subtask with the highest probability to the waiting queue of the processor, and delete the subtask from the subtask group;

(b5) calculate the length value of the waiting queue on the processor;

(b6) Judging whether the length value is less than the threshold value of the processor to stop entering the queue, if so, then return to step (b3), if not, then return to step (b1);

c. When multiple processors compete for the same subtask, calculate the load intensity of the multiple processors, and assign the subtask to the processor with the smallest load intensity.