CN115543554A - Method and device for scheduling calculation jobs and computer readable storage medium - Google Patents
Method and device for scheduling calculation jobs and computer readable storage medium Download PDFInfo
- Publication number
- CN115543554A CN115543554A CN202211035050.4A CN202211035050A CN115543554A CN 115543554 A CN115543554 A CN 115543554A CN 202211035050 A CN202211035050 A CN 202211035050A CN 115543554 A CN115543554 A CN 115543554A
- Authority
- CN
- China
- Prior art keywords
- computing
- job
- queued
- jobs
- resource
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/48—Program initiating; Program switching, e.g. by interrupt
- G06F9/4806—Task transfer initiation or dispatching
- G06F9/4843—Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
- G06F9/4881—Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
- G06F9/5038—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the execution order of a plurality of tasks, e.g. taking priority or time dependency constraints into consideration
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Multi Processors (AREA)
Abstract
本申请公开了一种计算作业的调度方法、装置及计算机可读存储介质,通过获取排队作业的排队原因,当排队原因为排队作业对应的计算资源池的剩余资源量小于排队作业的需求资源量时,获取集群中其他计算资源池的剩余资源量,当其他计算资源池的剩余资源量大于或等于需求资源量、且其他计算资源池与排队作业匹配时,将排队作业分配至其他计算资源池进行计算;不仅可以提高集群中计算资源的有效利用,而且可以减少计算作业的排队等待时间,从而可以提高集群的计算效率和利用率。
The present application discloses a computing job scheduling method, device, and computer-readable storage medium. By obtaining the queuing reason of the queuing job, when the queuing reason is that the remaining resource amount of the computing resource pool corresponding to the queuing job is less than the required resource amount of the queuing job When , obtain the remaining resources of other computing resource pools in the cluster. When the remaining resources of other computing resource pools are greater than or equal to the required resources, and other computing resource pools match the queued jobs, assign the queued jobs to other computing resource pools Computing; it can not only improve the effective utilization of computing resources in the cluster, but also reduce the queuing time of computing jobs, thereby improving the computing efficiency and utilization of the cluster.
Description
技术领域technical field
本申请涉及计算调度技术领域,具体涉及一种计算作业的调度方法、装置及计算机可读存储介质。The present application relates to the technical field of computing scheduling, and in particular to a computing job scheduling method, device, and computer-readable storage medium.
背景技术Background technique
高性能计算(High Performance Computing,简称HPC),是指以提高科学计算能力为目的计算机技术。HPC仿真计算是一种并行计算,即将一个应用程序分割成多块可以并行执行的部分并指定到多个处理器上执行的方法。High Performance Computing (HPC for short) refers to computer technology aimed at improving scientific computing capabilities. HPC simulation computing is a kind of parallel computing, which divides an application program into multiple parts that can be executed in parallel and assigns them to multiple processors for execution.
HPC仿真计算需要依赖调度软件来管理多个应用程序(例如仿真软件等)的计算调度,然而通常的调度软件只能针对单一环境(例如单个计算池)的集群中进行应用程序的计算调度,难以满足复杂环境下的计算资源调度问题的需求。HPC simulation computing needs to rely on scheduling software to manage the computing scheduling of multiple applications (such as simulation software, etc.), but the usual scheduling software can only perform computing scheduling for applications in a cluster of a single environment (such as a single computing pool), which is difficult It meets the needs of computing resource scheduling problems in complex environments.
发明内容Contents of the invention
为了解决上述技术问题,提出了本申请。本申请的实施例提供了一种计算作业的调度方法、装置及计算机可读存储介质,解决了上述技术问题。In order to solve the above-mentioned technical problems, the present application is proposed. Embodiments of the present application provide a computing job scheduling method, device, and computer-readable storage medium, which solve the above-mentioned technical problems.
根据本申请的一个方面,提供了一种计算作业的调度方法,包括:获取排队作业的排队原因;其中,所述排队作业表征正在排队等待处理的计算作业;当所述排队原因为所述排队作业对应的计算资源池的剩余资源量小于所述排队作业的需求资源量时,获取集群中其他计算资源池的剩余资源量;其中,所述集群包括所述排队作业对应的计算资源池和所述其他计算资源池;以及当所述其他计算资源池的所述剩余资源量大于或等于所述需求资源量、且所述其他计算资源池与所述排队作业匹配时,将所述排队作业分配至所述其他计算资源池进行计算。According to one aspect of the present application, a scheduling method for computing jobs is provided, including: obtaining the queuing reason of the queued job; wherein, the queuing job represents a computing job that is queuing up for processing; when the queuing reason is the queuing When the remaining resource amount of the computing resource pool corresponding to the job is less than the required resource amount of the queued job, obtain the remaining resource amount of other computing resource pools in the cluster; wherein, the cluster includes the computing resource pool corresponding to the queued job and the the other computing resource pools; and when the remaining resource amount of the other computing resource pools is greater than or equal to the required resource amount, and the other computing resource pools match the queued jobs, assigning the queued jobs to the other computing resource pools for computing.
在一实施例中,所述当所述排队原因为所述排队作业对应的计算资源池的剩余资源量小于所述排队作业的需求资源量时,获取集群中其他计算资源池的剩余资源量包括:In an embodiment, when the queuing reason is that the remaining resource amount of the computing resource pool corresponding to the queued job is less than the required resource amount of the queued job, obtaining the remaining resource amount of other computing resource pools in the cluster includes :
当所述排队原因为所述排队作业对应的计算资源池的剩余资源量减去保留资源量后小于所述排队作业的需求资源量时,获取集群中其他计算资源池的剩余资源量;其中,所述保留资源量表征单个计算资源池中预留的资源量,所述保留资源量与对应的计算资源池的剩余资源量正相关。When the reason for queuing is that the remaining resource amount of the computing resource pool corresponding to the queued job minus the amount of reserved resources is less than the required resource amount of the queued job, obtain the remaining resource amount of other computing resource pools in the cluster; wherein, The reserved resource amount represents the reserved resource amount in a single computing resource pool, and the reserved resource amount is positively correlated with the remaining resource amount of the corresponding computing resource pool.
在一实施例中,所述当所述其他计算资源池的所述剩余资源量大于或等于所述需求资源量、且所述其他计算资源池与所述排队作业匹配时,将所述排队作业分配至所述其他计算资源池进行计算包括:In an embodiment, when the remaining resource amount of the other computing resource pool is greater than or equal to the required resource amount, and the other computing resource pool matches the queued job, the queued job Allocation to the other computing resource pools for computing includes:
当所述其他计算资源池的所述剩余资源量减去保留资源量后大于或等于所述需求资源量、且所述其他计算资源池与所述排队作业匹配时,将所述排队作业分配至所述其他计算资源池进行计算;其中,所述保留资源量表征单个计算资源池中预留的资源量,所述保留资源量与对应的计算资源池的剩余资源量正相关。When the remaining resource amount of the other computing resource pool minus the reserved resource amount is greater than or equal to the required resource amount, and the other computing resource pool matches the queued job, assign the queued job to The other computing resource pools perform calculations; wherein, the reserved resource amount represents the reserved resource amount in a single computing resource pool, and the reserved resource amount is positively correlated with the remaining resource amount of the corresponding computing resource pool.
在一实施例中,所述计算作业的调度方法还包括:In an embodiment, the scheduling method of the computing job also includes:
当所述其他计算资源池的剩余资源量均小于所述需求资源量时,停止对所述排队作业进行调度。When the remaining resource amounts of the other computing resource pools are all smaller than the required resource amounts, stop scheduling the queued jobs.
在一实施例中,在所述停止对所述排队作业进行调度之后,所述计算作业的调度方法还包括:In an embodiment, after the scheduling of the queued job is stopped, the method for scheduling the computing job further includes:
对所述排队作业之后的计算作业进行调度。Scheduling computing jobs subsequent to the queued jobs.
在一实施例中,所述计算作业的调度方法还包括:In an embodiment, the scheduling method of the computing job also includes:
当单个用户对应的所述计算作业的需求资源量的总和大于单用户资源量上限时,停止对所述单个用户的所述计算作业进行调度;其中,所述单用户资源量上限与对应的计算资源池的剩余资源量正相关。When the sum of the required resources of the computing jobs corresponding to a single user is greater than the upper limit of single user resources, stop scheduling the computing jobs of the single user; The amount of remaining resources in the resource pool is positively related.
在一实施例中,在所述获取排队作业的排队原因之前,所述计算作业的调度方法还包括:In an embodiment, before the acquisition of the queuing cause of the queued job, the scheduling method of the computing job further includes:
根据所有计算作业的要求和所述集群中各个计算资源池的计算特性,将所述计算作业分别分配至各个所述计算资源池匹配。According to the requirements of all computing jobs and the computing characteristics of each computing resource pool in the cluster, the computing jobs are allocated to each of the computing resource pools for matching.
在一实施例中,所述获取排队作业的排队原因包括:In one embodiment, the queuing reasons for obtaining the queued jobs include:
当所述集群中存在排队等待的所述排队作业时,获取所述排队作业的排队原因。When the queuing job exists in the cluster, the queuing reason of the queuing job is acquired.
根据本申请的另一个方面,提供了一种计算作业的调度装置,包括:第一获取模块,用于获取排队作业的排队原因;其中,所述排队作业表征正在排队等待处理的计算作业;第二获取模块,用于当所述排队原因为所述排队作业对应的计算资源池的剩余资源量小于所述排队作业的需求资源量时,获取集群中其他计算资源池的剩余资源量;其中,所述集群包括所述排队作业对应的计算资源池和所述其他计算资源池;以及调度执行模块,用于当所述其他计算资源池的所述剩余资源量大于或等于所述需求资源量、且所述其他计算资源池与所述排队作业匹配且时,将所述排队作业分配至所述其他计算资源池进行计算。According to another aspect of the present application, a computing job scheduling device is provided, including: a first acquisition module, configured to acquire the queuing reason of the queued job; wherein, the queued job represents a computing job that is queued for processing; 2. An acquisition module, configured to acquire the remaining resources of other computing resource pools in the cluster when the queuing cause is that the remaining resources of the computing resource pool corresponding to the queuing job are less than the required resource amount of the queuing job; wherein, The cluster includes the computing resource pool corresponding to the queued job and the other computing resource pools; and a scheduling execution module, configured to, when the remaining resource amount of the other computing resource pool is greater than or equal to the required resource amount, And when the other computing resource pool matches the queued job, assign the queued job to the other computing resource pool for computing.
根据本申请的另一个方面,提供了一种计算机可读存储介质,所述存储介质存储有计算机程序,所述计算机程序用于执行上述任一所述的计算作业的调度方法。According to another aspect of the present application, a computer-readable storage medium is provided, the storage medium stores a computer program, and the computer program is used to execute any of the methods for scheduling computing jobs described above.
本申请提供的一种计算作业的调度方法、装置及计算机可读存储介质,在存在排队作业时,通过获取排队作业的排队原因,当排队原因为排队作业对应的计算资源池的剩余资源量小于排队作业的需求资源量时,获取集群中其他计算资源池的剩余资源量,当其他计算资源池的剩余资源量大于或等于需求资源量、且其他计算资源池与排队作业匹配时,将排队作业分配至其他计算资源池进行计算;即对排队作业进行排队原因的判断,若是因为该排队作业对应的计算资源池的剩余资源量不足,则获取集群中其他计算资源池的剩余资源量,若其他计算资源池的剩余资源量满足该排队作业的资源需求且该其他计算资源池与该排队作业匹配,则将该排队作业分配至该其他计算资源池进行计算,不仅可以提高集群中计算资源的有效利用,而且可以减少计算作业的排队等待时间,从而可以提高集群的计算效率和利用率。The application provides a computing job scheduling method, device, and computer-readable storage medium. When there are queuing jobs, by obtaining the queuing reasons of the queuing jobs, when the queuing reason is that the remaining resources of the computing resource pool corresponding to the queuing jobs are less than When queuing the required resources of the job, obtain the remaining resources of other computing resource pools in the cluster. When the remaining resources of other computing resource pools are greater than or equal to the required resources and the other computing resource pools match the queued jobs, the queued jobs will be Allocate to other computing resource pools for calculation; that is, to judge the queuing reason of the queued job, if the remaining resource amount of the computing resource pool corresponding to the queued job is insufficient, obtain the remaining resource amount of other computing resource pools in the cluster, if other If the remaining resources of the computing resource pool meet the resource requirements of the queued job and the other computing resource pool matches the queued job, then the queued job is allocated to the other computing resource pool for computing, which can not only improve the effective utilization of computing resources in the cluster Utilization, and can reduce the queuing time of computing jobs, thereby improving the computing efficiency and utilization of the cluster.
附图说明Description of drawings
通过结合附图对本申请实施例进行更详细的描述,本申请的上述以及其他目的、特征和优势将变得更加明显。附图用来提供对本申请实施例的进一步理解,并且构成说明书的一部分,与本申请实施例一起用于解释本申请,并不构成对本申请的限制。在附图中,相同的参考标号通常代表相同部件或步骤。The above and other objects, features and advantages of the present application will become more apparent through a more detailed description of the embodiments of the present application in conjunction with the accompanying drawings. The accompanying drawings are used to provide a further understanding of the embodiments of the present application, and constitute a part of the specification, and are used together with the embodiments of the present application to explain the present application, and do not constitute limitations to the present application. In the drawings, the same reference numerals generally represent the same components or steps.
图1是本申请一示例性实施例提供的一种计算作业的调度方法的流程示意图。Fig. 1 is a schematic flowchart of a scheduling method for computing jobs provided by an exemplary embodiment of the present application.
图2是本申请另一示例性实施例提供的一种计算作业的调度方法的流程示意图。Fig. 2 is a schematic flowchart of a scheduling method for computing jobs provided by another exemplary embodiment of the present application.
图3是本申请另一示例性实施例提供的一种计算作业的调度方法的流程示意图。Fig. 3 is a schematic flowchart of a scheduling method for computing jobs provided by another exemplary embodiment of the present application.
图4是本申请另一示例性实施例提供的一种计算作业的调度方法的流程示意图。Fig. 4 is a schematic flowchart of a scheduling method for computing jobs provided by another exemplary embodiment of the present application.
图5是本申请另一示例性实施例提供的一种计算作业的调度方法的流程示意图。Fig. 5 is a schematic flowchart of a scheduling method for computing jobs provided by another exemplary embodiment of the present application.
图6是本申请另一示例性实施例提供的一种计算作业的调度方法的流程示意图。Fig. 6 is a schematic flowchart of a method for scheduling computing jobs provided by another exemplary embodiment of the present application.
图7是本申请一示例性实施例提供的一种计算作业的调度装置的结构示意图。Fig. 7 is a schematic structural diagram of a computing job scheduling device provided by an exemplary embodiment of the present application.
图8是本申请另一示例性实施例提供的一种计算作业的调度装置的结构示意图。Fig. 8 is a schematic structural diagram of an apparatus for scheduling computing jobs provided by another exemplary embodiment of the present application.
图9是本申请一示例性实施例提供的电子设备的结构图。Fig. 9 is a structural diagram of an electronic device provided by an exemplary embodiment of the present application.
具体实施方式detailed description
下面,将参考附图详细地描述根据本申请的示例实施例。显然,所描述的实施例仅仅是本申请的一部分实施例,而不是本申请的全部实施例,应理解,本申请不受这里描述的示例实施例的限制。Hereinafter, exemplary embodiments according to the present application will be described in detail with reference to the accompanying drawings. Apparently, the described embodiments are only some of the embodiments of the present application, rather than all the embodiments of the present application. It should be understood that the present application is not limited by the exemplary embodiments described here.
HPC仿真计算的应用程序是需要依赖于HPC调度软件实现计算作业调度的,但是通常的HPC调度软件的计算作业调度策略只能针对单个计算集群进行调度,而无法满足多个计算集群、本地计算集群和云端计算集群相结合的环境。The application of HPC simulation computing needs to rely on HPC scheduling software to realize computing job scheduling, but the computing job scheduling strategy of common HPC scheduling software can only be scheduled for a single computing cluster, and cannot satisfy multiple computing clusters and local computing clusters. An environment combined with cloud computing clusters.
为了解决多个计算集群(包括本地计算集群和/或云端计算集群)之间的相互调度,本申请提出了一种计算作业的调度方法、装置及计算机可读存储介质,将所有集群同一调度管理,通过监控每个集群(或计算资源池)中的排队作业,当一个集群中出现了排队作业时,检测其他集群的剩余计算资源量,若其他集群的剩余计算资源量满足该排队作业的需求,将该排队作业调度至其他集群进行计算处理,以提高HPC仿真计算程序的计算效率和所有计算集群的资源利用率。In order to solve the mutual scheduling between multiple computing clusters (including local computing clusters and/or cloud computing clusters), this application proposes a computing job scheduling method, device, and computer-readable storage medium, which manage all clusters in the same scheduling , by monitoring the queued jobs in each cluster (or computing resource pool), when a queued job appears in a cluster, detect the remaining computing resources of other clusters, if the remaining computing resources of other clusters meet the needs of the queued job , dispatch the queued job to other clusters for computing processing, so as to improve the computing efficiency of the HPC simulation computing program and the resource utilization of all computing clusters.
下面结合附图具体说明本申请实施例提供的一种计算作业的调度方法、装置及计算机可读存储介质的具体方案和实现方式。The specific solutions and implementations of a computing job scheduling method, device, and computer-readable storage medium provided in the embodiments of the present application are described in detail below with reference to the accompanying drawings.
图1是本申请一示例性实施例提供的一种计算作业的调度方法的流程示意图。如图1所示,该计算作业的调度方法包括如下步骤:Fig. 1 is a schematic flowchart of a scheduling method for computing jobs provided by an exemplary embodiment of the present application. As shown in Figure 1, the scheduling method of the computing job includes the following steps:
步骤110:获取排队作业的排队原因。Step 110: Obtain the queuing reason of the queuing job.
其中,排队作业表征正在排队等待处理的计算作业。在一实施例中,步骤110的具体实现方式可以是:当集群中存在排队等待的排队作业时,获取排队作业的排队原因。Wherein, queued jobs represent computing jobs that are queued for processing. In an embodiment, the specific implementation manner of
具体的,本申请应用于多个计算资源池的HPC集群场景,例如包括3个计算资源池(第一本地计算资源池、第二本地计算资源池和云上计算资源池),其中,第一本地计算资源池包括36个服务器(CPU)、第二本地计算资源池包括48个服务器(CPU)、云上计算资源池包括64个服务器(CPU),且本申请需要计算的软件包括STAR-CCM+、Fluent、Abaqus、LS_Dyna、MechanicalAPDL、Optistruct等。应当理解,本申请中的计算资源池的数量和对应的服务器的数量只是示例性的,并非限定计算资源池的具体数量和对应的服务器的具体数量。当用户提交一个软件计算时,可能因为对应该软件或该客户的计算资源池的资源全部用完或剩余资源不足以计算该应用软件,此时即可确定该应用软件为排队作业。当集群中存在排队作业时,激活调度程序,即判断该排队作业的排队原因,以根据排队原因调度资源。具体的,调度程序可以周期性(例如每一分钟)的判断集群中是否存在排队作业,从而避免长时间排队而不被发现。Specifically, this application is applied to an HPC cluster scenario of multiple computing resource pools, for example, including three computing resource pools (the first local computing resource pool, the second local computing resource pool, and the cloud computing resource pool), wherein the first The local computing resource pool includes 36 servers (CPU), the second local computing resource pool includes 48 servers (CPU), and the cloud computing resource pool includes 64 servers (CPU), and the software required for this application includes STAR-CCM+ , Fluent, Abaqus, LS_Dyna, MechanicalAPDL, Optistruct, etc. It should be understood that the number of computing resource pools and the number of corresponding servers in this application are only exemplary, and are not limited to the specific number of computing resource pools and the specific number of corresponding servers. When a user submits a software calculation, it may be because the resources corresponding to the software or the customer's computing resource pool are all used up or the remaining resources are not enough to calculate the application software, then it can be determined that the application software is a queued job. When there is a queuing job in the cluster, activate the scheduler, that is, judge the queuing reason of the queuing job, so as to schedule resources according to the queuing reason. Specifically, the scheduler can periodically (for example, every minute) determine whether there are queued jobs in the cluster, so as to avoid long-time queues without being discovered.
步骤120:当排队原因为排队作业对应的计算资源池的剩余资源量小于排队作业的需求资源量时,获取集群中其他计算资源池的剩余资源量。Step 120: When the reason for queuing is that the remaining resource amount of the computing resource pool corresponding to the queued job is less than the required resource amount of the queued job, obtain the remaining resource amount of other computing resource pools in the cluster.
其中,集群包括排队作业对应的计算资源池和其他计算资源池,其他计算资源池可以是一个,也可以是多个,例如上述例子中若第一本地计算资源池为排队作业对应的计算资源池,则第二本地计算资源池和云上计算资源池为其他计算资源池。当确定排队作业的排队原因为对应的计算资源池的剩余资源量小于需求资源量时,调度软件获取集群中其他计算资源池的剩余资源量,以确定是否可以进行调度。例如第一本地计算资源池处存在排队作业,且排队原因为第一本地计算资源池的剩余资源量不足以满足该排队作业的需求资源量时,获取第二本地计算资源池和云上计算资源池的剩余资源量。应当理解,本申请可以按照预设的顺序获取其他计算资源池的剩余资源量,例如先获取第二本地计算资源池的剩余资源量,若第二本地计算资源池的剩余资源量不满足需求时再获取云上计算资源池的剩余资源量,以节省计算剩余资源量的计算量。Wherein, the cluster includes the computing resource pool corresponding to the queuing job and other computing resource pools. The other computing resource pools may be one or more. For example, in the above example, if the first local computing resource pool is the computing resource pool corresponding to the queuing job , the second local computing resource pool and the cloud computing resource pool are other computing resource pools. When it is determined that the queuing reason of the queued job is that the remaining resource amount of the corresponding computing resource pool is less than the required resource amount, the scheduling software obtains the remaining resource amount of other computing resource pools in the cluster to determine whether scheduling can be performed. For example, when there is a queued job at the first local computing resource pool, and the reason for queuing is that the remaining resources of the first local computing resource pool are not enough to meet the required resources of the queued job, obtain the second local computing resource pool and cloud computing resources The amount of remaining resources in the pool. It should be understood that the present application may obtain the remaining resources of other computing resource pools in a preset order, for example, first obtain the remaining resources of the second local computing resource pool, if the remaining resources of the second local computing resource pool do not meet the requirements Then obtain the remaining resource amount of the computing resource pool on the cloud, so as to save the calculation amount of calculating the remaining resource amount.
步骤130:当其他计算资源池的剩余资源量大于或等于需求资源量、且其他计算资源池与排队作业匹配时,将排队作业分配至其他计算资源池进行计算。Step 130: When the remaining resources of other computing resource pools are greater than or equal to the required resource amount, and other computing resource pools match the queued jobs, assign the queued jobs to other computing resource pools for calculation.
由于不同的仿真计算软件的需求不同,例如STAR-CCM软件适合多核心服务器但是对主频没要求,则选择云上计算资源池进行计算,Optistruct软件适合高主频但是对核心服务器数量没要求,则选择第一本地计算资源池,因此,在获取其他计算资源池的剩余资源量前,优选选择满足该排队作业的需求的计算资源池为目标。在获取了其他计算资源池的剩余资源量后,若其他计算资源池的剩余资源量大于或等于该排队作业的需求资源量(即该其他计算资源池的剩余资源量满足该排队作业的需求),并且该其他计算资源池与排队作业相匹配,此时可以将该排队作业分配至该其他计算资源池进行计算,以降低排队等待的仿真计算软件数量,从而提高集群的有效利用率和计算效率。Due to the different requirements of different simulation computing software, for example, STAR-CCM software is suitable for multi-core servers but has no requirement for the main frequency, so the computing resource pool on the cloud is selected for calculation. Optistruct software is suitable for high main frequency but does not require the number of core servers. Then, the first local computing resource pool is selected. Therefore, before acquiring the remaining resources of other computing resource pools, it is preferable to select the computing resource pool that meets the requirements of the queued job as the target. After obtaining the remaining resources of other computing resource pools, if the remaining resources of other computing resource pools are greater than or equal to the required resources of the queued job (that is, the remaining resources of the other computing resource pools meet the needs of the queued job) , and the other computing resource pool matches the queued job, at this time, the queued job can be assigned to the other computing resource pool for computing, so as to reduce the number of simulation computing software waiting in the queue, thereby improving the effective utilization rate and computing efficiency of the cluster .
本申请提供的一种计算作业的调度方法,在存在排队作业时,通过获取排队作业的排队原因,当排队原因为排队作业对应的计算资源池的剩余资源量小于排队作业的需求资源量时,获取集群中其他计算资源池的剩余资源量,当其他计算资源池的剩余资源量大于或等于需求资源量、且其他计算资源池与排队作业匹配时,将排队作业分配至其他计算资源池进行计算;即对排队作业进行排队原因的判断,若是因为该排队作业对应的计算资源池的剩余资源量不足,则获取集群中其他计算资源池的剩余资源量,若其他计算资源池的剩余资源量满足该排队作业的资源需求且该其他计算资源池与该排队作业匹配,则将该排队作业分配至该其他计算资源池进行计算,不仅可以提高集群中计算资源的有效利用,而且可以减少计算作业的排队等待时间,从而可以提高集群的计算效率和利用率。A computing job scheduling method provided by this application, when there is a queued job, by obtaining the queuing reason of the queued job, when the queuing reason is that the remaining resource amount of the computing resource pool corresponding to the queued job is less than the required resource amount of the queued job, Obtain the remaining resources of other computing resource pools in the cluster. When the remaining resources of other computing resource pools are greater than or equal to the required resources, and other computing resource pools match the queued jobs, assign the queued jobs to other computing resource pools for computing ; That is, to judge the queuing reason of the queuing job, if the remaining resource amount of the computing resource pool corresponding to the queuing job is insufficient, obtain the remaining resource amount of other computing resource pools in the cluster, if the remaining resource amount of other computing resource pools satisfies If the resource requirements of the queued job and the other computing resource pool match the queued job, then the queued job is allocated to the other computing resource pool for computing, which can not only improve the effective utilization of computing resources in the cluster, but also reduce the workload of computing jobs. Queue waiting time, which can improve the computing efficiency and utilization of the cluster.
图2是本申请另一示例性实施例提供的一种计算作业的调度方法的流程示意图。如图2所示,上述步骤120可以包括:Fig. 2 is a schematic flowchart of a scheduling method for computing jobs provided by another exemplary embodiment of the present application. As shown in Figure 2, the
步骤121:当排队原因为排队作业对应的计算资源池的剩余资源量减去保留资源量后小于排队作业的需求资源量时,获取集群中其他计算资源池的剩余资源量。Step 121: When the queuing reason is that the remaining resource amount of the computing resource pool corresponding to the queued job minus the reserved resource amount is less than the required resource amount of the queued job, obtain the remaining resource amount of other computing resource pools in the cluster.
其中,保留资源量表征单个计算资源池中预留的资源量,保留资源量与对应的计算资源池的剩余资源量正相关。为了保证计算资源池的计算速度,可以对每个计算资源池都设定一个保留资源量,一方面可以避免计算资源池的过饱和运行,另一方面也可以为新用户或较为紧急的计算作业留有一定的计算资源。具体的,保留资源量与剩余资源量正相关,即单个计算资源池的剩余资源量越多,则保留的资源量也越多,例如单个计算资源池的服务器数量为50,当资源利用率为0时(即完全空闲),此时可以允许单个仿真计算软件的需求资源量最多为30个服务器(即保留资源量为20个服务器);当资源利用率为80%时(即剩余资源量为10个服务器),此时可以允许单个仿真计算软件的需求资源量最多为5个服务器(即保留资源量为5个服务器)。优选地,本申请可以根据不同的资源利用率预先设定对应的保留比例,例如上述资源利用率为0时的保留比例为40%,而资源利用率为80%时的保留比例为50%。Wherein, the reserved resource amount represents the reserved resource amount in a single computing resource pool, and the reserved resource amount is positively correlated with the remaining resource amount of the corresponding computing resource pool. In order to ensure the calculation speed of the computing resource pool, a reserved resource amount can be set for each computing resource pool. On the one hand, it can avoid the oversaturated operation of the computing resource pool, and on the other hand, it can also provide new users or more urgent computing jobs. Reserve a certain amount of computing resources. Specifically, the amount of reserved resources is positively correlated with the amount of remaining resources, that is, the more remaining resources in a single computing resource pool, the more resources are reserved. For example, if the number of servers in a single computing resource pool is 50, when the resource utilization is 0 (i.e. completely idle), at this time, the required resources of a single simulation computing software can be allowed to be up to 30 servers (i.e., the reserved resources are 20 servers); when the resource utilization rate is 80% (i.e., the remaining resources are 10 servers), at this time, the required resources of a single simulation computing software can be allowed to be up to 5 servers (that is, the reserved resources are 5 servers). Preferably, the application may preset corresponding retention ratios according to different resource utilization rates, for example, when the resource utilization rate is 0, the retention ratio is 40%, and when the resource utilization rate is 80%, the retention ratio is 50%.
在计算资源池的剩余资源量减去保留资源量后小于排队作业的需求资源量时,即计算资源池当前可以供单个仿真计算软件使用的资源量小于需求资源量时,获取集群中其他计算资源池的剩余资源量,以对该单个仿真计算软件进行调度。When the amount of remaining resources in the computing resource pool minus the amount of reserved resources is less than the required amount of resources for queued jobs, that is, when the amount of resources that can be used by a single simulation computing software in the computing resource pool is less than the required amount of resources, other computing resources in the cluster will be obtained The amount of remaining resources in the pool is used to schedule the single simulation computing software.
图3是本申请另一示例性实施例提供的一种计算作业的调度方法的流程示意图。如图3所示,上述步骤130可以包括:Fig. 3 is a schematic flowchart of a scheduling method for computing jobs provided by another exemplary embodiment of the present application. As shown in Figure 3, the
步骤131:当其他计算资源池的剩余资源量减去保留资源量后大于或等于需求资源量、且其他计算资源池与排队作业匹配时,将排队作业分配至其他计算资源池进行计算。Step 131: When the remaining resource amount of other computing resource pools minus the reserved resource amount is greater than or equal to the required resource amount, and other computing resource pools match the queued jobs, assign the queued jobs to other computing resource pools for calculation.
其中,保留资源量表征单个计算资源池中预留的资源量,保留资源量与对应的计算资源池的剩余资源量正相关。为了保证计算资源池的计算速度,每个计算资源池都设定一个保留资源量,以避免计算资源池的过饱和运行,并且也可以为新用户或较为紧急的计算作业留有一定的计算资源。因此,在获取其他计算资源池的可利用资源量时,也需要保留一部分资源量,也就是说,只有在其他计算资源池的剩余资源量减去对应的保留资源量(与当前资源利用率正相关)后的资源量仍然能够满足该排队作业的需求资源量、且该计算资源池与该排队作业匹配时,才将该排队作业分配至该其他计算资源池进行计算,以保证该其他计算资源池的正常计算运行。Wherein, the reserved resource amount represents the reserved resource amount in a single computing resource pool, and the reserved resource amount is positively correlated with the remaining resource amount of the corresponding computing resource pool. In order to ensure the computing speed of the computing resource pool, each computing resource pool sets a reserved resource amount to avoid the oversaturated operation of the computing resource pool, and can also reserve certain computing resources for new users or more urgent computing jobs . Therefore, when obtaining the available resources of other computing resource pools, it is also necessary to reserve a part of the resources, that is, only when the remaining resources of other computing resource pools minus the corresponding reserved related) after the resource amount can still meet the demand resource amount of the queued job, and the computing resource pool matches the queued job, the queued job is assigned to the other computing resource pool for computing, so as to ensure that the other computing resources Normal computing operation of the pool.
图4是本申请另一示例性实施例提供的一种计算作业的调度方法的流程示意图。如图4所示,上述计算作业的调度方法还可以包括:Fig. 4 is a schematic flowchart of a scheduling method for computing jobs provided by another exemplary embodiment of the present application. As shown in Figure 4, the scheduling method of the above calculation job may also include:
步骤140:当其他计算资源池的剩余资源量均小于需求资源量时,停止对排队作业进行调度。Step 140: When the remaining resources of other computing resource pools are less than the required resources, stop scheduling the queued jobs.
若其他计算资源池的剩余资源量都小于排队作业的需求资源量,即集群中所有的计算资源池的剩余资源量都不能满足该排队作业的需求,此时只能退出调度程度,即停止对排队作业进行调度,保持现有的排队状态。并且在下一周期时再次判断是否存在排队作业、是否有满足排队作业需求的其他计算资源池,若判断结果都为是,则对排队作业进行调度。If the remaining resources of other computing resource pools are less than the required resources of the queuing job, that is, the remaining resources of all computing resource pools in the cluster cannot meet the requirements of the queuing job, the only way to exit the scheduling level is to stop processing Queued jobs are scheduled, maintaining the existing queued state. And in the next cycle, it is judged again whether there are queued jobs and whether there are other computing resource pools that meet the requirements of the queued jobs. If the judgment results are all yes, the queued jobs are scheduled.
在一实施例中,如图4所示,在步骤140之后,上述计算作业的调度方法还可以包括:In an embodiment, as shown in FIG. 4, after
步骤150:对排队作业之后的计算作业进行调度。Step 150: Scheduling the computing jobs following the queued jobs.
由于不同仿真计算软件的需求资源量不同,若某个计算资源池存在多个排队作业,且位于队列前面的排队作业的需求资源量超出了集群中所有计算资源池的剩余资源量,此时若后面的排队作业的需求资源量较小时可能存在可对其进行计算的计算资源池,可以对后面的排队作业进行调度(具体的调度方式如上述步骤110-130所述),从而可以尽可能的提高整个集群的资源利用率和仿真计算软件的计算效率。Due to the different resource requirements of different simulation computing software, if there are multiple queued jobs in a certain computing resource pool, and the resource requirements of the queued jobs at the front of the queue exceed the remaining resources of all computing resource pools in the cluster, if When the amount of required resources for the subsequent queued jobs is small, there may be a computing resource pool that can be used for calculation, and the subsequent queued jobs can be scheduled (the specific scheduling method is as described in the above-mentioned steps 110-130), so that as much as possible Improve the resource utilization of the entire cluster and the computing efficiency of the simulation computing software.
图5是本申请另一示例性实施例提供的一种计算作业的调度方法的流程示意图。如图5所示,上述计算作业的调度方法还可以包括:Fig. 5 is a schematic flowchart of a scheduling method for computing jobs provided by another exemplary embodiment of the present application. As shown in Figure 5, the scheduling method of the above calculation job may also include:
步骤160:当单个用户对应的计算作业的需求资源量的总和大于单用户资源量上限时,停止对单个用户的计算作业进行调度。Step 160: When the sum of required resource amounts of computing jobs corresponding to a single user is greater than the upper limit of resource amounts of a single user, stop scheduling the computing jobs of a single user.
由于计算资源池的计算资源有限,若单个用户提交的仿真计算软件数量过多,则该单个用户可能会占用过多的资源而导致其他用户无法正常使用。因此,本申请可以对单个用户的总资源进行限定(设定单用户资源量上限),若单个用户同时运行的仿真计算软件的需求资源量的总和超过了该单用户资源量上限,则该单个用户提交的计算作业只能排队等待,调度软件停止对该单个用户的计算作业进行调度。Due to the limited computing resources in the computing resource pool, if a single user submits too many simulation computing software, the single user may occupy too many resources and cause other users to fail to use it normally. Therefore, this application can limit the total resources of a single user (setting the upper limit of single-user resources). The computing jobs submitted by users can only be queued up, and the scheduling software stops scheduling the computing jobs of this single user.
其中,单用户资源量上限与对应的计算资源池的剩余资源量正相关。具体的,单用户资源量上限与剩余资源量正相关,即单个计算资源池的剩余资源量越多,则单用户资源量上限也越高,例如单个计算资源池的服务器数量为50,当资源利用率为0时(即完全空闲),此时可以允许单个用户提交的仿真计算软件的计算资源总和最多为30个服务器;当资源利用率为80%时,此时可以允许单个用户提交的仿真计算软件的计算资源总和最多为5个服务器(即保留5个服务器,该保留的5个服务器可以打上标签,禁止运行常用软件,以尽量避免该保留的5个服务器被使用而出现的过饱和问题)。优选地,本申请可以根据不同的资源利用率预先设定对应的单用户资源量上限与剩余资源量的比例,例如上述资源利用率为0时的比例为40%,而资源利用率为80%时的比例为50%。Wherein, the upper limit of the single-user resource amount is positively correlated with the remaining resource amount of the corresponding computing resource pool. Specifically, the upper limit of single-user resources is positively correlated with the remaining resources, that is, the more remaining resources in a single computing resource pool, the higher the upper limit of single-user resources. For example, if the number of servers in a single computing resource pool is 50, when the resources When the utilization rate is 0 (that is, completely idle), the total computing resources of the simulation computing software submitted by a single user can be up to 30 servers; when the resource utilization rate is 80%, the simulation calculation software submitted by a single user can be allowed at this time The total computing resources of the computing software are up to 5 servers (that is, 5 servers are reserved, and the 5 reserved servers can be labeled, and common software is prohibited from running, so as to avoid the oversaturation problem caused by the 5 reserved servers being used as much as possible ). Preferably, this application can pre-set the ratio of the upper limit of the corresponding single-user resource amount to the remaining resource amount according to different resource utilization rates. For example, when the above-mentioned resource utilization rate is 0, the ratio is 40%, while the resource utilization rate is 80%. When the ratio is 50%.
图6是本申请另一示例性实施例提供的一种计算作业的调度方法的流程示意图。如图6所示,在步骤110之前,上述计算作业的调度方法还可以包括:Fig. 6 is a schematic flowchart of a method for scheduling computing jobs provided by another exemplary embodiment of the present application. As shown in FIG. 6, before
步骤170:根据所有计算作业的要求和集群中各个计算资源池的计算特性,将计算作业分别分配至各个计算资源池匹配。Step 170: According to the requirements of all computing jobs and the computing characteristics of each computing resource pool in the cluster, assign the computing jobs to each computing resource pool for matching.
由于不同的仿真计算软件的需求不同,例如STAR-CCM软件适合多核心服务器但是对主频没要求,则选择服务器数量较多的计算资源池进行计算,Optistruct软件适合高主频但是对核心服务器数量没要求,则选择主频较高的计算资源池。为了尽可能实现较高的计算效率和计算效果,本申请可以预先将各个仿真计算软件分配至与之匹配的计算资源池中,即根据仿真计算软件的要求和计算资源池的计算特征,对仿真计算软件和计算资源池进行配对,以保证在无需调度的前提下,每个仿真计算软件都能在较优的计算资源池中进行计算处理,以保证仿真效果。并且,为了进一步提高集群中计算资源的利用率和平衡性,本申请还可以根据各个仿真计算软件的需求资源量在满足匹配原则的前提下,将各个仿真计算软件均匀的分配至各个计算资源池,以尽量降低计算资源池饱和的风险,从而减少调度。另外,本申请还可以在一段时间后,根据该段时间内各个仿真计算软件的使用频率等再次综合匹配各个仿真计算软件和计算资源池,以进一步降低计算资源池饱和的风险,从而减少调度。Due to the different requirements of different simulation computing software, for example, STAR-CCM software is suitable for multi-core servers but does not require the main frequency, so choose a computing resource pool with a large number of servers for calculation. Optistruct software is suitable for high main frequency but has no requirements for the number of core servers. If there is no requirement, choose a computing resource pool with a higher frequency. In order to achieve higher computing efficiency and computing effects as much as possible, the application can pre-allocate each simulation computing software to a matching computing resource pool, that is, according to the requirements of the simulation computing software and the computing characteristics of the computing resource pool, the simulation Computing software and computing resource pools are paired to ensure that each simulation computing software can perform computing processing in a better computing resource pool without scheduling, so as to ensure the simulation effect. Moreover, in order to further improve the utilization rate and balance of computing resources in the cluster, this application can evenly distribute each simulation computing software to each computing resource pool under the premise of satisfying the matching principle according to the required resource amount of each simulation computing software , to minimize the risk of computing resource pool saturation, thereby reducing scheduling. In addition, after a period of time, the application can also comprehensively match each simulation computing software and computing resource pool according to the usage frequency of each simulation computing software during this period of time, so as to further reduce the risk of computing resource pool saturation, thereby reducing scheduling.
图7是本申请一示例性实施例提供的一种计算作业的调度装置的结构示意图。如图7所示,计算作业的调度装置70包括:第一获取模块71,用于获取排队作业的排队原因;其中,排队作业表征正在排队等待处理的计算作业;第二获取模块72,用于当排队原因为排队作业对应的计算资源池的剩余资源量小于排队作业的需求资源量时,获取集群中其他计算资源池的剩余资源量;其中,集群包括排队作业对应的计算资源池和其他计算资源池;以及调度执行模块73,用于当其他计算资源池的剩余资源量大于或等于需求资源量、且其他计算资源池与排队作业匹配且时,将排队作业分配至其他计算资源池进行计算。Fig. 7 is a schematic structural diagram of a computing job scheduling device provided by an exemplary embodiment of the present application. As shown in FIG. 7 , the scheduling device 70 for computing jobs includes: a first acquiring module 71, configured to acquire the queuing reasons of the queued jobs; wherein, the queued jobs represent computing jobs that are queuing up for processing; a second acquiring module 72 is used for When the reason for queuing is that the remaining resource amount of the computing resource pool corresponding to the queued job is less than the required resource amount of the queued job, obtain the remaining resource amount of other computing resource pools in the cluster; where the cluster includes the computing resource pool corresponding to the queued job and other computing resource pools Resource pool; and scheduling execution module 73, for when the remaining resource amount of other computing resource pools is greater than or equal to the required resource amount, and when other computing resource pools match the queued jobs, assign the queued jobs to other computing resource pools for calculation .
本申请提供的一种计算作业的调度装置,在存在排队作业时,通过第一获取模块71获取排队作业的排队原因,当排队原因为排队作业对应的计算资源池的剩余资源量小于排队作业的需求资源量时,第二获取模块72获取集群中其他计算资源池的剩余资源量,当其他计算资源池的剩余资源量大于或等于需求资源量、且其他计算资源池与排队作业匹配时,调度执行模块73将排队作业分配至其他计算资源池进行计算;即对排队作业进行排队原因的判断,若是因为该排队作业对应的计算资源池的剩余资源量不足,则获取集群中其他计算资源池的剩余资源量,若其他计算资源池的剩余资源量满足该排队作业的资源需求且该其他计算资源池与该排队作业匹配,则将该排队作业分配至该其他计算资源池进行计算,不仅可以提高集群中计算资源的有效利用,而且可以减少计算作业的排队等待时间,从而可以提高集群的计算效率和利用率。A computing job scheduling device provided by the present application, when there is a queued job, obtains the queuing reason of the queued job through the first acquisition module 71, when the queuing reason is that the remaining resource amount of the computing resource pool corresponding to the queued job is less than that of the queued job When the amount of resources is required, the second acquisition module 72 acquires the remaining resource amounts of other computing resource pools in the cluster, and when the remaining resource amounts of other computing resource pools are greater than or equal to the required resource amount, and other computing resource pools match the queued jobs, scheduling Execution module 73 assigns the queued job to other computing resource pools for calculation; that is, judges the reason for queuing the queued job, and if the remaining resource amount of the computing resource pool corresponding to the queued job is insufficient, then obtain the other computing resource pools in the cluster. The amount of remaining resources, if the remaining resources of other computing resource pools meet the resource requirements of the queued job and the other computing resource pool matches the queued job, then the queued job is assigned to the other computing resource pool for calculation, which can not only improve The effective use of computing resources in the cluster can reduce the queuing time of computing jobs, thereby improving the computing efficiency and utilization of the cluster.
在一实施例中,第一获取模块71可以进一步配置为:当集群中存在排队等待的排队作业时,获取排队作业的排队原因。In an embodiment, the first acquiring module 71 may be further configured to: acquire the queuing reason of the queuing job when there is a queuing job waiting in line in the cluster.
在一实施例中,第二获取模块72可以进一步配置为:当排队原因为排队作业对应的计算资源池的剩余资源量减去保留资源量后小于排队作业的需求资源量时,获取集群中其他计算资源池的剩余资源量。其中,保留资源量表征单个计算资源池中预留的资源量,保留资源量与对应的计算资源池的剩余资源量正相关。In an embodiment, the second acquisition module 72 may be further configured to: when the reason for queuing is that the remaining resource amount of the computing resource pool corresponding to the queued job minus the amount of reserved resources is less than the required resource amount of the queued job, obtain other resources in the cluster. Calculate the remaining resources of the resource pool. Wherein, the reserved resource amount represents the reserved resource amount in a single computing resource pool, and the reserved resource amount is positively correlated with the remaining resource amount of the corresponding computing resource pool.
在一实施例中,调度执行模块73可以进一步配置为:当其他计算资源池的剩余资源量减去保留资源量后大于或等于需求资源量、且其他计算资源池与排队作业匹配时,将排队作业分配至其他计算资源池进行计算。其中,保留资源量表征单个计算资源池中预留的资源量,保留资源量与对应的计算资源池的剩余资源量正相关。In an embodiment, the scheduling execution module 73 may be further configured to: when the remaining resource amount of other computing resource pools minus the reserved resource amount is greater than or equal to the required resource amount, and other computing resource pools match the queued jobs, queue Jobs are assigned to other computing resource pools for computing. Wherein, the reserved resource amount represents the reserved resource amount in a single computing resource pool, and the reserved resource amount is positively correlated with the remaining resource amount of the corresponding computing resource pool.
图8是本申请另一示例性实施例提供的一种计算作业的调度装置的结构示意图。如图8所示,上述计算作业的调度装置70可以包括:调度终止模块74,用于当其他计算资源池的剩余资源量均小于需求资源量时,停止对排队作业进行调度。对应的,上述计算作业的调度装置70可以进一步配置为:对排队作业之后的计算作业进行调度。Fig. 8 is a schematic structural diagram of an apparatus for scheduling computing jobs provided by another exemplary embodiment of the present application. As shown in FIG. 8 , the computing job scheduling device 70 may include: a scheduling termination module 74 configured to stop scheduling queued jobs when the remaining resource amounts of other computing resource pools are less than the required resource amounts. Correspondingly, the above computing job scheduling device 70 may be further configured to: schedule computing jobs after the queued jobs.
在一实施例中,调度终止模块74可以进一步配置为:当单个用户对应的计算作业的需求资源量的总和大于单用户资源量上限时,停止对单个用户的计算作业进行调度。In an embodiment, the scheduling termination module 74 may be further configured to stop scheduling the computing jobs of a single user when the sum of required resource amounts of computing jobs corresponding to a single user is greater than the upper limit of resource amounts of a single user.
在一实施例中,如图8所示,上述计算作业的调度装置70可以包括:预分配模块75,用于根据所有计算作业的要求和集群中各个计算资源池的计算特性,将计算作业分别分配至各个计算资源池匹配。In one embodiment, as shown in FIG. 8 , the scheduling device 70 for the computing jobs mentioned above may include: a pre-allocation module 75, configured to assign computing jobs to Assigned to each computing resource pool match.
下面,参考图9来描述根据本申请实施例的电子设备。该电子设备可以是第一设备和第二设备中的任一个或两者、或与它们独立的单机设备,该单机设备可以与第一设备和第二设备进行通信,以从它们接收所采集到的输入信号。Next, an electronic device according to an embodiment of the present application will be described with reference to FIG. 9 . The electronic device may be either or both of the first device and the second device, or a stand-alone device independent of them, and the stand-alone device may communicate with the first device and the second device to receive collected data from them. input signal.
图9图示了根据本申请实施例的电子设备的框图。FIG. 9 illustrates a block diagram of an electronic device according to an embodiment of the present application.
如图9所示,电子设备10包括一个或多个处理器11和存储器12。As shown in FIG. 9 , an electronic device 10 includes one or more processors 11 and a memory 12 .
处理器11可以是中央处理单元(CPU)或者具有数据处理能力和/或指令执行能力的其他形式的处理单元,并且可以控制电子设备10中的其他组件以执行期望的功能。Processor 11 may be a central processing unit (CPU) or other form of processing unit having data processing capabilities and/or instruction execution capabilities, and may control other components in electronic device 10 to perform desired functions.
存储器12可以包括一个或多个计算机程序产品,所述计算机程序产品可以包括各种形式的计算机可读存储介质,例如易失性存储器和/或非易失性存储器。所述易失性存储器例如可以包括随机存取存储器(RAM)和/或高速缓冲存储器(cache)等。所述非易失性存储器例如可以包括只读存储器(ROM)、硬盘、闪存等。在所述计算机可读存储介质上可以存储一个或多个计算机程序指令,处理器11可以运行所述程序指令,以实现上文所述的本申请的各个实施例的计算作业的调度方法以及/或者其他期望的功能。在所述计算机可读存储介质中还可以存储诸如输入信号、信号分量、噪声分量等各种内容。Memory 12 may include one or more computer program products, which may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. The volatile memory may include, for example, random access memory (RAM) and/or cache memory (cache). The non-volatile memory may include, for example, a read-only memory (ROM), a hard disk, a flash memory, and the like. One or more computer program instructions may be stored on the computer-readable storage medium, and the processor 11 may execute the program instructions to implement the above-mentioned method for scheduling computing jobs in various embodiments of the present application and/or or other desired functionality. Various contents such as input signal, signal component, noise component, etc. may also be stored in the computer-readable storage medium.
在一个示例中,电子设备10还可以包括:输入装置13和输出装置14,这些组件通过总线系统和/或其他形式的连接机构(未示出)互连。In one example, the electronic device 10 may further include: an input device 13 and an output device 14, and these components are interconnected through a bus system and/or other forms of connection mechanisms (not shown).
在该电子设备是单机设备时,该输入装置13可以是通信网络连接器,用于从第一设备和第二设备接收所采集的输入信号。When the electronic device is a stand-alone device, the input device 13 may be a communication network connector for receiving collected input signals from the first device and the second device.
此外,该输入装置13还可以包括例如键盘、鼠标等等。In addition, the input device 13 may also include, for example, a keyboard, a mouse, and the like.
该输出装置14可以向外部输出各种信息,包括确定出的距离信息、方向信息等。该输出装置14可以包括例如显示器、扬声器、打印机、以及通信网络及其所连接的远程输出设备等等。The output device 14 can output various information to the outside, including determined distance information, direction information, and the like. The output device 14 may include, for example, a display, a speaker, a printer, a communication network and its connected remote output devices, and the like.
当然,为了简化,图9中仅示出了该电子设备10中与本申请有关的组件中的一些,省略了诸如总线、输入/输出接口等等的组件。除此之外,根据具体应用情况,电子设备10还可以包括任何其他适当的组件。Of course, for the sake of simplicity, only some of the components related to the present application in the electronic device 10 are shown in FIG. 9 , and components such as bus, input/output interface, etc. are omitted. In addition, according to specific application conditions, the electronic device 10 may also include any other suitable components.
所述计算机程序产品可以以一种或多种程序设计语言的任意组合来编写用于执行本申请实施例操作的程序代码,所述程序设计语言包括面向对象的程序设计语言,诸如Java、C++等,还包括常规的过程式程序设计语言,诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算设备上执行、部分地在用户设备上执行、作为一个独立的软件包执行、部分在用户计算设备上部分在远程计算设备上执行、或者完全在远程计算设备或服务器上执行。The computer program product can be written in any combination of one or more programming languages to execute the program codes for performing the operations of the embodiments of the present application, and the programming languages include object-oriented programming languages, such as Java, C++, etc. , also includes conventional procedural programming languages, such as the "C" language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server to execute.
所述计算机可读存储介质可以采用一个或多个可读介质的任意组合。可读介质可以是可读信号介质或者可读存储介质。可读存储介质例如可以包括但不限于电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。可读存储介质的更具体的例子(非穷举的列表)包括:具有一个或多个导线的电连接、便携式盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。The computer readable storage medium may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium may include, but not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, devices, or devices, or any combination thereof. More specific examples (non-exhaustive list) of readable storage media include: electrical connection with one or more conductors, portable disk, hard disk, random access memory (RAM), read only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing.
为了例示和描述的目的已经给出了以上描述。此外,此描述不意图将本申请的实施例限制到在此公开的形式。尽管以上已经讨论了多个示例方面和实施例,但是本领域技术人员将认识到其某些变型、修改、改变、添加和子组合。The foregoing description has been presented for purposes of illustration and description. Furthermore, this description is not intended to limit the embodiments of the application to the forms disclosed herein. Although a number of example aspects and embodiments have been discussed above, those skilled in the art will recognize certain variations, modifications, changes, additions and sub-combinations thereof.
Claims (10)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202211035050.4A CN115543554A (en) | 2022-08-26 | 2022-08-26 | Method and device for scheduling calculation jobs and computer readable storage medium |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202211035050.4A CN115543554A (en) | 2022-08-26 | 2022-08-26 | Method and device for scheduling calculation jobs and computer readable storage medium |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| CN115543554A true CN115543554A (en) | 2022-12-30 |
Family
ID=84726457
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202211035050.4A Pending CN115543554A (en) | 2022-08-26 | 2022-08-26 | Method and device for scheduling calculation jobs and computer readable storage medium |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN115543554A (en) |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN120407202A (en) * | 2025-07-01 | 2025-08-01 | 国家超级计算天津中心 | Cluster job scheduling method |
-
2022
- 2022-08-26 CN CN202211035050.4A patent/CN115543554A/en active Pending
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN120407202A (en) * | 2025-07-01 | 2025-08-01 | 国家超级计算天津中心 | Cluster job scheduling method |
| CN120407202B (en) * | 2025-07-01 | 2025-08-29 | 国家超级计算天津中心 | Cluster job scheduling method |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN112783659B (en) | Resource allocation method, device, computer equipment and storage medium | |
| CN103136055B (en) | For controlling the method and apparatus to the use calculating resource in database service | |
| CN107688492B (en) | Resource control method and device and cluster resource management system | |
| CN102667724B (en) | Method and system for dynamically managing accelerator resources | |
| CN111666147B (en) | Resource scheduling method, device, system and central server | |
| CN104243405B (en) | A kind of request processing method, apparatus and system | |
| CN111338785B (en) | Resource scheduling method and device, electronic equipment and storage medium | |
| CN107515786B (en) | Resource allocation method, master device, slave device and distributed computing system | |
| CN104598426A (en) | Task scheduling method for heterogeneous multi-core processor system | |
| CN112148467B (en) | Dynamic allocation of computing resources | |
| CN115098269B (en) | Resource allocation method and device, electronic equipment and storage medium | |
| CN112749002A (en) | Method and device for dynamically managing cluster resources | |
| CN107122233A (en) | A kind of adaptive real-time scheduling methods of many VCPU towards TSN business | |
| US9471387B2 (en) | Scheduling in job execution | |
| CN114546587A (en) | A method for expanding and shrinking capacity of online image recognition service and related device | |
| CN114489978A (en) | Resource scheduling method, device, equipment and storage medium | |
| CN117149440A (en) | Task scheduling method and device, electronic equipment and storage medium | |
| CN103440113B (en) | A kind of disk I/O resource allocation methods and device | |
| CN114721818A (en) | A GPU time-sharing method and system based on Kubernetes cluster | |
| CN115378879A (en) | Data control method and related device | |
| CN104598311A (en) | Method and device for real-time operation fair scheduling for Hadoop | |
| CN114265676B (en) | Cluster resource scheduling method, device, equipment and medium | |
| CN115543554A (en) | Method and device for scheduling calculation jobs and computer readable storage medium | |
| CN103164338B (en) | The analogy method of concurrent processing system and device | |
| CN119105853A (en) | A virtualized resource scheduling method, device, equipment, storage medium and product |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| CB03 | Change of inventor or designer information | ||
| CB03 | Change of inventor or designer information |
Inventor after: Lv Qinghai Inventor after: Wang Jiang Inventor after: Huang Yi Inventor after: Li Fa Inventor before: Wang Jiang Inventor before: Huang Yi Inventor before: Li Fa |