[go: up one dir, main page]

CN117724847A - Method, device, computing equipment and storage medium for determining VASP software parallel parameters - Google Patents

Method, device, computing equipment and storage medium for determining VASP software parallel parameters Download PDF

Info

Publication number
CN117724847A
CN117724847A CN202311799193.7A CN202311799193A CN117724847A CN 117724847 A CN117724847 A CN 117724847A CN 202311799193 A CN202311799193 A CN 202311799193A CN 117724847 A CN117724847 A CN 117724847A
Authority
CN
China
Prior art keywords
target
parameter value
determining
parameter
parallel
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311799193.7A
Other languages
Chinese (zh)
Inventor
杨稳
刘帅
宋志方
叶晋甫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Paratera Technology Co ltd
Original Assignee
Beijing Paratera Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Paratera Technology Co ltd filed Critical Beijing Paratera Technology Co ltd
Priority to CN202311799193.7A priority Critical patent/CN117724847A/en
Publication of CN117724847A publication Critical patent/CN117724847A/en
Pending legal-status Critical Current

Links

Landscapes

  • Stored Programmes (AREA)

Abstract

The invention discloses a method, a device, a computing device and a storage medium for determining a VASP software parallel parameter, wherein the method comprises the following steps: determining target calculation example information of the VASP software, wherein the target calculation example information comprises the K points of a target Brillouin zone of a target calculation example; acquiring node hardware configuration information of a cluster; determining the total number of parallel running cores of the VASP software and the total number of applied nodes according to the target calculation information and the node hardware configuration information; if the number of the K points of the target Brillouin zone is smaller than or equal to a first threshold value, determining a first parameter value based on the greatest common divisor of the number of the K points of the target Brillouin zone and the total number of parallel operation nuclei; if the number of the K points of the target Brillouin zone is larger than a first threshold value, determining a first parameter value based on a smaller common divisor of the number of the K points of the target Brillouin zone and the total number of parallel operation kernels; the second parameter value and the third parameter value are determined based on the parallel running total number of cores and the first parameter value. According to the technical scheme of the invention, the calculation efficiency of the VASP software can be improved under the condition of fully utilizing the cluster hardware resources.

Description

Method, device, computing equipment and storage medium for determining VASP software parallel parameters
Technical Field
The present invention relates to the field of computer technologies, and in particular, to a method for determining a parallel parameter of a VASP software, a device for determining a parallel parameter of a VASP software, a computing device and a storage medium.
Background
In the existing high-flux material computing scene, VASP software plays a role in importance, the computing efficiency of the VASP material is greatly improved by virtue of a computer cluster, and the utilization efficiency of the super computing cluster and the work output of a user can be remarkably improved by the efficient use of the software.
In the VASP software, a series of parameters may be set in the input file for specifying information of the calculated target atomic species, composition structure, initial field, and the like. And optimizing parameters refers to a technique of obtaining more accurate and stable calculation results by adjusting a series of calculation parameters. Among them, the parallel parameters refer to three parameters KPAR, NPAR, NCORE. KPAR represents the number of momentum tasks running simultaneously, NPAR represents the number of energy band tasks calculated simultaneously, and NCORE represents the number of computational cores used by a single energy band computational task. These three parameters affect the allocation of physical quantities such as atoms, energy bands, etc. to processes during the running of the VASP software, thereby affecting the computation amount, traffic volume, parallel computation efficiency, etc. of the corresponding physical cores of the processes.
In the prior art, aiming at optimization of parallel parameters, only improvement of parallel efficiency is considered, for example, setting of three parallel parameters needs to refer to the number of K points in a Brillouin zone and the like. However, in the prior art, the changes of the parallel parameters are not considered to cause the actually executed code to change, which results in the changes of the usage amount of the memory of the job to the node, the access bandwidth and the network communication between the nodes, thereby affecting the application performance bottleneck change during the operation of the job. The problem that the VASP software is limited in calculation efficiency due to the fact that the hardware performance bottleneck is achieved or is failed due to the fact that the memory capacity is insufficient is caused by the fact that the actual upper performance limit of the hardware is ignored due to the fact that the K points of the Brillouin zone are excessively focused. As can be seen, the best VASP parallel parameters on the hardware cannot be obtained with excessive attention paid to the number of brillouin zone K points.
Therefore, there is a need for a method for determining the parallel parameters of VASP software to solve the above-mentioned problems.
Disclosure of Invention
To this end, the present invention provides a method of determining VASP software parallelism parameters and an apparatus for determining VASP software parallelism parameters to solve or at least alleviate the above-identified problems.
According to one aspect of the present invention there is provided a method of determining a parallelism parameter of a VASP software, the VASP software being adapted to operate in a cluster comprising a plurality of nodes, the parallelism parameter comprising a first parameter indicative of the number of brillouin zone K points processed in parallel, a second parameter indicative of the number of energy bands processed in parallel, and a third parameter indicative of the number of cores allocated to each energy band on average, the method comprising: determining target calculation example information of the VASP software, wherein the target calculation example information comprises the number of target Brillouin zone K points of the target calculation example, the target energy band number of the target calculation example and the atomic number of the target calculation example; acquiring node hardware configuration information of the cluster, wherein the node hardware configuration information comprises one or more of CPU core number, memory capacity and access bandwidth of a single node; determining the total number of parallel running cores of the VASP software and the total number of applied nodes according to the target calculation information and the node hardware configuration information; determining a first parameter value based on a greatest common divisor of the target brillouin zone K number and the parallel operation total nucleus number if the target brillouin zone K number is smaller than or equal to a first threshold, wherein the first threshold is related to the parallel operation total nucleus number; determining a first parameter value based on a lesser common divisor of the target brillouin zone K number and the parallel operation total kernel number if the target brillouin zone K number is greater than a first threshold, such that the first parameter value is less than or equal to the first threshold; a second parameter value and a third parameter value are determined based on the parallel running total and the first parameter value.
Optionally, in the method for determining a parallel parameter of a VASP software according to the present invention, determining a second parameter value and a third parameter value based on the parallel running total core number and the first parameter value includes: determining a second parameter value based on an evolution value of a ratio of the parallel running total number of cores to the first parameter value; a third parameter value is determined based on the parallel operation total kernel number, the first parameter value, and the second parameter value, wherein the parallel operation total kernel number = first parameter value x second parameter value x third parameter value.
Optionally, in the method for determining a parallel parameter of a VASP software according to the present invention, determining a second parameter value based on an evolution value of a ratio of the total number of parallel operations to the first parameter value includes: judging whether the evolution value of the ratio of the total number of parallel operation to the first parameter value is even or not, if so, determining the evolution value of the ratio of the total number of parallel operation to the first parameter value as a second parameter value; and if the opening value of the ratio of the total number of parallel operation to the first parameter value is not even, taking down the maximum even number based on the opening value of the ratio of the total number of parallel operation to the first parameter value, and determining the maximum even number as a second parameter value.
Optionally, in the method for determining a parallel parameter of the VASP software according to the present invention, determining the first parameter value based on a greatest common divisor of the target brillouin zone K point number and the parallel running total kernel number includes: and determining the greatest common divisor of the K point number of the target Brillouin zone and the total nuclear number of the parallel operation as a first parameter value.
Optionally, in the method for determining a parallel parameter of the VASP software according to the present invention, determining a first parameter value based on a smaller common divisor of the target brillouin zone K point number and the parallel running total kernel number, so that the first parameter value is less than or equal to the first threshold value includes: determining all common divisors of the K points of the target Brillouin zone and the total parallel running nucleus number; one or more target common divisors which are smaller than or equal to the first threshold value are selected from all the common divisors, and a first parameter value is determined based on the one or more target common divisors.
Optionally, in the method for determining a parallel parameter of a VASP software according to the present invention, determining a first parameter value based on the one or more target common divisors includes: and determining the maximum objective common divisor in the one or more objective common divisors as a first parameter value.
Optionally, in the method for determining a parallel parameter of the VASP software according to the present invention, if the number of the target brillouin zone K points is less than or equal to a first threshold, determining a first parameter value based on a greatest common divisor of the number of the target brillouin zone K points and the total number of parallel running kernels includes: if the number of the K points of the target Brillouin zone is less than or equal to 10% of the total number of the parallel operation, determining a first parameter value based on the greatest common divisor of the K points of the target Brillouin zone and the total number of the parallel operation; if the target brillouin zone K number is greater than a first threshold, determining a first parameter value based on a smaller common divisor of the target brillouin zone K number and the parallel operation total kernel number, such that the first parameter value is less than or equal to the first threshold, comprising: if the target Brillouin zone K point number is greater than 10% of the parallel operation total kernel number, a first parameter value is determined based on a smaller common divisor of the target Brillouin zone K point number and the parallel operation total kernel number, so that the first parameter value is less than or equal to 10% of the parallel operation total kernel number.
Optionally, in the method for determining a VASP software parallelism parameter according to the present invention, the first threshold is 10% of the total number of parallel operations.
According to one aspect of the present invention there is provided an apparatus for determining a parallelism parameter of a VASP software, the VASP software being adapted to operate in a cluster comprising a plurality of nodes, the parallelism parameter comprising a first parameter indicative of the number of brillouin zone K points processed in parallel, a second parameter indicative of the number of energy bands processed in parallel, and a third parameter indicative of the number of cores allocated to each energy band on average, the apparatus comprising: the first determining module is suitable for determining target case information of the VASP software, wherein the target case information comprises the number of K points of a target Brillouin zone of the target case, the number of target energy bands of the target case and the number of atoms of the target case; the acquisition module is suitable for acquiring node hardware configuration information of the cluster, wherein the node hardware configuration information comprises one or more of CPU core number, memory capacity and access bandwidth of a single node; the third determining module is suitable for determining the total number of parallel running cores of the VASP software and the total number of applied nodes according to the target computing information and the node hardware configuration information; a fourth determining module, adapted to determine a first parameter value based on a greatest common divisor of the target brillouin zone K number and the parallel operation total number of cores, if the target brillouin zone K number is less than or equal to a first threshold, wherein the first threshold is related to the parallel operation total number of cores; a fifth determining module adapted to determine a first parameter value based on a smaller common divisor of the target brillouin zone K number and the parallel operation total kernel number, if the target brillouin zone K number is greater than a first threshold, so that the first parameter value is less than or equal to the first threshold; a sixth determination module is adapted to determine a second parameter value and a third parameter value based on the parallel running total and the first parameter value.
According to one aspect of the invention, there is provided a computing device comprising: at least one processor; a memory storing program instructions, wherein the program instructions are configured to be adapted to be executed by the at least one processor, the program instructions comprising instructions for performing the method of determining VASP software parallelism parameters as described above.
According to one aspect of the present invention there is provided a readable storage medium storing program instructions which, when read and executed by a computing device, cause the computing device to perform a method of determining a VASP software parallelism parameter as described above.
According to the technical scheme of the invention, the method for determining the parallel parameters of the VASP software is provided, target case information of the VASP software is determined, node hardware configuration information of a cluster is obtained, the total parallel operation cores of the VASP software and the total node number of applications are determined according to the target case information and the node hardware configuration information, under the condition that the number of the target Brillouin zone K points is smaller than or equal to a first threshold value, a first parameter value is determined based on the greatest common divisor of the number of the target Brillouin zone K points and the total parallel operation cores, under the condition that the number of the target Brillouin zone K points is larger than the first threshold value, a first parameter value is determined based on the lesser common divisor of the number of the target Brillouin zone K points and the total parallel operation cores so that the first parameter value is smaller than or equal to the first threshold value, and further, a second parameter value and a third parameter value are determined based on the total parallel operation cores and the first parameter value. Therefore, according to the technical scheme of the invention, the hardware performance bottleneck of the cluster is not triggered, the hardware resources of the cluster can be fully utilized, and the calculation efficiency of the VASP software can be improved under the condition of fully utilizing the hardware resources of the cluster. And the problems of limited calculation efficiency caused by the fact that the VASP software achieves a hardware performance bottleneck or operation failure caused by insufficient memory capacity are avoided.
The foregoing description is only an overview of the present invention, and is intended to be implemented in accordance with the teachings of the present invention in order that the same may be more clearly understood and to make the same and other objects, features and advantages of the present invention more readily apparent.
Drawings
To the accomplishment of the foregoing and related ends, certain illustrative aspects are described herein in connection with the following description and the annexed drawings, which set forth the various ways in which the principles disclosed herein may be practiced, and all aspects and equivalents thereof are intended to fall within the scope of the claimed subject matter. The above, as well as additional objects, features, and advantages of the present disclosure will become more apparent from the following detailed description when read in conjunction with the accompanying drawings. Like reference numerals generally refer to like parts or elements throughout the present disclosure.
FIG. 1 shows a schematic diagram of a computing device 100 provided in accordance with an embodiment of the invention;
FIG. 2 illustrates a flow diagram of a method 200 for determining VASP software parallelism parameters, provided in accordance with an embodiment of the present invention;
FIG. 3 shows a comparison of test data based on respective parallel parameter combinations for three examples in a second embodiment;
Fig. 4 shows a schematic diagram of an apparatus 400 for determining a VASP software parallelism parameter according to an embodiment of the invention.
Detailed Description
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
In order to facilitate understanding, terms related to the present invention are explained below.
High performance computing (High Performance Computing, HPC) is a computing area that utilizes massively parallel processing, high speed networking and advanced software tools to solve complex problems. The method is mainly used for greatly improving the computing capacity and the processing speed of the computer by applying the technical means such as parallel computing, big data processing, high-speed storage, high-efficiency algorithm and the like. Compared with the traditional computer, the system consists of a large number of high-speed processors, a large-capacity memory, a high-speed network and a storage system, and relies on key technologies such as parallel algorithm, scheduler, task distribution, data transmission and storage management, etc. so as to ensure the high efficiency and stability of the system.
A job scheduling system is a system for managing and scheduling task execution in a computer cluster. It is responsible for receiving task requests and distributing tasks to available computing resources according to certain policies and algorithms to achieve efficient task execution and resource utilization.
VASP (Vienna Ab initio Simulation Package) is a software developed by the university of Vienna for electronic structure computation and quantum mechanics-molecular dynamics simulation, and has very wide application in the field of computing material chemistry. The VASP has higher calculation efficiency, is one of commercial software with highest calculation efficiency for the first principle of solid materials at present, can realize large-scale high-efficiency parallel calculation by using a smaller memory, can realize large-scale high-efficiency parallel calculation, supports multi-core multi-node parallel calculation, has no limit on the number of cores and nodes, and supports simultaneous use of single users and multiple users.
VASP is a software package based on a first sexual principle, and the basic principle is to approximately solve the Schrodinger equation so as to obtain the electronic state and energy of the system. VASP is a procedure based on the density functional theory (DensityFunctional Theory, DFT) of the pseudopotential plane wave basis set, and the process of performing simulation calculations by VASP is to solve the Kohn-Sham equation within the DFT framework. The VASP adopts a pseudo potential plane wave method, namely, adopts a pseudo potential approximate simulation primitive potential field, takes plane waves as a base group, and represents a single electron wave function as a plane wave superposition form, and the plane wave base group does not need BSSE correction. VASP uses a projective prefix plus potential (Projector Augmented Wave, PAW) to approximately handle atomic electron interactions. VASP software supports a variety of hybrid functions based on two methods of Local Density Approximation (LDA) and Generalized Gradient Approximation (GGA), and hybrid functions (hybrid functional) formed by combining the exchange energy of DFT and the exact exchange energy in Hartree-Fock (HF) in a certain proportion. The simulation calculation by VASP software mainly comprises two parts of geometric structure optimization and static calculation, wherein the geometric structure optimization mainly adopts periodic boundary conditions, and performs geometric structure optimization on multiple systems such as atoms, molecules, surfaces, clusters and the like based on a density functional theory to obtain stable configurations, so as to obtain various structural parameters including lattice constants of the stable configurations, positions of all the atoms, bond lengths among the atoms and bond angles. Whereas static calculation refers to calculating the energy of the optimized system with higher accuracy on the basis of geometry optimization and calculating various properties of the structure.
The parallel parameters of the VASP software to be optimized include KPAR, NPAR, NCORE. The method comprises the steps of simultaneously processing K points of a Brillouin zone by KPAR characterization, simultaneously processing the K points of the Brillouin zone by NPAR characterization, simultaneously processing the energy bands by NPAR characterization, and uniformly distributing the core number of each energy band by NCORE characterization. The parallel parameter KPAR, NPAR, NCORE is only set based on the total number of kernels of KPAR NPAR ncore=parallel operation in the prior art to reduce the time taken for parallel operation.
It should be noted that, the operation of the VASP software needs to occupy certain system resources, such as cpu, memory bandwidth, I/O, and other system resources. The computing multi-atom multi-electronic system has higher requirements on the hardware equipment of the cluster, but the optimization mode of the parallel parameters in the prior art does not consider the actual load condition of the cluster, so that the situation that the computing efficiency is not improved and is reduced after the parallel parameters are adjusted can be caused, and the computing failure can be caused. Particularly, when the KPAR parameter is increased, the number of K points (K points for short) of the Brillouin zone in parallel processing is correspondingly increased, so that more memory is consumed by parallel processes, the memory access bandwidth is also required to be greatly increased, and if the consumed memory reaches the maximum value of the system, program error reporting, memory overflow and calculation failure are caused; if the access bandwidth reaches the upper limit of the system, the parallel speed is not increased any more, and even the performance is reduced due to the access delay.
In addition, under the condition that the number of cores is fixed for the same system, the parallel operation efficiency of the VASP software can be improved by setting a KPAR value and an NPAR value, however, when the operation speed is improved, the parameter occupies resources of the cluster to a certain extent, the larger the KPAR value is, the occupied resources such as memory and memory bandwidth are increased, if the memory bandwidth reaches the upper limit of system hardware, performance bottleneck can be generated, and the VASP operation efficiency is not improved any more at that time, and even the operation efficiency is reduced. If the total memory of the system is exceeded, the problem of insufficient memory can occur, resulting in task operation failure.
Aiming at the problems of the parallel parameter optimization mode, the invention provides a method for determining the parallel parameters of the VASP software, which can optimize the parallel parameters of the VASP software running on a cluster.
The method for determining the VASP software parallel parameters provided by the embodiment of the invention can be executed by computing equipment, and the computing equipment can be a terminal or a server. In some embodiments, the method for determining the parallel parameters of the VASP software provided by the embodiments of the present invention may be specifically executed by a terminal.
A computing device provided by an embodiment of the present invention is described below in conjunction with fig. 1.
FIG. 1 shows a schematic diagram of a computing device 100 provided in accordance with an embodiment of the invention. As shown in FIG. 1, in a basic configuration, computing device 100 includes at least one processing unit 102 and a system memory 104. According to one aspect, the processing unit 102 may be implemented as a processor, depending on the configuration and type of computing device. The system memory 104 includes, but is not limited to, volatile storage (e.g., random access memory), non-volatile storage (e.g., read only memory), flash memory, or any combination of such memories. According to one aspect, an operating system 105 is included in system memory 104.
According to one aspect, operating system 105 is suitable, for example, for controlling the operation of computing device 100. Further, examples are practiced in connection with a graphics library, other operating systems, or any other application program and are not limited to any particular application or system. This basic configuration is illustrated in fig. 1 by those components within the dashed line. According to one aspect, computing device 100 has additional features or functionality. For example, according to one aspect, computing device 100 includes additional data storage devices (removable and/or non-removable) such as, for example, magnetic disks, optical disks, or tape. Such additional storage is illustrated in fig. 1 by removable storage device 109 and non-removable storage device 110.
As set forth hereinabove, according to one aspect, the program module 103 is stored in the system memory 104. According to one aspect, program modules 103 may include one or more applications, the invention is not limited in the type of application, for example, the application may include: email and contacts applications, word processing applications, spreadsheet applications, database applications, slide show applications, drawing or computer-aided application, web browser applications, etc.
According to one aspect, the program module 103 may comprise means 400 for determining a VASP software parallelism parameter, the means 400 for determining a VASP software parallelism parameter comprising a plurality of program instructions adapted to perform the method 200 for determining a VASP software parallelism parameter of the invention, such that the means 400 for determining a VASP software parallelism parameter is configured to perform the method 200 for determining a VASP software parallelism parameter of the invention.
According to one aspect, the examples may be practiced in a circuit comprising discrete electronic components, a packaged or integrated electronic chip containing logic gates, a circuit utilizing a microprocessor, or on a single chip containing electronic components or a microprocessor. For example, examples may be practiced via a system on a chip (SOC) in which each or many of the components shown in fig. 1 may be integrated on a single integrated circuit. According to one aspect, such SOC devices may include one or more processing units, graphics units, communication units, system virtualization units, and various application functions, all of which are integrated (or "burned") onto a chip substrate as a single integrated circuit. When operating via an SOC, the functionality described herein may be operated via dedicated logic integrated with other components of computing device 100 on a single integrated circuit (chip). Embodiments of the invention may also be practiced using other techniques capable of performing logical operations (e.g., AND, OR, AND NOT), including but NOT limited to mechanical, optical, fluidic, AND quantum techniques. In addition, embodiments of the invention may be practiced within a general purpose computer or in any other circuit or system.
According to one aspect, the computing device 100 may also have one or more input devices 112, such as a keyboard, mouse, pen, voice input device, touch input device, and the like. Output device(s) 114 such as a display, speakers, printer, etc. may also be included. The foregoing devices are examples and other devices may also be used. Computing device 100 may include one or more communication connections 116 that allow communication with other computing devices 118. Examples of suitable communication connections 116 include, but are not limited to: RF transmitter, receiver and/or transceiver circuitry; universal Serial Bus (USB), parallel and/or serial ports.
The term computer readable media as used herein includes computer storage media. Computer storage media may include volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information (e.g., computer readable instructions, data structures, or program modules 103). System memory 104, removable storage 109, and non-removable storage 110 are all examples of computer storage media (i.e., memory storage). Computer storage media may include Random Access Memory (RAM), read Only Memory (ROM), electrically erasable read only memory (EEPROM), flash memory or other memory technology, CD-ROM, digital Versatile Disks (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other article of manufacture that can be used to store information and that can be accessed by computing device 100. According to one aspect, any such computer storage media may be part of computing device 100. Computer storage media does not include a carrier wave or other propagated data signal.
According to one aspect, communication media is embodied by computer readable instructions, data structures, program modules 103, or other data in a modulated data signal (e.g., carrier wave or other transport mechanism) and includes any information delivery media. According to one aspect, the term "modulated data signal" describes a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal. By way of example, and not limitation, communication media includes wired media such as a wired network or direct-wired connection, and wireless media such as acoustic, radio Frequency (RF), infrared, and other wireless media.
In an embodiment according to the invention, the computing device 100 is configured to perform a method 200 of determining VASP software parallel parameters according to the present invention. The computing device 100 includes one or more processors and one or more readable storage media storing program instructions that, when configured to be executed by the one or more processors, cause the computing device to perform the method 200 of determining VASP software parallelism parameters in an embodiment of the present invention. The computing device 100 can optimize the parallel parameter value of the VASP software running based on the cluster by executing the method 200 for determining the parallel parameter of the VASP software in the embodiment of the invention, so as to improve the computing efficiency of the VASP software under the condition of fully utilizing the hardware resources of the cluster.
The apparatus 400 for determining the VASP software parallelism parameter resides in the computing device 100 according to one embodiment of the present invention, the apparatus 400 for determining the VASP software parallelism parameter being configured to perform the method 200 for determining the VASP software parallelism parameter according to the present invention. Wherein the apparatus 400 for determining the VASP software parallel parameters comprises a plurality of program instructions for performing the method 200 for determining the VASP software parallel parameters of the present invention, which may instruct a processor to perform the method 200 for determining the VASP software parallel parameters according to the present invention.
Fig. 2 shows a flow chart of a method 200 for determining a VASP software parallelism parameter according to an embodiment of the invention. The method 200 of determining VASP software parallelism parameters is configured to be executed in the computing device 100.
In embodiments of the present invention, the VASP software operates in a cluster (i.e., a computer cluster), in other words, the VASP software in embodiments of the present invention operates on a cluster basis that includes a plurality of nodes (i.e., computer nodes). In an embodiment of the present invention, the parallel parameters of the VASP software (i.e. the parallel parameters to be optimized) at least include a first parameter, a second parameter, and a third parameter, where the first parameter characterizes the number of K points in the brillouin zone processed in parallel, the second parameter characterizes the number of energy bands processed in parallel, and the third parameter characterizes the number of nuclei distributed to each energy band on average.
In an embodiment of the present invention, the first parameter is KPAR, the second parameter is NPAR, and the third parameter is NCORE.
As shown in fig. 2, method 200 includes the following steps 210-260.
In step 210, the computing device 100 determines target case information of the VASP software, the target case information including the target brillouin zone K number of the target case, the target band number of the target case, and the atomic number of the target case.
In an embodiment of the invention, the target calculation may be a crystal system to be calculated by the VASP software.
The brillouin zone is a concept in solid physics, and crystals have symmetry and periodicity. In the empty space, the K grid point (KPOINTS) is divided into a plurality of regions, and energy in each region continuously changes, and such a region space becomes a brillouin region. The brillouin zone is used to describe the state of motion of electrons or atoms of a crystal in a periodicity. Each symmetry point in the brillouin zone is called a brillouin zone K point, and each brillouin zone K point contains all information required for calculation.
In step 220, the computing device 100 obtains node hardware configuration information for the cluster, where the node hardware configuration information may include one or more of CPU core number, memory capacity, access bandwidth of the individual nodes.
Here, the node hardware configuration information of the cluster is hardware configuration information of each node in the cluster. The hardware configuration information of each node in the cluster is the same. The hardware resource condition of the cluster can be determined according to the node hardware configuration information of the cluster.
In step 230, the computing device 100 determines the total number of concurrent runs of the VASP software and the total number of nodes the VASP software applies for (i.e., the total number of nodes that need to be allocated for the VASP software) based on the target case information and the node hardware configuration information.
Specifically, the total number of cores of the VASP software running in parallel can be determined according to the target Brillouin zone K point number, the target energy band number and the atomic number of the target computing example and one or more of the CPU core number, the memory capacity and the access bandwidth of the single node of the cluster. Further, the total number of nodes of the VASP software application is determined according to the total number of parallel running cores of the VASP software and the CPU cores of the single node.
In some embodiments of the present invention, the cluster includes a job scheduling system, and the cluster may use the job scheduling system to determine the total number of nodes allocated to the VASP software according to the target computing information and the node hardware configuration information, so that the computing device 100 may determine the total number of nodes applied by the VASP software. Here, the job scheduling system may determine the total number of nodes allocated for the VASP software according to the target calculation information and the node hardware configuration information based on the job scheduling algorithm.
In some embodiments of the present invention, the computing device 100 may utilize a job scheduling algorithm to determine the total number of nodes allocated for the VASP software (i.e., the total number of nodes applied by the VASP software) based on the target computing instance information and the node hardware configuration information. It should be noted that the present invention is not limited to the kind of job scheduling algorithm, and may be implemented as any job scheduling algorithm in the prior art.
In an embodiment of the present invention, the total number of parallel runs of the VASP software=the total number of nodes of the VASP software application.
In an embodiment of the present invention, the first parameter (KPAR) characterizes the K points of the brillouin zone processed in parallel by the VASP software, which means that the K points are divided into the first parameter (KPAR) areas for calculation. In fact, the setting of the first parameter (KPAR) needs to be combined with the parallel running total kernel number, for which, in the embodiment of the present invention, the first parameter (KPAR) is set as a common divisor of the target brillouin zone K point number and the parallel running total kernel number. Specifically, the first parameter may be determined according to the following steps 240 and 250.
In step 240, if the target brillouin zone K number is equal to or less than the first threshold, the computing device 100 may determine the first parameter value based on a greatest common divisor of the target brillouin zone K number and the parallel operation total kernel number.
It should be understood that when the number of the target brillouin zone K points is equal to or less than the first threshold, the first parameter value determined based on the greatest common divisor of the number of the target brillouin zone K points and the total number of parallel operations is equal to or less than the first threshold.
In an embodiment of the invention, the first threshold is related to the total number of cores running in parallel.
In step 250, if the target brillouin zone K number is greater than the first threshold, the computing device 100 may determine the first parameter value based on a smaller common divisor of the target brillouin zone K number and the parallel operation total kernel number, such that the first parameter value is less than or equal to the first threshold. In this way, the first parameter value (KPAR value) can be ensured to be equal to or smaller than the first threshold value.
The crystal system and the precision degree of the inverse space K grid point (KPOINTS) calculated by the VASP software are different, and the number of the brillouin zone K points is also different. The higher the precision is, the more points are sampled by the K point information of the Brillouin zone, the more perfect the crystal information is, so the number of the K points of the Brillouin zone is larger; accordingly, the lower the accuracy, the smaller the number of brillouin zone K dots.
In an embodiment of the invention, the total number of cores = first parameter value x second parameter value x third parameter value is run in parallel.
Based on this, when the number of target brillouin zone K points of the target calculation example is small (equal to or smaller than the first threshold), but it can be ensured that after dividing the K grid points based on the first parameter (KPAR), the number of parallel processed energy bands represented by the second parameter (NPAR), the number of cores that are equally allocated to each energy band represented by the third parameter (NCORE), still have sufficient core resource calculation, then the first parameter value (KPAR value) can be determined according to step 240 described above, that is: if the number of the target brillouin zone K points is equal to or less than the first threshold value, the first parameter value may be determined based on a greatest common divisor of the number of the target brillouin zone K points and the total number of parallel operations.
If the target brillouin zone K number of the target calculation example is large (greater than the first threshold value), and the number of parallel processed energy bands represented by the second parameter (NPAR), the number of cores equally allocated to each energy band represented by the third parameter (NCORE) after the division of the K grid points based on the first parameter (KPAR) are caused, there is not enough core resource calculation, the value of the first parameter needs to be reduced, and a common divisor of the target brillouin zone K number and the total number of cores for parallel operation needs to be followed, for which the first parameter value (KPAR value) may be determined according to step 250 described above, that is: and if the number of the target Brillouin zone K points is larger than the first threshold value, determining a first parameter value based on a smaller common divisor of the number of the target Brillouin zone K points and the total number of parallel operation kernels so that the first parameter value is smaller than or equal to the first threshold value.
Therefore, the first threshold can measure whether the second parameter and the third parameter determined by determining the first parameter based on the common divisor of the number of the target Brillouin zone K points and the total number of the parallel running kernels can ensure that each target energy band in each group of energy bands of the target computing example is distributed to enough core resources for computing. According to the embodiment of the invention, by setting the first parameter value to be smaller than or equal to the first threshold value, each target energy band in each group of energy bands of the target computing example can be ensured to be allocated to enough core resources, and the invention considers the upper performance limit of hardware and ensures that the VASP software runs under the condition of not triggering the performance bottleneck of the hardware.
In step 260, a second parameter value and a third parameter value are determined based on the total number of parallel runs of the VASP software and the first parameter value determined in the steps described above.
In an embodiment of the invention, the second parameter value may be determined based on an evolution of the ratio of the total number of parallel runs to the first parameter value.
A third parameter value is determined based on the total number of parallel runs, the first parameter value, and the second parameter value. Here, the above-determined parallel operation total core number, first parameter value, and second parameter value may be substituted into the following formula: total number of parallel operations = first parameter value = second parameter value = third parameter value, from which the third parameter value can be determined.
In addition, the second parameter value needs to be even down in view of the symmetry of the brillouin zone, see in particular the embodiments described below.
In some embodiments, the second parameter value is taken down even in the following manner: when determining the second parameter value based on the evolution value of the ratio of the parallel operation total core number to the first parameter value, it is first determined whether the evolution value of the ratio of the parallel operation total core number to the first parameter value is even or not, and if so, the evolution value of the ratio of the parallel operation total core number to the first parameter value may be determined as the second parameter value. In addition, if the opening value of the ratio of the parallel operation total core number to the first parameter value is not an even number, the maximum even number is taken down based on the opening value of the ratio of the parallel operation total core number to the first parameter value, and the maximum even number taken down based on the opening value of the ratio of the parallel operation total core number to the first parameter value is determined as the second parameter value.
In some embodiments, in step 240, if the target brillouin zone K number is equal to or less than the first threshold, when determining the first parameter value based on the greatest common divisor of the target brillouin zone K number and the parallel operation total number of cores, the greatest common divisor of the target brillouin zone K number and the parallel operation total number of cores may be determined, and the greatest common divisor may be determined as the first parameter value.
In step 250, if the number of the target brillouin zone K points is greater than the first threshold, when determining the first parameter value based on the smaller common divisor of the number of the target brillouin zone K points and the total number of parallel operations, all common divisors of the number of the target brillouin zone K points and the total number of parallel operations may be determined first, then one or more target common divisors which are equal to or less than the first threshold may be selected from all common divisors, and further the first parameter value may be determined based on the one or more target common divisors which are equal to or less than the first threshold. Specifically, a largest target common divisor among one or more target common divisors that are equal to or less than a first threshold value may be determined as the first parameter value.
That is, in the embodiment of the present invention, under the condition that the first parameter value is less than or equal to the first threshold, the greatest common divisor of the number of K points in the target brillouin zone and the total number of kernels in parallel operation is determined as the first parameter value, so that the hardware performance bottleneck of the cluster is not triggered, and the hardware resources of the cluster can be fully utilized.
In some embodiments, the first threshold may be 10% of the total number of cores running in parallel.
That is, in step 240, if the number of the target brillouin zone K points is less than or equal to the first threshold, the first parameter value is determined based on the greatest common divisor of the number of the target brillouin zone K points and the total number of parallel operations, which may be specifically implemented as: and if the number of the K points of the target Brillouin zone is less than or equal to 10% of the total number of the parallel operation cores, determining a first parameter value based on the greatest common divisor of the number of the K points of the target Brillouin zone and the total number of the parallel operation cores.
In step 250, if the number of K points in the target brillouin zone is greater than the first threshold, the first parameter value is determined based on a smaller common divisor of the number of K points in the target brillouin zone and the total number of parallel operation kernels, so that the first parameter value is less than or equal to the first threshold, which may be specifically implemented as: if the number of the target Brillouin zone K points is greater than 10% of the total number of parallel operations, a first parameter value is determined based on a small common divisor of the number of the target Brillouin zone K points and the total number of parallel operations so that the first parameter value is 10% or less of the total number of parallel operations.
The optimality of the parallel parameters determined by the method 200 of determining VASP software parallel parameters according to the present invention will be described in detail below with two specific embodiments.
In a first embodiment, the target example of the VASP software is, for example, the As256Ga256 system, whose atoms are 512 in total, that can run using a 192 kernel.
According to step 210, it may be determined that the target instance information includes a target Brillouin zone K point number of 4 for the target instance, a target band number of 1277 for the target instance, and an atomic number of 512 for the target instance.
The node hardware configuration information of the cluster obtained according to step 220 is as follows: the number of CPU cores of a single node is 96 cores.
The total number of nodes applied for the VASP software is 2, i.e. 2 nodes are applied for in total, according to the determination of the step 230, and further the total number of parallel running cores of the VASP software is determined to be 192 cores.
According to steps 240-260, the first parameter value (KPAR value) is determined to be 4, the second parameter value (NPAR value) is determined to be 4, and the third parameter value (NCORE value) is determined to be 12. The first, second and third parameter values determined herein are values of parallel parameters optimized by the method 200 of determining a VASP software parallel parameter according to the present invention.
Next, the parallel parameter in this first embodiment is evaluated for a test. Wherein, based on the first parameter value (KPAR value) being 4, the second parameter value (NPAR value) being 4, and the third parameter value (NCORE value) being 12, the single electron step time is 117.26s, and the average memory ratio is 27.52%. The electronic step time obtained after the parallel parameters determined according to the prior art are tested is 145.05s, and the average memory ratio is 27.64%. Therefore, according to the method 200 for determining the parallel parameters of the VASP software, the calculation efficiency of the VASP software is remarkably improved.
Fig. 3 shows a comparison of test data based on respective parallel parameter combinations for three examples in a second embodiment.
In a second embodiment, the node hardware configuration information of the cluster is as follows: the memory capacity is 384GB, and the upper limit of the memory access bandwidth is 371GB/s. Based on this cluster configuration, the effect of the values of the parallel parameters of the VASP software on the memory usage, the memory bandwidth, and the network bandwidth will be demonstrated by three examples, where kpar=1 is the default performance of the VASP software without setting the parallel parameters, and the best parallel parameter combinations in the examples are shown in fig. 3.
As shown in fig. 3, the first example is C128N3H15O3Cu1, and the target brillouin zone K point number of the first example is 4, the atomic number is 150, and the 288 kernel operation is used. According to the method 200 for determining the parallel parameters of the VASP software of the present invention, the best parallel parameter combination can be obtained: kpar=4, npar=6. And 298.32GB/s access bandwidth is the maximum value in the test of the example, so that the access bandwidth performance of the cluster is fully utilized, and the upper limit of the access bottleneck is not reached.
The second example is Si, with a target Brillouin zone K number of 4, an atomic number of 104, and run using 192 kernels. According to the parallel parameter combination obtained in the prior art, KPAR=1 and NPAR=192, the access bandwidth based on the parallel parameter combination is 369.28GB/s and is equal to the maximum access bandwidth, so that the access performance bottleneck of the hardware is triggered, and the operation performance is affected. In accordance with the method 200 of determining the parallel parameters of the VASP software of the present invention, the optimal parallel parameter combination is kpar=4, npar=3, and the operation performance based on the parameter combination is optimal.
The third example was Ni3O17N2C44H14, with a target Brillouin zone K point number of 20, an atomic number of 80, and a 180-kernel run. Firstly, in the test of this example, it can be seen that the task memory usage varies with the variation of the parallel parameters of the VASP software, and even error conditions due to insufficient memory occur. Thus, the setting of the parallel parameters of the VASP software needs to be adjusted according to the actual conditions of hardware. In addition, if the number of the brillouin zone K is large in this example, if the KPAR is considered to be identical to the number of the brillouin zone K (e.g., data corresponding to example numbers 9 and 10), there is a possibility that the memory usage becomes excessive, and the memory becomes insufficient and cannot be operated. In step 250 of the method 200 for determining parallel parameters of the VASP software according to the present invention, by taking down the smaller common divisor 10 of the number of K points in the target brillouin zone and the total number of parallel running kernels of the third example, an optimal parallel parameter combination of kpar=10 and npar=1 can be obtained.
It can be seen that, according to the method 200 for determining parallel parameters of the VASP software provided in the embodiment of the present invention, the objective case information of the VASP software is determined, the node hardware configuration information of the cluster is obtained, the total number of parallel running cores and the total number of nodes applied for the VASP software are determined according to the objective case information and the node hardware configuration information, the first parameter value is determined based on the greatest common divisor of the objective brillouin zone K number and the total number of parallel running cores when the objective brillouin zone K number is less than or equal to the first threshold, and the first parameter value is determined based on the lesser common divisor of the objective brillouin zone K number and the total number of parallel running cores when the objective brillouin zone K number is greater than the first threshold, so that the first parameter value is less than or equal to the first threshold, and further, the second parameter value and the third parameter value are determined based on the total number of parallel running cores and the first parameter value. Therefore, according to the technical scheme of the invention, the hardware performance bottleneck of the cluster is not triggered, the hardware resources of the cluster can be fully utilized, and the calculation efficiency of the VASP software can be improved under the condition of fully utilizing the hardware resources of the cluster. And the problems of limited calculation efficiency caused by the fact that the VASP software achieves a hardware performance bottleneck or operation failure caused by insufficient memory capacity are avoided.
Fig. 4 shows a schematic diagram of an apparatus 400 for determining a VASP software parallelism parameter according to an embodiment of the invention. The apparatus 400 for determining VASP software parallelism parameters resides in the computing device 100. The apparatus 400 for determining VASP software parallelism parameters may be configured to perform the method 200 of determining VASP software parallelism parameters of the present invention.
In embodiments of the present invention, the VASP software operates in a cluster (i.e., a computer cluster), in other words, the VASP software in embodiments of the present invention operates on a cluster basis that includes a plurality of nodes (i.e., computer nodes). In an embodiment of the present invention, the parallel parameters of the VASP software (i.e. the parallel parameters to be optimized) at least include a first parameter, a second parameter, and a third parameter, where the first parameter characterizes the number of K points in the brillouin zone processed in parallel, the second parameter characterizes the number of energy bands processed in parallel, and the third parameter characterizes the number of nuclei distributed to each energy band on average.
In an embodiment of the present invention, the first parameter is KPAR, the second parameter is NPAR, and the third parameter is NCORE.
As shown in fig. 4, in an embodiment of the present invention, the apparatus 400 for determining the parallel parameters of the VASP software includes a first determining module 410, an acquiring module 420, a third determining module 430, a fourth determining module 440, a fifth determining module 450, and a sixth determining module 460, which are sequentially communicatively connected.
The first determining module 410 may determine target case information of the VASP software, where the target case information includes a target brillouin zone K point number of the target case, a target band number of the target case, and an atomic number of the target case.
The obtaining module 420 may obtain node hardware configuration information of the cluster, where the node hardware configuration information includes one or more of CPU core number, memory capacity, access bandwidth of a single node.
The third determining module 430 may determine the total number of parallel running cores of the VASP software and the total number of nodes applied according to the target computing information and the node hardware configuration information.
The fourth determination module 440 may determine the first parameter value based on a greatest common divisor of the target brillouin zone K point number and the parallel operation total kernel number, where the first threshold is related to the parallel operation total kernel number, in a case where the target brillouin zone K point number is determined to be equal to or less than the first threshold.
The fifth determining module 450 may determine the first parameter value based on a smaller common divisor of the target brillouin zone K point number and the parallel operation total kernel number such that the first parameter value is equal to or less than the first threshold value, in a case where it is determined that the target brillouin zone K point number is greater than the first threshold value.
The sixth determination module 460 may determine the second parameter value and the third parameter value based on the parallel running total and the first parameter value.
It should be noted that the first determining module 410, the obtaining module 420, the third determining module 430, the fourth determining module 440, the fifth determining module 450, and the sixth determining module 460 are respectively configured to perform the foregoing steps 210 to 260. For specific execution logic of each module, reference is made to the descriptions of steps 210 to 260 in the foregoing method 200, and no further description is given here.
According to the apparatus 400 for determining parallel parameters of the VASP software provided in the embodiment of the present invention, the target case information of the VASP software is determined, the node hardware configuration information of the cluster is obtained, the total number of parallel operation cores and the total number of nodes applied for the VASP software are determined according to the target case information and the node hardware configuration information, the first parameter value is determined based on the greatest common divisor of the target brillouin region K number and the total number of parallel operation cores when the target brillouin region K number is less than or equal to the first threshold, and the first parameter value is determined based on the lesser common divisor of the target brillouin region K number and the total number of parallel operation cores when the target brillouin region K number is greater than the first threshold, so that the first parameter value is less than or equal to the first threshold, and further, the second parameter value and the third parameter value are determined based on the total number of parallel operation cores and the first parameter value. Therefore, according to the technical scheme of the invention, the hardware performance bottleneck of the cluster is not triggered, the hardware resources of the cluster can be fully utilized, and the calculation efficiency of the VASP software can be improved under the condition of fully utilizing the hardware resources of the cluster. And the problems of limited calculation efficiency caused by the fact that the VASP software achieves a hardware performance bottleneck or operation failure caused by insufficient memory capacity are avoided.
The various techniques described herein may be implemented in connection with hardware or software or, alternatively, with a combination of both. Thus, the methods and apparatus of the present invention, or certain aspects or portions of the methods and apparatus of the present invention, may take the form of program code (i.e., instructions) embodied in tangible media, such as removable hard drives, U-drives, floppy diskettes, CD-ROMs, or any other machine-readable storage medium, wherein, when the program is loaded into and executed by a machine, such as a computer, the machine becomes an apparatus for practicing the invention.
In the case of program code execution on programmable computers, the mobile terminal will generally include a processor, a storage medium readable by the processor (including volatile and non-volatile memory and/or storage elements), at least one input device, and at least one output device. Wherein the memory is configured to store program code; the processor is configured to execute the method of determining the VASP software parallelism parameters of the present invention according to instructions in said program code stored in the memory.
By way of example, and not limitation, readable media comprise readable storage media and communication media. The readable storage medium stores information such as computer readable instructions, data structures, program modules, or other data. Communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media. Combinations of any of the above are also included within the scope of readable media.
In the description provided herein, algorithms and displays are not inherently related to any particular computer, virtual system, or other apparatus. Various general-purpose systems may also be used with examples of the invention. The required structure for a construction of such a system is apparent from the description above. In addition, the present invention is not directed to any particular programming language. It will be appreciated that the teachings of the present invention described herein may be implemented in a variety of programming languages, and the above description of specific languages is provided for disclosure of enablement and best mode of the present invention.
In the description provided herein, numerous specific details are set forth. However, it is understood that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.
Similarly, it should be appreciated that in the foregoing description of exemplary embodiments of the invention, various features of the invention are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various inventive aspects.
Those skilled in the art will appreciate that the modules or units or components of the devices in the examples disclosed herein may be arranged in a device as described in this embodiment, or alternatively may be located in one or more devices different from the devices in this example. The modules in the foregoing examples may be combined into one module or may be further divided into a plurality of sub-modules.
Unless otherwise specified the use of the ordinal adjectives "first", "second", "third", etc., to describe a common object, merely indicate that different instances of like objects are being referred to, and are not intended to imply that the objects so described must be in a given sequence, either temporally, spatially, in ranking, or in any other manner.

Claims (10)

1. A method of determining a VASP software parallelism parameter, the VASP software being adapted to run in a cluster, the cluster comprising a plurality of nodes, the parallelism parameter comprising a first parameter, a second parameter, a third parameter, wherein the first parameter characterizes a brillouin zone K number of parallel processing, the second parameter characterizes a number of energy bands of parallel processing, and the third parameter characterizes a number of nuclei equally allocated to each energy band, the method comprising:
Determining target calculation example information of the VASP software, wherein the target calculation example information comprises the number of target Brillouin zone K points of the target calculation example, the target energy band number of the target calculation example and the atomic number of the target calculation example;
acquiring node hardware configuration information of the cluster, wherein the node hardware configuration information comprises one or more of CPU core number, memory capacity and access bandwidth of a single node;
determining the total number of parallel running cores of the VASP software and the total number of applied nodes according to the target calculation information and the node hardware configuration information;
determining a first parameter value based on a greatest common divisor of the target brillouin zone K number and the parallel operation total nucleus number if the target brillouin zone K number is smaller than or equal to a first threshold, wherein the first threshold is related to the parallel operation total nucleus number;
determining a first parameter value based on a lesser common divisor of the target brillouin zone K number and the parallel operation total kernel number if the target brillouin zone K number is greater than a first threshold, such that the first parameter value is less than or equal to the first threshold;
a second parameter value and a third parameter value are determined based on the parallel running total and the first parameter value.
2. The method of claim 1, wherein determining second and third parameter values based on the parallel running total and the first parameter values comprises:
determining a second parameter value based on an evolution value of a ratio of the parallel running total number of cores to the first parameter value;
a third parameter value is determined based on the parallel operation total kernel number, the first parameter value, and the second parameter value, wherein the parallel operation total kernel number = first parameter value x second parameter value x third parameter value.
3. The method of claim 2, wherein determining a second parameter value based on an evolution value of a ratio of the total number of parallel runs to the first parameter value comprises:
judging whether the evolution value of the ratio of the total number of parallel operation to the first parameter value is even or not, if so, determining the evolution value of the ratio of the total number of parallel operation to the first parameter value as a second parameter value;
and if the opening value of the ratio of the total number of parallel operation to the first parameter value is not even, taking down the maximum even number based on the opening value of the ratio of the total number of parallel operation to the first parameter value, and determining the maximum even number as a second parameter value.
4. A method according to any one of claims 1 to 3, wherein determining a first parameter value based on a greatest common divisor of the target brillouin zone K point number and the parallel running total core number comprises:
and determining the greatest common divisor of the K point number of the target Brillouin zone and the total nuclear number of the parallel operation as a first parameter value.
5. The method of any of claims 1-4, wherein determining a first parameter value based on a lesser common divisor of the target brillouin zone K point number and the parallel operation total kernel number, such that the first parameter value is less than or equal to the first threshold value, comprises:
determining all common divisors of the K points of the target Brillouin zone and the total parallel running nucleus number;
one or more target common divisors which are smaller than or equal to the first threshold value are selected from all the common divisors, and a first parameter value is determined based on the one or more target common divisors.
6. The method of claim 5, wherein determining a first parameter value based on the one or more target common divisors comprises:
and determining the maximum objective common divisor in the one or more objective common divisors as a first parameter value.
7. The method of any one of claim 1 to 6, wherein,
If the number of the target brillouin zone K points is less than or equal to a first threshold, determining a first parameter value based on a greatest common divisor of the number of the target brillouin zone K points and the total number of parallel operations includes:
if the number of the K points of the target Brillouin zone is less than or equal to 10% of the total number of the parallel operation, determining a first parameter value based on the greatest common divisor of the K points of the target Brillouin zone and the total number of the parallel operation;
if the target brillouin zone K number is greater than a first threshold, determining a first parameter value based on a smaller common divisor of the target brillouin zone K number and the parallel operation total kernel number, such that the first parameter value is less than or equal to the first threshold, comprising:
if the target Brillouin zone K point number is greater than 10% of the parallel operation total kernel number, a first parameter value is determined based on a smaller common divisor of the target Brillouin zone K point number and the parallel operation total kernel number, so that the first parameter value is less than or equal to 10% of the parallel operation total kernel number.
8. An apparatus for determining a parallelism parameter of a VASP software, the VASP software being adapted to operate in a cluster comprising a plurality of nodes, the parallelism parameter comprising a first parameter, a second parameter, and a third parameter, wherein the first parameter characterizes a brillouin zone K number of parallel processing, the second parameter characterizes a number of energy bands of parallel processing, and the third parameter characterizes a number of nuclei equally allocated to each energy band, the apparatus comprising:
The first determining module is suitable for determining target case information of the VASP software, wherein the target case information comprises the number of K points of a target Brillouin zone of the target case, the number of target energy bands of the target case and the number of atoms of the target case;
the acquisition module is suitable for acquiring node hardware configuration information of the cluster, wherein the node hardware configuration information comprises one or more of CPU core number, memory capacity and access bandwidth of a single node;
the third determining module is suitable for determining the total number of parallel running cores of the VASP software and the total number of applied nodes according to the target computing information and the node hardware configuration information;
a fourth determining module, adapted to determine a first parameter value based on a greatest common divisor of the target brillouin zone K number and the parallel operation total number of cores, if the target brillouin zone K number is less than or equal to a first threshold, wherein the first threshold is related to the parallel operation total number of cores;
a fifth determining module adapted to determine a first parameter value based on a smaller common divisor of the target brillouin zone K number and the parallel operation total kernel number, if the target brillouin zone K number is greater than a first threshold, so that the first parameter value is less than or equal to the first threshold;
A sixth determination module is adapted to determine a second parameter value and a third parameter value based on the parallel running total and the first parameter value.
9. A computing device, comprising:
at least one processor; and
a memory storing program instructions, wherein the program instructions are configured to be adapted to be executed by the at least one processor, the program instructions comprising instructions for performing the method of any of claims 1-7.
10. A readable storage medium storing program instructions which, when read and executed by a computing device, cause the computing device to perform the method of any of claims 1-7.
CN202311799193.7A 2023-12-25 2023-12-25 Method, device, computing equipment and storage medium for determining VASP software parallel parameters Pending CN117724847A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311799193.7A CN117724847A (en) 2023-12-25 2023-12-25 Method, device, computing equipment and storage medium for determining VASP software parallel parameters

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311799193.7A CN117724847A (en) 2023-12-25 2023-12-25 Method, device, computing equipment and storage medium for determining VASP software parallel parameters

Publications (1)

Publication Number Publication Date
CN117724847A true CN117724847A (en) 2024-03-19

Family

ID=90199742

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311799193.7A Pending CN117724847A (en) 2023-12-25 2023-12-25 Method, device, computing equipment and storage medium for determining VASP software parallel parameters

Country Status (1)

Country Link
CN (1) CN117724847A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20230253294A1 (en) * 2022-02-09 2023-08-10 Samsung Electronics Co., Ltd. Computing device and electronic device guaranteeing bandwidth per computational performance

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20230253294A1 (en) * 2022-02-09 2023-08-10 Samsung Electronics Co., Ltd. Computing device and electronic device guaranteeing bandwidth per computational performance

Similar Documents

Publication Publication Date Title
CN109669774B (en) Hardware resource quantification method, hardware resource arrangement method, hardware resource quantification device and hardware resource arrangement device and network equipment
US9213584B2 (en) Varying a characteristic of a job profile relating to map and reduce tasks according to a data size
US8966038B2 (en) Virtual server system and physical server selection method
US20130318538A1 (en) Estimating a performance characteristic of a job using a performance model
US11231852B2 (en) Efficient sharing of non-volatile memory
US20200137581A1 (en) Resource utilization of heterogeneous compute units in electronic design automation
WO2024245038A1 (en) Method and apparatus for scheduling virtual cloud computing resources
CN117724847A (en) Method, device, computing equipment and storage medium for determining VASP software parallel parameters
US11429299B2 (en) System and method for managing conversion of low-locality data into high-locality data
CN118519768A (en) Method, device, equipment and storage medium for overflowing data to shared buffer memory
Kim et al. Coordinating compaction between lsm-tree based key-value stores for edge federation
US9501328B2 (en) Method for exploiting parallelism in task-based systems using an iteration space splitter
Labasan et al. Power and performance tradeoffs for visualization algorithms
US20240134708A1 (en) Bin Packing
US10365997B2 (en) Optimizing DRAM memory based on read-to-write ratio of memory access latency
WO2024012153A1 (en) Data processing method and apparatus
CN116302527A (en) A social network data analysis method, system and electronic equipment
CN114546643A (en) NUMA-aware parallel computing method and system for ARM architecture
CN106940682B (en) Embedded system optimization method based on-chip programmable memory
CN103246563B (en) A kind of multilamellar piecemeal dispatching method with storage perception
Cho et al. Performance Benchmark of Cahn–Hilliard Equation Solver with Implementation of Semi-implicit Fourier Spectral Method
Shi et al. Integrating theoretical modeling and experimental measurement for soft resource allocation in multi-tier web systems
US20240202030A1 (en) Proportional performance metric control for physical functions of a memory device
EP4528505A1 (en) Data processing method, apparatus, device, and system
US20250045102A1 (en) Systems, methods, and apparatus for assigning compute tasks to computational devices

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination