CN115904617A - GPU virtualization implementation method based on SR-IOV technology - Google Patents
GPU virtualization implementation method based on SR-IOV technology Download PDFInfo
- Publication number
- CN115904617A CN115904617A CN202211419575.8A CN202211419575A CN115904617A CN 115904617 A CN115904617 A CN 115904617A CN 202211419575 A CN202211419575 A CN 202211419575A CN 115904617 A CN115904617 A CN 115904617A
- Authority
- CN
- China
- Prior art keywords
- virtual
- virtual machine
- host
- gpu
- graphics
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 41
- 238000005516 engineering process Methods 0.000 title claims abstract description 27
- 238000012545 processing Methods 0.000 claims abstract description 23
- 230000005540 biological transmission Effects 0.000 claims abstract description 16
- 238000002955 isolation Methods 0.000 claims abstract description 16
- 238000004891 communication Methods 0.000 claims abstract description 11
- 230000007246 mechanism Effects 0.000 claims abstract description 6
- 230000008569 process Effects 0.000 claims description 21
- 238000009877 rendering Methods 0.000 claims description 13
- 238000013507 mapping Methods 0.000 claims description 9
- 238000005192 partition Methods 0.000 claims description 6
- 238000004088 simulation Methods 0.000 claims description 5
- 238000004806 packaging method and process Methods 0.000 claims description 3
- 230000006870 function Effects 0.000 abstract description 27
- 238000013461 design Methods 0.000 abstract description 7
- 230000009286 beneficial effect Effects 0.000 abstract description 2
- 238000009960 carding Methods 0.000 abstract 1
- 230000015654 memory Effects 0.000 description 17
- 230000003863 physical function Effects 0.000 description 14
- 238000007726 management method Methods 0.000 description 10
- 238000004364 calculation method Methods 0.000 description 5
- 238000004040 coloring Methods 0.000 description 3
- 239000012634 fragment Substances 0.000 description 3
- 238000010586 diagram Methods 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 239000008186 active pharmaceutical agent Substances 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 238000013508 migration Methods 0.000 description 1
- 230000005012 migration Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 238000013468 resource allocation Methods 0.000 description 1
- 238000012163 sequencing technique Methods 0.000 description 1
- 238000012384 transportation and delivery Methods 0.000 description 1
Images
Classifications
- 
        - Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
 
Landscapes
- Stored Programmes (AREA)
Abstract
The invention is suitable for the technical field of computer virtualization, and provides a GPU virtualization implementation method based on an SR-IOV technology, which comprises the following steps: s1, dividing independent resources of a physical GPU through an SR-IOV technology; s2, installing a virtual display card to drive in a user layer virtual machine; s3, establishing a communication transmission mechanism of a Host OS of a Hypervisor layer and a user layer virtual machine; and S4, after the user layer virtual machine initiates an API request, the API request is transmitted to the Host OS through the Hypervisor layer, the Host OS performs centralized processing according to the event priority and the time stamp in sequence, each virtual machine can only access the corresponding independent resource, and the resource isolation and the safety during use are guaranteed. The method is based on the layered design, flexibly divides the physical GPU resources according to the number of the virtual machines, and performs specific definition and flow carding on each part of functions of the virtualization scheme, thereby being beneficial to improving the problems of low resource utilization rate and incapability of ensuring the safety of resource isolation of the traditional GPU virtualization scheme.
    Description
Technical Field
      The invention belongs to the technical field of computer virtualization, and particularly relates to a GPU virtualization implementation method based on an SR-IOV technology.
    Background
      In recent years, with the increasing requirements of people on the utilization rate and the safety of display card resources in a multi-user mode, the GPU virtualization technology has been developed. The GPU virtualization refers to abstracting the hardware resources of the GPU to provide the hardware resources for sharing and using in a multi-user mode so as to achieve the purpose of maximizing the utilization rate of the GPU resources. As a system level software and hardware solution. With the diversification of the demands of users on GPU resources, GPU virtualization is widely applied to the fields of cloud sharing platforms, VDIs, remote assistance and the like. The technology provides flexible user PC management capability on the basis of fully utilizing the internal computing resources of the GPU, has strong expansibility, and effectively ensures the effects of internal video memory, computing power and fault isolation among users.
      At present, the scheme for realizing GPU virtualization is mainly an API redirection mode. According to the scheme, under the condition that hardware details are not considered, API requests for calling a graphic library in a virtual machine are directly intercepted and sent to a physical GPU for processing. However, the scheme greatly increases the system burden of the CPU for managing all the virtual machines, and particularly for the case of simultaneous use of multiple virtual machines, it is difficult to ensure context isolation and frame buffer isolation in time-sharing use of the GPU.
      In addition, to reduce the system load, there is a proposal to directly divide the physical GPU into a plurality of modules. When the GPU is used, the divided single physical GPU is provided for the upper layer virtual machine to be used in a transparent transmission mode. Each virtual machine installs a real physical graphics card driver. However, this solution is only applicable to the use places with a fixed number of virtual machines. Since the physical GPU resources are divided in advance, when the number of virtual machines is less than the number of divided GPUs, the resources are wasted. In addition, the virtual machine is installed with a real physical graphics card driver, which may cause a safety hazard in use, that is, a single virtual machine may illegally access other partitioned physical GPU resources.
    Disclosure of Invention
      In view of the foregoing problems, an object of the present invention is to provide a method for implementing GPU virtualization based on SR-IOV technology, so as to solve the problems of low resource utilization and failure to ensure security of resource isolation in the conventional GPU virtualization scheme.
      The invention adopts the following technical scheme:
      the GPU virtualization implementation method based on the SR-IOV technology comprises the following steps:
      s1, dividing independent resources of a physical GPU through an SR-IOV technology;
      s2, installing a virtual display card to drive in a user layer virtual machine;
      s3, establishing a communication transmission mechanism of a Host OS of a Hypervisor layer and a user layer virtual machine;
      and S4, after the user layer virtual machine initiates an API request, the API request is transmitted to the Host OS through the Hypervisor layer, the Host OS performs centralized processing according to the event priority and the time stamp in sequence, each virtual machine can only access the corresponding independent resource, and the resource isolation and the safety during use are guaranteed.
      Further, the specific process of step S1 is as follows:
      s11, determining the partition number of the physical GPU based on the number of the configured virtual machines, fixing the partition number as an even number, and if the number of the virtual machines is an odd number, adding 1 to the number of the virtual machines;
      s12, dividing an address space in the physical GPU into a first space and a second space, wherein the first space is used for configuring a control register space, connecting a display sending module and a non-configurable computing power and video coding and decoding scheduling unit, and the second space is used for providing access for an upper-layer graphics controller and is used for storing graphics context and temporary data;
      s13, averagely dividing the first space and the second space according to the dividing quantity of the physical GPU;
      and S14, establishing a mapping relation between the virtual address and the physical address through an MMIO page table.
      Further, the step S2 specifically includes the following steps:
      s21, configuring the virtual machine to be in a non-transparent transmission mode;
      s22, setting the virtual machine on the Hypervisor layer and setting the number of the virtual machine, wherein the virtual machine with the number of 0 is the Host OS with the highest privilege level;
      s23, when the virtual graphics card driver is installed, the user layer virtual machines with numbers larger than 0 respectively store a single virtual machine identifier, and record the address space which is allocated by the user layer virtual machines and mapped through an MMIO page table; the virtual display card driver provides a graphics library API function of a standard graphics library OpenGL to an upper layer application program, and the virtual display card driver is used for intercepting and forwarding an API request of a user layer virtual machine.
      Further, the specific process of step S3 is as follows:
      s31, automatically creating Virtio-deficiency-type virtual equipment in a Hypervisor layer, wherein the Virtio-deficiency-type virtual equipment is in one-to-one correspondence with a virtual machine, a front-end driver of the Virtio-deficiency-type virtual equipment is provided by a virtual graphics card driver, and a rear-end driver is only used for receiving information transmitted by the front-end driver and forwarding the information to a Host OS;
      and S32, setting a ring queue VirtQueue by a back-end driver, and realizing the exchange of data between the Host OS and the virtual machine.
      Further, the specific process of step S4 is as follows:
      s41, a program used by a user layer for displaying or rendering graphics firstly calls a graphics middleware, the graphics middleware decomposes a display or graphics rendering task into basic graphics drawing requirements, and then calls a bottom-layer standard graphics library OpenGL to generate an API request for drawing basic graphics;
      s42, after the API interface which is the same as the current API request is driven by a virtual display card of the user layer virtual machine to receive the API request, recording a function parameter based on the determined drawing task and event priority, the API called sequence and timestamp, a configured virtual GPU identifier and software state information;
      s43, packaging information recorded by the virtual display card driver to an API request, and transmitting the Virtio deficiency-type simulation equipment created by the Hypervisor layer to a rear-end driver of the Hypervisor layer;
      s44, sequentially transmitting the API requests of all the user virtual machines to a Host OS with the serial number of 0 through a ring queue VirtQueue by a rear-end driver of the Hypervisor layer;
      s45, establishing a plurality of independent transmission channels by a Host OS to receive and analyze the API request, processing the API request according to a high priority, processing the API request with the same priority in a timestamp sequence, and calling a native graphics card to drive to access the physical GPU in a thread mode;
      s46, the physical GPU processes various API requests transmitted by the Host OS, the processing result is transmitted back to the Host OS in a return value mode, the Host OS is transmitted to the corresponding virtual machine through a VirtQueue of a Hypervisor layer,
      the invention has the beneficial effects that: the GPU virtualization implementation method based on the SR-IOV technology is realized through layered design, a plurality of modules designed in a layered mode can be processed and improved independently, when the method is used specifically, a virtual machine is provided with a virtual display card driver, a communication relation between the virtual machine and a Host OS is established through a Hypervisor layer, and independent resources of a physical GPU are divided again; the Host OS receives API requests of all the virtual machines and performs centralized processing; and performing time-sharing scheduling on computing resources and video coding and decoding resources of different API requests by directly accessing the physical GPU, and transmitting the calculated result back to the corresponding virtual machine system. The scheme of the invention can improve the utilization rate of GPU resources to the greatest extent, reduce the burden of a CPU system and ensure the use safety under resource isolation.
    Drawings
      FIG. 1 is a flowchart of a GPU virtualization implementation method based on SR-IOV technology according to an embodiment of the present invention;
      fig. 2 is a frame diagram for implementing the GPU virtualization implementation method according to the embodiment of the present invention.
    Detailed Description
      In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
      In order to illustrate the technical means of the present invention, the following description is given by way of specific examples.
      Fig. 1 shows a flow of a method for implementing GPU virtualization based on SR-IOV technology according to an embodiment of the present invention, and fig. 2 shows an implementation framework diagram of an implementation case of the present invention, and for convenience of description, only a part related to the embodiment of the present invention is shown.
      As shown in fig. 1 and 2, the method for implementing GPU virtualization based on SR-IOV technology includes the following steps:
      s1, dividing independent resources of the physical GPU through an SR-IOV technology.
      The physical GPU divides independent resources such as a video memory, a controller, a connection module and the like through an SR-IOV technology. The I/O operation of the upper layer is accessed through a single physical address space and a partitioned virtual MMIO page table when accessing the resource of the physical GPU. And realizing time-sharing scheduling on the computing power of the physical GPU, the video coding and decoding and other shared resources based on the task priority and the time stamp.
      The specific process of the step is as follows:
      s11, determining the partition number of the physical GPU based on the number of the configured virtual machines, fixing the partition number to be an even number, and if the number of the virtual machines is an odd number, adding 1 to the number of the virtual machines.
      And S12, dividing an address space in the physical GPU into a first space and a second space, wherein the first space is used for configuring a control register space, connecting a display sending module and an unconfigurable computing power and video coding and decoding scheduling unit, and the second space is used for providing access for an upper-layer graphics controller and storing graphics context and temporary data.
      Taking the GPU with the 32G video memory specification as an example, the 4G video memory is divided into a first space as a configurable control register space crtc, a connection video sending module connector, and an unconfigurable computation and video coding and decoding scheduling unit, and the rest 28G video memories are second spaces and are provided for an upper graphics controller to access, for saving a graphics context, and storing temporary data.
      And S13, averagely dividing the first space and the second space according to the dividing quantity of the physical GPU.
      And after the number of the post-physical GPUs is determined, averagely dividing a configurable independent division control register space crtc and a physical space connected with a display sending module connector in a 28G video memory space and a 4G video memory space. Taking 4 virtual machines as an example, each virtual machine has an independent 7G graphics context and temporary data storage space for access, and a 1G separate control register space crtc, connected to the rendering module connector space.
      And S14, establishing a mapping relation between the virtual address and the physical address through an MMIO page table.
      In order to ensure the isolation and the safety of the access of an upper virtual machine and avoid directly accessing a physical address space, the mapping relation between a virtual address and a physical address is established through an MMIO page table. The virtual space seen and accessed by the upper level virtual machine will be mapped to actual physical addresses by the MMIO page table. The establishment and maintenance of the MMIO page table are realized by the native video card driver in the Host operating system Host OS. And each time the Host OS obtains the virtual address space required to be accessed in the virtual machine, corresponding to the actual physical address based on the divided physical space and the MMIO page table.
      When the physical GPU is used as an independent PCIe (Peripheral Component Interconnect Express) device to perform I/O transmission and communication with the Host OS, multiple PCIe devices may be virtualized by an SR-IOV technology, and each device occupies independent resources such as an independent video memory, a controller, and a connection. That is, for the Host OS with upper layer cut-through, there is no longer only one GPU device, but rather, a plurality of GPU units are connected at the same time, each GPU unit being called a VF (Virtual Function).
      The division mode for realizing the SR-IOV technology of the GPU firstly needs to provide technical support when hardware is designed, and mainly hardware can realize the division of a single PF (Physical Function) and Virtual Function (VF). The Guest OS of the upper layer user machine mainly accesses to a PF (Physical Function) and an associated Virtual Function (VF) through I/O access.
      The PF accessed by the Host OS of the upper layer contains all PCI functions of the GPU video card, can directly discover, manage and process all memory spaces and processing units of the PF, has fully configured resources, and can be used for configuring or controlling other VF devices which are divided. The VF, as a light PCIe functional unit, includes a part of physical functions of the GPU, and may share a partitioned physical resource with the associated PF, that is, the VF includes all storage addresses of the partitioned video memory, the control unit, and the connector.
      The SR-IOV technology realizes the creation of PF and VF modules by a configuration register moving mode. The number of VF modules that need to be dynamically divided is first determined based on the number of upper layer virtual machines, and therefore when the number of upper layer virtual machines changes, the VF modules need to be re-divided. The assigned VF amount is determined by designing the CX _ MAX _ VF _ N parameter in the PF module. In addition, changing the assignment of VFs at reset, i.e., again determining that the VF management registers are programmable, may be accomplished by writing to the PF's IntialVFs register. The written CX _ MAX _ VF _ N ensures that the ability to implement SR-IOV initializes the IntialVFs registers.
      The configuration of GPU virtualization is initialized, and the related configuration is stored mainly through display firmware such as vBIOS, uefi and the like. First, the local CPU, host, overrides the default values of the terminal's internal SR-IOV register via the local bus controller through the data bus interface. The remote high-authority port firstly discovers the PF of the physical GPU, reads the SR-IOV function register, configures and enables a plurality of VFs modules divided by the SR-IOV function register, and realizes the matching of the port and the VF. After the remote high-authority port is matched with the VF functional module, independent resource allocation such as a video memory, a control unit, a connector and the like is completed through configuration of the PF and the VF Bar. And finally, writing a message signal interrupt table such as PF and VF by the remote high-authority port, and enabling internal message signal interrupt, so that SR-IOV configuration of the GPU is completed.
      The difference from the network card virtualization of the common SR-IOV technology is that the network card virtualization only needs to consider the virtualization of the I/O port during design and use, namely, the original single I/O port is decomposed, and the multipath I/O virtual communication is realized. The GPU virtualization adopting the SR-IOV technology needs hardware physical GPU to divide independent resources, and the externally realized functions comprise independent access of a video memory, independent access of a controller and independent access of a connecting module. In addition, after the network card is virtualized, the upper layer user layer software only needs to access different virtualized spaces through different address spaces. For GPU virtualization, an upper user layer or an operating system is mainly divided for a virtualized graphics card through a set identifier. The identifier is mainly set by hardware during division, the identifier is visible to an upper layer when the device is loaded after configuration, and a plurality of upper layers access fixed hardware resources in a virtualized manner and need to carry the determined identifier.
      After the virtual division, the virtual hardware GPU is directly connected with the Host OS through the same PCI Express line, and the Host OS accesses different virtual GPUs mainly through GPU hardware information carrying identifiers. And the hardware provides different virtualized GPU polling schedules for shared resources such as computing power and video coding and decoding according to the identifiers.
      And S2, installing a virtual display card driver in the user layer virtual machine.
      When a plurality of virtual machines access the unified physical GPU, a separate virtual display card driver needs to be installed in the Guest OS. The virtual graphics driver includes the same API interface functions as the graphics libraries Open GL, open CL and Vulcan. The virtual graphics card driver intercepts and redirects requests for upper layer graphics middleware, such as QT, OSG, and Mini GUI, to call the underlying graphics library. The specific process of the step is as follows:
      and S21, configuring the virtual machine to be in a non-transparent transmission mode.
      For the user layer virtual machine, in order to guarantee the use safety, all the configured virtual machines are in a non-transparent transmission mode, and an application program API (application program interface) request cannot access the physical GPU. Aiming at the non-transparent transmission mode, the virtual display card driver is used as a kernel device driver code and is responsible for software and hardware interaction, and communication between the Guest OS and the graphic device is realized.
      And S22, setting the virtual machine on the Hypervisor layer and setting the number of the virtual machine, wherein the virtual machine with the number of 0 is the Host OS with the highest privilege level.
      Before the virtual graphics card driver is installed, the virtual machine is arranged on a Hypervisor layer of a virtual machine management program, and a virtual machine number is set. The virtual machine numbered 0, i.e. the Host OS with the highest privilege level, is responsible for managing other virtual machines and is directly connected with the hardware GPU.
      S23, when the virtual graphics card driver is installed, the user layer virtual machines with numbers larger than 0 respectively store a single virtual machine identifier, and record the address space which is allocated by the user layer virtual machines and mapped through an MMIO page table; the virtual display card driver provides a graphics library API function of a standard graphics library OpenGL to an upper application program, and is used for intercepting and forwarding an API request of a user layer virtual machine.
      The main functions of the virtual graphics card driver include: virtualizing a graphic device for Guest OS to recognize, and initializing the recognition, resource mapping and corresponding setting of the graphic device in the initialization stage, such as being responsible for searching the device, accessing the configuration space of the device, implementing interrupt control, etc., assisting in completing the mapping work of the hardware resource of the graphic device to the system resource space, and simultaneously serving as a transmission channel of related data and commands when using virtualization to perform graphic processing.
      And S3, establishing a communication transmission mechanism of the Host OS of the Hypervisor layer and the user layer virtual machine.
      The Hypervisor layer is used as an intermediate software layer which runs between the bottom-layer physical GPU and the upper-layer virtual machine, and coordinates the upper-layer virtual machine to provide a mode for accessing unified physical resources. And establishing Virtio equipment on a Hypervisor layer, and simulating independent display card equipment. On one hand, the API request of calling a graphic library or scientific calculation by the virtual machine is transmitted to the Host OS, and on the other hand, the result processed by the GPU is transmitted to the corresponding virtual machine through the Hypervisor layer.
      All virtual machines are installed on the Hypervisor layer. The Hypervisor is a virtual machine management program, runs at a software layer between hardware and virtual machine operation software, and commonly used Hypervisor programs include Xen, VMware and KVM. The Hypervisor used in this embodiment is a KVM, and provides functions such as memory management, storage, and client image format, implementation migration, device driver, and scalability performance.
      Specifically, the process of the step is as follows:
      s31, creating one-to-one correspondence between the Virtio-deficiency-type analog devices and the virtual machines automatically on the Hypervisor layer, wherein a front-end driver of the Virtio-deficiency-type analog devices is provided by a virtual graphics card driver, and a back-end driver is only used for receiving information transmitted by the front-end driver and forwarding the information to a Host OS.
      The Hypervisor layer of the KVM automatically creates Virtio-deficiency-type simulation equipment, one independent PCI equipment is simulated for the virtual machine and corresponds to the virtual machine one by one, wherein a front-end driver of the Virtio-deficiency-type simulation equipment is provided for a virtual graphics card driver, and a rear-end driver is only used for receiving information transmitted by the front-end driver and forwarding the information to the Host OS, so that the virtual machine management system Hypervisor can be realized.
      And S32, setting a ring queue VirtQueue by a back-end driver, and realizing the exchange of data between the Host OS and the virtual machine.
      And establishing a communication mechanism between the Guest OS and the Host OS of the virtual machine at a Hypervisor layer of the virtual machine management system. Instead of traditional socket communication, which requires repeated authentication and consumes a lot of system performance resources, the communication mechanism chooses to use a ring queue-based Virtio channel for message delivery. The Virtio is used as a virtual I/O interface driver, and a PCI device is simulated for the virtual machine by creating a Virtio virtual device. The data exchange between the Host OS and the virtual machine is realized by using the ring queue-based VirtQueue, and the frequent copying of the data between the Host and the virtual machine is reduced.
      After the virtual machine is started, the Hypervisor layer creates a Virtio device visible to the virtual machine. The virtual display card driver communicates with Virtio equipment in the Hypervisor layer, and the Virtio equipment in the Hypervisor layer transmits data to the Host OS. And completing the I/O task of the data between the virtual machine and the Host OS.
      And S4, after the user layer virtual machine initiates an API request, the API request is transmitted to the Host OS through the Hypervisor layer, the Host OS performs centralized processing according to the event priority and the time stamp in sequence, each virtual machine can only access the corresponding independent resource, and the resource isolation and the safety during use are guaranteed.
      The Host OS centrally processes the API request transmitted through the Hypervisor layer, and resource isolation and safety during use are guaranteed according to the established corresponding relation between the independent resources divided by the physical layer and the virtual machine. The API requests of the graphic library transmitted by the virtual machines are marked with event priority and time stamps before transmission, and the event priority and the time stamps determine the centralized processing sequence of the HostOS. Each virtual machine can only access independent physical resources after SR-IOV division, and resource isolation is guaranteed.
      The specific process of the step is as follows:
      s41, a program used by the user layer for displaying or rendering graphics firstly calls a graphics middleware, the graphics middleware decomposes a display or graphics rendering task into basic graphics drawing requirements, and then calls a bottom-layer standard graphics library OpenGL to generate an API request for drawing basic graphics.
      After the virtual machine is configured on the Hypervisor layer of the KVM and the virtual graphics driver is installed, the user layer for displaying or rendering graphics may first call graphics middleware such as QT, OSG, and the like. And (3) decomposing tasks such as graphic middleware display or graphic rendering into basic graphic drawing requirements, and calling a bottom standard graphic library OpenGL for drawing. Since the virtual graphics card driver includes the same API interface as the standard graphics library, the virtual graphics card driver receives the basic graphics drawing requirement API of the upper layer application.
      And when the virtual machine calls the virtual display card driver, the virtual display card driver provides a graphic library API of the standard graphic library OpenGL to the upper application program. Typically, the graphics card driver will provide both the graphics library API and the graphics library API driver to the upper layer application. In the virtualization design of the invention, the specific graphic drawing and rendering process is realized by the Host OS, so the Guest OS of the virtual machine does not design a graphic library API driving program, and only calls the graphic library API by the upper layer application to intercept and forward the image library API into the Host OS.
      The virtual graphics card drives graphics library API functions of the standard graphics library OpenGL provided to the upper layer, which functions can be used to specify objects and operations, creating interactive three-dimensional applications. The OpenGL graphics library API contains, as its main content, object management, state management, texture mapping, vertex operations, shader language compilers, fragment operations, etc. Object management allows applications to explicitly store data in the video memory; the vertex operation is responsible for setting the vertex attribute and starting the graph rendering; the texture mapping is responsible for three-dimensional texture mapping; the function interface of the coloring language compiler is responsible for scheduling the OpenGL coloring language compiler to complete the compilation of the coloring device; the state management is responsible for maintaining the state of the OpenGL context; the fragment operation is responsible for carrying out Alpha mixing, depth testing and template testing on the fragments.
      And S42, after the API interface which is the same as the current API request is driven by the virtual display card of the user layer virtual machine to receive the API request, recording the function parameters based on the determined drawing task and event priority, the API called sequence and the time stamp, the configured virtual GPU identifier and the software state information.
      The virtual display card driver intercepts and forwards the user layer function call graphics library API operation, and can ensure that the Host OS can perform centralized processing on API requests of a plurality of virtual machine systems only by sorting related configuration parameters and information of the function call and uniformly sending the packed configuration parameters and information to the Host OS.
      The configuration parameters and information which are intercepted by the virtual display card driver and provided for the Host OS comprise the following steps:
      (1) With function parameters that determine semantics. The Host OS can obtain the semantics of the function, the target address contained in the function and the type structure content of the data by analyzing the function parameters requested by the API. The function parameters of the specific API request can guarantee that the Host OS can take correct execution measures for the API request.
      (2) The API requests the data stream with the order and time stamp in which the API is called. In the multi-user mode, the calling and execution sequence of the API request of the user layer of the virtual machine needs to be clear in calling order, and different APIs have a mutual calling relationship. In addition, the time stamp can ensure that the Host OS can execute in the correct execution sequence, and further ensure the integrity and correctness of the result.
      (3) And storing the state information of the GPU hardware, the driver, the user library and other software. In the process of executing an API request and driver, state information of software such as GPU hardware, drivers, and user libraries needs to be saved and interacted. Including caches, memories, kernel grids, kernel threads, modules and functions, texture operations, contexts, flows and events, etc.
      And S43, packaging the information recorded by the virtual graphics driver to an API request, and transmitting the Virtio deficiency-type simulation equipment created by the Hypervisor layer to the rear-end driver of the Hypervisor layer.
      And S44, sequentially transmitting the API requests of all the user virtual machines to a Host OS with the serial number of 0 through a ring queue VirtQueue by a rear-end driver of the Hypervisor layer.
      And S45, establishing a plurality of independent transmission channels by the Host OS, receiving and analyzing the API request, processing the API request according to the high priority and the same priority in a time stamp sequence, and calling the native display card to drive to access the physical GPU in a thread mode.
      The analysis process mainly comprises the following steps: on one hand, sequencing is carried out based on the time stamps and the priorities, and API requests of the virtual machines are processed in sequence; on the other hand, the specific graphics drawing task in the API request is determined, the virtualization hardware with the same identifier is called for processing, and the related independent video memory resource can also be used for reserving the graphics context.
      The Host operating system Host OS processes the request sequence such as the graphic drawing API of different virtual machines, and the request sequence is mainly sequenced through the time stamps on the virtual display card drive receiving time zones on the virtual machines. In addition, for extremely important tasks, such as emergencies or long events, high priority is marked, and priority processing is performed by means of preemption and the like.
      The Host operating system Host OS manages the native display card driver, namely, the native display card driver comprises a device driver, a graphic library driver, hardware initialization and software and hardware management functions. Meanwhile, the Host OS is directly communicated with the actual physical GPU, so that the state and the driving information of the hardware GPU are required to be fed back to the Guest OS periodically, and the state synchronization between the Host OS and the Guest OS is guaranteed.
      After analyzing the API request for drawing or rendering the graphics of the virtual machine, the Host operating system Host OS calls the native graphics card to drive to access the physical GPU in a thread mode, and related functions are achieved.
      S46, the physical GPU processes various API requests transmitted by the Host OS, the processing result is transmitted back to the Host OS in a return value mode, and the Host OS transmits the processing result to the corresponding virtual machine through a VirtQueue of a Hypervisor layer.
      The physical GPU processes API requests transmitted by various Host machine Host OS, such as 2D/3D graph drawing, graph rendering and scientific calculation, the processing results are transmitted back to the Host machine in a return value mode, the Host machine Host OS is transmitted to a corresponding virtual machine through VirtQueue of a Hypervisor layer.
      In summary, the present invention provides a method for implementing GPU virtualization based on SR-IOV technology, which performs a layered design for implementing a virtualization system, and includes a user layer, a Hypervisor layer, a Host OS, and a physical GPU hardware layer. In a hardware layer, a physical module which divides independent resources such as a video memory, a controller and a connection module and realizes time-sharing scheduling of computing power, video coding and decoding and other shared resources is designed based on an SR-IOV technology. And installing a virtual display card in a user layer to drive in a virtual machine of the user layer, and intercepting and redirecting an API (application program interface) request for calling a graphic library or scientific calculation by the virtual machine. On the one hand, on the Hypervisor layer, transmitting an API request for calling a graphic library or scientific calculation by the virtual machine to the Host OS, and on the other hand, transmitting a result processed by the GPU to the corresponding virtual machine through the Hypervisor layer; in the Host OS, the API requests transmitted through the Hypervisor layer are processed in a centralized manner, the corresponding relation between the independent resources divided by the physical layer and the virtual machine is established, and the resource isolation and the safety during use are guaranteed.
      The scheme of the layered design ensures the safety and the subsequent maintainability during use. When a new virtual machine is configured, a communication relation with a Host OS is established through a Hypervisor layer, and independent resources of the physical GPU are divided again. And a virtual display card driver is installed in the virtual machine, and on the basis, the API calling a standard graphic library or an application program API of scientific calculation can transmit an API request to the Host OS through the virtual display card driver. And the Host OS directly accesses the physical GPU, and stores the API requests in the single virtual machine in the divided physical GPU to realize context isolation and frame caching. And the physical GPU carries out time-sharing scheduling on computing resources and video coding and decoding resources of different API requests, and the calculated result is transmitted to the corresponding virtual machine system through the Host OS. The method improves the utilization rate of GPU resources to the maximum extent, reduces the burden of a CPU system and ensures the use safety under resource isolation.
      The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.
    Claims (5)
1. A GPU virtualization implementation method based on SR-IOV technology is characterized by comprising the following steps:
      s1, dividing independent resources of a physical GPU through an SR-IOV technology;
      s2, installing a virtual display card to drive in a user layer virtual machine;
      s3, establishing a communication transmission mechanism of a Host OS of a Hypervisor layer and a user layer virtual machine;
      and S4, after the user layer virtual machine initiates an API request, the API request is transmitted to the Host OS through the Hypervisor layer, the Host OS performs centralized processing according to the event priority and the time stamp in sequence, each virtual machine can only access the corresponding independent resource, and the resource isolation and the safety during use are guaranteed.
    2. The method for implementing GPU virtualization according to claim 1, wherein the specific process of step S1 is as follows:
      s11, determining the partition number of the physical GPU based on the number of the configured virtual machines, fixing the partition number as an even number, and if the number of the virtual machines is an odd number, adding 1 to the number of the virtual machines;
      s12, dividing an address space in the physical GPU into a first space and a second space, wherein the first space is used for configuring a control register space, connecting a display sending module and a non-configurable computing power and video coding and decoding scheduling unit, and the second space is used for providing access for an upper-layer graphics controller and is used for storing graphics context and temporary data;
      s13, averagely dividing the first space and the second space according to the dividing quantity of the physical GPU;
      and S14, establishing a mapping relation between the virtual address and the physical address through an MMIO page table.
    3. The SR-IOV technology-based GPU virtualization implementation method of claim 2, wherein the specific process of step S2 is as follows:
      s21, configuring the virtual machine to be in a non-transparent transmission mode;
      s22, setting the virtual machine on the Hypervisor layer and setting the number of the virtual machine, wherein the virtual machine with the number of 0 is the Host OS with the highest privilege level;
      s23, when the virtual graphics card driver is installed, the user layer virtual machines with numbers larger than 0 respectively store a single virtual machine identifier, and record the address space which is allocated by the user layer virtual machines and mapped through an MMIO page table; the virtual display card driver provides a graphics library API function of a standard graphics library OpenGL to an upper application program, and is used for intercepting and forwarding an API request of a user layer virtual machine.
    4. The method for implementing GPU virtualization based on SR-IOV technology of claim 3, wherein the specific process of step S3 is as follows:
      s31, automatically creating Virtio-deficiency-type virtual equipment in a Hypervisor layer, wherein the Virtio-deficiency-type virtual equipment is in one-to-one correspondence with a virtual machine, a front-end driver of the Virtio-deficiency-type virtual equipment is provided by a virtual graphics card driver, and a rear-end driver is only used for receiving information transmitted by the front-end driver and forwarding the information to a Host OS;
      and S32, setting a ring queue VirtQueue by a back-end driver, and realizing the exchange of data between the Host OS and the virtual machine.
    5. The method for implementing GPU virtualization according to claim 4, wherein the specific process of step S4 is as follows:
      s41, a program used by a user layer for displaying or rendering graphics firstly calls a graphics middleware, the graphics middleware decomposes a display or graphics rendering task into basic graphics drawing requirements, and then calls a bottom-layer standard graphics library OpenGL to generate an API request for drawing basic graphics;
      s42, after the API interface which is the same as the current API request is driven by a virtual display card of the user layer virtual machine to receive the API request, recording a function parameter based on the determined drawing task and event priority, the API called sequence and timestamp, a configured virtual GPU identifier and software state information;
      s43, packaging information recorded by the virtual display card driver to an API request, and transmitting the Virtio deficiency-type simulation equipment created by the Hypervisor layer to a rear-end driver of the Hypervisor layer;
      s44, sequentially transmitting API requests of all user virtual machines to a Host OS with the serial number of 0 through a ring queue VirtQueue by a rear-end driver of the Hypervisor layer;
      s45, establishing a plurality of independent transmission channels by a Host OS to receive and analyze the API request, processing the API request according to a high priority, processing the API request with the same priority in a timestamp sequence, and calling a native graphics card to drive to access the physical GPU in a thread mode;
      and S46, the physical GPU processes various API requests transmitted by the Host OS, the processing result is transmitted back to the Host OS in a return value mode, and the Host OS transmits the processing result to the corresponding virtual machine through a ring queue VirtQueue of the Hypervisor layer.
    Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title | 
|---|---|---|---|
| CN202211419575.8A CN115904617A (en) | 2022-11-14 | 2022-11-14 | GPU virtualization implementation method based on SR-IOV technology | 
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title | 
|---|---|---|---|
| CN202211419575.8A CN115904617A (en) | 2022-11-14 | 2022-11-14 | GPU virtualization implementation method based on SR-IOV technology | 
Publications (1)
| Publication Number | Publication Date | 
|---|---|
| CN115904617A true CN115904617A (en) | 2023-04-04 | 
Family
ID=86480716
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date | 
|---|---|---|---|
| CN202211419575.8A Pending CN115904617A (en) | 2022-11-14 | 2022-11-14 | GPU virtualization implementation method based on SR-IOV technology | 
Country Status (1)
| Country | Link | 
|---|---|
| CN (1) | CN115904617A (en) | 
Cited By (5)
| Publication number | Priority date | Publication date | Assignee | Title | 
|---|---|---|---|---|
| CN117873735A (en) * | 2024-03-11 | 2024-04-12 | 湖南马栏山视频先进技术研究院有限公司 | GPU scheduling system under virtualized environment | 
| CN118035965A (en) * | 2024-04-12 | 2024-05-14 | 清华大学 | Method and device for computing power by using graphic processor cooperatively by multiple users | 
| CN119003098A (en) * | 2024-10-23 | 2024-11-22 | 上海壁仞科技股份有限公司 | Method, computing device, medium and program product for constructing trusted execution environment | 
| CN119621251A (en) * | 2025-02-17 | 2025-03-14 | 麒麟软件有限公司 | A method for jailhouse to realize SR-IOV virtual device discovery | 
| CN120429067A (en) * | 2025-06-26 | 2025-08-05 | 摩尔线程智能科技(上海)有限责任公司 | GPU virtualized video memory management method, device, storage medium, and program product | 
- 
        2022
        - 2022-11-14 CN CN202211419575.8A patent/CN115904617A/en active Pending
 
Cited By (10)
| Publication number | Priority date | Publication date | Assignee | Title | 
|---|---|---|---|---|
| CN117873735A (en) * | 2024-03-11 | 2024-04-12 | 湖南马栏山视频先进技术研究院有限公司 | GPU scheduling system under virtualized environment | 
| CN117873735B (en) * | 2024-03-11 | 2024-05-28 | 湖南马栏山视频先进技术研究院有限公司 | GPU scheduling system under virtualized environment | 
| CN118035965A (en) * | 2024-04-12 | 2024-05-14 | 清华大学 | Method and device for computing power by using graphic processor cooperatively by multiple users | 
| CN118035965B (en) * | 2024-04-12 | 2024-06-11 | 清华大学 | Method and device for collaborative use of graphics processor computing power by multiple users | 
| CN119003098A (en) * | 2024-10-23 | 2024-11-22 | 上海壁仞科技股份有限公司 | Method, computing device, medium and program product for constructing trusted execution environment | 
| CN119003098B (en) * | 2024-10-23 | 2025-02-25 | 上海壁仞科技股份有限公司 | Method, computing device, medium and program product for building a trusted execution environment | 
| CN119621251A (en) * | 2025-02-17 | 2025-03-14 | 麒麟软件有限公司 | A method for jailhouse to realize SR-IOV virtual device discovery | 
| CN119621251B (en) * | 2025-02-17 | 2025-05-06 | 麒麟软件有限公司 | A method for jailhouse to realize SR-IOV virtual device discovery | 
| CN120429067A (en) * | 2025-06-26 | 2025-08-05 | 摩尔线程智能科技(上海)有限责任公司 | GPU virtualized video memory management method, device, storage medium, and program product | 
| CN120429067B (en) * | 2025-06-26 | 2025-09-12 | 摩尔线程智能科技(上海)有限责任公司 | GPU (graphics processing unit) virtualized video memory management method, device, storage medium and program product | 
Similar Documents
| Publication | Publication Date | Title | 
|---|---|---|
| CN115904617A (en) | GPU virtualization implementation method based on SR-IOV technology | |
| US20250191111A1 (en) | Reconfigurable virtual graphics and compute processor pipeline | |
| CN107077377B (en) | Equipment virtualization method, device and system, electronic equipment and computer program product | |
| US9798565B2 (en) | Data processing system and method having an operating system that communicates with an accelerator independently of a hypervisor | |
| US10191759B2 (en) | Apparatus and method for scheduling graphics processing unit workloads from virtual machines | |
| KR100733852B1 (en) | Computer system | |
| EP1674987B1 (en) | Systems and methods for exposing processor topology for virtual machines | |
| US9146762B2 (en) | Specialized virtual machine to virtualize hardware resource for guest virtual machines | |
| US11010859B2 (en) | Display resource scheduling method and device for embedded system | |
| US20190303190A1 (en) | Managing virtual machine instances utilizing a virtual offload device | |
| WO2018119951A1 (en) | Gpu virtualization method, device, system, and electronic apparatus, and computer program product | |
| CN106406977A (en) | Virtualization implementation system and method of GPU (Graphics Processing Unit) | |
| CN100570562C (en) | Graphics card, virtual machine system using the graphics card, and display processing method | |
| CN101419558A (en) | CUDA graphic subsystem virtualization method | |
| CN111966504B (en) | Task processing method in graphics processor and related equipment | |
| US12105648B2 (en) | Data processing method, apparatus, and device | |
| CN114138423B (en) | Virtualization construction system and method based on domestic GPU graphics card | |
| CN113032103A (en) | VF (variable frequency) resource dynamic scheduling method based on SR-IOV (scheduling request-input/output) function of high-speed network card | |
| CN113419845A (en) | Calculation acceleration method and device, calculation system, electronic equipment and computer readable storage medium | |
| EP4471587A1 (en) | Device virtualization method and related device | |
| CN113568734A (en) | Virtualization method and system based on multi-core processor, multi-core processor and electronic equipment | |
| CN118467093A (en) | Resource processing method, device, apparatus, readable storage medium and program product | |
| US8402191B2 (en) | Computing element virtualization | |
| CN120429067B (en) | GPU (graphics processing unit) virtualized video memory management method, device, storage medium and program product | |
| CN118885301B (en) | Hardware-accelerated digital GPU simulation method and system | 
Legal Events
| Date | Code | Title | Description | 
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination |