CN118262029B - Rendering subdivision method, electronic device, storage medium and computer program product - Google Patents
Rendering subdivision method, electronic device, storage medium and computer program product Download PDFInfo
- Publication number
- CN118262029B CN118262029B CN202410692387.5A CN202410692387A CN118262029B CN 118262029 B CN118262029 B CN 118262029B CN 202410692387 A CN202410692387 A CN 202410692387A CN 118262029 B CN118262029 B CN 118262029B
- Authority
- CN
- China
- Prior art keywords
- address information
- storage address
- output variable
- variable
- characteristic
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T15/00—3D [Three Dimensional] image rendering
- G06T15/10—Geometric effects
- G06T15/20—Perspective computation
- G06T15/205—Image-based rendering
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T15/00—3D [Three Dimensional] image rendering
- G06T15/50—Lighting effects
- G06T15/55—Radiosity
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Computer Graphics (AREA)
- General Physics & Mathematics (AREA)
- Computing Systems (AREA)
- Geometry (AREA)
- Memory System Of A Hierarchy Structure (AREA)
- Devices For Executing Special Programs (AREA)
Abstract
本申请实施例公开一种渲染细分方法、电子设备、存储介质及计算机程序产品。其中,该渲染细分方法包括:获得第一表;所述第一表中的每一行包含第一细分控制着色器TCS中的一个输出变量的第一特征和所述输出变量的第一存储地址信息;获得第二表;所述第二表中的每一行包含第一细分评估着色器TES中的一个输入变量的第二特征和为所述输入变量分配的存储位置的第二存储地址信息;基于所述第一表的所述第一特征和所述第二表的所述第二特征进行匹配,将所述第一表中每一个输出变量的第一存储地址信息存储至所述第二表中与所述每一个输出变量匹配的输入变量的第二存储地址信息所指的存储位置。
The embodiment of the present application discloses a rendering subdivision method, an electronic device, a storage medium and a computer program product. The rendering subdivision method includes: obtaining a first table; each row in the first table contains a first feature of an output variable in a first subdivision control shader TCS and first storage address information of the output variable; obtaining a second table; each row in the second table contains a second feature of an input variable in a first subdivision evaluation shader TES and second storage address information of a storage location allocated for the input variable; matching the first feature of the first table with the second feature of the second table, storing the first storage address information of each output variable in the first table to the storage location indicated by the second storage address information of the input variable matching each output variable in the second table.
Description
技术领域Technical Field
本申请实施例涉及图形渲染技术领域,尤其涉及一种渲染细分方法、电子设备、存储介质及计算机程序产品。The embodiments of the present application relate to the field of graphics rendering technology, and in particular to a rendering subdivision method, an electronic device, a storage medium, and a computer program product.
背景技术Background Art
随着图形渲染技术的发展,曲面细分(tessellation)成为图像处理器(GPU,Graphics Processing Unit)运行的图形渲染管线中的重要的一环,它可以将输入的目标图元分解为更小的图元,从而提供更精细的几何变换。然而,随着人们对渲染细分的要求不断提高,曲面细分存在诸多可以提升的方面。With the development of graphics rendering technology, tessellation has become an important part of the graphics rendering pipeline run by the GPU (Graphics Processing Unit). It can decompose the input target primitives into smaller primitives, thereby providing more sophisticated geometric transformations. However, as people's requirements for rendering tessellation continue to increase, there are many aspects of tessellation that can be improved.
发明内容Summary of the invention
根据本申请实施例的一个方面,其提供一种渲染细分方法,包括:获得第一表;所述第一表中的每一行包含第一细分控制着色器TCS中的一个输出变量的第一特征和所述输出变量的第一存储地址信息;获得第二表;所述第二表中的每一行包含第一细分评估着色器TES中的一个输入变量的第二特征和为所述输入变量分配的存储位置的第二存储地址信息;基于所述第一表的所述第一特征和所述第二表的所述第二特征进行匹配,将所述第一表中每一个输出变量的第一存储地址信息存储至所述第二表中与所述每一个输出变量匹配的输入变量的第二存储地址信息所指的存储位置。According to one aspect of an embodiment of the present application, a rendering subdivision method is provided, including: obtaining a first table; each row in the first table contains a first feature of an output variable in a first subdivision control shader TCS and first storage address information of the output variable; obtaining a second table; each row in the second table contains a second feature of an input variable in a first subdivision evaluation shader TES and second storage address information of a storage location assigned to the input variable; based on matching the first feature of the first table and the second feature of the second table, the first storage address information of each output variable in the first table is stored in the storage location indicated by the second storage address information of the input variable matching each output variable in the second table.
在上述方案中,所述基于所述第一表的所述第一特征和所述第二表的所述第二特征进行匹配,将所述第一表中每一个输出变量的第一存储地址信息存储至所述第二表中与所述每一个输出变量匹配的输入变量的第二存储地址信息所指的存储位置,包括:比对所述第一表中目标输出变量的所述第一特征与所述第二表中的每一个输入变量的第二特征;响应于所述第二表中存在目标输入变量的所述第二特征与所述目标输出变量的所述第一特征相同,将所述目标输出变量的所述第一存储地址信息存储至所述目标输入变量的第二存储地址信息所指的存储位置;其中,所述目标输出变量为所述第一表中的任一个输出变量;所述目标输入变量是所述第二表中与所述目标输出变量匹配的输入变量。In the above scheme, the matching based on the first feature of the first table and the second feature of the second table, storing the first storage address information of each output variable in the first table to the storage location indicated by the second storage address information of the input variable matching each output variable in the second table, includes: comparing the first feature of the target output variable in the first table with the second feature of each input variable in the second table; in response to the existence of the second feature of the target input variable in the second table being the same as the first feature of the target output variable, storing the first storage address information of the target output variable to the storage location indicated by the second storage address information of the target input variable; wherein, the target output variable is any output variable in the first table; and the target input variable is an input variable matching the target output variable in the second table.
在上述方案中,所述方法还包括:响应于所述第二表中未存在与所述目标输出变量的所述第一特征相同的输入变量的所述第二特征,输出指示信息;所述指示信息用于指示匹配失败。In the above scheme, the method further includes: in response to the second table not containing the second feature of the input variable that is the same as the first feature of the target output variable, outputting indication information; the indication information is used to indicate a matching failure.
在上述方案中,所述获得第一表,包括:加载存储的所述第一表,或者,运行第一TCS获得所述第一表;所述获得第二表,包括:加载存储的所述第二表,或者,运行第一TES获得所述第二表。In the above scheme, obtaining the first table includes: loading the stored first table, or running a first TCS to obtain the first table; obtaining the second table includes: loading the stored second table, or running a first TES to obtain the second table.
在上述方案中,所述方法还包括:获得所述第一表,生成第一指令;所述第一指令用于基于所述第一表中的第一存储地址信息存储对应的输出变量。In the above solution, the method further includes: obtaining the first table and generating a first instruction; the first instruction is used to store the corresponding output variable based on the first storage address information in the first table.
在上述方案中,所述方法还包括:获得所述第二表,生成第二指令;所述第二指令用于基于所述第二表中的第二存储地址信息读取对应的第一存储地址信息,且基于所述对应的第一存储地址信息读取对应的输出变量;所述对应的输出变量作为第一TES的一个输入变量。In the above scheme, the method also includes: obtaining the second table and generating a second instruction; the second instruction is used to read the corresponding first storage address information based on the second storage address information in the second table, and read the corresponding output variable based on the corresponding first storage address information; the corresponding output variable is used as an input variable of the first TES.
在上述方案中,所述方法还包括:获得第三表;所述第三表中的每一行包含第二TES中的一个输入变量的第三特征和为所述输入变量分配的存储位置的第三存储地址信息;基于所述第一表的所述第一特征和所述第三表的所述第三特征进行匹配,将所述第一表中每一个输出变量的第一存储地址信息存储至所述第三表中与所述每一个输出变量匹配的输入变量的第三存储地址信息所指的存储位置。In the above scheme, the method also includes: obtaining a third table; each row in the third table contains a third feature of an input variable in the second TES and third storage address information of a storage location assigned to the input variable; based on matching the first feature of the first table and the third feature of the third table, the first storage address information of each output variable in the first table is stored in the storage location indicated by the third storage address information of the input variable matching each output variable in the third table.
在上述方案中,所述方法还包括:获得第四表;所述第四表中的每一行包含第二TCS中的一个输出变量的第四特征和所述输出变量的第四存储地址信息;基于所述第四表的所述第四特征和所述第二表的所述第二特征进行匹配,将所述第四表中每一个输出变量的第四存储地址信息存储至所述第二表中与所述每一个输出变量匹配的输入变量的第二存储地址信息所指的存储位置。In the above scheme, the method also includes: obtaining a fourth table; each row in the fourth table contains a fourth feature of an output variable in the second TCS and fourth storage address information of the output variable; based on matching the fourth feature of the fourth table and the second feature of the second table, the fourth storage address information of each output variable in the fourth table is stored in the storage location indicated by the second storage address information of the input variable matching each output variable in the second table.
在上述方案中,所述第一特征为所述输出变量的名字;所述第二特征为所述输入变量的名字。In the above solution, the first feature is the name of the output variable; the second feature is the name of the input variable.
在上述方案中,所述第一存储地址信息用于指示所述输出变量在显卡的显存中的存储位置;所述第二存储地址信息用于指示为所述输入变量分配的缓冲区Buffer的位置;其中,所述Buffer位于所述显存或所述显卡中的寄存器内。In the above scheme, the first storage address information is used to indicate the storage location of the output variable in the video memory of the graphics card; the second storage address information is used to indicate the location of the buffer Buffer allocated for the input variable; wherein the Buffer is located in the video memory or a register in the graphics card.
在上述方案中,若所述第一存储地址信息和所述第二存储地址信息均是指向所述显存内的存储位置,所述第一存储地址信息和所述第二存储地址信息指向所述显存内的不同存储位置。In the above solution, if the first storage address information and the second storage address information both point to storage locations in the video memory, the first storage address information and the second storage address information point to different storage locations in the video memory.
根据本申请实施例的另一发明,其提供一种电子设备,包括:图形处理器GPU;所述GPU,被配置为:获得第一表;所述第一表中的每一行包含第一细分控制着色器TCS中的一个输出变量的第一特征和所述输出变量的第一存储地址信息; 获得第二表;所述第二表中的每一行包含第一细分评估着色器TES中的一个输入变量的第二特征和为所述输入变量分配的存储位置的第二存储地址信息;以及基于所述第一表的所述第一特征和所述第二表的所述第二特征进行匹配,将所述第一表中每一个输出变量的第一存储地址信息存储至所述第二表中与所述每一个输出变量匹配的输入变量的第二存储地址信息所指的存储位置。According to another invention of the embodiment of the present application, an electronic device is provided, including: a graphics processing unit GPU; the GPU is configured to: obtain a first table; each row in the first table contains a first feature of an output variable in a first subdivision control shader TCS and first storage address information of the output variable; obtain a second table; each row in the second table contains a second feature of an input variable in a first subdivision evaluation shader TES and second storage address information of a storage location assigned to the input variable; and based on matching the first feature of the first table and the second feature of the second table, store the first storage address information of each output variable in the first table to the storage location indicated by the second storage address information of the input variable matching each output variable in the second table.
在上述方案中,所述GPU包括:编译器和驱动器,其中;In the above solution, the GPU includes: a compiler and a driver, wherein;
所述编译器,被配置为:编译第一TCS,获得第一表;所述第一表中的每一行包含所述第一TCS中的一个输出变量的第一特征和所述输出变量的第一存储地址信息; 以及编译第一TES,获得第二表;所述第二表中的每一行包含所述第一TES中的一个输入变量的第二特征和为所述输入变量分配的存储位置的第二存储地址信息;The compiler is configured to: compile a first TCS to obtain a first table, wherein each row in the first table contains a first characteristic of an output variable in the first TCS and first storage address information of the output variable; and compile a first TES to obtain a second table, wherein each row in the second table contains a second characteristic of an input variable in the first TES and second storage address information of a storage location allocated for the input variable;
所述驱动器,被配置为:基于所述第一表的所述第一特征和所述第二表的所述第二特征进行匹配,将所述第一表中每一个输出变量的第一存储地址信息存储至所述第二表中与所述每一个输出变量匹配的输入变量的第二存储地址信息所指的存储位置。The driver is configured to: match the first feature of the first table with the second feature of the second table, and store the first storage address information of each output variable in the first table to the storage location indicated by the second storage address information of the input variable matching each output variable in the second table.
根据本申请实施例的再一方面,其提供一种计算可读存储介质,其中存储有计算机程序,所述计算机程序被处理器加载执行时,实现上述任一项所述的方法。According to another aspect of the embodiments of the present application, a computer-readable storage medium is provided, in which a computer program is stored. When the computer program is loaded and executed by a processor, any of the methods described above is implemented.
根据本申请实施例的又一方面,其提供一种计算机程序产品,包括计算机程序,所述计算机程序被处理器加载执行时,实现上述任一项所述的方法。According to another aspect of the embodiments of the present application, a computer program product is provided, including a computer program, and when the computer program is loaded and executed by a processor, the computer program implements any of the methods described above.
本申请实施例提供一种渲染细分方法、电子产品、存储介质及计算机程序产品。其中,该渲染细分方法,包括:获得第一表;所述第一表中的每一行包含第一细分控制着色器TCS中的一个输出变量的第一特征和所述输出变量的第一存储地址信息;获得第二表;所述第二表中的每一行包含第一细分评估着色器TES中的一个输入变量的第二特征和为所述输入变量分配的存储位置的第二存储地址信息;基于所述第一表的所述第一特征和所述第二表的所述第二特征进行匹配,将所述第一表中每一个输出变量的第一存储地址信息存储至所述第二表中与所述每一个输出变量匹配的输入变量的第二存储地址信息所指的存储位置。本申请实施例提供的渲染细分方法,通过将TCS的输出变量组成的第一表和TES的输入变量组成的第二表进行匹配,以确定TCS与TES是否匹配。并且,在TCS与TES匹配的情况下,将TCS的输出变量传递到TES。采用这样方式,若缓存有TCS或TES,直接用来匹配,不需要重复编译TCS或TES,以此,提升了tessellation的性能。The embodiment of the present application provides a rendering subdivision method, an electronic product, a storage medium and a computer program product. The rendering subdivision method includes: obtaining a first table; each row in the first table contains a first feature of an output variable in a first subdivision control shader TCS and a first storage address information of the output variable; obtaining a second table; each row in the second table contains a second feature of an input variable in a first subdivision evaluation shader TES and a second storage address information of a storage location allocated for the input variable; matching the first feature of the first table and the second feature of the second table, storing the first storage address information of each output variable in the first table to the storage location indicated by the second storage address information of the input variable matching each output variable in the second table. The rendering subdivision method provided by the embodiment of the present application determines whether the TCS matches the TES by matching the first table composed of the output variables of the TCS and the second table composed of the input variables of the TES. And, when the TCS matches the TES, the output variable of the TCS is passed to the TES. In this way, if there is a TCS or TES in the cache, it is directly used for matching, and there is no need to repeatedly compile the TCS or TES, thereby improving the performance of tessellation.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
在不一定按比例绘制的附图中,相同的标号可以描述不同视图中的类似组件。具有不同字母后缀的相同数字可表示类似组件的不同实例。附图以实例而非限制的方式一般性地说明了本文档中讨论的各种实施例。In the drawings, which are not necessarily drawn to scale, the same reference numerals may describe similar components in different views. The same numerals with different letter suffixes may represent different instances of similar components. The drawings generally illustrate various embodiments discussed in this document by way of example and not limitation.
图1为本申请实施例提供的一种渲染细分方法的流程示意图一;FIG1 is a schematic diagram of a flow chart of a rendering subdivision method provided in an embodiment of the present application;
图2为本申请实施例提供的渲染管线的一种结构示意图;FIG2 is a schematic diagram of a structure of a rendering pipeline provided in an embodiment of the present application;
图3为本申请实施例提供的几何阶段的一种结构示意图;FIG3 is a schematic diagram of a structure of a geometric stage provided in an embodiment of the present application;
图4为本申请实施例提供的一种按照三角形细分的示例性示意图;FIG4 is an exemplary schematic diagram of a triangle subdivision provided in an embodiment of the present application;
图5为本申请实施例提供的曲面细分着色器的一种结构示意图;FIG5 is a schematic diagram of a structure of a tessellation shader provided in an embodiment of the present application;
图6为本申请实施例提供的TCS的一种示例性示意图;FIG6 is an exemplary schematic diagram of a TCS provided in an embodiment of the present application;
图7为本申请实施例提供的TES的一种示例性示意图;FIG. 7 is an exemplary schematic diagram of a TES provided in an embodiment of the present application;
图8为本申请实施例提供的TCS的输出表的一种示例性示意图;FIG8 is an exemplary schematic diagram of an output table of a TCS provided in an embodiment of the present application;
图9为本申请实施例提供的TES的输入表的一种示例性示意图;FIG9 is an exemplary schematic diagram of an input table of a TES provided in an embodiment of the present application;
图10为本申请实施例提供的第一表和第二表进行匹配的流程示意图;FIG10 is a schematic diagram of a flow chart of matching a first table and a second table provided in an embodiment of the present application;
图11为本申请实施例提供的图8提供的输出表和图9的输入表之间的匹配的示例性示意图;FIG11 is an exemplary schematic diagram of the matching between the output table provided in FIG8 and the input table provided in FIG9 according to an embodiment of the present application;
图12为本申请实施例提供的第一指令的示例性示意图;FIG12 is an exemplary schematic diagram of a first instruction provided in an embodiment of the present application;
图13为本申请实施例提供的第二指令的示例性示意图;FIG13 is an exemplary schematic diagram of a second instruction provided in an embodiment of the present application;
图14为本申请实施例提供的一种渲染细分方法的流程示意图二;FIG14 is a second flow chart of a rendering subdivision method provided in an embodiment of the present application;
图15为本申请实施例提供的电子设备一种结构示意图;FIG15 is a schematic diagram of a structure of an electronic device provided in an embodiment of the present application;
图16为本申请实施例提供的计算机程序产品的结构示意图。FIG16 is a schematic diagram of the structure of a computer program product provided in an embodiment of the present application.
具体实施方式DETAILED DESCRIPTION
为了便于理解本申请,下面将参照相关附件图对本申请进行更全面的描述。附件中给出了本申请的首选实施例。但是,本申请可以以多种不同的形式来实现,并不限于本申请所描述的实施例。相反地,提供这些实施例的目的是使本申请的公开内容更加透彻全面。除非另有定义,本文所使用的所有的技术和科学术语与属于本公开的技术领域的技术人员通常理解的含义相同。本文中在本公开的说明书中所使用的术语只是为了实现描述具体的实施例的目的,不是旨在限制本公开。本文所使用的术语“和/或”包括一个或多个相关的所列项目的任意的和所有的组合。In order to facilitate the understanding of the present application, the present application will be described more comprehensively with reference to the relevant annex figures below. The preferred embodiment of the present application is given in the annex. However, the present application can be implemented in a variety of different forms and is not limited to the embodiments described in the present application. On the contrary, the purpose of providing these embodiments is to make the disclosure of the present application more thorough and comprehensive. Unless otherwise defined, all technical and scientific terms used herein have the same meaning as those commonly understood by technicians in the technical field of the present disclosure. The terms used in the specification of the present disclosure herein are only for the purpose of describing specific embodiments and are not intended to limit the present disclosure. The term "and/or" used herein includes any and all combinations of one or more related listed items.
本申请实施例提供一种渲染细分方法,通过将TCS的输出变量组成的第一表和TES的输入变量组成的第二表进行匹配,以确定TCS与TES是否匹配。并且,在TCS与TES匹配的情况下,将TCS的输出变量传递到TES。采用这样方式,若缓存有TCS或TES,直接用来匹配,不需要重复编译TCS或TES,以此,提升了tessellation的性能。The embodiment of the present application provides a rendering subdivision method, which matches a first table consisting of output variables of TCS with a second table consisting of input variables of TES to determine whether TCS matches TES. And, if TCS matches TES, the output variables of TCS are passed to TES. In this way, if TCS or TES is cached, it is directly used for matching without the need to repeatedly compile TCS or TES, thereby improving the performance of tessellation.
以下结合附图详细说明本申请。The present application is described in detail below with reference to the accompanying drawings.
参见图1,其示出本申请实施例提供的一种渲染细分方法的流程示意图。具体地,该方法可以包括:Referring to FIG1 , it shows a schematic flow chart of a rendering subdivision method provided by an embodiment of the present application. Specifically, the method may include:
步骤101:获得第一表;所述第一表中的每一行包含第一细分控制着色器TCS中的一个输出变量的第一特征和所述输出变量的第一存储地址信息;Step 101: Obtain a first table; each row in the first table contains a first feature of an output variable in a first subdivision control shader TCS and first storage address information of the output variable;
步骤102:获得第二表;所述第二表中的每一行包含第一细分评估着色器TES中的一个输入变量的第二特征和为所述输入变量分配的存储位置的第二存储地址信息;Step 102: Obtain a second table; each row in the second table contains a second feature of an input variable in the first subdivision evaluation shader TES and second storage address information of a storage location allocated for the input variable;
步骤103:基于所述第一表的所述第一特征和所述第二表的所述第二特征进行匹配,将所述第一表中每一个输出变量的第一存储地址信息存储至所述第二表中与所述每一个输出变量匹配的输入变量的第二存储地址信息所指的存储位置。Step 103: Based on matching the first feature of the first table and the second feature of the second table, the first storage address information of each output variable in the first table is stored in the storage location indicated by the second storage address information of the input variable matching each output variable in the second table.
需要说明的是:所说的渲染可以是指图形渲染。图形渲染(技术)可以是指通过计算机算法将三维模型转换为二维图像的过程,其模拟了光线在现实世界中的传播和相互作用,以呈现出逼真的图像效果。It should be noted that the rendering mentioned here may refer to graphics rendering. Graphics rendering (technology) may refer to the process of converting a three-dimensional model into a two-dimensional image through a computer algorithm, which simulates the propagation and interaction of light in the real world to present a realistic image effect.
在一些实施例中,图形渲染的实现可以通过如图2所示的渲染管线(或者称之为渲染流水线)。如图2所示,渲染管线可以抽象为三个阶段:应用阶段;几何阶段;光栅化阶段。其中,应用阶段是数据准备阶段,通过计算机或终端等电子设备的中央处理单元(CPU,Central Processing Unit)向图形处理单元(GPU,Graphics Processing Unit)输送数据,例如顶点数据、摄像机位置、视椎体数据、场景模型数据、光源等等。并且,在一些实施例中,在应用阶段中,为了提高渲染性能,还会对这些数据进行处理,比如剔除不可见物体等等。这一阶段最重要的输出是渲染所需的几何信息,也即渲染图元,其中,渲染图元可以是点、线、面等。这里,几何阶段运行在GPU中,其主要用于将从应用阶段接收到的顶点坐标变换到屏幕空间中。In some embodiments, graphics rendering can be implemented through a rendering pipeline (or rendering pipeline) as shown in FIG2. As shown in FIG2, the rendering pipeline can be abstracted into three stages: application stage; geometry stage; rasterization stage. Among them, the application stage is a data preparation stage, in which data such as vertex data, camera position, frustum data, scene model data, light source, etc. are transmitted to the graphics processing unit (GPU) through the central processing unit (CPU) of an electronic device such as a computer or terminal. In addition, in some embodiments, in the application stage, in order to improve the rendering performance, these data will also be processed, such as culling invisible objects, etc. The most important output of this stage is the geometric information required for rendering, that is, the rendering primitives, where the rendering primitives can be points, lines, surfaces, etc. Here, the geometry stage runs in the GPU, which is mainly used to transform the vertex coordinates received from the application stage into the screen space.
在一些实施例中,几何阶段的结构如图3所示包括:顶点着色器、曲面细分着色器、几何着色器、裁剪、屏幕映射,其中,顶点着色器、曲面细分着色器、几何着色器是可编程的阶段,也称之为可编程渲染管线,就是用户可以自定义进行编程的着色器。所谓顶点着色器,其处理单位是顶点,即对输入的每个顶点都会调用一次顶点着色器。顶点着色器主要功能是进行坐标系变换操作。所谓曲面细分着色器是一个可选阶段的着色器,其利用镶嵌处理技术对待渲染图元细分,从而提供更精细的几何变换。其中,细分方式有多种,比如,开放图形库(OpenGL,Open Graphics Library)中的三角形,四边形,等值线集合等等形式。示例性的,如图4所示,其示出本申请实施例三角形的细分方式的示意图。In some embodiments, the structure of the geometry stage is shown in FIG3 and includes: vertex shader, tessellation shader, geometry shader, clipping, and screen mapping, wherein the vertex shader, tessellation shader, and geometry shader are programmable stages, also known as programmable rendering pipelines, which are shaders that users can customize for programming. The so-called vertex shader has a processing unit of a vertex, that is, the vertex shader is called once for each vertex input. The main function of the vertex shader is to perform coordinate system transformation operations. The so-called tessellation shader is an optional stage shader that uses tessellation processing technology to subdivide the rendering primitives, thereby providing a more sophisticated geometric transformation. Among them, there are many subdivision methods, such as triangles, quadrilaterals, contour line sets, and the like in the Open Graphics Library (OpenGL). Exemplarily, as shown in FIG4, a schematic diagram of the subdivision method of triangles in an embodiment of the present application is shown.
在一些实施例中,如图5所示,细分过程可以分为三个阶段,分别为曲面细分控制着色器(TCS,Tessellation Control Shader)、曲面细分图元生成器(TPG,TessellationPrimitive Generator)、曲面细分评估着色器(TES,Tessellation Evaluation Shader),其中,所谓TCS负责输出细分参数,且向TPG传输,该细分参数用于控制待渲染图元被细分为多少更小图元;TCS还负责向TES输出待渲染图元经顶点着色器处理后的顶点数据。所谓TPG是固定的硬件模块,根据TCS提供的细分参数执行细分操作。所谓TES根据TCS传输的顶点数据为TGP传输的经细分操作后的各更小的渲染图元计算顶点数据,并且将计算的顶点数据传递至渲染管线的下一阶段。需要说明的是,所谓TCS和所谓TES可以以一段程序实现。示例性的,如图6所示,其示出本申请实施例提供的TCS的一种示例性实现;如图7所示,其示出本申请实施例提供的TES的一种示例性实现。In some embodiments, as shown in FIG5 , the subdivision process can be divided into three stages, namely, a tessellation control shader (TCS), a tessellation primitive generator (TPG), and a tessellation evaluation shader (TES), wherein the so-called TCS is responsible for outputting subdivision parameters and transmitting them to TPG, and the subdivision parameters are used to control how many smaller primitives the primitive to be rendered is subdivided into; TCS is also responsible for outputting vertex data of the primitive to be rendered after being processed by the vertex shader to TES. The so-called TPG is a fixed hardware module that performs subdivision operations according to the subdivision parameters provided by TCS. The so-called TES calculates vertex data for each smaller rendering primitive after the subdivision operation transmitted by TGP according to the vertex data transmitted by TCS, and passes the calculated vertex data to the next stage of the rendering pipeline. It should be noted that the so-called TCS and the so-called TES can be implemented in a program. Exemplarily, as shown in FIG6 , it shows an exemplary implementation of TCS provided in an embodiment of the present application; as shown in FIG7 , it shows an exemplary implementation of TES provided in an embodiment of the present application.
在一些实施例中,所谓光栅化阶段运行在GPU中,其主要任务时决定每个渲染图元中哪些像素应该被绘制在屏幕上,它需要对上一阶段得到的逐顶点数据进行插值,然后进行逐像素处理。In some embodiments, the so-called rasterization stage runs in the GPU, and its main task is to decide which pixels in each rendering primitive should be drawn on the screen. It needs to interpolate the vertex data obtained in the previous stage and then perform pixel-by-pixel processing.
基于前面对曲面细分的描述,在曲面细分阶段中,TCS 会向TES传送数据,基于此,将TCS的输出(变量)和TES的输入(变量)匹配是GPU需要解决的问题。所谓匹配可以是指将TCS的某一输出变量与TES的某一输入变量进行对应,也即,将TCS的输出与TCS的输入一一对应。Based on the previous description of tessellation, in the tessellation stage, TCS transmits data to TES. Based on this, matching the output (variable) of TCS with the input (variable) of TES is a problem that the GPU needs to solve. The so-called matching can refer to matching a certain output variable of TCS with a certain input variable of TES, that is, matching the output of TCS with the input of TCS one by one.
在一些实施例中,TCS和TES编译时进行匹配,具体地如下:GPU编译器先编译TCS,编译完成后生成一张TCS的输出表(TCS Output Table),表中每一行包含TCS输出的名字(name)和偏移(offset),然后由GPU驱动将TCS的输出表(TCS output table)和TES的shader源代码(程序代码,在使用时,需要GPU的编码器进行编译)传递给GPU编译器,GPU编译器根据TCS output table将TES的输入和TCS的输出进行匹配,编译TES。In some embodiments, TCS and TES are matched during compilation, specifically as follows: the GPU compiler first compiles TCS, and after compilation, generates a TCS output table (TCS Output Table), in which each row contains the name (name) and offset (offset) of the TCS output, and then the GPU driver passes the TCS output table (TCS output table) and the shader source code (program code, which needs to be compiled by the GPU encoder when used) of TES to the GPU compiler. The GPU compiler matches the input of TES and the output of TCS according to the TCS output table and compiles TES.
在另一些实施例中,如图1所示的方法,分别获得TCS的第一表和TES的第二表,其中,第一表也即前面描述的TCS的输出表的一个具体示例。第二表是TES的输入表。基于此,将第一表中的输出变量的第一特征与第二表中的输入变量的第二特征进行匹配,以此,将TCS的输出变量传输给TES。In some other embodiments, the method shown in FIG. 1 obtains a first table of TCS and a second table of TES, respectively, wherein the first table is a specific example of the output table of TCS described above. The second table is the input table of TES. Based on this, the first feature of the output variable in the first table is matched with the second feature of the input variable in the second table, thereby transmitting the output variable of TCS to TES.
示例性的,如图8所示,其示出本申请实施例提供的第一表的一种示例性示意图。其中,va为TCS的一个输出变量的第一特征,offset (0)即为该输出变量的第一存储地址信息;vb为TCS的另一个输出变量的第一特征,offset (4)即为该输出变量的第一存储地址信息。For example, as shown in FIG8 , it shows an exemplary schematic diagram of the first table provided in an embodiment of the present application. Among them, va is the first characteristic of an output variable of TCS, and offset (0) is the first storage address information of the output variable; vb is the first characteristic of another output variable of TCS, and offset (4) is the first storage address information of the output variable.
示例性的,如图9所示,其示出本申请实施例提供的第二表的一种示例性示意图。其中,va为TES的一个输入变量的第二特征,ID(0)即为该输入变量分配的存储位置的的第二存储地址信息;vb为TES的另一个输入变量的第二特征,ID(1)即为该输入变量的第二存储地址信息。For example, as shown in FIG9 , an exemplary schematic diagram of the second table provided in an embodiment of the present application is shown. Among them, va is the second feature of an input variable of TES, and ID (0) is the second storage address information of the storage location assigned to the input variable; vb is the second feature of another input variable of TES, and ID (1) is the second storage address information of the input variable.
其中,所述第一特征可以为所述输出变量的名字(name);所述第二特征为所述输入变量的名字(name)。示例性的,图8和图9所示的va和vb是name。The first feature may be the name of the output variable (name); the second feature may be the name of the input variable (name). For example, va and vb shown in FIG8 and FIG9 are names.
其中,所述第一存储地址信息用于指示所述输出变量在显卡的显存中的存储位置;所述第二存储地址信息用于指示为所述输入变量分配的缓冲区Buffer的位置;其中,所述Buffer位于所述显存或所述显卡中的寄存器内。The first storage address information is used to indicate the storage location of the output variable in the video memory of the graphics card; the second storage address information is used to indicate the location of the buffer allocated for the input variable; wherein the Buffer is located in the video memory or a register in the graphics card.
并且,若所述第一存储地址信息和所述第二存储地址信息均是指向所述显存内的存储位置,所述第一存储地址信息和所述第二存储地址信息指向所述显存内的不同存储位置。Furthermore, if the first storage address information and the second storage address information both point to storage locations in the video memory, the first storage address information and the second storage address information point to different storage locations in the video memory.
也即,第一表中的输出变量是被存储在显存内的。所谓第一存储地址信息所指向的显存的存储位置,并且第一存储地址信息以位置偏移(Offset)形式实现。第二表中为输入变量分配的存储位置是缓冲区(buffer)内,该buffer可以位于显存或寄存器内。其中,该显存和寄存器包含在显卡内。应该理解的是,显卡内还可以包括GPU。所谓第二存储地址信息所指向的缓冲区(buffer)内的存储位置,并且第二存储地址信息以buffer内的ID形式实现。That is, the output variables in the first table are stored in the video memory. The so-called first storage address information refers to the storage location of the video memory, and the first storage address information is implemented in the form of a position offset. The storage location allocated for the input variables in the second table is in a buffer, which can be located in the video memory or register. Among them, the video memory and register are included in the graphics card. It should be understood that the graphics card can also include a GPU. The so-called second storage address information refers to the storage location in the buffer, and the second storage address information is implemented in the form of an ID in the buffer.
应该理解的是,若第一存储地址信息和第二存储地址信息均指向的存储位置均在显存内,那么,第一存储地址信息和第二存储地址信息所指向的存储位置是不同的。It should be understood that if the storage locations pointed to by the first storage address information and the second storage address information are both within the video memory, then the storage locations pointed to by the first storage address information and the second storage address information are different.
在一些实施例中,所述获得第一表,可以包括:加载存储的所述第一表,或者,运行第一TCS获得所述第一表;In some embodiments, obtaining the first table may include: loading the stored first table, or executing a first TCS to obtain the first table;
所述获得第二表,可以包括:加载存储的所述第二表,或者,运行第一TES获得所述第二表。The obtaining of the second table may include: loading the stored second table, or running the first TES to obtain the second table.
需要说明的是,若第一TCS是第一次使用,此时,需运行第一TCS(也即编译第一TCS),然后生成一张TCS的输出表(TCS output table)。若第一TCS之前使用过,且在显存或其他存储位置有存储(也即有缓存),那么,此时若需要使用该第一TCS,直接加载即可。同理,对于第二表的理解,在此不再赘述。It should be noted that if the first TCS is used for the first time, it is necessary to run the first TCS (i.e. compile the first TCS) and then generate a TCS output table. If the first TCS has been used before and is stored in the video memory or other storage location (i.e. cached), then if the first TCS needs to be used, it can be directly loaded. Similarly, the understanding of the second table will not be repeated here.
在一些实施例中,所述基于所述第一表的所述第一特征和所述第二表的所述第二特征进行匹配,将所述第一表中每一个输出变量的第一存储地址信息存储至所述第二表中与所述每一个输出变量匹配的输入变量的第二存储地址信息所指的存储位置,如图10所示,可以包括:In some embodiments, the matching based on the first feature of the first table and the second feature of the second table, storing the first storage address information of each output variable in the first table to the storage location indicated by the second storage address information of the input variable matching each output variable in the second table, as shown in FIG. 10, may include:
步骤1001:比对所述第一表中目标输出变量的所述第一特征与所述第二表中的每一个输入变量的第二特征;Step 1001: Compare the first feature of the target output variable in the first table with the second feature of each input variable in the second table;
步骤1002:响应于所述第二表中存在目标输入变量的所述第二特征与所述目标输出变量的所述第一特征相同,将所述目标输出变量的所述第一存储地址信息存储至所述目标输入变量的第二存储地址信息所指的存储位置;Step 1002: In response to the second feature of the target input variable being the same as the first feature of the target output variable in the second table, storing the first storage address information of the target output variable to the storage location indicated by the second storage address information of the target input variable;
其中,所述目标输出变量为所述第一表中的任一个输出变量;所述目标输入变量是所述第二表中与所述目标输出变量匹配的输入变量。The target output variable is any output variable in the first table; and the target input variable is an input variable in the second table that matches the target output variable.
需要说明的是,所谓的目标输出变量可以是第一表中任意一个输出变量;所谓目标输入变量可以是第二表中与该目标输出变量匹配的输入变量,换句话说,第一表中的每一个输出变量与第二表中的输入变量的匹配的步骤相似,仅以目标输出变量为例进行说明。具体地,将该目标输出变量为索引遍历第二表中的输入变量,若第二表中存在目标输入变量的第二特征与目标输出变量的第一特征相同,则,将目标输出变量的第一存储地址信息存储至该目标输入变量的第二存储地址信息所指的存储位置,也即完成目标输出变量与目标输入变量之间的匹配。其余TCS的输出变量与TES的输入变量之间匹配可以参照上述步骤进行理解。It should be noted that the so-called target output variable can be any output variable in the first table; the so-called target input variable can be an input variable in the second table that matches the target output variable. In other words, the steps for matching each output variable in the first table with the input variable in the second table are similar, and only the target output variable is used as an example for explanation. Specifically, the target output variable is used as an index to traverse the input variables in the second table. If the second feature of the target input variable in the second table is the same as the first feature of the target output variable, then the first storage address information of the target output variable is stored in the storage location indicated by the second storage address information of the target input variable, that is, the matching between the target output variable and the target input variable is completed. The matching between the output variables of the remaining TCS and the input variables of the TES can be understood with reference to the above steps.
示例性的,如图11所示,其示出本申请实施例提供一种第一表和第二表的匹配示例性示意图。图11中,第一表可以参照前面图8。第二表可以参照前面图9。也即,第一表中的一个输出变量的第一特征va与第二表中的一个输入变量的第二特征va相同,那么,此时,可以将第一表中该输出变量的第一存储地址信息存储至该输入变量对应的第二存储地址信息所指向的存储位置,也即将offset (0)中的0存在ID(0)处。第一表中的一个输出变量的第一特征vb与第二表中的一个输入变量的第二特征vb相同,那么,此时,可以将第一表中该输出变量的第一存储地址信息存储至该输入变量对应的第二存储地址信息所指向的存储位置,也即将offset (4)中的4存在ID(1)处。Exemplarily, as shown in FIG11, it shows an exemplary schematic diagram of matching a first table and a second table provided by an embodiment of the present application. In FIG11, the first table can refer to FIG8 above. The second table can refer to FIG9 above. That is, if the first feature va of an output variable in the first table is the same as the second feature va of an input variable in the second table, then, at this time, the first storage address information of the output variable in the first table can be stored in the storage location pointed to by the second storage address information corresponding to the input variable, that is, the 0 in offset (0) is stored at ID (0). If the first feature vb of an output variable in the first table is the same as the second feature vb of an input variable in the second table, then, at this time, the first storage address information of the output variable in the first table can be stored in the storage location pointed to by the second storage address information corresponding to the input variable, that is, the 4 in offset (4) is stored at ID (1).
在一些实施例中,所述方法还可以包括:In some embodiments, the method may further include:
响应于所述第二表中未存在与所述目标输出变量的所述第一特征相同的输入变量的所述第二特征,输出指示信息;所述指示信息用于指示匹配失败。In response to the second feature of the input variable being the same as the first feature of the target output variable not existing in the second table, outputting indication information; the indication information is used to indicate a matching failure.
也即,在另一方面,若第二表中未存在与目标输出变量的第一特征相同的输入变量的第二特征,输出用于指示匹配失败的指示信息。该指示信息可以以任何形式呈现,比如,以文字、数字、音频等等形式。也即,在第二表中找不到与目标输出变量的第一特征系统的第二特征时,也就是,TCS的输出与TES的输入不匹配,此时,TCS的输出不能向TES传输,以免后续的错误。之后,可以重新TCS和TES之间的匹配。That is, on the other hand, if the second feature of the input variable that is the same as the first feature of the target output variable does not exist in the second table, an indication information indicating that the matching failed is output. The indication information can be presented in any form, such as text, numbers, audio, etc. That is, when the second feature of the first feature system of the target output variable cannot be found in the second table, that is, the output of TCS does not match the input of TES, at this time, the output of TCS cannot be transmitted to TES to avoid subsequent errors. Afterwards, the matching between TCS and TES can be repeated.
在一些实施例中,所述方法还可以包括:In some embodiments, the method may further include:
获得所述第一表,生成第一指令;所述第一指令用于基于所述第一表中的第一存储地址信息存储对应的输出变量。The first table is obtained, and a first instruction is generated; the first instruction is used to store the corresponding output variable based on the first storage address information in the first table.
需要说明的是,前面已经描述,第一TCS可以是一段程序,编译后,将输出变量本身存储在显存中,因此,在运行第一TCS后,生成一个存储该输出变量的指令,也即第一指令。应该理解的是,TCS的输出变量可以包含至少一个,也即,第一指令可以包含至少一个存储输出变量的子指令。示例性的,第一指令可以如图12所示,其示出了本申请实施例提供的图6所示的TCS生成的第一指令的示意图。It should be noted that, as described above, the first TCS can be a program that, after compilation, stores the output variable itself in the video memory. Therefore, after running the first TCS, an instruction for storing the output variable is generated, that is, the first instruction. It should be understood that the output variable of the TCS can include at least one, that is, the first instruction can include at least one sub-instruction for storing the output variable. Exemplarily, the first instruction can be as shown in FIG. 12, which shows a schematic diagram of the first instruction generated by the TCS shown in FIG. 6 provided in an embodiment of the present application.
在一些实施例中,所述方法还可以包括:In some embodiments, the method may further include:
获得所述第二表,生成第二指令;所述第二指令用于基于所述第二表中的第二存储地址信息读取对应的第一存储地址信息,且基于所述对应的第一存储地址信息读取对应的输出变量;所述对应的输出变量作为所述第一TES的一个输入变量。The second table is obtained and a second instruction is generated; the second instruction is used to read the corresponding first storage address information based on the second storage address information in the second table, and read the corresponding output variable based on the corresponding first storage address information; the corresponding output variable is used as an input variable of the first TES.
需要说明的是,第二指令的理解可以参照第一指令理解,在此不再赘述。示例性的,第二指令可以如图13所示,其示出了本申请实施例提供的图7所示的TES生成的第二指令的示意图。It should be noted that the understanding of the second instruction can refer to the understanding of the first instruction, which will not be repeated here. Exemplarily, the second instruction can be as shown in Figure 13, which shows a schematic diagram of the second instruction generated by the TES shown in Figure 7 provided in an embodiment of the present application.
在一些实施例中,所述方法还可以包括:In some embodiments, the method may further include:
获得第三表;所述第三表中的每一行包含第二TES中的一个输入变量的第三特征和为所述输入变量分配的存储位置的第三存储地址信息;Obtain a third table; each row in the third table contains a third feature of an input variable in the second TES and third storage address information of a storage location allocated for the input variable;
基于所述第一表的所述第一特征和所述第三表的所述第三特征进行匹配,将所述第一表中每一个输出变量的第一存储地址信息存储至所述第三表中与所述每一个输出变量匹配的输入变量的第三存储地址信息所指的存储位置。Based on matching the first feature of the first table and the third feature of the third table, the first storage address information of each output variable in the first table is stored in the storage location indicated by the third storage address information of the input variable in the third table that matches each output variable.
需要说明的是,所谓第三表是第二TES对应的输入表,且第二TES与第一TES不同。也即,在有不同的TES要与第一TCS匹配时,要进行第一表与第三表的匹配,以将第一表中每一个输出变量的第一存储地址信息存储至所述第三表中与所述每一个输出变量匹配的输入变量的第三存储地址信息所指的存储位置。具体步骤,可以参照第一表与第二表的匹配,在此不再赘述。It should be noted that the so-called third table is the input table corresponding to the second TES, and the second TES is different from the first TES. That is, when different TESs need to be matched with the first TCS, the first table and the third table need to be matched to store the first storage address information of each output variable in the first table to the storage location indicated by the third storage address information of the input variable matching each output variable in the third table. For specific steps, refer to the matching of the first table and the second table, which will not be repeated here.
在一些实施例中,所述方法还可以包括:In some embodiments, the method may further include:
获得第四表;所述第四表中的每一行包含第二TCS中的一个输出变量的第四特征和所述输出变量的第四存储地址信息;Obtain a fourth table; each row of the fourth table contains a fourth feature of an output variable in the second TCS and fourth storage address information of the output variable;
基于所述第四表的所述第四特征和所述第二表的所述第二特征进行匹配,将所述第四表中每一个输出变量的第四存储地址信息存储至所述第二表中与所述每一个输出变量匹配的输入变量的第二存储地址信息所指的存储位置。Based on matching the fourth feature of the fourth table and the second feature of the second table, the fourth storage address information of each output variable in the fourth table is stored in the storage location pointed to by the second storage address information of the input variable matching each output variable in the second table.
需要说明的是,所谓第四表是第二TCS对应的输出表,且第二TCS与第一TCS不同。也即,在有不同的TCS要与第一TES匹配时,要进行第二表与第四表的匹配,以将第四表中每一个输出变量的第四存储地址信息存储至第二表中与所述每一个输出变量匹配的输入变量的第二存储地址信息所指的存储位置。具体步骤,可以参照第一表与第二表的匹配,在此不再赘述。It should be noted that the so-called fourth table is the output table corresponding to the second TCS, and the second TCS is different from the first TCS. That is, when different TCSs need to be matched with the first TES, the second table and the fourth table need to be matched to store the fourth storage address information of each output variable in the fourth table to the storage location indicated by the second storage address information of the input variable matching each output variable in the second table. For the specific steps, refer to the matching of the first table and the second table, which will not be repeated here.
本申请实施例提供的渲染细分方法,在实现ARB_seperate_shader_object 扩展时,通过缓冲区(Buffer)实现TCS与TES之间的传输且在TCS和/或TES不同且存在缓存(暂时有存储)的情况下,直接进行匹配即可,不需要重复编译TCS和/或TES,提升了tessellation的性能。The rendering subdivision method provided in the embodiment of the present application, when implementing the ARB_seperate_shader_object extension, realizes the transmission between TCS and TES through a buffer (Buffer), and when TCS and/or TES are different and there is a cache (temporary storage), direct matching can be performed without repeatedly compiling TCS and/or TES, thereby improving the performance of tessellation.
参见图14,本申请实施例还提供一种电子设备140,其包括:图形处理器GPU1401;所述GPU1401,被配置为:获得第一表;所述第一表中的每一行包含第一细分控制着色器TCS中的一个输出变量的第一特征和所述输出变量的第一存储地址信息; 获得第二表;所述第二表中的每一行包含第一细分评估着色器TES中的一个输入变量的第二特征和为所述输入变量分配的存储位置的第二存储地址信息;以及基于所述第一表的所述第一特征和所述第二表的所述第二特征进行匹配,将所述第一表中每一个输出变量的第一存储地址信息存储至所述第二表中与所述每一个输出变量匹配的输入变量的第二存储地址信息所指的存储位置。Referring to Figure 14, an embodiment of the present application also provides an electronic device 140, which includes: a graphics processor GPU1401; the GPU1401 is configured to: obtain a first table; each row in the first table contains a first feature of an output variable in a first subdivision control shader TCS and first storage address information of the output variable; obtain a second table; each row in the second table contains a second feature of an input variable in a first subdivision evaluation shader TES and second storage address information of a storage location allocated for the input variable; and based on matching the first feature of the first table and the second feature of the second table, store the first storage address information of each output variable in the first table to the storage location indicated by the second storage address information of the input variable matching each output variable in the second table.
在一些实施例中,所述GPU可以包括:编译器和驱动器,其中;In some embodiments, the GPU may include: a compiler and a driver, wherein;
所述编译器,被配置为:编译第一TCS,获得第一表;所述第一表中的每一行包含所述第一TCS中的一个输出变量的第一特征和所述输出变量的第一存储地址信息; 以及编译第一TES,获得第二表;所述第二表中的每一行包含所述第一TES中的一个输入变量的第二特征和为所述输入变量分配的存储位置的第二存储地址信息;The compiler is configured to: compile a first TCS to obtain a first table, wherein each row in the first table contains a first characteristic of an output variable in the first TCS and first storage address information of the output variable; and compile a first TES to obtain a second table, wherein each row in the second table contains a second characteristic of an input variable in the first TES and second storage address information of a storage location allocated for the input variable;
所述驱动器,被配置为:基于所述第一表的所述第一特征和所述第二表的所述第二特征进行匹配,将所述第一表中每一个输出变量的第一存储地址信息存储至所述第二表中与所述每一个输出变量匹配的输入变量的第二存储地址信息所指的存储位置。The driver is configured to: match the first feature of the first table with the second feature of the second table, and store the first storage address information of each output variable in the first table to the storage location indicated by the second storage address information of the input variable matching each output variable in the second table.
在另一些实施例中,所述GPU可以包括:驱动器,其中;In some other embodiments, the GPU may include: a driver, wherein;
所述驱动器,被配置为:加载与第一TCS相关的缓存,获得第一表;所述第一表中的每一行包含第一TCS中的一个输出变量的第一特征和所述输出变量的第一存储地址信息;以及加载与第一TES相关的缓存,获得第二表;所述第二表中的每一行包含第一TES中的一个输入变量的第二特征和为所述输入变量分配的存储位置的第二存储地址信息;以及基于所述第一表的所述第一特征和所述第二表的所述第二特征进行匹配,将所述第一表中每一个输出变量的第一存储地址信息存储至所述第二表中与所述每一个输出变量匹配的输入变量的第二存储地址信息所指的存储位置。The driver is configured to: load a cache associated with a first TCS to obtain a first table; each row in the first table contains a first feature of an output variable in the first TCS and first storage address information of the output variable; and load a cache associated with a first TES to obtain a second table; each row in the second table contains a second feature of an input variable in the first TES and second storage address information of a storage location allocated for the input variable; and match based on the first feature of the first table and the second feature of the second table, store the first storage address information of each output variable in the first table to the storage location indicated by the second storage address information of the input variable matching each output variable in the second table.
也即:该渲染细分方法运行在GPU中,且通过加载缓存或编译获得第一表和第二表,然后,将第一表和第二表进行匹配,使得TCS中的输出变量向TES的输入变量传输。That is, the rendering subdivision method runs in the GPU, and obtains the first table and the second table by loading the cache or compiling, and then matches the first table and the second table so that the output variables in the TCS are transmitted to the input variables of the TES.
为了理解本申请,如图15所示,其示出本申请实施例提供的一种渲染细分方法的流程示意图。In order to understand the present application, as shown in FIG15 , a flowchart of a rendering subdivision method provided in an embodiment of the present application is shown.
具体流程可以包括:Specific processes may include:
编译TCS或从缓存中加载,获得TCS的输出表(也即第一表或第四表),每一行包含TCS输出的名字(name)和偏移(offset),生成第一指令,该第一指令用于将每个输出存储至(store)对应的offset偏移处的显存中。Compile TCS or load from cache, obtain the output table of TCS (i.e., the first table or the fourth table), each row of which contains the name (name) and offset (offset) of the TCS output, and generate the first instruction, which is used to store each output to the video memory at the corresponding offset offset.
编译TES或从缓存中加载,获得TES的输入表(也即第二表或第三表),每一行包含TES的输入的名字(name)和为每个TES的输入分配的ID(缓冲区)。生成第二指令,该第二指令用于从缓冲区(Buffer)的ID位置读取偏移(offset),且用于从offset偏移处的显存中读取TES的输入。其中,缓冲区可以位于显存或者是寄存器。需要说明的是,输入表中的各ID可以连续的,比如,ID(0)、ID(1),也即缓冲区的连续存储位置;也可以是不连续的,比如,ID(0)、ID(2),也即是缓冲区中不连续的存储位置。此外,缓冲区可以显存或寄存器中连续的存储区域,也可以是不连续的存储区域。Compile TES or load from cache to obtain the input table of TES (i.e., the second table or the third table), each row of which contains the name of the TES input (name) and the ID (buffer) assigned to each TES input. Generate a second instruction, which is used to read the offset (offset) from the ID position of the buffer (Buffer), and to read the input of TES from the video memory at the offset. The buffer can be located in the video memory or a register. It should be noted that the IDs in the input table can be continuous, for example, ID (0), ID (1), that is, continuous storage locations in the buffer; or they can be discontinuous, for example, ID (0), ID (2), that is, discontinuous storage locations in the buffer. In addition, the buffer can be a continuous storage area in the video memory or register, or it can be a discontinuous storage area.
驱动器根据TCS output table和TES input table进行匹配,将TCS outputtable中的每个输出读取对应TES input table的ID,将每个输出的offset加载至 Buffer的对应ID位置。The driver matches the TCS output table and TES input table, reads the ID of the corresponding TES input table for each output in the TCS outputtable, and loads the offset of each output to the corresponding ID position of the Buffer.
如果使用不同TCS,并且如果有缓存则加载缓存,跳转至前面的步骤重新进行匹配,如果没有缓存则跳转至前面的步骤重新编译TCS并且创建缓存。If a different TCS is used, and if there is a cache, load the cache and jump to the previous step to rematch. If there is no cache, jump to the previous step to recompile the TCS and create the cache.
如果使用不同TES,并且如果有缓存则加载缓存,跳转至前面的步骤重新进行匹配,如果没有缓存则跳转至前面的步骤重编译TES并且创建缓存。If a different TES is used, and if there is a cache, load the cache and jump to the previous step to rematch. If there is no cache, jump to the previous step to recompile TES and create a cache.
基于上述描述,本申请实施例提供的细分方案可以由GPU的驱动器进行TCS和TES的输入输出匹配,因此在TCS和TES发生改变时不需要重复编译TCS和TES,只需要重新进行匹配。在ARB_seperate_shader_object扩展下,当使用不同的TCS和TES时驱动只需要通过TCS output table和TES input table重新进行匹配,不需要重复编译TCS和TES,大大提高了GPU的性能。Based on the above description, the segmentation scheme provided in the embodiment of the present application can be used by the GPU driver to match the input and output of TCS and TES, so when TCS and TES change, there is no need to recompile TCS and TES, only rematch. Under the ARB_seperate_shader_object extension, when using different TCS and TES, the driver only needs to rematch through the TCS output table and TES input table, without the need to recompile TCS and TES, which greatly improves the performance of the GPU.
本申请实施例还提供一种计算机可读存储介质,其中存储有计算机程序,所述计算机程序处理器被处理器执行时实现上述方法实施例的步骤,而前述的存储介质包括:移动存储设备、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random AccessMemory)、磁碟或者光盘等各种可以存储程序代码的介质。An embodiment of the present application also provides a computer-readable storage medium, which stores a computer program. When the computer program processor is executed by a processor, the steps of the above-mentioned method embodiment are implemented. The aforementioned storage medium includes: a mobile storage device, a read-only memory (ROM), a random access memory (RAM), a magnetic disk or an optical disk, and other media that can store program codes.
本申请还提供了一种计算机程序产品160,如图16所示,包括计算机程序1600,计算机程序1600被处理器执行时实现如上述各实施例提供的方法。The present application also provides a computer program product 160, as shown in FIG16, including a computer program 1600, and when the computer program 1600 is executed by a processor, the method provided in the above embodiments is implemented.
需要说明的是:“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。It should be noted that: "first", "second", etc. are used to distinguish similar objects, and are not necessarily used to describe a specific order or sequence.
另外,在本申请所提供的几个实施例中,应该理解到,所揭露的设备和方法,可以通过其它的方式实现。以上所描述的设备实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,如:多个单元或组件可以结合,或可以集成到另一个系统,或一些特征可以忽略,或不执行。另外,所显示或讨论的各组成部分相互之间的耦合、或直接耦合、或通信连接可以是通过一些接口,设备或单元的间接耦合或通信连接,可以是电性的、机械的或其它形式的。In addition, in the several embodiments provided in the present application, it should be understood that the disclosed devices and methods can be implemented in other ways. The device embodiments described above are only schematic. For example, the division of the units is only a logical function division. There may be other division methods in actual implementation, such as: multiple units or components can be combined, or can be integrated into another system, or some features can be ignored, or not executed. In addition, the coupling, direct coupling, or communication connection between the components shown or discussed can be through some interfaces, and the indirect coupling or communication connection of the devices or units can be electrical, mechanical or other forms.
上述作为分离部件说明的单元可以是、或也可以不是物理上分开的,作为单元显示的部件可以是、或也可以不是物理单元,即可以位于一个地方,也可以分布到多个网络单元上;可以根据实际的需要选择其中的部分或全部单元来实现本实施例方案的目的。The units described above as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place or distributed on multiple network units; some or all of the units may be selected according to actual needs to achieve the purpose of the present embodiment.
另外,在本发明各实施例中的各功能单元可以全部集成在一个处理单元中,也可以是各单元分别单独作为一个单元,也可以两个或两个以上单元集成在一个单元中;上述集成的单元既可以采用硬件的形式实现,也可以采用硬件加软件功能单元的形式实现。In addition, all functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may be separately used as a unit, or two or more units may be integrated into one unit; the above-mentioned integrated units may be implemented in the form of hardware or in the form of hardware plus software functional units.
本领域普通技术人员可以理解:实现上述方法实施例的全部或部分步骤可以通过程序指令相关的硬件来完成,前述的程序可以存储于一计算机可读取存储介质中,该程序在执行时,执行包括上述方法实施例的步骤;而前述的存储介质包括:移动存储设备、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。A person of ordinary skill in the art can understand that: all or part of the steps of implementing the above method embodiment can be completed by hardware related to program instructions, and the aforementioned program can be stored in a computer-readable storage medium. When the program is executed, it executes the steps of the above method embodiment; and the aforementioned storage medium includes: mobile storage devices, read-only memories (ROM, Read-Only Memory), random access memories (RAM, Random Access Memory), disks or optical disks, etc. Various media that can store program codes.
或者,本发明上述集成的单元如果以软件功能模块的形式实现并作为独立的产品销售或使用时,也可以存储在一个计算机可读取存储介质中。基于这样的理解,本发明实施例的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机、服务器、或者网络设备等)执行本发明各个实施例所述方法的全部或部分。而前述的存储介质包括:移动存储设备、ROM、RAM、磁碟或者光盘等各种可以存储程序代码的介质。Alternatively, if the above-mentioned integrated unit of the present invention is implemented in the form of a software function module and sold or used as an independent product, it can also be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the embodiment of the present invention can be essentially or partly reflected in the form of a software product that contributes to the prior art. The computer software product is stored in a storage medium and includes several instructions for a computer device (which can be a personal computer, server, or network device, etc.) to execute all or part of the methods described in each embodiment of the present invention. The aforementioned storage medium includes: various media that can store program codes, such as mobile storage devices, ROM, RAM, magnetic disks or optical disks.
以上所述,仅为本发明的具体实施方式,但本发明的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本发明揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本发明的保护范围之内。The above description is only a specific implementation mode of the present invention, but the protection scope of the present invention is not limited thereto. Any technician familiar with the technical field can easily think of changes or substitutions within the technical scope disclosed by the present invention, which should be covered by the protection scope of the present invention.
Claims (15)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202410692387.5A CN118262029B (en) | 2024-05-30 | 2024-05-30 | Rendering subdivision method, electronic device, storage medium and computer program product |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202410692387.5A CN118262029B (en) | 2024-05-30 | 2024-05-30 | Rendering subdivision method, electronic device, storage medium and computer program product |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN118262029A CN118262029A (en) | 2024-06-28 |
| CN118262029B true CN118262029B (en) | 2024-08-13 |
Family
ID=91611627
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202410692387.5A Active CN118262029B (en) | 2024-05-30 | 2024-05-30 | Rendering subdivision method, electronic device, storage medium and computer program product |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN118262029B (en) |
Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN104183008A (en) * | 2014-07-31 | 2014-12-03 | 浙江大学 | Shader classification method and device based on surface signal fitting and tessellation and graphics rendering method |
| CN115576689A (en) * | 2022-10-10 | 2023-01-06 | 阿里云计算有限公司 | Cloud rendering processing method, device and storage medium |
Family Cites Families (8)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US9922442B2 (en) * | 2012-07-18 | 2018-03-20 | Arm Limited | Graphics processing unit and method for performing tessellation operations |
| US9786098B2 (en) * | 2015-07-06 | 2017-10-10 | Mediatek Inc. | Apparatus for performing tessellation operation and methods utilizing the same |
| US10643296B2 (en) * | 2016-01-12 | 2020-05-05 | Qualcomm Incorporated | Systems and methods for rendering multiple levels of detail |
| US10650566B2 (en) * | 2017-02-15 | 2020-05-12 | Microsoft Technology Licensing, Llc | Multiple shader processes in graphics processing |
| CN110415161B (en) * | 2019-07-19 | 2023-06-27 | 龙芯中科(合肥)技术有限公司 | Graphics processing method, device, equipment and storage medium |
| CN113947515B (en) * | 2020-07-17 | 2024-12-03 | 芯原微电子(上海)股份有限公司 | Subdivision curve data processing implementation method, system, medium and vector graphics processing device |
| CN114391155B (en) * | 2020-08-20 | 2025-09-26 | 华为技术有限公司 | GPU shader program iterative calling method, GPU, compiler and GPU driver |
| CN114663272B (en) * | 2022-02-22 | 2024-04-09 | 荣耀终端有限公司 | Image processing method and electronic equipment |
-
2024
- 2024-05-30 CN CN202410692387.5A patent/CN118262029B/en active Active
Patent Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN104183008A (en) * | 2014-07-31 | 2014-12-03 | 浙江大学 | Shader classification method and device based on surface signal fitting and tessellation and graphics rendering method |
| CN115576689A (en) * | 2022-10-10 | 2023-01-06 | 阿里云计算有限公司 | Cloud rendering processing method, device and storage medium |
Also Published As
| Publication number | Publication date |
|---|---|
| CN118262029A (en) | 2024-06-28 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| KR100742419B1 (en) | Systems and methods to run shader driven compilation to render assets | |
| EP2277106B1 (en) | Software rasterization optimization | |
| US7619630B2 (en) | Preshaders: optimization of GPU pro | |
| US7777750B1 (en) | Texture arrays in a graphics library | |
| KR100860427B1 (en) | Method for optimizing stream processing programs and information storage medium including instructions adapted to execute the same, and graphics processing subsystem including the information storage medium | |
| CN110084875B (en) | Using a compute shader as a front-end for a vertex shader | |
| US20080266296A1 (en) | Utilization of symmetrical properties in rendering | |
| US20090040222A1 (en) | Multi-pass shading | |
| US20100046846A1 (en) | Image compression and/or decompression | |
| KR102266962B1 (en) | Compiler-assisted technologies to reduce memory usage in the graphics pipeline | |
| JP2023525725A (en) | Data compression method and apparatus | |
| CN118262029B (en) | Rendering subdivision method, electronic device, storage medium and computer program product | |
| US12394010B2 (en) | Pipeline delay elimination with parallel two level primitive batch binning | |
| US20240070961A1 (en) | Vertex index routing for two level primitive batch binning | |
| US10062140B2 (en) | Graphics processing systems | |
| CN115563426A (en) | System and method for analyzing and displaying 3D file by browser | |
| CN120765448A (en) | Geometric processing method, graphic processor and computer equipment | |
| CN119273826A (en) | WebGL application dynamic translation method and device for WebGPU |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant | ||
| CP03 | Change of name, title or address |
Address after: B655, 4th Floor, Building 14, Cuiwei Zhongli, Haidian District, Beijing, 100036 Patentee after: Mole Thread Intelligent Technology (Beijing) Co.,Ltd. Country or region after: China Address before: 209, 2nd Floor, No. 31 Haidian Street, Haidian District, Beijing Patentee before: Moore Threads Technology Co., Ltd. Country or region before: China |
|
| CP03 | Change of name, title or address |