KR20040011918A

KR20040011918A - operation method of pixel cache architecture in three-dimensional graphic accelerator

Info

Publication number: KR20040011918A
Application number: KR1020020045234A
Authority: KR
Inventors: 이길환; 박우찬; 김일산; 한탁돈; 김신덕
Original assignee: 학교법인연세대학교
Priority date: 2002-07-31
Filing date: 2002-07-31
Publication date: 2004-02-11
Anticipated expiration: 2022-07-31
Also published as: KR100441080B1

Abstract

본 발명은 프레임 버퍼의 대역폭 문제와 접근 지연을 상당량 줄일 수 있는 픽셀 캐쉬 구조의 동작 방법을 제공하기 위한 것으로서, 새로 입력되는 프레그먼트와 동일한 텍스쳐 좌표값을 갖는 기존 프레그먼트 정보를 픽셀 캐쉬와 NT 버퍼로 동시에 접근하여 검출하는 단계와, 상기 새로 입력된 프레그먼트의 깊이 값과 검출된 기존 프레그먼트의 깊이 값을 비교하는 단계와, 상기 비교결과, 새로 입력된 프레그먼트의 깊이 값이 기존 프레그먼트의 깊이 값보다 더 크면, 상기 새로 입력된 프레그먼트의 정보를 NT 버퍼에 저장하고, 다음 입력되는 새로운 프레그먼트 수행을 위해 대기하는 단계와, 상기 비교결과, 새로 입력된 프레그먼트의 깊이 값이 기존 프레그먼트의 깊이 값보다 더 작으면, 상기 새로 입력된 프레그먼트의 정보를 픽셀 캐쉬에 저장하는 단계와, 상기 픽셀 캐쉬로부터 새로 입력된 프레그먼트의 색깔 값에 대한 읽기 연산을 수행하고, 색깔 값에 따른 알파 혼합을 수행한 후, 상기 알파 혼합된 색깔 값을 픽셀 캐쉬에 저장하는 단계를 포함하여 이루어지는데 있다.The present invention is to provide a method of operating a pixel cache structure that can significantly reduce the bandwidth buffer and access delay of the frame buffer, the existing fragment information having the same texture coordinate value as the newly input fragment and the pixel cache and Simultaneously accessing and detecting an NT buffer; comparing a depth value of the newly input fragment with a depth value of the detected existing fragment; and as a result of the comparison, a depth value of the newly input fragment If greater than the depth value of the existing fragment, storing the information of the newly entered fragment in the NT buffer, waiting for the next input new fragment to perform, and as a result of the comparison, the newly entered If the depth value of the fragment is smaller than the depth value of the existing fragment, storing the information of the newly entered fragment in the pixel cache Performing a read operation on a color value of a newly input fragment from the pixel cache, performing alpha mixing according to the color value, and storing the alpha mixed color value in the pixel cache. It is done by

Description

Operation method of pixel cache architecture in three-dimensional graphic accelerator}

본 발명은 3차원 그래픽 가속기에 관한 것으로, 특히 깊이 검사 결과에 근거한 캐쉬 적재 방법을 가지는 픽셀 캐쉬 구조에 관한 것이다.The present invention relates to a three-dimensional graphics accelerator, and more particularly, to a pixel cache structure having a cache loading method based on a depth test result.

최근 3차원 그래픽 기법들이 PC에 도입되면서 좀 더 현실감 있고 화려한 화면을 구성하기 위해서 많은 수의 폴리곤 처리와 광원효과와 같은 특수효과를 효율적으로 처리할 수 있는 3차원 그래픽 하드웨어들이 시장의 주류를 이루게 되었다.Recently, 3D graphics techniques have been introduced to the PC, and 3D graphics hardware, which can efficiently handle a large number of polygons and special effects such as light effects, has become the mainstream of the market in order to construct a more realistic and colorful screen. .

특히, 3차원 화면을 이용하여 사용자들에게 높은 현실감을 제공하는 게임들이 큰 호응을 얻으면서 이에 대한 연구가 큰 발전을 이루게 되었다.In particular, as games that provide high realism to users using 3D screens have received great response, research on this has made great progress.

이러한 연구 결과로 3차원 그래픽 하드웨어의 성능은 급격히 발전하게 되었다.As a result of this research, the performance of 3D graphics hardware has been rapidly developed.

도 1 은 종래 기술에 따른 3차원 그래픽 처리과정을 나타낸 도면이다.1 is a view showing a three-dimensional graphics process according to the prior art.

도 1과 같이, 3차원 응용 소프트웨어가 API(Application Program Interface)(100)를 통하여 3차원 그래픽 가속기(200)에서 실시간 하드웨어 가속기를 수행한 후 디스플레이부(500)로 보내지는 단계를 거친다.As shown in FIG. 1, the 3D application software performs a real-time hardware accelerator in the 3D graphics accelerator 200 through an API (Application Program Interface) 100 and then passes to the display unit 500.

이때, 상기 3차원 그래픽 가속기(200)는 크게 기하학 처리(geometry processing)(300)와 렌더링(rendering) 처리(400)로 이루어진다.In this case, the 3D graphics accelerator 200 is largely composed of a geometry processing (300) and the rendering (rendering) process (400).

그리고 상기 기하학 처리(300)는 주로 3차원 좌표계의 물체를 시점에 따라 변환하고, 2차원 좌표계로 투영 처리하는 과정으로 이루어진다.The geometry processing 300 mainly includes a process of converting an object of a 3D coordinate system according to a viewpoint and projecting the object into a 2D coordinate system.

이와 같은 과정을 통해 한 프레임에 대하여 입력되는 모든 3차원 데이터의 수행이 끝나게 되면, 프레임 버퍼에 저장된 색깔 값은 디스플레이부(500)로 보내져서 출력하게 된다.When all three-dimensional data input for one frame is finished through the above process, the color value stored in the frame buffer is sent to the display unit 500 and output.

3차원 그래픽 영상은 주로 점, 선, 다각형으로 구성되며, 대부분의 3차원 렌더링 처리는 주로 삼각형을 고속으로 처리하는 구조를 갖는다.3D graphic images are mainly composed of points, lines, and polygons, and most 3D rendering processes mainly have a structure of processing triangles at high speed.

따라서, 상기 렌더링 처리(400)는 고성능 처리를 위하여 파이프라인화 되어 있으며, 크게 삼각형 당(per triangle) 처리되는 삼각형 셋업(triangle setup) 처리와, 변 당(per edge) 처리되는 엣지웍(edge-walk) 처리와, 픽셀 당(per pixel) 처리되는 픽셀 래스터라이재이션(pixel rasterization) 처리로 이루어진다.Accordingly, the rendering process 400 is pipelined for high performance processing, and is mainly composed of triangle setup processing per-triangle and edge-work processing per edge. walk processing and pixel rasterization processing performed per pixel.

상기 삼각형 셋업 처리는 입력되는 삼각형에 대하여 상기 엣지웍 처리 및 픽셀 래스터라이제이션 처리에서 사용될 값들을 계산한다.The triangle setup process calculates values to be used in the edgework process and pixel rasterization process for the input triangle.

그리고 상기 엣지웍 처리는 삼각형의 에지를 따라 스팬(span)의 시작점과 끝점을 구한다.The edgework process finds the starting point and the ending point of the span along the edge of the triangle.

여기서 스팬의 시작점과 끝점은 주어진 스캔라인에 대하여 삼각형의 에지 들에 대한 두 개의 교차점을 의미하며, 스팬은 이러한 시작점과 끝점 사이에 있는 픽셀들의 집합이다.Where the start and end points of the span are the two intersections of the edges of the triangle for a given scanline, and the span is the set of pixels between these start and end points.

그리고 상기 픽셀 래스터라이제이션 처리는 스팬에 대하여 보간(interpolation)을 통하여 스팬을 구성하는 픽셀에 대한 최종 색깔값을 생성하는 부분이다.The pixel rasterization process is a part of generating a final color value for pixels constituting the span through interpolation with respect to the span.

본 발명에서는 이 픽셀 래스터라이제이션 처리의 동작을 제안하고 있으므로,여기서는 상기 픽셀 래스터라이제이션 처리의 동작만을 살펴보도록 한다.Since the present invention proposes the operation of the pixel rasterization process, only the operation of the pixel rasterization process will be described here.

도 2 는 일반적인 픽셀 래스터라이제이션 처리에 따른 파이프라인 과정을 나타낸 도면이다.2 illustrates a pipeline process according to a general pixel rasterization process.

도 2를 참조하여 설명하면, 입력되는 프레그먼트(fragment) 정보는 보간을 통하여 생성된 색깔값, 3차원 위치 좌표(x,y,z) 그리고 텍스쳐 좌표 등이 포함된다.Referring to FIG. 2, the input fragment information includes color values generated through interpolation, three-dimensional position coordinates (x, y, z), texture coordinates, and the like.

그러면, 먼저 텍스쳐 읽기/필터부(401)는 상기 입력되는 프레그먼트 정보의 텍스쳐 좌표를 통해 4개 혹은 8개의 텍셀(texel)을 텍스쳐 캐쉬(419)로부터 읽기 연산을 하고 필터링을 수행하여 한 개의 텍셀을 생성한다. 이때, 상기 텍셀을 텍스쳐 데이터의 최소 단위를 나타낸다.Then, the texture read / filter unit 401 first reads four or eight texels from the texture cache 419 through the texture coordinates of the input fragment information and performs filtering to perform one filtering. Create a texel. In this case, the texel represents the minimum unit of texture data.

이어 텍스쳐 혼합부(403)는 상기 생성된 하나의 텍셀을 텍스쳐 캐쉬로부터 읽어진 다수 텍셀의 색깔 값들을 혼합하여 하나의 색상으로 정의한다.The texture mixer 403 then defines the generated one texel as one color by mixing color values of a plurality of texels read from the texture cache.

다음으로 알파 검사부(405)는 새로 입력되는 텍셀에 따른 프레그먼트의 알파 값인 투명도를 검사한다.Next, the alpha inspection unit 405 inspects transparency, which is an alpha value of a fragment according to a newly input texel.

이어, 깊이 읽기부(407)와 깊이 검사부(409)에서는 픽셀 캐쉬(421)로부터 현 위치의 프레그먼트의 깊이 값에 대한 읽기 연산을 수행하고, 읽어진 프레그먼트에 대한 깊이 값과 상기 새로 입력되는 프레그먼트의 깊이 값을 비교한다.Subsequently, the depth reading unit 407 and the depth inspecting unit 409 perform a read operation on the depth value of the fragment at the current position from the pixel cache 421, and the depth value of the read fragment and the newly read fragment. Compares the depth value of the input fragment.

상기 비교 결과 새로 입력되는 프레그먼트의 깊이 값이 현재의 프레그먼트의 깊이 값보다 작으면, 상기 새로 입력되는 프레그먼트의 깊이 값이 픽셀 캐쉬(421)에 저장된다.As a result of the comparison, if the depth value of the newly input fragment is smaller than the depth value of the current fragment, the depth value of the newly input fragment is stored in the pixel cache 421.

그리고 상기 비교 결과 새로 입력되는 프레그먼트의 깊이 값이 현재의 프레그먼트의 깊이 값보다 크면, 상기 새로 입력되는 프레그먼트는 파이프라인에 버려지게 된다.When the depth value of the newly input fragment is larger than the depth value of the current fragment, the newly input fragment is discarded in the pipeline.

이어 색깔 읽기부(413)는 상기 픽셀 캐쉬(421)로부터 색깔 값에 대한 읽기 연산을 수행하고, 알파 혼합부(415)에서 상기 읽기 연산으로 읽어진 색깔 값과 상기 텍스쳐 혼합부(403)에서 혼합된 색깔 값에 대하여 알파 혼합을 수행한다.Subsequently, the color reading unit 413 performs a read operation on the color value from the pixel cache 421, and mixes the color value read by the read operation by the alpha mixing unit 415 and the texture mixing unit 403. Alpha blending is performed on the color values.

그리고, 최종적으로 색깔 쓰기부(417)는 상기 알파 혼합부(415)에서 알파 혼합된 색깔 값을 픽셀 캐쉬(421)에 저장하게 된다.Finally, the color writing unit 417 stores the alpha value mixed in the alpha mixing unit 415 in the pixel cache 421.

이와 같은 래스터라이제이션 처리는 전체 메모리 전송량의 상당 부분이 픽셀 처리 파이프라인 단계의 텍스쳐 데이터 전송과 프레임 메모리(423) 접근에서 일어나게 되어, 메모리의 대역폭뿐만 아니라 프레임 메모리(423) 접근에 따른 지연도 성능의 중요한 요인이 된다.This rasterization process takes a significant portion of the total memory transfer in the texture data transfer and frame memory 423 access in the pixel processing pipeline stage, so that not only the bandwidth of the memory but also the delay due to the frame memory 423 access is achieved. Is an important factor.

근래에 발표되는 3차원 그래픽 가속기(200)들은 대부분의 경우 텍스쳐 전송과 프레임버퍼 쓰기의 전송량을 줄이기 위해 도 2에서 나타내고 있는 것과 같이 중간에 캐쉬 메모리를 사용하고 있으나, 이 또한 사용되는 화면의 해상도가 높아지면 처리되는 픽셀의 수도 증가하기 때문에 텍스쳐 캐쉬(419)와 픽셀 캐쉬(421)의 전송량도 기하급수적으로 증가하게 된다.In recent years, the 3D graphics accelerator 200, which is recently released, uses a cache memory in the middle, as shown in FIG. 2, in order to reduce the amount of texture transfer and frame buffer write. As the number of pixels to be processed increases, the amount of transmission of the texture cache 419 and the pixel cache 421 also increases exponentially.

3차원 그래픽 하드웨어에서 메모리 문제는 성능향상에 영향을 미치는 중요한 요소 중 하나이지만 메모리 데이터나 접근 성향에 대한 분석과 메모리 구조 등에 대한 연구는 상당히 소홀하게 진행되었다.The memory problem in 3D graphics hardware is one of the important factors affecting the performance improvement, but the analysis of memory data and access tendency and the study of memory structure have been neglected considerably.

이전에 발표된 Mitra와 Chiuh의 논문에서 동적 데이터(dynamic workload)에 대한 분석 중 래스터라이제이션 단계 내의 텍스쳐 트래픽과 메모리 뱅크 활용도에 대하여 간단히 언급한 정도이고, 텍스쳐 데이터로 인한 문제를 해결하기 위하여 텍스쳐 캐쉬나 텍스쳐 선인출 기법 등과 같은 연구들 정도만이 발표되고 있어, 픽셀 캐쉬에 대한 연구는 더욱 미비한 상태이다.In the previously published papers of Mitra and Chiuh, we briefly talk about texture traffic and memory bank utilization in the rasterization phase during dynamic workload analysis. Only researches such as texture prefetching techniques have been published, and the research on pixel caches is incomplete.

따라서 본 발명은 상기와 같은 문제점을 해결하기 위해 안출한 것으로서, 프레임 버퍼의 대역폭 문제와 접근 지연을 상당량 줄일 수 있는 픽셀 캐쉬 구조를 제시하는데 그 목적이 있다.Accordingly, an object of the present invention is to provide a pixel cache structure that can significantly reduce the bandwidth problem and access delay of a frame buffer.

도 1 은 종래 기술에 따른 3차원 그래픽 처리과정을 나타낸 도면1 is a view showing a three-dimensional graphics processing process according to the prior art

도 2 는 일반적인 픽셀 래스터라이제이션 처리에 따른 파이프라인 과정을 나타낸 도면2 is a diagram illustrating a pipeline process according to a general pixel rasterization process.

도 3 은 본 발명에 따른 픽셀 래스터라이제이션 처리에 따른 파이프라인 과정을 나타낸 도면3 is a diagram illustrating a pipeline process according to pixel rasterization processing according to the present invention.

*도면의 주요부분에 대한 부호의 설명* Explanation of symbols for main parts of the drawings

100 : API 200 : 3차원 그래픽 가속기100: API 200: 3D graphics accelerator

300 : 기하학 처리 400 : 렌더링300: Geometry Processing 400: Rendering

401 : 텍스쳐 읽기/필터부 403 : 텍스쳐 혼합부401: texture read / filter unit 403: texture mixing unit

405 : 알파 검사부 407 : 깊이 읽기부405: alpha inspection unit 407: depth reading unit

409 : 깊이 검사부 411 : 깊이 쓰기부409: depth inspection unit 411: depth writing unit

413 : 색깔 읽기부 415 : 알파 혼합부413 color reading unit 415 alpha mixing unit

417 : 색깔 쓰기부 419 : 텍스쳐 캐쉬417: color writing section 419: texture cache

421 : 픽셀 캐쉬 422 : NT 버퍼421 Pixel Cache 422 NT Buffer

423 : 프레임 메모리 500 : 디스플레이부423 frame memory 500 display unit

상기와 같은 목적을 달성하기 위한 본 발명에 따른 3차원 그래픽 가속기에서의 픽셀 캐쉬 구조의 동작방법의 특징은 새로 입력되는 프레그먼트와 동일한 텍스쳐 좌표값을 갖는 기존 프레그먼트 정보를 픽셀 캐쉬와 NT 버퍼로 동시에 접근하여 검출하는 단계와, 상기 새로 입력된 프레그먼트의 깊이 값과 검출된 기존 프레그먼트의 깊이 값을 비교하는 단계와, 상기 비교결과, 새로 입력된 프레그먼트의 깊이 값이 기존 프레그먼트의 깊이 값보다 더 크면, 상기 새로 입력된 프레그먼트의 정보를 NT 버퍼에 저장하고, 다음 입력되는 새로운 프레그먼트 수행을 위해 대기하는 단계와, 상기 비교결과, 새로 입력된 프레그먼트의 깊이 값이 기존 프레그먼트의 깊이 값보다 더 작으면, 상기 새로 입력된 프레그먼트의 정보를 픽셀 캐쉬에 저장하는 단계와, 상기 픽셀 캐쉬로부터 새로 입력된 프레그먼트의 색깔 값에 대한 읽기 연산을 수행하고, 색깔 값에 따른 알파 혼합을 수행한 후, 상기 알파 혼합된 색깔 값을 픽셀 캐쉬에 저장하는 단계를 포함하여 이루어지는데 있다.A characteristic of the method of operating the pixel cache structure in the 3D graphic accelerator according to the present invention for achieving the above object is that the pixel cache and NT to the existing fragment information having the same texture coordinate value as the newly input fragment Simultaneously accessing and detecting a buffer; comparing the depth value of the newly input fragment with the detected depth value of the existing fragment; and as a result of the comparison, the depth value of the newly input fragment If greater than the depth value of the existing fragment, storing the information of the newly entered fragment in the NT buffer, and waiting for the next input of the new fragment execution, the comparison result, the newly entered fragment If the depth value of the segment is smaller than the depth value of the existing fragment, storing the newly input fragment information in the pixel cache; Performing a read operation on a color value of a newly input fragment from a cell cache, performing alpha mixing according to the color value, and storing the alpha mixed color value in a pixel cache. .

이때, 상기 픽셀 캐쉬는 깊이 검사에서 성공한 깊이 데이터만이 저장되어 있고, 상기 NT 버퍼에는 깊이 검사에서 실패한 깊이 데이터만이 저장되어 있는데 다른 특징이 있다.In this case, the pixel cache stores only depth data that is successful in the depth check, and only the depth data that has failed in the depth check is stored in the NT buffer.

상기 NT 버퍼에 저장되어 있는 데이터는 깊이 검사에서 성공하면 픽셀 캐쉬로 이동되고, 깊이 검사에서 실패하면 NT 버퍼에 그대로 유지하는데 또 다른 특징이 있다.The data stored in the NT buffer is moved to the pixel cache if the depth check succeeds, and the data stored in the NT buffer remains in the NT buffer.

그리고 상기 픽셀 캐쉬와 NT 버퍼에 해당 프레그먼트 정보가 존재하지 않는 경우, 프레임 메모리에 접근하여 해당 프레그먼트 정보를 검출한 후, 상기 NT 버퍼에 저장하는 단계를 더 포함하여 이루어지는데 또 다른 특징이 있다.If the fragment information does not exist in the pixel cache and the NT buffer, the method further includes accessing a frame memory to detect the fragment information and storing the fragment information in the NT buffer. There is this.

본 발명의 다른 목적, 특성 및 잇점들은 첨부한 도면을 참조한 실시예들의 상세한 설명을 통해 명백해질 것이다.Other objects, features and advantages of the present invention will become apparent from the following detailed description of embodiments taken in conjunction with the accompanying drawings.

본 발명에 따른 3차원 그래픽 가속기에서의 픽셀 캐쉬 구조의 바람직한 실시예에 대하여 첨부한 도면을 참조하여 설명하면 다음과 같다.A preferred embodiment of the pixel cache structure in the 3D graphics accelerator according to the present invention will be described with reference to the accompanying drawings.

도 3 은 본 발명에 픽셀 래스터라이제이션 처리에 따른 파이프라인 과정을 나타낸 도면이다.3 is a diagram illustrating a pipeline process according to pixel rasterization processing according to the present invention.

도 3과 같이, 픽셀 래스터라이제이션 파이프라인은 기존과 동일하며, 단지 캐쉬구조가 텍스쳐 캐쉬(419), 픽셀 캐쉬(421) 그리고 NT 버퍼(422)로 구성되는데 그 차이를 갖는다.As shown in FIG. 3, the pixel rasterization pipeline is the same as before, except that the cache structure includes the texture cache 419, the pixel cache 421, and the NT buffer 422.

그리고 상기 텍스쳐 캐쉬(419)와 픽셀 캐쉬(421)는 일반적인 캐쉬구조와 같게 구성되며, 상기 NT(Not Tested) 버퍼는 8 또는 16엔트리의 완전 연관 캐쉬로 구성된다.The texture cache 419 and the pixel cache 421 are configured in a general cache structure, and the NT (Not Tested) buffer is composed of 8 or 16 entries of a fully associated cache.

이와 같이 구성될 때, 픽셀 래스터라이제이션 파이프라인의 동작을 도면을 참조하여 상세히 살펴보면 다음과 같다.When configured as described above, the operation of the pixel rasterization pipeline will be described in detail with reference to the accompanying drawings.

도 4 는 본 발명에 따른 픽셀 캐쉬 구조의 동작을 나타낸 흐름도이다.4 is a flowchart illustrating the operation of a pixel cache structure according to the present invention.

도 4를 보면 먼저, 새로 입력되는 프레그먼트의 색깔값, 3차원 위치좌표 그리고 텍스쳐 좌표 등이 포함된 정보가 입력되면, 텍스쳐 읽기/필터부(401)에서 해당 텍스쳐 좌표에 대하여 다수개의 텍셀을 텍스쳐 캐쉬(419)로부터 읽기 연산을 수행하고 필터링을 수행하여 한 개의 텍셀을 생성한다. 그리고 텍스쳐 혼합부(403)에서 다수의 색깔 값을 하나로 혼합한다(S10).Referring to FIG. 4, first, when information including color values, three-dimensional position coordinates, and texture coordinates of a newly input fragment is input, the texture read / filter unit 401 selects a plurality of texels for the texture coordinates. A read operation from the texture cache 419 is performed and filtering is performed to generate one texel. And the texture mixing unit 403 mixes a plurality of color values into one (S10).

이어, 알파 검사부(405)에서 상기 생성된 텍셀의 투명도를 검사하고, 깊이 읽기부(407)를 통해 픽셀 캐쉬(421)와 NT 버퍼(422)를 동시에 접근하여 새로 입력되는 프레그먼트의 위치에 해당되는 기존 프레그먼트의 정보를 검출한다(S20).Subsequently, the alpha inspection unit 405 inspects the transparency of the generated texel, and simultaneously accesses the pixel cache 421 and the NT buffer 422 through the depth reading unit 407 to a position of a newly input fragment. Information of the corresponding existing fragment is detected (S20).

이때, 상기 픽셀 캐쉬(421)와 NT 버퍼(422)에 원하는 데이터가 없는 경우, 즉 캐쉬 접근 실패시에는 깊이 읽기부(407)에서 프레임 메모리(423)로부터 원하는 기존 프레그먼트의 정보를 가져와서 픽셀 캐쉬(421)에는 저장하지 않고, NT 버퍼(422)에 저장한다(S50).In this case, when there is no desired data in the pixel cache 421 and the NT buffer 422, that is, when the cache access fails, the depth reading unit 407 takes information of a desired existing fragment from the frame memory 423. The data is not stored in the pixel cache 421 but stored in the NT buffer 422 (S50).

그리고 깊이 검사부(409)를 통해 상기 픽셀 캐쉬(421) 또는 NT 버퍼(422)에서 읽어진 기존 프레그먼트에 대한 깊이 값과 상기 새로 입력되는 프레그먼트의 깊이 값을 비교한다(S40)(S60).The depth checker 409 compares the depth value of the existing fragment read from the pixel cache 421 or the NT buffer 422 with the depth value of the newly input fragment (S40) (S60). ).

즉, 기존 프레그먼트의 정보가 상기 픽셀 캐쉬(421)에 있는 경우에는 픽셀 캐쉬(421)에서 상기 기존 프레그먼트의 깊이 값을 읽어오고, 기존 프레그먼트의 정보가 픽셀 캐쉬(421)에 없고 NT 버퍼(422)에 있는 경우에는 상기 기존 프레그먼트의 깊이 값을 NT 버퍼(422)로부터 읽어온 후, 새로 입력되는 프레그먼트의 깊이 값과 비교한다.That is, when the information of the existing fragment is in the pixel cache 421, the depth value of the existing fragment is read from the pixel cache 421, and the information of the existing fragment is read in the pixel cache 421. If it is not present in the NT buffer 422, the depth value of the existing fragment is read from the NT buffer 422 and then compared with the depth value of the newly input fragment.

상기 비교 결과, 상기 새로 입력된 프레그먼트의 깊이 값이 기존 프레그먼트의 깊이 값보다 작은 경우에는 깊이 쓰기부(411)를 통해 새로 입력되는 프레그먼트의 정보를 상기 픽셀 캐쉬(421)에 저장한다(S80).As a result of the comparison, when the depth value of the newly input fragment is smaller than the depth value of the existing fragment, information of the newly input fragment through the depth writing unit 411 is transmitted to the pixel cache 421. Save (S80).

그리고 상기 비교 결과, 새로 입력되는 프레그먼트의 깊이 값이 기존 프레그먼트의 깊이 값보다 큰 경우에는 깊이 쓰기부(411)를 통해 새로 입력되는 프레그먼트의 정보를 NT 버퍼(422)에 저장한다(S70).As a result of the comparison, when the depth value of the newly input fragment is larger than the depth value of the existing fragment, the information of the newly input fragment is stored in the NT buffer 422 through the depth writing unit 411. (S70).

이때, 상기 새로 입력되는 프레그먼트의 정보가 NT 버퍼(422)에 저장된 상태에서는 현재의 렌더링 처리를 중단하고 다음 입력되는 새로운 프레그먼트 수행을 위해 대기한다(S100).At this time, in the state where the information of the newly input fragment is stored in the NT buffer 422, the current rendering process is stopped and waits for the next input of the new fragment to be performed (S100).

그리고 상기 새로 입력되는 프레그먼트의 정보가 픽셀 캐쉬(421)에 저장되면, 색깔 읽기부(413)는 상기 픽셀 캐쉬(421)로부터 색깔 값에 대한 읽기 연산을 수행하고, 알파 혼합부(415)에서 상기 읽기 연산으로 읽어진 색깔 값과 상기 텍스쳐 혼합부(403)에서 혼합된 색깔 값에 대하여 알파 혼합을 수행한다. 이어 색깔 쓰기부(417)는 상기 알파 혼합부(415)에서 알파 혼합된 색깔 값을 픽셀 캐쉬(421)에저장한다(S90).When the newly input fragment information is stored in the pixel cache 421, the color reading unit 413 performs a read operation on the color value from the pixel cache 421 and the alpha mixing unit 415. In FIG. 1, alpha mixing is performed on the color values read by the read operation and the color values mixed by the texture mixer 403. Subsequently, the color writing unit 417 stores the color value alpha-mixed by the alpha mixing unit 415 in the pixel cache 421 (S90).

또한, 이와 같은 동작 방법에 따라 깊이 검사 결과가 실패한 데이터에 의한 캐쉬 오염을 줄이면서 재 사용될 가능성이 많은 데이터를 픽셀 캐쉬에 저장함으로써, 픽셀 캐쉬의 저장 공간에 대한 효율성을 높일 수 있을 뿐만 아니라, 픽셀 캐쉬의 성공률을 높일 수 있다.In addition, by storing the data that is likely to be reused in the pixel cache while reducing the cache contamination caused by the data whose depth test result has failed, the efficiency of the storage space of the pixel cache can be increased, and the pixel can be improved. Can increase the success rate of the cache.

즉, 재 사용될 가능성을 깊이 검사 결과에 의해서 판단하고, 깊이 검사가 실패한 데이터를 어느 정도의 시간동안 NT 버퍼(422)에 유지시키고 있는데, 이는 깊이 검사에 실패한 데이터라고 하더라도 성공한 데이터보다는 재 사용하다 확률은 작지만 깊이 검사에 사용되지 않은 다른 데이터들보다는 재 사용하다 확률이 높기 때문에, 상기 실패한 데이터가 재 사용하다 때 발생되는 접근 지연 및 대역폭을 줄일 수 있게 된다.In other words, it is judged that the possibility of re-use is determined by the depth test result, and the data that has failed the depth check is kept in the NT buffer 422 for a certain time. Since M is small but has a higher probability of reuse than other data not used for depth checking, it is possible to reduce the access delay and bandwidth incurred when the failed data is reused.

도 5 는 NT 버퍼를 사용한 경우와 사용하지 않은 경우의 픽셀 캐쉬의 실패율을 나타낸 그래프이다.Fig. 5 is a graph showing the failure rate of the pixel cache with and without the NT buffer.

이때, 도 5 는 엔트리의 크기가 64바이트 및 128바이트이면서 NT 버퍼(422)가 1개 및 2개의 프레그먼트 데이터를 저장할 수 있는 크기를 갖고 있다.In this case, FIG. 5 has an entry size of 64 bytes and 128 bytes, and the NT buffer 422 has a size capable of storing one and two fragment data.

이때, 픽셀 캐쉬(423)의 크기라 16K 바이트, 32K 바이트 그리고 64K 바이트를 각각 실시예로 나타내고 있다.In this case, the size of the pixel cache 423 indicates 16K bytes, 32K bytes and 64K bytes according to the embodiment.

따라서, 도 5와 같이 NT 버퍼(422)를 사용하기 전보다 사용한 후에 실패율이 더 작아지는 것을 알 수 있다. 여기서는 실시예로 NT 버퍼(422)의 크기를 1개 및 2개의 프레그먼트 데이터가 저장되도록 한정하고 있으나, NT 버퍼(422)의 크기를 증가시키면 더 큰 효율을 나타낼 것이다.Thus, it can be seen that the failure rate is smaller after using the NT buffer 422 than before, as shown in FIG. In this embodiment, the size of the NT buffer 422 is limited so that one and two fragment data are stored. However, increasing the size of the NT buffer 422 will result in greater efficiency.

이상에서 설명한 바와 같은 본 발명에 따른 3차원 그래픽 가속기에서의 픽셀 캐쉬 구조는 다음과 같은 효과가 있다.As described above, the pixel cache structure of the 3D graphic accelerator according to the present invention has the following effects.

첫째, 깊이 검사 시에 성공한 깊이 데이터 즉, 다시 사용될 확률이 큰 데이터만을 캐쉬에 적재함으로써, 캐쉬 오염(cache pollution)을 줄임으로써 캐쉬 성공률을 높인다.First, the cache success rate is increased by reducing cache pollution by loading only the depth data that is successful at the depth inspection, that is, data that is likely to be used again.

둘째, 캐쉬에 적재할 데이터에 대한 선택기법이 간단하다. 깊이 검사는 렌더링 처리에서 기본적으로 수행되는 과정이므로 이에 대한 결과만을 검사하여 수행되기 때문에 하드웨어적인 비용이 적을 뿐만 아니라, 캐쉬에 데이터를 적재할 지에 대한 선택 시간도 짧다.Second, the selection technique for the data to be loaded into the cache is simple. Depth checking is basically a process that is performed in the rendering process, so it is performed by checking only the result thereof, so that the hardware cost is low and the selection time for loading data into the cache is short.

이상 설명한 내용을 통해 당업자라면 본 발명의 기술 사상을 이탈하지 아니하는 범위에서 다양한 변경 및 수정이 가능함을 알 수 있을 것이다.Those skilled in the art will appreciate that various changes and modifications can be made without departing from the spirit of the present invention.

따라서, 본 발명의 기술적 범위는 실시예에 기재된 내용으로 한정되는 것이 아니라 특허 청구의 범위에 의하여 정해져야 한다.Therefore, the technical scope of the present invention should not be limited to the contents described in the embodiments, but should be defined by the claims.

Claims

Simultaneously accessing existing fragment information having the same texture coordinate value as the newly input fragment by accessing the pixel cache and the NT buffer simultaneously;

Comparing the depth value of the newly input fragment with the detected depth value of the existing fragment;

As a result of the comparison, if the depth value of the newly input fragment is larger than the depth value of the existing fragment, the information of the newly input fragment is stored in the NT buffer, and a new fragment execution is performed next. Waiting for

As a result of the comparison, if the depth value of the newly input fragment is smaller than the depth value of the existing fragment, storing the information of the newly input fragment in the pixel cache;

Performing a read operation on a color value of a newly input fragment from the pixel cache, performing alpha mixing according to the color value, and storing the alpha mixed color value in the pixel cache. A method of operating a pixel cache structure, characterized in that

The method of claim 1,

The pixel cache stores only depth data that is successful in depth checking, and only depth data that is failed in depth checking is stored in the NT buffer.

The method of claim 1,

The data stored in the NT buffer is moved to the pixel cache if the depth check succeeds, and if the depth check fails, the data is stored in the NT buffer.

The method of claim 1,

If the fragment information does not exist in the pixel cache and the NT buffer, the method further comprises accessing a frame memory to detect the fragment information and storing the fragment information in the NT buffer. How cache structures work