KR20240106637A

KR20240106637A - Electronic device and method for page fault handling

Info

Publication number: KR20240106637A
Application number: KR1020220189597A
Authority: KR
Inventors: 임영빈; 이봉원; 최승혁
Original assignee: 울산과학기술원
Priority date: 2022-12-29
Filing date: 2022-12-29
Publication date: 2024-07-08

Abstract

According to an embodiment, an electronic device may obtain a page in which a fault occurs from a cache memory area. Provided is a page fault handling method comprising the steps of: determining whether to perform prefetching accompanied by a smart network interface card; applying a virtual memory offset record of a host accessed by a thread to a leaf algorithm to obtain a major offset; determining a first page batch to be prefetched by the host and a second page batch to be prefetched by the smart network interface card; transmitting a request batch to a remote memory server; and receiving, by the smart network interface card, the second page batch.

Description

Page fault handling device and method {ELECTRONIC DEVICE AND METHOD FOR PAGE FAULT HANDLING}

이하, 페이지 폴트 핸들링 장치 및 방법에 관한 기술이 제공된다.Hereinafter, technology regarding a page fault handling device and method is provided.

기존의 네트워크 인터페이스 카드(NIC)는 Application-Specific Integrated Circuit(ASIC) 기반으로 개발되었고 이에 따라 양산이 용이하고 가격이 저렴하다는 장점이 있으나 내부 로직을 변경하는 것은 불가능하다. 이를 해결하기 위해 새로운 형태의 스마트 네트워크 인터페이스 카드(SmartNIC)가 개발되었다. SmartNIC은 자체 온보드 프로세서 및 이에 필요한 컴퓨터 자원을 가짐으로써 다양한 워크로드를 호스트(HOST)를 대신하여 처리가 가능하다.Existing network interface cards (NICs) were developed based on Application-Specific Integrated Circuit (ASIC) and have the advantage of being easy to mass produce and inexpensive, but it is impossible to change the internal logic. To solve this problem, a new type of smart network interface card (SmartNIC) was developed. SmartNIC can process various workloads on behalf of the host (HOST) by having its own onboard processor and the necessary computer resources.

인공지능에 대한 수요가 높아짐에 따라 처리해야 하는 데이터의 양이 증가되었다. 이에 따라 개별 호스트가 요구하는 메모리에 대한 수요도 같이 증가되었다. 그러므로, 최근 개별 호스트의 물리적 한계를 넘을 수 있도록 하는 메모리 분리 기법에 대한 연구가 진행되고 있다. 이들의 연구는 주로 RDMA를 활용하여 다른 노드의 메모리를 사용할 수 있는 것이다. 하지만 이 연구들은 높은 호스트의 CPU 오버헤드를 보이고, 페이지 접근 Latency가 큰 단점이 존재한다.As the demand for artificial intelligence grows, the amount of data that needs to be processed has increased. Accordingly, the demand for memory required by individual hosts also increased. Therefore, research has recently been conducted on memory separation techniques that can overcome the physical limitations of individual hosts. Their research mainly utilizes RDMA to use the memory of other nodes. However, these studies show high host CPU overhead and have the disadvantage of high page access latency.

위에서 설명한 배경기술은 발명자가 본원의 개시 내용을 도출하는 과정에서 보유하거나 습득한 것으로서, 반드시 본 출원 전에 일반 공중에 공개된 공지기술이라고 할 수는 없다.The background technology described above is possessed or acquired by the inventor in the process of deriving the disclosure of the present application, and cannot necessarily be said to be known technology disclosed to the general public before this application.

일 실시예에 따른 전자 장치는, 컴퓨터로 실행 가능한 명령어들(computer-executable instructions)이 저장된 메모리; 상기 메모리에 액세스(access)하여 상기 명령어들을 실행하는 프로세서; 호스트(host); 및 스마트 네트워크 인터페이스 카드(Smart NIC, Smart Network Interface Card)를 포함하고, 상기 명령어들은, 상기 호스트의 폴트 페이지(fault page)에 관한 메타데이터(metadata)에 기초하여, 상기 스마트 네트워크 인터페이스 카드를 동반한 프리페칭(prefetching)의 수행 여부를 결정하고, 상기 프리페칭을 수행하기로 결정된 것에 기초하여, 스레드(thread)가 접근하는 상기 호스트의 가상 메모리 오프셋 기록(record)을 리프(leap) 알고리즘에 적용하여 메이저 오프셋(major offset)을 획득하고, 상기 메이저 오프셋에 기초하여, 상기 호스트에 의해 프리페칭될 제1 페이지 배치(page batch) 및 상기 스마트 네트워크 인터페이스 카드에 의해 프리페칭될 제2 페이지 배치를 결정하고, 상기 결정된 제1 페이지 배치에 대한 메타데이터 및 상기 결정된 제2 페이지 배치에 대한 메타데이터에 기초한 요청 배치(request batch)를 원격 메모리 서버(remote memory server)에게 전달하고, 상기 요청에 응답한 원격 메모리 서버로부터 상기 호스트가 상기 제1 페이지 배치를 수신하고 상기 스마트 네트워크 인터페이스 카드가 상기 제2 페이지 배치를 수신할 수 있다.An electronic device according to an embodiment includes a memory storing computer-executable instructions; a processor that accesses the memory and executes the instructions; host; and a smart network interface card (Smart NIC, Smart Network Interface Card), wherein the commands are provided with the smart network interface card based on metadata about a fault page of the host. Determine whether to perform prefetching, and based on the decision to perform prefetching, apply a record of the virtual memory offset of the host accessed by the thread to the leaf algorithm Obtain a major offset and, based on the major offset, determine a first page batch to be prefetched by the host and a second page batch to be prefetched by the smart network interface card; , transmitting a request batch based on the metadata for the determined first page layout and the metadata for the determined second page layout to a remote memory server, and the remote memory responding to the request. The host may receive the first page layout from a server and the smart network interface card may receive the second page layout.

상기 프로세서는, 상기 호스트의 페이지 정보 테이블(page information table)에서 저장된 상기 폴트 페이지의 캐싱 기록을 획득하고, 상기 캐싱 기록에 기초하여, 상기 폴트 페이지가 저장된 캐시 메모리 영역(cache memory region)으로부터 상기 호스트에 반환되고, 상기 호스트의 스레드 정보 테이블(thread information table)에서 상기 반환된 폴트 페이지에 대응하는 스레드의 히트 정보를 업데이트할 수 있다.The processor obtains a caching record of the fault page stored in the page information table of the host, and, based on the caching record, retrieves the cache memory region in which the fault page is stored from the host. and the hit information of the thread corresponding to the returned fault page can be updated in the host's thread information table.

상기 프로세서는, 상기 호스트의 캐시 정보 테이블(cache information table)에 상기 폴트 페이지에 관한 메타데이터가 존재하지 않는 경우에 기초하여, 상기 스마트 네트워크 인터페이스 카드를 동반한 프리페칭을 수행하기로 결정할 수 있다.The processor may decide to perform prefetching with the smart network interface card based on a case where metadata regarding the fault page does not exist in the cache information table of the host.

상기 프로세서는, 상기 호스트에서 실행되는 복수의 스레드의 각각에 대하여, 상기 복수의 스레드에 대한 상기 호스트에 포함된 가상 메모리의 오프셋 기록들을 획득하고, 상기 복수의 스레드의 각각의 오프셋 기록들을, 과반수 투표 알고리즘인 상기 리프 알고리즘에 적용하여 상기 메이저 오프셋을 획득할 수 있다.The processor obtains, for each of a plurality of threads running in the host, offset records of a virtual memory included in the host for the plurality of threads, and selects each of the offset records of the plurality of threads by a majority vote. The major offset can be obtained by applying the leaf algorithm, which is an algorithm.

상기 프로세서는, 상기 메이저 오프셋을 기초로 프리페칭할 페이지들을 선택하고, 상기 원격 메모리 서버에 대한 기대 접근 시간(expected access time)에 기초하여, 상기 선택된 페이지들 중에서 상기 제1 페이지 배치를 결정하고, 상기 선택된 페이지들 중 상기 결정된 제1 페이지 배치를 제외한 페이지들을, 상기 제2 페이지 배치로 결정할 수 있다.The processor selects pages to be prefetched based on the major offset and determines the first page placement among the selected pages based on an expected access time to the remote memory server, Among the selected pages, pages other than the determined first page arrangement may be determined as the second page arrangement.

상기 프로세서는, 상기 결정된 제2 페이지 배치 중에서, 상기 호스트의 캐시 메모리 영역에 저장될 페이지들 및 상기 스마트 네트워크 인터페이스 카드의 캐시 메모리 영역에 저장될 페이지들을 결정하고, 상기 제2 페이지 배치에 대한 메타데이터를 상기 호스트의 RDMA(remote direct memory access) 오프로더(offloader)를 통해 상기 스마트 네트워크 인터페이스 카드의 프리페처(prefetcher)에게 전달할 수 있다.The processor determines pages to be stored in the cache memory area of the host and pages to be stored in the cache memory area of the smart network interface card from among the determined second page arrangement, and provides metadata for the second page arrangement. can be delivered to the prefetcher of the smart network interface card through the RDMA (remote direct memory access) offloader of the host.

상기 프로세서는, 상기 제1 페이지 배치에 대응하는 상기 요청 배치를, 상기 호스트의 RDMA 실행기(executor)를 통해 원격 메모리 서버에게 전달하고, 상기 제2 페이지 배치에 대응하는 상기 요청 배치를, 상기 스마트 네트워크 인터페이스 카드의 RDMA 실행기를 통해 원격 메모리 서버에게 전달할 수 있다.The processor transmits the request batch corresponding to the first page batch to a remote memory server through an RDMA executor of the host, and sends the request batch corresponding to the second page batch to the smart network. It can be transmitted to a remote memory server through the RDMA executor of the interface card.

상기 원격 메모리 서버에 저장된 페이지들의 간격(interval)에 기초하여, 상기 스마트 네트워크 인터페이스 카드의 스케줄러(scheduler)로부터 기대 지연 시간(expected latency time)을 획득하고, 상기 기대 지연 시간이 미리 결정된 임계 시간을 초과하는 경우에 기초하여, 상기 요청 배치에 포함된 요청들을 개별적으로 상기 원격 메모리 서버에게 전달할 수 있다.Based on the interval of pages stored in the remote memory server, an expected latency time is obtained from a scheduler of the smart network interface card, and the expected latency time exceeds a predetermined threshold time. Based on the case, requests included in the request batch can be individually delivered to the remote memory server.

상기 프로세서는, 상기 스마트 네트워크 인터페이스 카드의 전송 큐(sending queue)에 포함된 요청의 수를 측정하고, 상기 측정된 요청 수가 미리 결정된 임계 값을 초과하는 경우에 기초하여, 상기 제1 페이지 배치 또는 상기 제2 페이지 배치 중 적어도 하나의 배치의 수신의 속도를 변경할 수 있다.The processor measures the number of requests included in a sending queue of the smart network interface card, and based on the case where the measured number of requests exceeds a predetermined threshold, the first page batch or the The speed of reception of at least one batch of the second page batches can be changed.

상기 프로세서는, 상기 원격 메모리 서버로부터 상기 제2 페이지 배치가 수신된 경우에 기초하여, 상기 제2 페이지 배치 중 상기 호스트의 캐시 메모리 영역에 저장될 페이지들을 상기 호스트에게 전달하고, 상기 호스트의 스레드 정보 테이블에서 상기 전달된 페이지들에 대응하는 스레드의 히트 정보를 업데이트할 수 있다.The processor, based on when the second page batch is received from the remote memory server, transfers pages to be stored in the cache memory area of the host among the second page batch to the host, and provides thread information of the host. In the table, hit information of threads corresponding to the delivered pages can be updated.

상기 프로세서는, 상기 수신된 제1 페이지 배치 및 제2 페이지 배치에 포함된 페이지들 중, 방출(evict)될 페이지들을 상기 호스트의 작성 메모리 영역(write memory region)에 작성하고, 상기 방출될 페이지들에 대한 메타데이터에 기초한 방출 요청 배치를 상기 스마트 네트워크 인터페이스 카드에 전달하고, 상기 스마트 네트워크 인터페이스 카드가 상기 방출 요청 배치에 대응하는 페이지들을 상기 원격 메모리 서버에 전달할 수 있다.The processor writes pages to be evict, among pages included in the received first page batch and the second page batch, in a write memory region of the host, and writes the pages to be evict to a write memory region of the host. A release request batch based on metadata for may be transmitted to the smart network interface card, and the smart network interface card may transmit pages corresponding to the release request batch to the remote memory server.

도 1은 일 실시예에 따른 페이지 폴트 핸들링 전자 장치 및 원격 메모리 서버를 도시한 도면이다.
도 2는 일 실시예에 따른 페이지 폴트 핸들링 방법을 도시한 흐름도이다.
도 3은 일 실시예에 따른 페이지 폴트 핸들링 전자 장치의 호스트 및 전자 장치의 스마트 네트워크 인터페이스 카드를 도시한 도면이다.
도 4는 일 실시예에 따른 Cache hit 상황의 전자 장치의 호스트 및 전자 장치의 스마트 네트워크 인터페이스 카드를 도시한 도면이다.
도 5 및 도 6은 일 실시예에 따른 Cache miss 상황의 전자 장치의 호스트 및 전자 장치의 스마트 네트워크 인터페이스 카드를 도시한 도면이다.
도 7은 일 실시예에 따른 페이지 폴트 핸들링 전자 장치의 호스트 및 전자 장치의 스마트 네트워크 인터페이스 카드의 네트워크 스케줄링을 도시한 도면이다.
도 8은 일 실시예에 따른 페이지 폴트 핸들링 전자 장치의 호스트 및 전자 장치의 스마트 네트워크 인터페이스 카드의 페이지 방출(eviction)을 도시한 도면이다.FIG. 1 is a diagram illustrating a page fault handling electronic device and a remote memory server according to an embodiment.
Figure 2 is a flowchart illustrating a page fault handling method according to an embodiment.
FIG. 3 is a diagram illustrating a host of a page fault handling electronic device and a smart network interface card of the electronic device, according to an embodiment.
FIG. 4 is a diagram illustrating a host of an electronic device and a smart network interface card of the electronic device in a cache hit situation according to an embodiment.
5 and 6 are diagrams illustrating a host of an electronic device in a cache miss situation and a smart network interface card of the electronic device, according to an embodiment.
FIG. 7 is a diagram illustrating network scheduling of a host of a page fault handling electronic device and a smart network interface card of the electronic device, according to an embodiment.
FIG. 8 is a diagram illustrating page fault handling of a host of an electronic device and page eviction of a smart network interface card of the electronic device, according to an embodiment.

실시예들에 대한 특정한 구조적 또는 기능적 설명들은 단지 예시를 위한 목적으로 개시된 것으로서, 다양한 형태로 변경되어 구현될 수 있다. 따라서, 실제 구현되는 형태는 개시된 특정 실시예로만 한정되는 것이 아니며, 본 명세서의 범위는 실시예들로 설명한 기술적 사상에 포함되는 변경, 균등물, 또는 대체물을 포함한다.Specific structural or functional descriptions of the embodiments are disclosed for illustrative purposes only and may be changed and implemented in various forms. Accordingly, the actual implementation form is not limited to the specific disclosed embodiments, and the scope of the present specification includes changes, equivalents, or substitutes included in the technical idea described in the embodiments.

제1 또는 제2 등의 용어를 다양한 구성요소들을 설명하는데 사용될 수 있지만, 이런 용어들은 하나의 구성요소를 다른 구성요소로부터 구별하는 목적으로만 해석되어야 한다. 예를 들어, 제1 구성요소는 제2 구성요소로 명명될 수 있고, 유사하게 제2 구성요소는 제1 구성요소로도 명명될 수 있다.Terms such as first or second may be used to describe various components, but these terms should be interpreted only for the purpose of distinguishing one component from another component. For example, a first component may be named a second component, and similarly, the second component may also be named a first component.

어떤 구성요소가 다른 구성요소에 "연결되어" 있다고 언급된 때에는, 그 다른 구성요소에 직접적으로 연결되어 있거나 또는 접속되어 있을 수도 있지만, 중간에 다른 구성요소가 존재할 수도 있다고 이해되어야 할 것이다.When a component is referred to as being “connected” to another component, it should be understood that it may be directly connected or connected to the other component, but that other components may exist in between.

단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. 본 명세서에서, "포함하다" 또는 "가지다" 등의 용어는 설명된 특징, 숫자, 단계, 동작, 구성요소, 부분품 또는 이들을 조합한 것이 존재함으로 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성요소, 부분품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.Singular expressions include plural expressions unless the context clearly dictates otherwise. In this specification, terms such as “comprise” or “have” are intended to designate the presence of the described features, numbers, steps, operations, components, parts, or combinations thereof, and are intended to indicate the presence of one or more other features or numbers, It should be understood that this does not exclude in advance the possibility of the presence or addition of steps, operations, components, parts, or combinations thereof.

본 문서에서, "A 또는 B", "A 및 B 중 적어도 하나", "A 또는 B 중 적어도 하나", "A, B 또는 C", "A, B 및 C 중 적어도 하나", 및 "A, B, 또는 C 중 적어도 하나"와 같은 문구들 각각은 그 문구들 중 해당하는 문구에 함께 나열된 항목들 중 어느 하나, 또는 그들의 모든 가능한 조합을 포함할 수 있다.As used herein, “A or B”, “at least one of A and B”, “at least one of A or B”, “A, B or C”, “at least one of A, B and C”, and “A Each of phrases such as “at least one of , B, or C” may include any one of the items listed together in the corresponding phrase, or any possible combination thereof.

다르게 정의되지 않는 한, 기술적이거나 과학적인 용어를 포함해서 여기서 사용되는 모든 용어들은 해당 기술 분야에서 통상의 지식을 가진 자에 의해 일반적으로 이해되는 것과 동일한 의미를 가진다. 일반적으로 사용되는 사전에 정의되어 있는 것과 같은 용어들은 관련 기술의 문맥상 가지는 의미와 일치하는 의미를 갖는 것으로 해석되어야 하며, 본 명세서에서 명백하게 정의하지 않는 한, 이상적이거나 과도하게 형식적인 의미로 해석되지 않는다.Unless otherwise defined, all terms used herein, including technical or scientific terms, have the same meaning as commonly understood by a person of ordinary skill in the art. Terms as defined in commonly used dictionaries should be interpreted as having meanings consistent with the meanings they have in the context of the related technology, and unless clearly defined in this specification, should not be interpreted in an idealized or overly formal sense. No.

이하, 실시예들을 첨부된 도면들을 참조하여 상세하게 설명한다. 첨부 도면을 참조하여 설명함에 있어, 도면 부호에 관계없이 동일한 구성 요소는 동일한 참조 부호를 부여하고, 이에 대한 중복되는 설명은 생략하기로 한다.Hereinafter, embodiments will be described in detail with reference to the attached drawings. In the description with reference to the accompanying drawings, identical components will be assigned the same reference numerals regardless of the reference numerals, and overlapping descriptions thereof will be omitted.

도 1은 일 실시예에 따른 페이지 폴트 핸들링 전자 장치 및 원격 메모리 서버를 도시한 도면이다.FIG. 1 is a diagram illustrating a page fault handling electronic device and a remote memory server according to an embodiment.

일 실시예에 따른 전자 장치(100)는, 페이지 폴트(page fault)를 핸들링(handling)하기 위한 장치로서, 프로세서(110), 메모리(120), 호스트(host)(130), 및 스마트 네트워크 인터페이스 카드(Smart NIC, Smart Network Interface Card)를 포함할 수 있다.The electronic device 100 according to one embodiment is a device for handling page faults, and includes a processor 110, a memory 120, a host 130, and a smart network interface. It may include a card (Smart NIC, Smart Network Interface Card).

프로세서(110)는, 예를 들면, 소프트웨어를 실행하여 프로세서(110)에 연결된 전자 장치(100)의 적어도 하나의 다른 구성요소(예: 하드웨어 또는 소프트웨어 구성요소)를 제어할 수 있고, 다양한 데이터 처리 또는 연산을 수행할 수 있다. 일 실시예에 따르면, 페이지 폴트(page fault) 핸들링을 위한 데이터 처리 또는 연산의 적어도 일부로서, 프로세서(110)는 명령 또는 데이터를 메모리(120)에 저장하고, 메모리(120)에 저장된 명령 또는 데이터를 처리하고, 결과 데이터를 메모리(120)에 저장할 수 있다.The processor 110 may, for example, execute software to control at least one other component (e.g., hardware or software component) of the electronic device 100 connected to the processor 110, and process various data. Alternatively, calculations can be performed. According to one embodiment, as at least part of data processing or computation for page fault handling, processor 110 stores instructions or data in memory 120, and stores instructions or data stored in memory 120. can be processed, and the resulting data can be stored in the memory 120.

메모리(120)는 컴퓨터로 실행 가능한 명령어들을 포함할 수 있다. 메모리(120)는 페이지 폴트(page fault) 핸들링을 위해 요구되는 다양한 데이터 및/또는 정보를 임시적으로 및/또는 영구적으로 저장할 수 있다. 예를 들어, 메모리(120)는 페이지 정보(page information), 스레드 정보(thread information), 또는 페이지 배치(page batch)에 대한 메타데이터 중 적어도 하나 이상을 저장할 수 있다. 페이지 정보는 페이지의 가상 주소, 오프셋, 유형(type), 및 상태를 포함할 수 있다. 스레드 정보는 스레드의 ID, 오프셋, 및 히트 정보를 포함할 수 있다. 페이지 배치는 적어도 두개 이상의 페이지들을 포함하는 배치(batch)일 수 있다.Memory 120 may include instructions executable by a computer. Memory 120 may temporarily and/or permanently store various data and/or information required for page fault handling. For example, the memory 120 may store at least one of page information, thread information, or metadata about page batch. Page information may include the virtual address, offset, type, and state of the page. Thread information may include the thread's ID, offset, and hit information. A page layout may be a batch containing at least two or more pages.

호스트(130)는 전자 장치(100)의 커널(kernel) 영역을 포함할 수 있다. 예를 들어, 호스트(130)는 하나의 프로세스 또는 복수의 프로세스들 중 적어도 하나 이상에 할당되는 메모리 공간 중에서 유저(user) 영역을 제외한 나머지 영역을 포함할 수 있다. 호스트(130)는 CPU에서 명령어들을 처리함에 따라 스마트 네트워크 인터페이스 카드(140)와 구분될 수 있다.The host 130 may include a kernel area of the electronic device 100. For example, the host 130 may include the remaining areas excluding the user area among the memory space allocated to one process or at least one of a plurality of processes. The host 130 can be distinguished from the smart network interface card 140 as the CPU processes commands.

예를 들어, 호스트(130)는 폴트된 페이지를 포함하는 페이지 배치를 프리페칭(prefetching) 할 수 있다. 구체적으로, 호스트(130)는 원격 메모리 서버(150)로부터 폴트된 페이지를 포함하는 페이지 배치를 수신할 수 있다. 호스트(130)는 원격 메모리 서버(150)로부터 수신된 페이지 배치를 프리페칭 할 수 있다. 프리페칭은, 페이지가 메모리 영역(예를 들어, 캐시 메모리 영역)과 디스크 영역(예를 들어, 원격 메모리 서버(150))을 반복적으로 이동하면서 발생되는 오버헤드를 감소시키기 위해, 전자 장치(100)의 캐시 메모리 영역에서 페이지들의 방출 시간을 지연 또는 후보 페이지들(예를 들어, 폴트된 페이지가 제외된 페이지 배치의 페이지들)을 미리 저장시키는 과정을 나타낼 수 있다. 호스트(130)는 전술한 프리페칭 과정을 스마트 네트워크 인터페이스 카드(140)에게 오프로딩(offloading) 할 수 있다. 호스트(130)는 원격 직접 메모리 접근(RDMA)을 스마트 네트워크 인터페이스 카드(140)에게 오프로딩 할 수 있다. 원격 직접 메모리 접근(RDMA)은 폴트 페이지를 포함하는 페이지 배치의 프리페칭을 위해, 원격 메모리 서버(150)에게 페이지 배치를 요청 또는 원격 메모리 서버(150)로부터 페이지 배치를 수신 중 적어도 하나 이상의 과정을 나타낼 수 있다.For example, the host 130 may prefetch a batch of pages including faulted pages. Specifically, the host 130 may receive a page layout including a faulted page from the remote memory server 150. The host 130 may prefetch the page batch received from the remote memory server 150. Prefetching is performed by the electronic device 100 to reduce the overhead generated when pages repeatedly move between a memory area (e.g., cache memory area) and a disk area (e.g., remote memory server 150). ) may represent a process of delaying the release time of pages in the cache memory area or pre-storing candidate pages (for example, pages in a page arrangement from which faulted pages are excluded). The host 130 may offload the above-described prefetching process to the smart network interface card 140. Host 130 may offload remote direct memory access (RDMA) to smart network interface card 140. Remote direct memory access (RDMA) performs at least one of requesting a page layout from the remote memory server 150 or receiving a page layout from the remote memory server 150 to prefetch a page layout including a fault page. It can be expressed.

스마트 네트워크 인터페이스 카드(140)는, 자체 온보드(on board) 프로세서 및 자체 온보드 프로세서에 필요한 컴퓨터 자원을 가짐으로써, 다양한 워크로드를 호스트(130) 대신하여 처리할 수 있는 네트워크 인터페이스 카드일 수 있다. 예를 들어, 스마트 네트워크 인터페이스 카드(140)는 폴트된 페이지를 포함하는 페이지 배치를 프리페칭(prefetching) 할 수 있다. 구체적으로, 스마트 네트워크 인터페이스 카드(140)는 호스트(130)를 대신하여 원격 직접 메모리 접근(RDMA)을 할 수 있다. 스마트 네트워크 인터페이스 카드(140)는, 호스트(130)를 대신하여 원격 직접 메모리 접근(RDMA)을 수행함으로써, 호스트(130)의 CPU 사용률을 감소시킬 수 있으며 호스트(130)의 페이지 접근 지연 시간을 감소시킬 수 있다.The smart network interface card 140 may be a network interface card that can process various workloads on behalf of the host 130 by having its own on-board processor and computer resources necessary for the on-board processor. For example, the smart network interface card 140 may prefetch a batch of pages that include faulted pages. Specifically, the smart network interface card 140 may perform remote direct memory access (RDMA) on behalf of the host 130. The smart network interface card 140 can reduce the CPU utilization of the host 130 and reduce the page access latency of the host 130 by performing remote direct memory access (RDMA) on behalf of the host 130. You can do it.

원격 메모리 서버(remote memory server)(150)는, 전자 장치(100)와 분리되어 메모리를 저장하는 서버일 수 있다. 구체적으로, 원격 메모리 서버(150)는 복수의 메모리들을 저장할 수 있다. 원격 메모리 서버(150)는 전자 장치(100)의 네트워크 인터페이스 카드 또는 스마트 네트워크 인터페이스 카드(140) 중 적어도 하나 이상의 카드를 통해 복수의 메모리들을 전자 장치(100)에게 전달할 수 있다.The remote memory server 150 may be a server that is separate from the electronic device 100 and stores memory. Specifically, the remote memory server 150 may store a plurality of memories. The remote memory server 150 may transfer a plurality of memories to the electronic device 100 through at least one of the network interface card or the smart network interface card 140 of the electronic device 100.

도 2는 일 실시예에 따른 페이지 폴트 핸들링 방법을 도시한 흐름도이다.Figure 2 is a flowchart illustrating a page fault handling method according to an embodiment.

단계(210)에서, 전자 장치(예: 도 1의 전자 장치(100))는, 호스트(예: 도 1의 호스트(130))의 폴트 페이지(fault page)에 관한 메타데이터에 기초하여, 스마트 네트워크 인터페이스 카드(예: 도 1의 스마트 네트워크 인터페이스 카드(140))를 동반한 프리페칭(prefetching)의 수행 여부를 결정할 수 있다. 예를 들어, 폴트 페이지에 관한 메타데이터는 폴트 페이지의 정보에 관한 구조화된 데이터일 수 있다. 폴트 페이지에 관한 메타데이터는 폴트 페이지의 주소 또는 폴트 페이지의 사이즈 중 적어도 하나 이상을 포함할 수 있다. 전자 장치는, 프리페칭을 통해 메모리 접근으로 발생되는 지연시간을 감소시킬 수 있다.In step 210, the electronic device (e.g., the electronic device 100 of FIG. 1), based on metadata about the fault page of the host (e.g., the host 130 of FIG. 1), It is possible to determine whether to perform prefetching with a network interface card (eg, smart network interface card 140 of FIG. 1). For example, metadata about a fault page may be structured data about information about the fault page. Metadata about the fault page may include at least one of the address of the fault page or the size of the fault page. Electronic devices can reduce latency caused by memory access through prefetching.

일 실시예에 따른 전자 장치는, 스마트 네트워크 인터페이스 카드를 이용하는 프리페칭의 수행 여부를 결정할 수 있다. 예를 들어, 전자 장치는 폴트 페이지가 캐시 메모리에 저장된 경우, 폴트 페이지가 캐싱되어 있는 위치로부터 폴트 페이지를 프로세스에 반환시킬 수 있다. 다시 말해, 전자 장치는 폴트 페이지가 캐시 메모리에 저장된 경우, 폴트 페이지를 포함하는 페이지 배치를 프리페칭을 수행할 수 있다. 이와 달리, 전자 장치는 폴트 페이지가 캐시 메모리에 저장되어 있지 않은 경우, 스마트 네트워크 인터페이스 카드를 동반한 프리페칭을 수행하기로 결정할 수 있다. 이하, 도 5 및 도 6에서, 스마트 네트워크 인터페이스 카드를 동반한 프리페칭 과정이 상세하게 설명된다.The electronic device according to one embodiment may determine whether to perform prefetching using a smart network interface card. For example, when a fault page is stored in a cache memory, the electronic device may return the fault page to the process from the location where the fault page is cached. In other words, when the fault page is stored in the cache memory, the electronic device may perform prefetching on the page arrangement including the fault page. Alternatively, the electronic device may decide to perform prefetching with a smart network interface card if the fault page is not stored in the cache memory. Below, in Figures 5 and 6, the prefetching process involving a smart network interface card is explained in detail.

단계(220)에서, 전자 장치는 프리페칭을 수행하기로 결정된 것에 기초하여, 스레드(thread)가 접근하는 호스트의 가상 메모리 오프셋 기록을 리프(Leep) 알고리즘에 적용하여 메이저 오프셋을 획득할 수 있다. 예를 들어, 리프 알고리즘은, 페이지 폴트가 발생할 때 마다 폴트된 페이지의 스왑 캐시(swap cache) 영역 오프셋의 기록에 저장된 스왑 캐시 영역의 오프셋을 바탕으로, 과반수 투표 알고리즘을 사용하여 메이저 오프셋을 획득할 수 있는 알고리즘을 나타낼 수 있다. 스왑 캐시 영역은 페이지가 디스크로 방출(evict)되기 전, 임시로 저장되는 영역일 수 있다.In step 220, based on the decision to perform prefetching, the electronic device may obtain a major offset by applying the virtual memory offset record of the host accessed by the thread to the leaf algorithm. For example, the leaf algorithm uses a majority voting algorithm to obtain a major offset whenever a page fault occurs, based on the offset of the swap cache area stored in the record of the swap cache area offset of the faulted page. It can represent an algorithm that can be used. The swap cache area may be an area where pages are temporarily stored before they are evict to disk.

일 실시예에 따른 전자 장치는, 스레드가 접근하는 호스트의 가상 메모리 오프셋을 기록하고, 폴트 페이지를 요구하는 스레드의 메이저 오프셋을 획득할 수 있다. 여기서, 오프셋은 스왑 캐시 영역에서 폴트 페이지를 기준으로 프리페칭 될 페이지들 중 마지막 페이지까지의 변위차를 나타내는 정수형일 수 있다. 전자 장치는, 페이지 폴트가 발생된 시간 구간(time interval)에 실행되는 복수의 스레드 각각에 대하여, 각각의 스레드의 오프셋을 획득할 수 있다. 전자 장치는, 복수의 스레드의 메이저 오프셋을 리프 알고리즘에 적용하여 메이저 오프셋을 획득할 수 있다. 전자 장치는, 복수의 스레드의 오프셋을 통해 메이저 오프셋을 획득함으로써, 프리페칭될 페이지 배치의 결정을 효율적으로 할 수 있다. 예를 들어, 전자 장치는 스레드가 접근하는 호스트의 가상 메모리 오프셋이 반영된 메이저 오프셋을 통해 프리페칭될 페이지 배치를 결정함으로써, 캐시 적중 비율(cache hit ration)을 증가시킬 수 있다.The electronic device according to one embodiment may record the virtual memory offset of the host accessed by the thread and obtain the major offset of the thread requesting the fault page. Here, the offset may be an integer representing the displacement difference from the swap cache area to the last page among pages to be prefetched based on the fault page. The electronic device may obtain the offset of each thread for each of the plurality of threads executing in the time interval in which the page fault occurred. The electronic device may obtain a major offset by applying the major offsets of a plurality of threads to a leaf algorithm. The electronic device can efficiently determine the arrangement of pages to be prefetched by obtaining a major offset through the offsets of a plurality of threads. For example, the electronic device can increase the cache hit ratio by determining the arrangement of pages to be prefetched through a major offset that reflects the virtual memory offset of the host accessed by the thread.

단계(230)에서, 전자 장치는 메이저 오프셋에 기초하여, 호스트에 의해 프리페칭될 제1 페이지 배치 및 스마트 네트워크 인터페이스 카드에 의해 프리페칭될 제2 페이지 배치를 결정할 수 있다. 예를 들어, 전자 장치는 메이저 오프셋을 기초로 프리페칭할 페이지들을 선택할 수 있다. 프리페칭할 페이지는 폴트 페이지를 포함할 수 있고, 스왑 캐시 영역에서 폴트 페이지와 다른 페이지들을 포함할 수 있다. 전자 장치는, 원격 메모리 서버(예: 도 1의 원격 메모리 서버(150))에 대한 기대 접근 시간(expected access time) 또는 폴트 페이지의 긴급성(urgency) 중 적어도 하나 이상에 기초하여, 메이저 오프셋을 기초로 선택된 복수의 페이지들 중에서 제1 페이지 배치를 결정할 수 있다. 전자 장치는, 선택된 페이지들 중 결정된 제1 페이지 배치를 제외한 페이지들을, 제2 페이지 배치로 결정할 수 있다. 폴트 페이지의 긴급성은, 폴트 페이지가 호스트에서 신속하게 처리되어야 하는 정도를 나타낼 수 있다. 원격 메모리 서버에 대한 기대 접근 시간은, 전자 장치의 호스트 또는 스마트 네트워크 인터페이스 카드 중 적어도 하나 이상이 원격 메모리 서버에 접근하는 기대 시간을 나타낼 수 있다. 원격 메모리 서버에 대한 기대 접근 시간에 관한 상세한 설명은 도 7에서 후술한다.In step 230, the electronic device may determine a first page batch to be prefetched by the host and a second page batch to be prefetched by the smart network interface card, based on the major offset. For example, the electronic device may select pages to prefetch based on the major offset. The page to be prefetched may include a faulted page, and may include pages other than the faulted page in the swap cache area. The electronic device determines a major offset based on at least one of an expected access time to a remote memory server (e.g., remote memory server 150 of FIG. 1) or the urgency of the fault page. The first page arrangement may be determined among a plurality of pages selected as a basis. The electronic device may determine pages other than the determined first page layout among the selected pages as the second page layout. The urgency of a fault page may indicate the degree to which the fault page must be handled quickly by the host. The expected access time to the remote memory server may represent the expected time for at least one of the host of the electronic device or the smart network interface card to access the remote memory server. A detailed description of the expected access time for the remote memory server is described later in FIG. 7.

단계(240)에서, 전자 장치는 결정된 제1 페이지 배치에 대한 메타데이터 및 결정된 제2 페이지 배치에 대한 메타데이터에 기초한 요청 배치(request batch)를 원격 메모리 서버에게 전달할 수 있다. 예를 들어, 전자 장치는 제1 페이지 배치에 대응하는 요청 배치를, 호스트의 RDMA 실행기(RDMA executor)를 통해 원격 메모리 서버에게 전달할 수 있다. 전자 장치는 제2 페이지 배치에 대응하는 요청 배치를, 스마트 네트워크 인터페이스 카드의 RDMA 실행기를 통해 원격 메모리 서버에게 전달할 수 있다. 요청 배치는, 전자 장치가 원격 메모리 서버에게 페이지 배치를 요청하는 정보들을 포함할 수 있다.In step 240, the electronic device may transmit a request batch based on the metadata for the determined first page layout and the metadata for the determined second page layout to the remote memory server. For example, the electronic device may transmit a request batch corresponding to the first page batch to a remote memory server through an RDMA executor of the host. The electronic device may transmit the request batch corresponding to the second page batch to the remote memory server through the RDMA executor of the smart network interface card. Request placement may include information that the electronic device requests page placement from a remote memory server.

단계(250)에서, 전자 장치는 요청에 응답한 원격 메모리 서버로부터 호스트가 제1 페이지 배치를 수신하고 스마트 네트워크 인터페이스 카드가 제2 페이지 배치를 수신할 수 있다. 전자 장치는, 원격 메모리 서버로부터 제2 페이지 배치가 수신된 경우에 기초하여, 제2 페이지 배치 중 호스트의 캐시 메모리 영역에 저장될 페이지들을 호스트에게 전달할 수 있다. 전자 장치는, 호스트의 스레드 정보 테이블에서 전달된 페이지들에 대응하는 스레드의 히트 정보를 업데이트할 수 있다. 이후, 전자 장치는, 수신된 페이지 배치들을 기초로 프리패칭을 수행할 수 있다. 전자 장치는, 단계(240)의 요청 배치를 원격 메모리 서버에게 전달함으로써, 폴트 페이지를 동기적으로(synchronously) 판독(read)하고 프리페칭될 페이지들을 비동기적으로(asynchronously) 판독함으로써 CPU의 오버헤드를 감소시킬 수 있다.In step 250, the host may receive a first page batch and the smart network interface card may receive a second page batch from the remote memory server that responded to the request. The electronic device may deliver pages to be stored in the cache memory area of the host during the second page arrangement to the host, based on when the second page arrangement is received from the remote memory server. The electronic device may update the hit information of the thread corresponding to the delivered pages in the host's thread information table. Thereafter, the electronic device may perform prefetching based on the received page batches. The electronic device transmits the request batch of step 240 to the remote memory server, thereby reducing the overhead of the CPU by synchronously reading faulted pages and asynchronously reading pages to be prefetched. can be reduced.

도 3은 일 실시예에 따른 페이지 폴트 핸들링 전자 장치의 호스트 및 전자 장치의 스마트 네트워크 인터페이스 카드를 도시한 도면이다.FIG. 3 is a diagram illustrating a host of a page fault handling electronic device and a smart network interface card of the electronic device, according to an embodiment.

일 실시예에 따른 전자 장치(예: 도 1의 전자 장치(100))는 호스트(300) 및 스마트 네트워크 인터페이스 카드(320)를 포함할 수 있다.An electronic device (eg, the electronic device 100 of FIG. 1) according to an embodiment may include a host 300 and a smart network interface card 320.

호스트(300)는, 캐시 정보 테이블(cahce information table)(302), 스레드 정보 테이블(thread information table)(304), 캐시 메모리 영역(cache memory region)(306), 작성 메모리 영역(write memory region)(308), RDMA 오프로더(offloader)(310), 및 RDMA 실행기(executor)(312)를 포함할 수 있다.The host 300 includes a cache information table 302, a thread information table 304, a cache memory region 306, and a write memory region. It may include (308), an RDMA offloader (310), and an RDMA executor (312).

캐시 정보 테이블(302)은 호스트(300)가 프리페칭하는 페이지와 방출(evict)하는 페이지에 관한 메타데이터를 저장할 수 있다. 스레드 정보 테이블(304)은 스레드가 접근하는 호스트의 가상 메모리 오프셋 또는 히트 정보(hit metric) 중 적어도 하나 이상을 포함할 수 있다. 캐시 메모리 영역(306)은 요청 배치(예: 도 2의 요청 배치(request batch))를 스마트 네트워크 인터페이스 카드(320)에게 오프로딩하기 위한 배치들 및 원격 메모리 서버(예: 도 1의 원격 메모리 서버(150))로부터 수신한 페이지 배치를 포함할 수 있다. 작성 메모리 영역(308)은 호스트(300)에서 원격 메모리 서버로 작성되는 페이지 또는 스마트 네트워크 인터페이스 카드(320)에서 원격 메모리 서버로 작성되는 페이지 중 적어도 하나 이상의 페이지를 포함할 수 있다. RDMA 오프로더(offloader)(310)는 프리페칭될 페이지 배치의 요청 배치를 스마트 네트워크 인터페이스 카드(320)로 전달할 수 있다. RDMA 실행기(312)는 원격 메모리 서버에 대한 원격 직접 메모리 접근(RDMA, remote direct memory access)을 수행할 수 있다.The cache information table 302 may store metadata about pages that the host 300 prefetches and pages that it evicts. The thread information table 304 may include at least one of a virtual memory offset or hit metric of a host accessed by a thread. The cache memory area 306 stores batches for offloading request batches (e.g., request batch in FIG. 2) to the smart network interface card 320 and a remote memory server (e.g., remote memory server in FIG. 1). It may include the page layout received from (150)). The writing memory area 308 may include at least one page among pages written from the host 300 to the remote memory server or pages written from the smart network interface card 320 to the remote memory server. The RDMA offloader 310 may transmit a requested batch of pages to be prefetched to the smart network interface card 320. The RDMA executor 312 may perform remote direct memory access (RDMA) to a remote memory server.

스마트 네트워크 인터페이스 카드(320)는, 프리페처(prefetcher)(322), 스케줄러(324), RDMA 실행기(332), 캐시 메모리 영역(326), 및 작성 메모리 영역(328)을 포함할 수 있다.The smart network interface card 320 may include a prefetcher 322, a scheduler 324, an RDMA executor 332, a cache memory area 326, and a write memory area 328.

프리페처(322)는 결정된 제2 페이지 배치 중에서 호스트(300)의 캐시 메모리 영역(306)에 저장될 페이지들 및 스마트 네트워크 인터페이스 카드(320)의 캐시 메모리 영역(326)에 저장될 페이지들을 결정할 수 있다. 또한, 프리페처(322)는 호스트(300)의 RDMA 오프로더(310)에 의해 송신된 프리페칭될 페이지 배치의 요청 배치를 수신할 수 있다. 스케줄러(324)는, 요청 배치가 원격 메모리 서버에 전달되는 과정을 동적으로 스케줄링할 수 있다. 스케줄러에 관한 설명은, 하기 도 7에서 후술한다. RDMA 실행기(332)는 호스트(300)로부터 요청 배치를 수신하는 것에 기초하여, 원격 메모리 서버 사이에 대한 원격 직접 메모리 접근(RDMA)을 수행할 수 있다. 다시 말해, RDMA 실행기(332)는 호스트(300)를 대신하여 페이지 배치를 원격 메모리 서버로부터 수신할 수 있다.The prefetcher 322 may determine pages to be stored in the cache memory area 306 of the host 300 and pages to be stored in the cache memory area 326 of the smart network interface card 320 from among the determined second page arrangement. there is. Additionally, the prefetcher 322 may receive a request batch of pages to be prefetched sent by the RDMA offloader 310 of the host 300. The scheduler 324 may dynamically schedule the process in which the request batch is delivered to the remote memory server. A description of the scheduler will be provided later with reference to FIG. 7. RDMA executor 332 may perform remote direct memory access (RDMA) to and from remote memory servers based on receiving a batch of requests from host 300. In other words, the RDMA executor 332 may receive page placement from a remote memory server on behalf of the host 300.

도 4는 일 실시예에 따른 Cache hit 상황의 전자 장치의 호스트 및 전자 장치의 스마트 네트워크 인터페이스 카드를 도시한 도면이다.FIG. 4 is a diagram illustrating a host of an electronic device and a smart network interface card of the electronic device in a cache hit situation according to an embodiment.

일 실시예에 따른 전자 장치(예: 도 1의 전자 장치(100))는, 폴트가 발생된 페이지를 캐시 메모리 영역으로부터 획득할 수 있다. 전자 장치는 획득된 폴트 페이지를 프리페칭할 수 있다.An electronic device (e.g., the electronic device 100 of FIG. 1) according to an embodiment may obtain a page in which a fault has occurred from a cache memory area. The electronic device may prefetch the acquired fault page.

예를 들어, 전자 장치는 폴트 페이지에 관한 메타데이터를 통해, 호스트(400)의 캐시 정보 테이블(404)에서 저장된 폴트 페이지의 캐싱 기록을 획득할 수 있다. 폴트 페이지의 캐싱 기록은, 폴트 페이지가 캐시 메모리 영역에 저장된 페이지인지 여부의 기록을 포함할 수 있다. 전자 장치는, 상술한 캐싱 기록에 기초하여, 폴트 페이지가 저장된 캐시 메모리 영역으로부터 폴트 페이지를 호스트(400)에 반환시킬 수 있다.For example, the electronic device may obtain a caching record of the fault page stored in the cache information table 404 of the host 400 through metadata about the fault page. The caching record of the faulted page may include a record of whether the faulted page is a page stored in the cache memory area. The electronic device may return the fault page to the host 400 from the cache memory area where the fault page is stored, based on the above-described caching record.

구체적으로, 폴트 페이지가 저장된 캐시 메모리 영역은, 호스트(400)의 캐시 메모리 영역(406) 또는 스마트 네트워크 인터페이스 카드(410)의 캐시 메모리 영역(412) 적어도 하나의 메모리 영역을 나타낼 수 있다. 만약, 폴트 페이지가 스마트 네트워크 인터페이스 카드(410)의 캐시 메모리 영역(412)에 저장된 경우, 전자 장치는 RDMA 오프로더(408)에 의해 캐시 메모리 영역(412)으로부터 폴트 페이지를 호스트(400)에 반환시킬 수 있다. 이후, 전자 장치는, 스레드 정보 테이블(402)에 저장된 폴트 페이지에 관한 스레드의 히트 정보(hit metric)를 업데이트 할 수 있다. 전자 장치는, 스레드의 히트 정보를 업데이트 함으로써, 또 다른(another) 프리페칭을 위한 오프셋을 기록할 수 있다.Specifically, the cache memory area in which the fault page is stored may represent at least one memory area of the cache memory area 406 of the host 400 or the cache memory area 412 of the smart network interface card 410. If the fault page is stored in the cache memory area 412 of the smart network interface card 410, the electronic device returns the fault page to the host 400 from the cache memory area 412 by the RDMA offloader 408. You can. Thereafter, the electronic device may update the hit information (hit metric) of the thread regarding the fault page stored in the thread information table 402. The electronic device can record an offset for another prefetching by updating the hit information of the thread.

도 5 및 도 6은 일 실시예에 따른 Cache miss 상황의 전자 장치의 호스트 및 전자 장치의 스마트 네트워크 인터페이스 카드를 도시한 도면이다.5 and 6 are diagrams illustrating a host of an electronic device in a cache miss situation and a smart network interface card of the electronic device, according to an embodiment.

일 실시예에 따른 전자 장치(예: 도 1의 전자 장치(100))는, 스마트 네트워크 인터페이스 카드(610)를 통해 원격 메모리 서버(예: 도 1의 원격 메모리 서버(150))에 대한 원격 직접 메모리 접근(RDMA)을 수행할 수 있다. 예를 들어, 전자 장치는 스마트 네트워크 인터페이스 카드(610)를 통해 원격 메모리 서버에 저장된 페이지 배치(예를 들어, 폴트 페이지를 포함하는 복수의 페이지들)를 수신할 수 있다. 전자 장치는, 스마트 네트워크 인터페이스 카드(610)를 통해 수신된 페이지 배치들을 프리페칭할 수 있다.An electronic device (e.g., the electronic device 100 of FIG. 1) according to an embodiment remotely directly connects to a remote memory server (e.g., the remote memory server 150 of FIG. 1) through a smart network interface card 610. Memory access (RDMA) can be performed. For example, the electronic device may receive a page arrangement (eg, a plurality of pages including a fault page) stored in a remote memory server through the smart network interface card 610. The electronic device may prefetch page batches received through the smart network interface card 610.

일 실시예에 따른 전자 장치는, 호스트(500)의 캐시 정보 테이블(502)에 폴트 페이지에 관한 메타데이터가 존재하지 않는 경우에 기초하여, 스마트 네트워크 인터페이스 카드(610)를 동반한 프리페칭을 수행하기로 결정할 수 있다. 전자 장치는, 호스트(500)에서 실행되는 복수의 스레드의 각각의 오프셋 기록(504)을 과반수 투표 알고리즘인 리프 알고리즘에 적용하여 메이저 오프셋을 획득할 수 있다.The electronic device according to one embodiment performs prefetching with the smart network interface card 610 based on the case where metadata related to the fault page does not exist in the cache information table 502 of the host 500. You can decide to do it. The electronic device may obtain a major offset by applying each offset record 504 of a plurality of threads running on the host 500 to the leaf algorithm, which is a majority voting algorithm.

전자 장치는, 메이저 오프셋을 기초로 페이지 배치(506)를 결정할 수 있다. 페이지 배치(506)는 호스트(500)에 의해 프리페칭될 제1 페이지 배치 및 스마트 네트워크 인터페이스 카드에 의해 프리페칭될 제2 페이지 배치를 포함할 수 있다. 이후, 전자 장치는 결정된 제2 페이지 배치 중에서, 호스트의 캐시 메모리 영역에 저장될 페이지들 및 스마트 네트워크 인터페이스 카드의 캐시 메모리 영역에 저장될 페이지들을 결정할 수 있다.The electronic device may determine the page arrangement 506 based on the major offset. Page batch 506 may include a first page batch to be prefetched by host 500 and a second page batch to be prefetched by a smart network interface card. Thereafter, from the determined second page arrangement, the electronic device may determine pages to be stored in the cache memory area of the host and pages to be stored in the cache memory area of the smart network interface card.

전자 장치는, 폴트 페이지를 포함하는 제1 페이지 배치를 RDMA 실행기(508)를 통해 원격 직접 메모리 접근(RDMA)을 수행할 수 있다. 예를 들어, 전자 장치는 제1 페이지 배치에 대응하는 요청 배치를, 상기 호스트(500)의 RDMA 실행기(508)를 통해 원격 메모리 서버에게 전달할 수 있다. 다만, 이로 한정하는 것은 아니고, 전자 장치는 제1 페이지 배치 및 제2 페이지 배치에 대한 메타데이터에 기초한 요청 배치를 스마트 네트워크 인터페이스 카드에 전달함으로써, 스마트 네트워크 인터페이스 카드가 원격 직접 메모리 접근(RDMA)을 수행할 수 있다.The electronic device may perform remote direct memory access (RDMA) on the first page batch including the faulted page through the RDMA executor 508. For example, the electronic device may transmit a request batch corresponding to the first page batch to a remote memory server through the RDMA executor 508 of the host 500. However, it is not limited to this, and the electronic device transmits a request batch based on metadata for the first page batch and the second page batch to the smart network interface card, so that the smart network interface card performs remote direct memory access (RDMA). It can be done.

일 실시예에 따른 전자 장치는, 제1 페이지 배치 및/또는 제2 페이지 배치에 대한 메타데이터에 기초한 요청 배치를 스마트 네트워크 인터페이스 카드(610)에 전달할 수 있다. 예를 들어, 전자 장치는, 제1 페이지 배치 및/또는 제2 페이지 배치에 대한 메타데이터를 호스트(600)의 RDMA 오프로더(604)를 통해, 스마트 네트워크 인터페이스 카드(610)의 프리페처(612)에게 전달할 수 있다.The electronic device according to one embodiment may transmit a request layout based on metadata for the first page layout and/or the second page layout to the smart network interface card 610. For example, the electronic device may send metadata about the first page layout and/or the second page layout to the prefetcher 612 of the smart network interface card 610 through the RDMA offloader 604 of the host 600. It can be delivered to.

전자 장치는, 스마트 네트워크 인터페이스 카드(610)의 RDMA 실행기(614)를 통해, 호스트(600) 대신 스마트 네트워크 인터페이스 카드(610)가 원격 직접 메모리 접근(RDMA)을 수행하게 할 수 있다. 예를 들어, 전자 장치는 제1 페이지 및/또는 제2 페이지 배치에 대응하는 요청 배치를, 스마트 네트워크 인터페이스 카드(610)의 RDMA 실행기(614)를 통해 원격 메모리 서버에게 전달할 수 있다.The electronic device may cause the smart network interface card 610 to perform remote direct memory access (RDMA) on behalf of the host 600 through the RDMA executor 614 of the smart network interface card 610. For example, the electronic device may transmit a request batch corresponding to the first page and/or second page batch to the remote memory server through the RDMA executor 614 of the smart network interface card 610.

전자 장치는, 원격 메모리 서버로부터 제1 페이지 배치 및/또는 제2 페이지 배치가 수신된 경우에 기초하여, 제2 페이지 배치 중 호스트(600)의 캐시 메모리 영역(602)에 저장될 페이지들을 호스트(600)에게 전달할 수 있다. 전자 장치는, 호스트의 스레드 정보 테이블에서 전달된 페이지들에 대응하는 스레드의 히트 정보를 업데이트할 수 있다.Based on the case where the first page batch and/or the second page batch are received from the remote memory server, the electronic device selects the pages to be stored in the cache memory area 602 of the host 600 during the second page batch to the host ( 600). The electronic device may update the hit information of the thread corresponding to the delivered pages in the host's thread information table.

전자 장치는, 호스트(600) 대신 스마트 네트워크 인터페이스 카드(610)가 원격 직접 메모리 접근(RDMA)을 수행하게 함으로써, 캐시 적중 비율(cache hit ration)을 증가시킬수 있고 CPU의 오버헤드를 감소시킬 수 있다. 예를 들어, 전자 장치는 스마트 네트워크 인터페이스 카드가 원격 직접 메모리 접근(RDMA)을 수행하게 함으로써, 호스트(600)에서 처리될 명령어들(예를 들어, 원격 직접 메모리 접근에 관한 명령어들)을 감소시켜 CPU의 오버헤드를 감소시킬 수 있다. 또한, 전자 장치는, 보다 많은 페이지들을 포함하는 페이지 배치를 생성함으로써, 보다 많은 페이지들의 프리페칭을 통해 캐시 적중 비율을 증가시킬 수 있다.The electronic device can increase the cache hit ratio and reduce CPU overhead by allowing the smart network interface card 610 to perform remote direct memory access (RDMA) instead of the host 600. . For example, the electronic device allows a smart network interface card to perform remote direct memory access (RDMA), thereby reducing the number of instructions to be processed in the host 600 (e.g., instructions related to remote direct memory access). CPU overhead can be reduced. Additionally, the electronic device can increase the cache hit ratio through prefetching of more pages by creating a page arrangement including more pages.

도 7은 일 실시예에 따른 페이지 폴트 핸들링 전자 장치의 호스트 및 전자 장치의 스마트 네트워크 인터페이스 카드의 네트워크 스케줄링을 도시한 도면이다.FIG. 7 is a diagram illustrating network scheduling of a host of a page fault handling electronic device and a smart network interface card of the electronic device, according to an embodiment.

일 실시예에 따른 전자 장치(예: 도 1의 전자 장치(100))는 원격 직접 메모리 접근(RDMA)에 관한 네트워크 스케줄링을 수행할 수 있다. 예를 들어 원격 직접 메모리 접근(RDMA)에 관한 네트워크 스케줄링은, 전자 장치가 요청 배치(request batch)를 원격 메모리 서버(예: 도 1의 원격 메모리 서버(150))에게 전달 및 요청에 응답한 원격 메모리 서버로부터 전자 장치가 페이지 배치를 수신하는 동작을 스케줄링하는 것일 수 있다.An electronic device (eg, the electronic device 100 of FIG. 1) according to an embodiment may perform network scheduling related to remote direct memory access (RDMA). For example, network scheduling for remote direct memory access (RDMA) involves an electronic device transmitting a request batch to a remote memory server (e.g., remote memory server 150 in FIG. 1) and a remote memory server responding to the request. This may be scheduling an operation in which an electronic device receives a page arrangement from a memory server.

전자 장치는 원격 메모리 서버에 저장된 페이지들의 간격(interval)에 기초하여, 스마트 네트워크 인터페이스 카드(710)의 스케줄러(예: 도 3의 스케줄러(324))로부터 기대 지연 시간(expected latency time)을 획득할 수 있다. 스케줄러는 전자 장치가 요청 배치를 원격 메모리 서버에게 전달 및 요청에 응답한 원격 메모리 서버로부터 전자 장치가 페이지 배치를 수신하는 동작을 스케줄링할 수 있다. 기대 지연 시간은, 전자 장치의 호스트(700) 또는 스마트 네트워크 인터페이스 카드(710) 중 적어도 하나 이상이 원격 메모리 서버로부터 요청에 관한 응답이 지연되는 시간을 나타낼 수 있다.The electronic device obtains the expected latency time from the scheduler (e.g., scheduler 324 in FIG. 3) of the smart network interface card 710 based on the interval of pages stored in the remote memory server. You can. The scheduler may schedule operations in which the electronic device transmits the requested batch to the remote memory server and receives the page batch from the remote memory server that responds to the request. The expected delay time may represent a time during which at least one of the host 700 or the smart network interface card 710 of the electronic device delays a response to a request from a remote memory server.

전자 장치는, 기대 지연 시간이 미리 결정된 임계 시간을 초과하는 경우에 기초하여, 요청 배치에 포함된 요청들을 개별적으로 원격 메모리 서버에게 전달할 수 있다. 전자 장치는, 기대 지연 시간에 기초하여 요청 배치를 원격 메모리 서버에게 동적으로(dynamically) 전달함으로써, 원격 직접 메모리 접근 과정으로 인해 발생된 오버헤드를 감소시킬 수 있고 응용 프로그램의 트래픽(traffic)을 보호할 수 있다. 또한, 전자 장치는 기대 접근 시간에 기초하여, 패이지 배치에서, 호스트(700)에 의해 프리패칭될 제1 페이지 배치를 결정할 수 있다.The electronic device may individually forward requests included in the request batch to the remote memory server based on a case where the expected delay time exceeds a predetermined threshold time. The electronic device can reduce the overhead caused by the remote direct memory access process and protect application traffic by dynamically forwarding the request batch to the remote memory server based on the expected delay time. can do. Additionally, the electronic device may determine the first page arrangement to be prefetched by the host 700 in the page arrangement, based on the expected access time.

전자 장치는, 요청에 응답한 원격 메모리 서버로부터 스마트 네트워크 인터페이스 카드(710)가 제1 페이지 배치 및/또는 제2 페이지 배치를 수신하게 할 수 있다. 전자 장치는, 전송 큐(sending queue)(718)에 기초하여 페이지 배치의 수신의 속도를 변경시킬 수 있다. 예를 들어, 전자 장치는 스마트 네트워크 인터페이스 카드(710)의 전송 큐(718)에 포함된 요청의 수를 측정할 수 있다. 전자 장치는 측정된 요청 수가 미리 결정된 임계 값을 초과하는 경우에 기초하여, 제1 페이지 배치 및/또는 제2 페이지 배치의 수신의 속도를 변경시킬 수 있다. 예를 들어, 전자 장치는, 측정된 요청 수에 기초하여, 지수가중평균(EWMA, exponentially weighted moving average)을 획득할 수 있다. 전자 장치는, 지수가중평균(EWMA) 값이 임계 값을 초과하는 경우, 지수가중평균(EWMA)이 임계 값 미만이 될 때까지 프리페칭될 페이지들의 요청 또는 수신의 속도를 변경시킬 수 있다.The electronic device may cause the smart network interface card 710 to receive the first page batch and/or the second page batch from the remote memory server that responded to the request. The electronic device may change the speed of receiving a batch of pages based on a sending queue 718. For example, the electronic device may measure the number of requests included in the transmission queue 718 of the smart network interface card 710. The electronic device may change the speed of reception of the first page batch and/or the second page batch based on when the measured number of requests exceeds a predetermined threshold. For example, the electronic device may obtain an exponentially weighted moving average (EWMA) based on the measured number of requests. When the exponentially weighted average (EWMA) value exceeds the threshold, the electronic device may change the speed of requesting or receiving pages to be prefetched until the exponentially weighted average (EWMA) becomes less than the threshold.

구체적으로, 스마트 네트워크 인터페이스 카드(710)는 작성(write)될 페이지 배치(714) 및 판독(read)될 페이지 배치(716)를 포함할 수 있다. 작성될 페이지 배치(714)는, 스마트 네트워크 인터페이스 카드(710)의 작성 메모리 영역(712)에 저장된 페이지들을 포함할 수 있다. 판독될 페이지 배치(716)는, 페이지 배치(702) 중 스마트 네트워크 인터페이스 카드(710)에 의해 프리페칭될 페이지들을 포함할 수 있다. 전자 장치는, 전송 큐(718)에 포함된 요청의 수가 미리 결정된 임계 값을 초과하는 경우에, 작성될 페이지 배치(714)에 대한 요청을 지연시킬 수 있다. 다시 말해, 전자 장치는 전송 큐(718)에 포함된 요청의 수가 미리 결정된 임계 값 미만이 될 때까지, 작성될 페이지 배치(714)에 대한 요청을 감소시킴으로써, 판독될 페이지 배치(716)에 대한 요청의 지연을 최소화시킬 수 있다. 다만, 이로 한정하는 것은 아니고, 전자 장치는 전송 큐(718)에 포함된 요청의 수에 기초하여, 프리페칭될 페이지들의 개수 또는 크기를 조절함으로써, 판독될 페이지 배치(716)에 대한 요청의 지연을 최소화시킬 수 있다.Specifically, the smart network interface card 710 may include a page batch to be written 714 and a page batch to be read 716. The page batch to be written 714 may include pages stored in the writing memory area 712 of the smart network interface card 710. The batch of pages to be read 716 may include pages that will be prefetched by the smart network interface card 710 during the batch of pages 702 . The electronic device may delay a request for a batch of pages to be created 714 if the number of requests included in the transmission queue 718 exceeds a predetermined threshold. In other words, the electronic device reduces requests for a batch of pages to be written 714 until the number of requests contained in the transmission queue 718 is below a predetermined threshold, thereby reducing the number of requests for a batch of pages to be read 716. Request delays can be minimized. However, it is not limited to this, and the electronic device adjusts the number or size of pages to be prefetched based on the number of requests included in the transmission queue 718, thereby reducing the delay of the request for the batch of pages to be read 716. can be minimized.

도 8은 일 실시예에 따른 페이지 폴트 핸들링 전자 장치의 호스트 및 전자 장치의 스마트 네트워크 인터페이스 카드의 페이지 방출(eviction)을 도시한 도면이다.FIG. 8 is a diagram illustrating page fault handling of a host of an electronic device and page eviction of a smart network interface card of the electronic device, according to an embodiment.

일 실시예에 따른 전자 장치(예: 도 1의 전자 장치(100))는, 수신된 제1 페이지 배치 또는 제2 페이지 배치 중 적어도 하나 이상의 페이지 배치에 포함된 페이지들 중, 방출(evict)될 페이지들을 호스트(800)의 작성 메모리 영역(802)에 작성할 수 있다. 예를 들어, 전자 장치는 방출될 페이지들을 호스트(800)의 작성 메모리 영역(802)에 작성 후 캐시 정보 테이블에 방출될 페이지들의 정보를 업데이트 할 수 있다. 전자 장치는 방출될 페이지들에 대한 메타데이터에 기초한 방출 요청 배치를 생성할 수 있다.An electronic device (e.g., the electronic device 100 of FIG. 1) according to an embodiment may select pages to be evict among pages included in at least one of the received first page batch and the second page batch. Pages may be written to the write memory area 802 of the host 800. For example, the electronic device may write pages to be released in the writing memory area 802 of the host 800 and then update information on the pages to be released in the cache information table. The electronic device may generate a release request batch based on metadata about the pages to be released.

전자 장치는, 방출 요청 배치를 스마트 네트워크 인터페이스 카드(810)에 전달할 수 있다. 예를 들어, 방출 요청 배치는 호스트(800)의 RDMA 오프로더(804)에 의해 스마트 네트워크 인터페이스 카드(810)의 작성 메모리 영역(812)으로 전달될 수 있다. 전자 장치는, 작성 메모리 영역(802)에 저장된 페이지들이 미리 결정된 임계 값을 초과하는 경우에 기초하여, 방출 요청 배치를 스마트 네트워크 인터페이스 카드(810)로 전달할 수 있다.The electronic device may transmit the release request batch to the smart network interface card 810. For example, a batch of release requests may be delivered to the write memory area 812 of the smart network interface card 810 by the RDMA offloader 804 of the host 800. The electronic device may transmit a batch release request to the smart network interface card 810 based on when the pages stored in the writing memory area 802 exceed a predetermined threshold.

전자 장치는, 스마트 네트워크 인터페이스 카드(810)의 RDMA 실행기(814)에 의해, 방출 요청 배치에 대응하는 페이지들을 원격 메모리 서버(예: 도 1의 원격 메모리 서버(150))에 전달할 수 있다. 예를 들어, 전자 장치는 방출 요청 배치에 대응하는 페이지들을 원격 메모리 서버에 전달 및 작성(write)함으로써, 방출될 페이지들에 대한 방출(evict)을 수행할 수 있다. 전자 장치는, 방출될 페이지들에 대한 방출이 수행된 경우에 기초하여, 캐시 정보 테이블에 방출될 페이지들의 정보를 삭제할 수 있다.The electronic device may transfer pages corresponding to the release request batch to a remote memory server (eg, remote memory server 150 in FIG. 1) by the RDMA executor 814 of the smart network interface card 810. For example, the electronic device may evict pages to be emitted by delivering and writing pages corresponding to the ejection request batch to a remote memory server. The electronic device may delete information on pages to be released from the cache information table based on when the pages to be released have been released.

이상에서 설명된 실시예들은 하드웨어 구성요소, 소프트웨어 구성요소, 및/또는 하드웨어 구성요소 및 소프트웨어 구성요소의 조합으로 구현될 수 있다. 예를 들어, 실시예들에서 설명된 장치, 방법 및 구성요소는, 예를 들어, 프로세서, 콘트롤러, ALU(arithmetic logic unit), 디지털 신호 프로세서(digital signal processor), 마이크로컴퓨터, FPGA(field programmable gate array), PLU(programmable logic unit), 마이크로프로세서, 또는 명령(instruction)을 실행하고 응답할 수 있는 다른 어떠한 장치와 같이, 범용 컴퓨터 또는 특수 목적 컴퓨터를 이용하여 구현될 수 있다. 처리 장치는 운영 체제(OS) 및 상기 운영 체제 상에서 수행되는 소프트웨어 애플리케이션을 수행할 수 있다. 또한, 처리 장치는 소프트웨어의 실행에 응답하여, 데이터를 접근, 저장, 조작, 처리 및 생성할 수도 있다. 이해의 편의를 위하여, 처리 장치는 하나가 사용되는 것으로 설명된 경우도 있지만, 해당 기술분야에서 통상의 지식을 가진 자는, 처리 장치가 복수 개의 처리 요소(processing element) 및/또는 복수 유형의 처리 요소를 포함할 수 있음을 알 수 있다. 예를 들어, 처리 장치는 복수 개의 프로세서 또는 하나의 프로세서 및 하나의 컨트롤러를 포함할 수 있다. 또한, 병렬 프로세서(parallel processor)와 같은, 다른 처리 구성(processing configuration)도 가능하다.The embodiments described above may be implemented with hardware components, software components, and/or a combination of hardware components and software components. For example, the devices, methods, and components described in the embodiments may include, for example, a processor, a controller, an arithmetic logic unit (ALU), a digital signal processor, a microcomputer, and a field programmable gate (FPGA). It may be implemented using a general-purpose computer or a special-purpose computer, such as an array, programmable logic unit (PLU), microprocessor, or any other device capable of executing and responding to instructions. The processing device may execute an operating system (OS) and software applications running on the operating system. Additionally, a processing device may access, store, manipulate, process, and generate data in response to the execution of software. For ease of understanding, a single processing device may be described as being used; however, those skilled in the art will understand that a processing device includes multiple processing elements and/or multiple types of processing elements. It can be seen that it may include. For example, a processing device may include multiple processors or one processor and one controller. Additionally, other processing configurations, such as parallel processors, are possible.

소프트웨어는 컴퓨터 프로그램(computer program), 코드(code), 명령(instruction), 또는 이들 중 하나 이상의 조합을 포함할 수 있으며, 원하는 대로 동작하도록 처리 장치를 구성하거나 독립적으로 또는 결합적으로(collectively) 처리 장치를 명령할 수 있다. 소프트웨어 및/또는 데이터는, 처리 장치에 의하여 해석되거나 처리 장치에 명령 또는 데이터를 제공하기 위하여, 어떤 유형의 기계, 구성요소(component), 물리적 장치, 가상 장치(virtual equipment), 컴퓨터 저장 매체 또는 장치, 또는 전송되는 신호 파(signal wave)에 영구적으로, 또는 일시적으로 구체화(embody)될 수 있다. 소프트웨어는 네트워크로 연결된 컴퓨터 시스템 상에 분산되어서, 분산된 방법으로 저장되거나 실행될 수도 있다. 소프트웨어 및 데이터는 컴퓨터 판독 가능 기록 매체에 저장될 수 있다.Software may include a computer program, code, instructions, or a combination of one or more of these, which may configure a processing unit to operate as desired, or may be processed independently or collectively. You can command the device. Software and/or data may be used on any type of machine, component, physical device, virtual equipment, computer storage medium or device to be interpreted by or to provide instructions or data to a processing device. , or may be permanently or temporarily embodied in a transmitted signal wave. Software may be distributed over networked computer systems and stored or executed in a distributed manner. Software and data may be stored on a computer-readable recording medium.

실시예에 따른 방법은 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다. 컴퓨터 판독 가능 매체는 프로그램 명령, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있으며 매체에 기록되는 프로그램 명령은 실시예를 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 당업자에게 공지되어 사용 가능한 것일 수도 있다. 컴퓨터 판독 가능 기록 매체의 예에는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체(magnetic media), CD-ROM, DVD와 같은 광기록 매체(optical media), 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media), 및 롬(ROM), 램(RAM), 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다. 프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드를 포함한다. 위에서 설명한 하드웨어 장치는 실시예의 동작을 수행하기 위해 하나 또는 복수의 소프트웨어 모듈로서 작동하도록 구성될 수 있으며, 그 역도 마찬가지이다.The method according to the embodiment may be implemented in the form of program instructions that can be executed through various computer means and recorded on a computer-readable medium. A computer-readable medium may include program instructions, data files, data structures, etc., singly or in combination, and the program instructions recorded on the medium may be specially designed and constructed for the embodiment or may be known and available to those skilled in the art of computer software. It may be possible. Examples of computer-readable recording media include magnetic media such as hard disks, floppy disks, and magnetic tapes, optical media such as CD-ROMs and DVDs, and magnetic media such as floptical disks. -Includes magneto-optical media and hardware devices specially configured to store and execute program instructions, such as ROM, RAM, flash memory, etc. Examples of program instructions include machine language code, such as that produced by a compiler, as well as high-level language code that can be executed by a computer using an interpreter, etc. The hardware devices described above may be configured to operate as one or multiple software modules to perform the operations of the embodiments, and vice versa.

이상과 같이 실시예들이 비록 한정된 도면에 의해 설명되었으나, 해당 기술분야에서 통상의 지식을 가진 자라면 이를 기초로 다양한 기술적 수정 및 변형을 적용할 수 있다. 예를 들어, 설명된 기술들이 설명된 방법과 다른 순서로 수행되거나, 및/또는 설명된 시스템, 구조, 장치, 회로 등의 구성요소들이 설명된 방법과 다른 형태로 결합 또는 조합되거나, 다른 구성요소 또는 균등물에 의하여 대치되거나 치환되더라도 적절한 결과가 달성될 수 있다.As described above, although the embodiments have been described with limited drawings, those skilled in the art can apply various technical modifications and variations based on this. For example, the described techniques are performed in a different order than the described method, and/or components of the described system, structure, device, circuit, etc. are combined or combined in a different form than the described method, or other components are used. Alternatively, appropriate results may be achieved even if substituted or substituted by an equivalent.

그러므로, 다른 구현들, 다른 실시예들 및 특허청구범위와 균등한 것들도 후술하는 특허청구범위의 범위에 속한다.Therefore, other implementations, other embodiments, and equivalents of the claims also fall within the scope of the claims described below.

Claims

In a method of handling a page fault performed by an electronic device,
A step of determining whether to perform prefetching with a smart network interface card (Smart NIC) based on metadata about the fault page of the host. ;
Based on the decision to perform the prefetching, applying a record of the host's virtual memory offset accessed by a thread to a leaf algorithm to obtain a major offset;
based on the major offset, determining a first page batch to be prefetched by the host and a second page batch to be prefetched by the smart network interface card;
transmitting a request batch based on the metadata for the determined first page layout and the metadata for the determined second page layout to a remote memory server; and
The host receiving the first page batch and the smart network interface card receiving the second page batch from a remote memory server that responds to the request.
Including,
How to handle page faults.

According to paragraph 1,
The step of determining whether to perform the prefetching is:
Obtaining a caching record of the faulted page stored in a page information table of the host;
based on the caching record, returning the faulted page to the host from a cache memory region where the faulted page is stored; and
Updating hit information of a thread corresponding to the returned fault page in a thread information table of the host.
Including,
How to handle page faults.

According to paragraph 1,
The step of determining whether to perform the prefetching is:
determining to perform prefetching with the smart network interface card based on a case where metadata about the fault page does not exist in the cache information table of the host.
Including,
How to handle page faults.

According to paragraph 1,
The step of obtaining the major offset is,
For each of a plurality of threads running in the host, obtaining offset records of virtual memory included in the host for the plurality of threads; and
Obtaining the major offset by applying each offset record of the plurality of threads to the leaf algorithm, which is a majority voting algorithm.
Including,
How to handle page faults.

According to paragraph 1,
The step of determining the first page layout and the second page layout includes:
selecting pages to be prefetched based on the major offset;
determining the first page placement among the selected pages based on an expected access time to the remote memory server; and
Determining pages other than the determined first page arrangement among the selected pages as the second page arrangement.
Including,
How to handle page faults.

According to clause 5,
From the determined second page arrangement, determining pages to be stored in a cache memory area of the host and pages to be stored in a cache memory area of the smart network interface card; and
Passing metadata about the second page arrangement to a prefetcher of the smart network interface card through a remote direct memory access (RDMA) offloader of the host.
Containing more,
How to handle page faults.

According to paragraph 1,
The step of transmitting the request batch to the remote memory server is,
transmitting the request batch corresponding to the first page batch to a remote memory server through an RDMA executor of the host; and
Forwarding the request batch corresponding to the second page batch to a remote memory server through an RDMA executor of the smart network interface card.
Including,
How to handle page faults.

In clause 7,
Obtaining an expected latency time from a scheduler of the smart network interface card based on an interval of pages stored in the remote memory server; and
Individually forwarding requests included in the request batch to the remote memory server based on the case where the expected delay time exceeds a predetermined threshold time.
Containing more,
How to handle page faults.

According to paragraph 1,
Receiving the first page batch and the second page batch includes:
measuring the number of requests included in a sending queue of the smart network interface card; and
Changing the rate of receipt of at least one batch of the first page batch or the second batch of pages based on when the measured number of requests exceeds a predetermined threshold.
Including,
How to handle page faults.

According to paragraph 1,
Receiving the first page batch and the second page batch includes:
Based on when the second page batch is received from the remote memory server, delivering pages to be stored in a cache memory area of the host among the second page batch to the host; and
Updating hit information of threads corresponding to the delivered pages in the host's thread information table.
Including,
How to handle page faults.

According to paragraph 1,
writing pages to be evict, among pages included in the received first and second page batches, in a write memory region of the host;
transmitting to the smart network interface card a batch of release requests based on metadata for the pages to be released; and
The smart network interface card transmitting pages corresponding to the release request batch to the remote memory server.
Containing more,
How to handle page faults.

A computer program combined with hardware and stored in a computer-readable recording medium to execute the method of any one of claims 1 to 11.

In the page fault handling electronic device,
Memory storing computer-executable instructions;
a processor that accesses the memory and executes the instructions;
host; and
Smart NIC (Smart Network Interface Card)
Including,
The above commands are:
Based on metadata about a fault page of the host, determine whether to perform prefetching with the smart network interface card,
Based on the decision to perform the prefetching, a major offset is obtained by applying a record of the virtual memory offset of the host accessed by the thread to a leaf algorithm,
Based on the major offset, determine a first page batch to be prefetched by the host and a second page batch to be prefetched by the smart network interface card;
transmitting a request batch based on the metadata for the determined first page layout and the metadata for the determined second page layout to a remote memory server;
The host receives the first page batch from a remote memory server that responds to the request, and the smart network interface card receives the second page batch.
Electronic devices.

According to clause 13,
The processor,
Obtain a caching record of the faulted page stored in the page information table of the host,
Based on the caching record, the faulted page is returned to the host from a cache memory region where the faulted page is stored,
Updating the hit information of the thread corresponding to the returned fault page in the thread information table of the host.
Electronic devices.

According to clause 13,
The processor,
Based on the case where metadata about the fault page does not exist in the cache information table of the host, determining to perform prefetching with the smart network interface card
Electronic devices.

According to clause 13,
The processor,
For each of the plurality of threads running in the host, obtain offset records of virtual memory included in the host for the plurality of threads,
Obtaining the major offset by applying each offset record of the plurality of threads to the leaf algorithm, which is a majority voting algorithm.
Electronic devices.

According to clause 13,
The processor,
Select pages to be prefetched based on the major offset,
Based on an expected access time for the remote memory server, determine the first page placement among the selected pages,
Determining pages other than the determined first page arrangement among the selected pages as the second page arrangement.
Electronic devices.

According to clause 17,
The processor,
From the determined second page arrangement, determine pages to be stored in a cache memory area of the host and pages to be stored in a cache memory area of the smart network interface card,
Transferring metadata about the second page arrangement to a prefetcher of the smart network interface card through a remote direct memory access (RDMA) offloader of the host.
Electronic devices.

According to clause 13,
The processor,
Delivering the request batch corresponding to the first page batch to a remote memory server through an RDMA executor of the host,
transmitting the request batch corresponding to the second page batch to a remote memory server through an RDMA executor of the smart network interface card.
Electronic devices.

According to clause 19,
The processor,
Obtaining an expected latency time from a scheduler of the smart network interface card based on the interval of pages stored in the remote memory server,
Based on the case where the expected delay time exceeds a predetermined threshold time, the requests included in the request batch are individually delivered to the remote memory server.
Electronic devices.

According to clause 13,
The processor,
Measure the number of requests included in the sending queue of the smart network interface card,
changing the rate of receipt of at least one batch of the first page batch or the second batch of pages based on when the measured number of requests exceeds a predetermined threshold;
Electronic devices.

According to clause 13,
The processor,
Based on when the second page batch is received from the remote memory server, pages to be stored in the cache memory area of the host among the second page batch are delivered to the host,
Updating the hit information of the thread corresponding to the delivered pages in the host's thread information table.
Electronic devices.

According to clause 13,
The processor,
Among the pages included in the received first page batch and the second page batch, pages to be evict are written in a write memory region of the host,
send a batch of release requests based on metadata for the pages to be released to the smart network interface card;
The smart network interface card transmits pages corresponding to the release request batch to the remote memory server.
Electronic devices.