Disclosure of Invention
Aiming at the defects, the technical task of the invention is to provide the optimization method and the system for PCIe ECAM address mapping, which can obviously reduce the number of page table entries, improve the hit rate of the TLB and improve the overall access efficiency of the system.
The technical scheme adopted for solving the technical problems is as follows:
A method for optimizing PCIe ECAM address mapping, the implementation of the method comprising:
The PCIe device area division is that a rule of PCIe device area division is formulated, and based on the rule, PCIe devices are divided into high-frequency access area devices and low-frequency access area devices according to the device distribution condition in PCIe bus topology;
page table allocation including static pre-allocation of page tables and dynamic adjustment of page table allocation,
Based on the PCIe device region division, the device in the high-frequency access region is subjected to centralized memory mapping by taking a large page as granularity, and the device in the low-frequency access region is subjected to independent memory mapping by taking a small page as granularity;
The dynamic adjustment page table allocation is that an access frequency threshold is set, the access frequency of the configuration space of the PCIe equipment is monitored when the system runs, and the page table is merged or split through a memory management unit.
The method provides a PCIe ECAM address mapping method with mixed granularity, combines the mixed granularity mapping with PCIe configuration space management, improves access memory efficiency by flexibly adjusting mapping granularity of different PCIe bus devices, carries out region division on the PCIe devices, monitors access frequency of each device on the PCIe bus, and dynamically uses page table granularities (such as 4KB, 64K and 2M) with different sizes in ECAM address mapping so as to balance memory occupation, TLB efficiency and access memory performance. Traditional fixed granularity mapping (such as full 4KB pages) can lead to explosive growth of page table entries when the number of devices is huge, and mixed granularity obviously reduces the number of page table entries by flexibly adjusting the page size, so that the hit rate of the TLB is improved, and the overall memory access efficiency of the system is improved.
Further, the big page is 64K or 2M page table granularity, and the small page is 4K page table granularity.
Further, the PCIe device region division,
And dividing PCIe devices with the number of 2 or more downstream devices into high-frequency access area devices, and dividing PCIe devices with the number of less than 2 into low-frequency access area devices.
Furthermore, the centralized memory mapping refers to summing address spaces required to be mapped by each high-frequency access area device, and performing memory space mapping with a large page as granularity by taking the summation result as a parameter.
Further, the system is running to monitor the access frequency of the PCIe device configuration space,
And counting the access frequency and continuity of the PCIe equipment configuration space through a performance counter (such as perf), triggering page merging operation when the equipment access frequency of the low-frequency access area is monitored to exceed a set frequency threshold value, and triggering page splitting operation when the equipment access frequency of the high-frequency access area is monitored to be lower than the set frequency threshold value.
Further, the merged page table,
And continuously accessing the equipment in the low-frequency access area frequently, and re-dividing the equipment into the high-frequency access area when the access frequency exceeds a set threshold value, releasing the allocated small page space, and re-applying for memory mapping in the intensively allocated large page space.
Further, the split page table,
And (3) fragmenting access occurs to the equipment in the high-frequency access area, if the access frequency is lower than a set threshold value, the equipment is re-divided into low-frequency access area equipment, the low-frequency access area equipment is released in a memory mapping space in a large page, and independent memory mapping is performed again by taking a small page as granularity.
The invention also claims an optimization system of PCIe ECAM address mapping, which comprises a PCIe device region dividing module and a page table distributing module,
The PCIe device region division module is used for making a rule of PCIe device region division, and dividing PCIe devices into high-frequency access region devices and low-frequency access region devices according to device distribution conditions in PCIe bus topology based on the rule;
The page table allocation module comprises static pre-allocation of page tables and dynamic adjustment of page table allocation;
based on the PCIe device region division, the device in the high-frequency access region is subjected to centralized memory mapping by taking a large page as granularity, and the device in the low-frequency access region is subjected to independent memory mapping by taking a small page as granularity;
Setting an access frequency threshold value, monitoring the access frequency of a PCIe equipment configuration space during system operation, and merging or splitting the page table through a memory management unit;
the system specifically realizes optimization of PCIe ECAM address mapping by the method.
The invention also claims an optimization device of PCIe ECAM address mapping, which comprises at least one memory and at least one processor;
the at least one memory for storing a machine readable program;
the at least one processor is configured to invoke the machine-readable program to implement the method described above.
The invention also claims a computer readable medium having stored thereon computer instructions which, when executed by a processor, implement the above-described method.
Compared with the prior art, the optimization method and the optimization system for PCIe ECAM address mapping have the following beneficial effects:
1. By the optimization method of the PCIe ECAM address mapping with mixed granularity, the number of page table entries occupied by the PCIe ECAM address mapping is reduced, and the occupied page table memory is reduced.
2. Through the optimization method of PCIe ECAM address mapping with mixed granularity, a large page enables a single TLB (virtual private line) entry to cover a larger address range, and the TLB hit rate is improved.
3. Through the optimization method of PCIe ECAM address mapping with mixed granularity, a large page covers more equipment configuration space, page table level is reduced, and the efficiency of configuration register access is improved.
Detailed Description
The invention will be further illustrated with reference to specific examples.
The embodiment of the invention provides an optimization method for PCIe ECAM address mapping, which is used for carrying out region division on PCIe equipment, monitoring the access frequency of each equipment on a PCIe bus, and dynamically using page table granularities (such as 4KB, 64K and 2M) with different sizes in the ECAM address mapping so as to balance memory occupation, TLB efficiency and access performance. Traditional fixed granularity mapping (such as full 4KB pages) can lead to explosive growth of page table entries when the number of devices is huge, and mixed granularity obviously reduces the number of page table entries by flexibly adjusting the page size, so that the hit rate of the TLB is improved, and the overall memory access efficiency of the system is improved.
The specific implementation mode of the PCIe ECAM address mapping method with mixed granularity is as follows:
1. PCIe device region division:
And formulating a rule of PCIe device region division, and dividing PCIe devices into high-frequency access region devices and low-frequency access region devices according to device distribution conditions in PCIe bus topology based on the rule.
2. Page table allocation including static pre-allocation of page tables and dynamic adjustment of page table allocation.
Based on the PCIe device region division, the device in the high-frequency access region is subjected to centralized memory mapping by taking a large page (64K/2M) as granularity, and the device in the low-frequency access region is subjected to independent memory mapping by taking a small page (4K) as granularity.
The dynamic adjustment page table allocation is that an access frequency threshold is set, the access frequency of the configuration space of the PCIe equipment is monitored when the system runs, and the page tables are merged or split through a memory management unit.
In order to enable those skilled in the art to better understand and practice the present method, the following describes the implementation of the present method in detail with reference to fig. 1 and 2. Embodiments of the method include the steps of:
S1, PCIe device region division.
And formulating a rule of device region division, and dividing PCIe devices into high-frequency access region devices and low-frequency access region devices according to the device distribution condition in the PCIe bus topology.
As shown in fig. 2, in the device distribution situation in the PCIe bus topology, based on a preset rule, PCIe devices with the number of downstream devices being greater than 2 are divided into high-frequency access area devices, and PCIe devices with the number of downstream devices being less than 2 are divided into low-density area devices. The device region division result is as follows:
High frequency access area device A, B, D, H.
Low frequency access area device E, F, G, L, M, N, I, J, P, Q, C, K.
S2, static pre-allocation of page tables.
Based on the device region division result in step S1, the device A, B, D, H in the high-frequency access region performs centralized memory mapping with the large page (64K/2M) as granularity, where the centralized memory mapping refers to summing address spaces required to be mapped by the four devices A, B, D, H, and performing memory space mapping with the sum result as a parameter and the large page as granularity.
The device E, F, G, L, M, N, I, J, P, Q, C, K for the low-frequency access region performs independent memory mapping with small pages (4K) as granularity.
S3, dynamically adjusting page table allocation.
Setting an access frequency threshold value, and monitoring the access frequency of the configuration space of the PCIe equipment in real time by a monitoring module during system running, and merging or splitting a page table through a memory management unit in the kernel of the linux operating system.
And the monitoring module is used for counting the access frequency and continuity of the PCIe device configuration space through a performance counter (such as perf). When the device access frequency of the low-frequency access area is monitored to exceed the set frequency threshold, the page merging operation is triggered, and when the device access frequency of the high-frequency access area is monitored to be lower than the set frequency threshold, the page splitting operation is triggered.
And (3) page merging, namely if the equipment C, K is accessed frequently continuously, and the access frequency exceeds a set threshold, re-dividing the equipment into a high-frequency access area, releasing the allocated small page space, and re-applying for memory mapping in the intensively allocated large page space.
And splitting the page, namely if the H equipment in the large page area has fragmented access, and the access frequency is lower than a set threshold value, re-dividing the H equipment into low-frequency access area equipment, releasing the low-frequency access area equipment in a memory mapping space in the large page, and re-carrying out independent memory mapping by taking the small page as granularity.
The method is suitable for the scene of accessing PCIe equipment (such as a data center GPU and NVMe storage) at high frequency, and can remarkably improve the overall throughput and response speed of the system.
The embodiment of the invention also provides an optimization system for PCIe ECAM address mapping, which comprises a PCIe equipment region dividing module and a page table distributing module,
The PCIe device region division module is used for making a rule of PCIe device region division, and dividing PCIe devices into high-frequency access region devices and low-frequency access region devices according to device distribution conditions in PCIe bus topology based on the rule;
The page table allocation module comprises static pre-allocation of page tables and dynamic adjustment of page table allocation;
based on the PCIe device region division, the device in the high-frequency access region is subjected to centralized memory mapping by taking a large page as granularity, and the device in the low-frequency access region is subjected to independent memory mapping by taking a small page as granularity;
Setting an access frequency threshold value, monitoring the access frequency of a PCIe equipment configuration space during system operation, and merging or splitting the page table through a memory management unit;
The system specifically optimizes the PCIe ECAM address mapping by the optimization method of the PCIe ECAM address mapping described in the embodiment.
The embodiment of the invention also provides an optimization device of PCIe ECAM address mapping, which comprises at least one memory and at least one processor;
the at least one memory for storing a machine readable program;
The at least one processor is configured to invoke the machine-readable program to implement the optimization method for PCIe ECAM address mapping described in the foregoing embodiments.
The embodiment of the invention also provides a computer readable medium, wherein the computer readable medium stores computer instructions, and the computer instructions, when executed by a processor, cause the processor to execute the optimization method of PCIe ECAM address mapping described in the above embodiment. Specifically, a system or apparatus provided with a storage medium on which a software program code realizing the functions of any of the above embodiments is stored, and a computer (or CPU or MPU) of the system or apparatus may be caused to read out and execute the program code stored in the storage medium.
In this case, the program code itself read from the storage medium may realize the functions of any of the above-described embodiments, and thus the program code and the storage medium storing the program code form part of the present invention.
Examples of storage media for providing program code include floppy disks, hard disks, magneto-optical disks, optical disks (e.g., CD-ROMs, CD-R, CD-RWs, DVD-ROMs, DVD-RAMs, DVD-RWs, DVD+RWs), magnetic tapes, nonvolatile memory cards, and ROMs. Alternatively, the program code may be downloaded from a server computer by a communication network.
Further, it should be apparent that the functions of any of the above-described embodiments may be implemented not only by executing the program code read out by the computer, but also by causing an operating system or the like operating on the computer to perform part or all of the actual operations based on the instructions of the program code.
Further, it is understood that the program code read out by the storage medium is written into a memory provided in an expansion board inserted into a computer or into a memory provided in an expansion unit connected to the computer, and then a CPU or the like mounted on the expansion board or the expansion unit is caused to perform part and all of actual operations based on instructions of the program code, thereby realizing the functions of any of the above embodiments.
While the invention has been illustrated and described in detail in the drawings and in the preferred embodiments, the invention is not limited to the disclosed embodiments, and it will be appreciated by those skilled in the art that the code audits of the various embodiments described above may be combined to produce further embodiments of the invention, which are also within the scope of the invention.