[go: up one dir, main page]

WO2018113030A1 - Technologie pour mettre en oeuvre un pilote express de mémoire non volatile bifurquée - Google Patents

Technologie pour mettre en oeuvre un pilote express de mémoire non volatile bifurquée Download PDF

Info

Publication number
WO2018113030A1
WO2018113030A1 PCT/CN2016/113701 CN2016113701W WO2018113030A1 WO 2018113030 A1 WO2018113030 A1 WO 2018113030A1 CN 2016113701 W CN2016113701 W CN 2016113701W WO 2018113030 A1 WO2018113030 A1 WO 2018113030A1
Authority
WO
WIPO (PCT)
Prior art keywords
driver
kernel
memory
queue
bifurcated
Prior art date
Application number
PCT/CN2016/113701
Other languages
English (en)
Inventor
Changpeng LIU
Cunyin CHANG
Ziye Yang
Qihua DAI
Original Assignee
Intel Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Intel Corporation filed Critical Intel Corporation
Priority to CN201680091055.5A priority Critical patent/CN109983443B/zh
Priority to DE112016007538.3T priority patent/DE112016007538T5/de
Publication of WO2018113030A1 publication Critical patent/WO2018113030A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/10Program control for peripheral devices
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0655Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices
    • G06F3/0659Command handling arrangements, e.g. command buffers, queues, command scheduling
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0673Single storage device
    • G06F3/0679Non-volatile semiconductor memory device, e.g. flash memory, one time programmable memory [OTP]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/4401Bootstrapping
    • G06F9/4411Configuring for operating with peripheral devices; Loading of device drivers
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2213/00Indexing scheme relating to interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F2213/0026PCI express

Definitions

  • Embodiments generally relate to non-volatile memory technology. More particularly, embodiments relate to a way to provide high input-output performance using a bifurcated non-volatile memory (NVM) express (NVMe) driver to manage one or more file system storage and NVMe device storage.
  • NVM non-volatile memory express
  • NVM express NVMe
  • SSDs solid state drives
  • NVMe technology allows levels of parallelism found in modern solid state drives (SSDs) to be utilized by host hardware and software.
  • NVMe technology may be used to store data in data center, reduce input-output overhead and provide various performance improvements in comparison to previous logical device interfaces which may include multiple, long command queues, and reduced latency.
  • software stacks for existing NVMe solutions may be challenged to satisfy ever increasing performance requirements (e.g., power consumption and processing capacities) due to resource demanding applications (e.g., read intensive applications and/or write intensive applications) .
  • FIG. 1 is an illustration of an example of a bifurcated non-volatile memory (NVM) express (NVMe) driver architecture according to an embodiment
  • FIG. 2 is a flowchart of an example of a method of initializing a kernel space driver according to an embodiment
  • FIG. 3 is a flowchart of an example of a method of initializing a bifurcated driver according to an embodiment
  • FIG. 4 is a block diagram of example of a high read transaction per second (TPS) for a database application according to an embodiment
  • FIG. 5 is a block diagram of example of an application with large file access usage according to an embodiment
  • FIG. 6 is a block diagram of example of two independent applications sharing an NVMe device according to an embodiment
  • FIG. 7 is a block diagram of an example of a processor according to an embodiment.
  • FIG. 8 is a block diagram of an example of a computing system according to an embodiment.
  • FIG. 1 shows an illustration of an example of a bifurcated non-volatile memory (NVM) express (NVMe) driver architecture 100 according to an embodiment.
  • a user space NVMe polling mode driver e.g., INTEL’s storage performance development kit
  • INTEL central processing unit
  • SSDs NVMe solid state drives
  • FIG. 1 shows an illustration of an example of a bifurcated non-volatile memory (NVM) express (NVMe) driver architecture 100 according to an embodiment.
  • a user space NVMe polling mode driver e.g., INTEL’s storage performance development kit
  • a traditional kernel driver that may work in interrupt mode, in order to mitigate central processing unit (CPU) utilization bottlenecks due to NVMe solid state drives (SSDs) performance boosts and leverage performance improvements provided by a fast storage backend (e.g., an NVMe storage device and/or NVMe over fabric NVMF) .
  • a fast storage backend e.g., an N
  • the bifurcated NVMe driver architecture 100 may include a host memory 102, including one or more of a volatile memory or a non-volatile memory.
  • the non-volatile memory may include a file system device storage 104 and a block device storage 106 (e.g., one or more NVMe devices) .
  • the bifurcated NVMe driver architecture 100 may further include a memory controller 108 (e.g., an NVMe controller) communicatively coupled to the non-volatile memory, including the file system device storage 104 and the block device storage 106.
  • host memory 102 includes at least some portion of one or more of the file system device storage 104 and the block device storage 106, while another portion of one or more of the file system device storage 104 or the block device storage 106 may be used by one or more applications 110, 112.
  • the memory controller 108 may use a bifurcated driver 114 to process one or more memory access requests to access the non-volatile memory, including one or more of the file system device storage 104 or the block device storage 106.
  • the bifurcated driver 114 may include a kernel space driver 116 communicatively coupled to a kernel file system 118, and a user space driver 120 (e.g., an INTEL Storage Performance Development Kit (SPDK) NVMe driver) communicatively coupled to the block device storage 106 (e.g., one or more NVMe devices) via a block service daemon 122, and configuration tools 124 (e.g., out-of-band configuration tool, or NVMe configuration tool to use a dedicated channel for managing network devices) .
  • the bifurcated NVMe driver architecture 100 may, for out-of-band management, separate data from commands and events so that data may travel through a host-to-controller interface, while commands and events may travel through management port Ethernet cables.
  • the bifurcated NVMe driver architecture 100 may include a driver bifurcator (e.g., configurable logic and/or fixed-functionality hardware logic) to generate the bifurcated driver 114.
  • the driver bifurcator may be executed, managed and/or reside in the kernel (e.g., operating system) .
  • the driver bifurcator may retrieve user configuration information used to configure the bifurcated driver 114, the kernel space driver 116 and the user space driver 120.
  • the user configuration information may include metadata identifying a number of administrative queue pairs 126 (e.g., submission queues and completion queues) , input-output queues 128, 130, 132, 134 and a number of bifurcator namespace pairs, including a kernel namespace 136 and a user namespace 138 to use for one or more of the kernel space driver 116 or the user space driver 120.
  • One or more memory access requests may be received from one or more applications 110, 112 modified to use the bifurcated driver 114.
  • the one or more applications 110, 112 may include one or more of a read intensive application or a write intensive application.
  • the driver bifurcator may initialize the kernel space driver 116.
  • the kernel space driver 116 may provide support for use of the kernel file system 118 to access a file system storage device, such as for example, block device storage 106.
  • the driver bifurcator may generate a data structure (e.g., a bifurcation data structure) in a shared memory space 140 of the host memory 102.
  • the host memory 102 may include a kernel, wherein the kernel is to manage the kernel space driver via the bifurcated driver.
  • the shared memory space 140 may be accessible by a bifurcator namespace pair, the bifurcator namespace pair including a kernel namespace and a user namespace.
  • the data structure may include data structure metadata including one or more of submission queue physical addresses, completion queue physical addresses, queue size or queue doorbell registers physical address.
  • the shared memory space 140 may be accessible by the bifurcator namespace pair, including the kernel namespace 136 and the user namespace 138.
  • the driver bifurcator may generate, in the host memory 102, the kernel namespace 136, one or more kernel queue pairs 126, 128, 130, the user namespace 138 and one or more user space queue pairs 132, 134.
  • the driver bifurcator may assign the kernel namespace 136 and the one or more kernel queue pairs 126, 128, 130 to the kernel space driver 116, update the data structure with kernel queue pair metadata identifying the kernel namespace 136 and the one or more kernel queue pairs 126, 128, 130, and register the device to the kernel, wherein the device is the NVMe block device storage.
  • the one or more kernel queue pairs 126, 128, 130 may include one or more of an admin queue pair 126 that includes an admin submission queue and an admin completion queue, and one or more of kernel input-output queue pairs 128, 130 that may include a kernel submission queue and a kernel completion queue.
  • the one or more user space queue pairs 132, 134 may comprise a user space submission queue and a user space completion queue.
  • the block device storage 106 may be accessed via the block service daemon 114.
  • the driver bifurcator may initialize, using the data structure metadata, the user space driver 120.
  • the user space driver 120 may be a polling mode driver.
  • the driver bifurcator may retrieve the user configuration information and the data structure metadata, generate the device based on the user configuration information and the data structure metadata, update the data structure with the user space queue pair metadata, and register to the user space driver 120 a block device interface (e.g., the block service daemon 114) to the device.
  • a block device interface e.g., the block service daemon 114
  • the bifurcated driver 114 and/or kernel may direct read requests and write requests to one or more of the kernel space driver 116 or the user space driver 120 to perform corresponding read operations and/or write operations.
  • the user space driver 120 may be communicatively coupled to the memory controller 108 (e.g., NVMe controller) .
  • the bifurcated driver 114 may receive the one or more memory access requests from the one or more applications 110, 112 to access one or more of the file system device storage or the NVMe block device storage, process the one or more memory access requests, update, using, one or more of the kernel namespace 136, the user namespace 138, the one or more kernel queue pairs 126, 128, 130 or the one or more user space queue pairs 132, 134 based on the one or more memory access requests, and synchronize, using the memory controller 108 (e.g., NVMe controller) , the data structure in the shared memory space 140.
  • the memory controller 108 e.g., NVMe controller
  • the bifurcated driver 114, the kernel and/or the kernel space driver 116 may direct the one or more memory access requests to the user space driver 120.
  • the kernel space driver 116 may operate a controller of the user space driver 120.
  • the bifurcated driver 114, the kernel and/or the kernel space driver 116 may direct the one or more memory access requests to the kernel space driver 116.
  • Disk read/write commands may be sent to the kernel space driver 116 through kernel file systems 110 and/or the user space driver 120 (e.g., a SPDK disk driver) through the bifurcated driver 114 based on a user’s configuration information (e.g., one or more applications and application types including read intensive and/or write intensive type applications) .
  • the memory controller 108 e.g., NVMe controller
  • the memory controller 108 may not be used (e.g., shared) by another driver (e.g., a kernel driver) , for example, without implementing the bifurcated driver 114.
  • An application 110 may use the kernel space driver 116 for file system storage while another application 112 (e.g., a high performance application) may use the user space driver 120 for block device storage.
  • Both application 110 e.g., a traditional storage application
  • application 112 e.g., a high performance application
  • NVMe device such as block device storage 106.
  • Traditional storage applications, such as application 110 may use portable operating system interface (POSIX) compatible system calls, while high performance applications, such as application 112, may realize higher input/output operations per second (IOPS) with limited CPU usage.
  • POSIX portable operating system interface
  • IOPS input/output operations per second
  • the kernel, the bifurcated driver 114 or the kernel space driver 116 may initialize one or more storage devices, including PCI device probe, generating one or more namespaces and queue pairs, and memory allocation for related data structures.
  • the kernel space driver 116 may claim the bifurcator namespaces 136, 138 and input-output queue pairs 128, 130, 132, 134 through user’s configuration information. For example, users may allocate the kernel namespace 136 and the kernel queue pairs 126, 128, 130 to the kernel driver, so the kernel driver may probe the kernel namespace 136 and then allocated interrupt routine function to the kernel queue pairs 126, 128, 130.
  • the kernel space driver 116 may update the PCI device information, including the bifurcator namespaces 136, 138 and kernel queue pairs 126, 128, 130 information, as well as user’s configuration information into the shared memory 140.
  • the user space driver 120 When the user space driver 120 is started (e.g., initialized) , the user space driver 120 may not perform PCI device initialization work (e.g., tasks) , but may read the PCI device information from the shared memory space 140 and generate a logical NVMe device.
  • the user space driver 120 may use specified namespaces information and input-output queues assigned to the user space driver 120.
  • the input-output queues 128, 130, 132, 134 may be used for input-output submission and completion polling.
  • the bifurcated driver 114 runtime overhead may be small or nonexistent, and may improve the disk input-output performance of an application that uses a file system.
  • a kernel space driver 116 and a user space driver 120 e.g., NVMe driver attempting to share an NVMe device, such as for example, block device storage 106, may operate to support an application 110 that may use a file system 118 (e.g., legacy storage, a file system storage intensive application) .
  • NVMe device read performance may be a key factor for application performance based on whether an application is read intensive or write intensive.
  • NVMe device writes may have more file system dependence, in contrast to NVMe device reads.
  • An application such as for example, block device storage 106, using more read operations and fewer writes operations may take advantage of the bifurcated driver 114.
  • an application such as for example, application 112 may be more read intensive than write intensive (e.g., a high performance storage application such as a database application) .
  • the application 112 may use a local and/or distributed storage for a hash table in a data deduplication system.
  • a hash key may be written once when there is no duplication, but read each time during hash table lookup. Accordingly, the performance of hash table lookup may be a key factor of the performance of the data deduplication system.
  • the local and/or distributed storage for a database may be used for metadata and/or database raw data.
  • the metadata may be stored once, but read multiple times. And the read performance of the metadata storage may be key for the database performance.
  • the application 112 may configure the bifurcated driver 114 to send read requests to the user space driver (e.g., SPDK NVMe driver) .
  • the bifurcated driver 114 may send write requests to the kernel space file system.
  • NVMe write requests may be directed to the kernel space driver through the file system
  • NVMe read requests may be directed to the user space driver through the block interface, and use the shared memory to synchronize the map of NVMe read/write addresses.
  • the NVMe read performance may be guaranteed by the user space driver.
  • the runtime overhead to maintain the map in the shared memory may be small, in contrast with a traditional kernel space driver handling NVMe reads.
  • the development overhead/cost for NVMe read/write synchronization may be outweighed by the cost and time to develop a user space file system.
  • two applications such as applications 110, 112 may share an NVMe device, such as block device storage 106, where application 110 uses the file system support from the kernel file system 118, and application 112 does not need file system support.
  • the bifurcated driver 114 may direct NVMe read/write commands from application 110 to the kernel spaces driver 116 to use the kernel file system 118, and direct NVMe read/write commands from application 112 to the user namespace space 138 to use the user space driver 120.
  • bifurcated driver 114 may result in no or limited runtime overhead to initialize the configuration without development overhead/cost. Without the benefits of using a bifurcated driver, such as bifurcated driver 114, two or more applications may be unable to share one or more NVMe devices (e.g., hardware) using multiple namespaces, due to hardware registers conflicts between a traditional kernel driver and user space driver (e.g., NVMe driver) .
  • NVMe driver e.g., hardware
  • a user space file system may be developed and implemented to use an NVMe driver (e.g., INTEL SPDK NVMe driver) in the user namespace to attempt to speed up NVMe reads, or read and write operations.
  • NVMe driver e.g., INTEL SPDK NVMe driver
  • development to implement a user space file system may be at significant time and cost, risk of instability and quality of the file system.
  • using a traditional kernel space driver to perform NVMe writes and NVMe reads using a mature kernel space file system NVMe read and/or NVMe read and write operations performance may not be guaranteed.
  • relatively costly corresponding components may be required to overcome deficiencies and to achieve maximum NVMe hardware performance, otherwise addressed using a bifurcated driver, such as bifurcated driver 114.
  • Developing a user space file system and/or using higher performing components may also increase costs due to an inefficient software stack that results from either approach.
  • user space driver 120 may serve one or more virtual machines and may provide block service to an NVMe device, such as block device storage 106, in a storage network.
  • NVMe driver may limit storage to block interface type devices
  • kernel space driver 116 may be used for features provided by a kernel (e.g., file system support) and usable by applications, such as application 110.
  • applications designed for high performance reads with low CPU utilization and no or limited file system requirements, such as application 112 may use user space driver 120, while neither the kernel space driver nor the user space driver 120 may satisfy requirements for applications that may need both relatively equal file system (e.g., write intensive operations) support and block level device support.
  • FIG. 2 is a flowchart of an example of a method of initializing a kernel space driver according to an embodiment.
  • the method 200 may be implemented as a module or related component in a set of logic instructions stored in a non-transitory machine-or computer-readable storage medium such as random access memory (RAM) , read only memory (ROM) , programmable ROM (PROM) , firmware, flash memory, etc., in configurable logic such as, for example, programmable logic arrays (PLAs) , field programmable gate arrays (FPGAs) , complex programmable logic devices (CPLDs) , in fixed-functionality hardware logic using circuit technology such as, for example, application specific integrated circuit (ASIC) , complementary metal oxide semiconductor (CMOS) or transistor-transistor logic (TTL) technology, or any combination thereof.
  • PLAs programmable logic arrays
  • FPGAs field programmable gate arrays
  • CPLDs complex programmable logic devices
  • ASIC application specific integrated circuit
  • CMOS
  • computer program code to carry out operations shown in the method 200 may be written in any combination of one or more programming languages, including an object oriented programming language such as JAVA, SMALLTALK, C++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages.
  • object oriented programming language such as JAVA, SMALLTALK, C++ or the like
  • conventional procedural programming languages such as the "C" programming language or similar programming languages.
  • Illustrated processing block 202 provides for initializing the kernel driver coupled to a kernel file system.
  • Illustrated processing block 204 provides for loading user’s configuration information, including queue and namespace splitting information.
  • Illustrated processing block 206 provides for generating multiple NVMe hardware queue pairs, and when namespaces do not exist, generating one or more namespaces (e.g., one or more bifurcator namespace pairs) , in a data structure in a shared memory space of a host memory.
  • the host memory may include one or more of a volatile memory or a non-volatile memory, wherein the host memory is to include a kernel, wherein the kernel is to manage the kernel space driver via the bifurcated driver.
  • the non-volatile memory may include a file system device storage and a NVMe block device storage.
  • the shared memory space may be accessible by a bifurcator namespace pair, including a kernel namespace and a user namespace, one or more kernel queue pairs, the user namespace and one or more user space queue pairs.
  • Illustrated processing block 206 further provides for generating, in the host memory, the kernel namespace, one or more kernel queue pairs, the user namespace and one or more user space queue pairs.
  • Illustrated processing block 208 provides for assigning the kernel namespace and the one or more kernel queue pairs to the kernel space driver.
  • Illustrated processing block 210 provides for allocating input-output queue interrupt routine for commands completion process.
  • Illustrated processing block 212 provides for assigning queue pairs to the user space driver.
  • Illustrated processing block 214 provides for exporting a memory controller identity structure (e.g., NVMe controller identity structure) , input-output queue structure and namespace identity data structure to shared memory, and updating the data structure (s) with kernel queue pair metadata identifying the kernel namespace and the one or more kernel queue pairs.
  • Illustrated processing block 216 provides for registering the device to the kernel, wherein the device is the NVMe block device storage.
  • the bifurcated driver may be as described above regarding bifurcated driver 114, as shown in FIG. 1.
  • the method 300 may be implemented as a module or related component in a set of logic instructions stored in a non-transitory machine-or computer-readable storage medium such as RAM, ROM, PROM, firmware, flash memory, etc., in configurable logic such as, for example, PLAs, FPGAs, CPLDs, in fixed-functionality hardware logic using circuit technology such as, for example, ASIC, CMOS or TTL technology, or any combination thereof.
  • Illustrated processing block 302 provides for initializing the user space driver, using the data structure metadata from the shared memory space.
  • Illustrated processing block 304 provides for loading/retrieving from shared memory user’s configuration information and memory controller identity information (e.g., NVMe controller identity information) including, for example, the data structure metadata, including the memory controller identity structure (e.g., NVMe controller identity structure) .
  • Illustrated processing block 306 provides for loading the input-output queue data structure, input-output queue structure and namespace identity data structure, and kernel queue pair metadata.
  • Illustrated processing block 308 provides for generating a device (e.g., an NVMe block device) based on the user configuration information and the data structure metadata (e.g., namespace information) .
  • the user space driver may be a polling mode driver.
  • Illustrated processing block 310 provides for polling the thread for input-output and completion updates, and updating the data structure with the user space queue pair metadata.
  • Illustrated processing block 312 provides for registering to the user space driver a block device interface (e.g. block service daemon) to the device, wherein the device is the NVMe block device storage.
  • a block device interface e.g. block service daemon
  • FIG. 4 is a block diagram 400 of an example of a high read transaction per second (TPS) for a database application according to an embodiment.
  • the bifurcated driver 402 may provide the user space driver 404 and kernel space driver 406 simultaneous access to an NVMe device, such as described above regarding block device storage 106, as shown in FIG. 1, to provide high TPS for a database application 408.
  • the user space driver 404 and kernel space driver 406 may share a namespace 410 of the device, so that the user space driver 404 and kernel space driver 406 may read/write a logical block addressing (LBA) (e.g., offset of the disk capacity, specifying the location of blocks of data stored at a storage device) .
  • LBA logical block addressing
  • the kernel space driver 406 may provide read/write support using a kernel file system 412 for small sized dedicated database application files (e.g., average file capacity ⁇ 1GB) , such as write ahead log files 414 and redo log files 416.
  • a kernel file system 412 for small sized dedicated database application files (e.g., average file capacity ⁇ 1GB) , such as write ahead log files 414 and redo log files 416.
  • database application 408 may use the kernel space driver 406 for writes, and the database application 408 may use the user space driver 404 for reads.
  • Directing which of the user space driver 404 and kernel space driver 406 is used may be based on one or more of the read intensity or write intensity of an application and/or files used by the application to increase read TPS (e.g., for reads intensive workloads) .
  • FIG. 5 is a block diagram 500 of an example of an application 502 with large file access usage according to an embodiment.
  • a bifurcated driver 504 may include a user space driver 506 and a kernel space driver 508.
  • the application 502 may use large file access to store (e.g., writes) and access (e.g., reads) metadata in a key-value engine 510, which may be local as opposed to remote.
  • the block device storage may be as described above regarding block device storage 106, as shown in FIG. 1.
  • FIG. 6 is a block diagram 600 of an example of two independent applications 602, 604 sharing an NVMe device according to an embodiment.
  • the NVMe device may be as described above regarding block device storage 106, as shown in FIG. 1.
  • the bifurcated driver 606 may include a kernel space driver 608 with kernel file system 610 support and a user space driver 612, and provide and/or generate multiple logical partitions, including a kernel namespace 614 and a user namespace 616, assigned to respective applications 602, 604 serviced by the kernel space driver 608 and the user space driver 612.
  • FIG. 7 is a block diagram 700 of an example of a processor core 701 according to one embodiment.
  • the processor core 701 may be the core for any type of processor, such as a micro-processor, an embedded processor, a digital signal processor (DSP) , a network processor, or other device to execute code. Although only one processor core 701 is illustrated in FIG. 7, a processing element may alternatively include more than one of the processor core 701 illustrated in FIG. 7.
  • the processor core 701 may be a single-threaded core or, for at least one embodiment, the processor core 701 may be multithreaded in that it may include more than one hardware thread context (or “logical processor” ) per core.
  • FIG. 7 also illustrates a memory 707 coupled to the processor core 701.
  • the memory 707 may be any of a wide variety of memories (including various layers of memory hierarchy) as are known or otherwise available to those of skill in the art.
  • the memory 771 may include one or more code 713 instruction (s) to be executed by the processor core 701, wherein the code 713 may implement the method 200 (FIG. 2) and/or method 300 (FIG. 3) , already discussed.
  • the processor core 701 follows a program sequence of instructions indicated by the code 713. Each instruction may enter a front end portion 711 and be processed by one or more decoders 721.
  • the decoder 721 may generate as its output a micro operation such as a fixed width micro operation in a predefined format, or may generate other instructions, microinstructions, or control signals which reflect the original code instruction.
  • the illustrated front end portion 711 also includes register renaming logic 725 and scheduling logic 731, which generally allocate resources and queue the operation corresponding to the convert instruction for execution.
  • the processor core 701 is shown including execution logic 751 having a set of execution units 755-1 through 755-N. Some embodiments may include a number of execution units dedicated to specific functions or sets of functions. Other embodiments may include only one execution unit or one execution unit that may perform a particular function.
  • the illustrated execution logic 751 performs the operations specified by code instructions.
  • back end logic 761 retires the instructions of the code 713.
  • the processor core 701 allows out of order execution but requires in order retirement of instructions.
  • Retirement logic 765 may take a variety of forms as known to those of skill in the art (e.g., re-order buffers or the like) . In this manner, the processor core 701 may be transformed during execution of the code 713, at least in terms of the output generated by the decoder, the hardware registers and tables utilized by the register renaming logic 725, and any registers (not shown) modified by the execution logic 751.
  • a processing element may include other elements on chip with the processor core 701.
  • a processing element may include memory control logic along with the processor core 701.
  • the processing element may include I/O control logic and/or may include I/O control logic integrated with memory control logic.
  • the processing element may also include one or more caches.
  • the memory may be external to the processor (e.g., external memory) , and/or may be coupled to the processor by, for example, a memory bus.
  • the memory may be implemented as main memory.
  • the memory may include, for example, volatile memory, non-volatile memory, and so on, or combinations thereof.
  • the memory may include dynamic random access memory (DRAM) configured as one or more memory modules such as, for example, dual inline memory modules (DIMMs) , small outline DIMMs (SODIMMs) , etc., read-only memory (ROM) (e.g., programmable read-only memory (PROM) , erasable PROM (EPROM) , electrically EPROM (EEPROM) , etc.
  • DRAM dynamic random access memory
  • ROM read-only memory
  • PROM programmable read-only memory
  • EPROM erasable PROM
  • EEPROM electrically EPROM
  • the memory may include an array of memory cells arranged in rows and columns, partitioned into independently addressable storage locations.
  • the processor and/or operating system may use a secondary memory storage with the memory to improve performance, capacity and flexibility.
  • Non-limiting examples of non-volatile memory may include any or a combination of: solid state memory (such as planar or 3-dimentional (3D) NAND flash memory or NOR flash memory) , 3D cross point memory, storage devices that use chalcogenide phase change material (e.g., chalcogenide glass) , byte addressable non-volatile memory devices, ferroelectric memory, silicon-oxide-nitride-oxide-silicon (SONOS) memory, polymer memory (e.g., ferroelectric polymer memory) , ferroelectric transistor random access memory (Fe-TRAM) ovonic memory, nanowire memory, electrically erasable programmable read-only memory (EEPROM) , other various types of non-volatile random access memories (RAMs) , and magnetic storage memory.
  • solid state memory such as planar or 3-dimentional (3D) NAND flash memory or NOR flash memory
  • 3D cross point memory storage devices that use chalcogenide phase change material (e.g
  • 3D cross point memory may comprise a transistor-less stackable cross point architecture in which memory cells sit at the intersection of words lines and bit lines and are individually addressable and in which bit storage is based on a change in bulk resistance.
  • a memory module with non-volatile memory may comply with one or more standards promulgated by the Joint Electron Device Engineering Council (JEDEC) , such as JESD218, JESD219, JESD220-1, JESD223B, JESD223-1, or other suitable standard (the JEDEC standards cited herein are available at jedec. org) .
  • JEDEC Joint Electron Device Engineering Council
  • Volatile memory is a storage medium that requires power to maintain the state of data stored by the medium.
  • volatile memory may include various types of random access memory (RAM) , such as dynamic random access memory (DRAM) or static random access memory (SRAM) .
  • DRAM dynamic random access memory
  • SRAM static random access memory
  • DRAM of the memory modules complies with a standard promulgated by JEDEC, such as JESD79F for Double Data Rate (DDR) SDRAM, JESD79-2F for DDR2 SDRAM, JESD79-3F for DDR3 SDRAM, or JESD79-4A for DDR4 SDRAM (these standards are available at www. jedec. org) .
  • Such standards (and similar standards) may be referred to as DDR- based standards and communication interfaces of the storage devices that implement such standards may be referred to as DDR-based interfaces.
  • FIG. 8 shown is a block diagram of a computing system 800 embodiment in accordance with an embodiment. Shown in FIG. 8 is a multiprocessor system 800 that includes a first processing element 870 and a second processing element 880. While two processing elements 870 and 880 are shown, it is to be understood that an embodiment of the system 800 may also include only one such processing element.
  • the system 800 is illustrated as a point-to-point interconnect system, wherein the first processing element 870 and the second processing element 880 are coupled via a point-to-point interconnect 850. It should be understood that any or all of the interconnects illustrated in FIG. 8 may be implemented as a multi-drop bus rather than point-to-point interconnect.
  • each of processing elements 870 and 880 may be multicore processors, including first and second processor cores (i.e., processor cores 874a and 874b and processor cores 884a and 884b) .
  • Such cores 874a, 874b, 884a, 884b may be configured to execute instruction code in a manner similar to that discussed above in connection with FIG. 7.
  • the cores that may execute one or more instructions such as a read instruction, a write instruction, an erase instruction, a move instruction, an arithmetic instruction, a control instruction, and so on, or combinations thereof.
  • the cores may, for example, execute one or more instructions to move data (e.g., program data, operation code, operand, etc.
  • the instructions may include any code representation such as, for example, binary code, octal code, and/or hexadecimal code (e.g., machine language) , symbolic code (e.g., assembly language) , decimal code, alphanumeric code, higher-level programming language code, and so on, or combinations thereof.
  • hexadecimal code may be used to represent an operation code (e.g., opcode) of an x86 instruction set including a byte value “00” for an add operation, a byte value “8B” for a move operation, a byte value “FF” for an increment/decrement operation, and so on.
  • operation code e.g., opcode
  • Each processing element 870, 880 may include at least one shared cache 899a, 899b.
  • the shared cache 899a, 899b may store data (e.g., instructions) that are utilized by one or more components of the processor, such as the cores 874a, 874b and 884a, 884b, respectively.
  • the shared cache 899a, 899b may locally cache data stored in a memory 832, 834 for faster access by components of the processor.
  • the shared cache 899a, 899b may include one or more mid-level caches, such as level 2 (L2) , level 3 (L3) , level 4 (L4) , or other levels of cache, a last level cache (LLC) , and/or combinations thereof.
  • LLC last level cache
  • processing elements 870, 880 While shown with only two processing elements 870, 880, it is to be understood that the scope of the embodiments are not so limited. In other embodiments, one or more additional processing elements may be present in a given processor. Alternatively, one or more of processing elements 870, 880 may be an element other than a processor, such as an accelerator or a field programmable gate array.
  • additional processing element (s) may include additional processors (s) that are the same as a first processor 870, additional processor (s) that are heterogeneous or asymmetric to processor a first processor 870, accelerators (such as, e.g., graphics accelerators or digital signal processing (DSP) units) , field programmable gate arrays, or any other processing element.
  • accelerators such as, e.g., graphics accelerators or digital signal processing (DSP) units
  • DSP digital signal processing
  • processing elements 870, 880 may reside in the same die package.
  • the first processing element 870 may further include memory controller logic (MC) 872 and point-to-point (P-P) interfaces 876 and 878.
  • the second processing element 880 may include a MC 882 and P-P interfaces 886 and 888.
  • MC’s 872 and 882 couple the processors to respective memories, namely a memory 832 and a memory 834, which may be portions of main memory locally attached to the respective processors. While the MC 872 and 882 is illustrated as integrated into the processing elements 870, 880, for alternative embodiments the MC logic may be discrete logic outside the processing elements 870, 880 rather than integrated therein.
  • the first processing element 870 and the second processing element 880 may be coupled to an I/O subsystem 890 via P-P interconnects 876 886, respectively.
  • the I/O subsystem 890 includes P-P interfaces 894 and 898.
  • I/O subsystem 890 includes an interface 892 to couple I/O subsystem 890 with a high performance graphics engine 838.
  • bus 849 may be used to couple the graphics engine 838 to the I/O subsystem 890.
  • a point-to-point interconnect may couple these components.
  • I/O subsystem 890 may be coupled to a first bus 816 via an interface 896.
  • the first bus 816 may be a Peripheral Component Interconnect (PCI) bus, or a bus such as a PCI Express bus or another third generation I/O interconnect bus, although the scope of the embodiments are not so limited.
  • PCI Peripheral Component Interconnect
  • various I/O devices 814 may be coupled to the first bus 816, along with a bus bridge 818 which may couple the first bus 816 to a second bus 820.
  • the second bus 820 may be a low pin count (LPC) bus.
  • Various devices may be coupled to the second bus 820 including, for example, a keyboard/mouse 812, communication device (s) 826, and a data storage unit 819 such as a disk drive or other mass storage device which may include code 830, in one embodiment.
  • the illustrated code 830 may implement the method 200 (FIG. 2) and/or method 300 (FIG. 3) , already discussed, and may be similar to the code 713 (FIG. 7) , already discussed.
  • an audio I/O 824 may be coupled to second bus 820 and a battery 810 may supply power to the computing system 800.
  • FIG. 8 may implement a multi-drop bus or another such communication topology.
  • the elements of FIG. 8 may alternatively be partitioned using more or fewer integrated chips than shown in FIG. 8.
  • Example 1 may include a system comprising a host memory including one or more of a volatile memory or a non-volatile memory, and a memory controller communicatively coupled to the host memory, the memory controller including a bifurcated driver to process one or more memory access requests to access the memory, wherein the bifurcated driver includes a kernel space driver communicatively coupled to a kernel file system, and a user space driver communicatively coupled to a device.
  • Example 2 may include the system of Example 1, further comprising a driver bifurcator to generate the bifurcated driver, wherein the driver bifurcator retrieves user configuration information used to configure the bifurcated driver, wherein the user configuration information includes a number of input-output queues and a number of namespaces to use for the kernel space driver and the user space driver, wherein the one or more memory access requests are to be received from one or more applications modified to use the bifurcated driver, and wherein the one or more applications include one or more of a read intensive application or a write intensive application.
  • the driver bifurcator retrieves user configuration information used to configure the bifurcated driver, wherein the user configuration information includes a number of input-output queues and a number of namespaces to use for the kernel space driver and the user space driver, wherein the one or more memory access requests are to be received from one or more applications modified to use the bifurcated driver, and wherein the one or more applications include one or more of a read intensive application or a write intensive
  • Example 3 may include the system of Example 2, wherein the driver bifurcator is to initialize the kernel space driver, wherein the driver bifurcator is to generate a data structure in a shared memory space of the host memory, wherein the host memory is to include a kernel, wherein the kernel is to manage the kernel space driver via the bifurcated driver, wherein the shared memory space is accessible by a namespace pair, the bifurcator namespace pair including a kernel namespace and a user namespace, wherein the non-volatile memory includes a file system device storage and a non-volatile memory (NVM) express (NVMe) block device storage.
  • NVM non-volatile memory express
  • the driver bifurcator is to further generate, in the host memory, the kernel namespace, one or more kernel queue pairs, the user namespace and one or more user space queue pairs, assign the kernel namespace and the one or more kernel queue pairs to the kernel space driver, update the data structure with kernel queue pair metadata identifying the kernel namespace and the one or more kernel queue pairs, and register the the device to the kernel, wherein the device is the NVMe block device storage.
  • Example 4 may include the system of Example 3, wherein the data structure includes data structure metadata including one or more of submission queue physical addresses, completion queue physical addresses, queue size or queue doorbell registers physical address.
  • Example 5 may include the system of Example 3, wherein the one or more kernel queue pairs include one or more of an admin queue pair includes an admin submission queue and an admin completion queue, and one or more of a kernel input-output queue pair includes a kernel submission queue and a kernel completion queue, and wherein the one or more user space queue pairs comprise a user space submission queue and a user space completion queue.
  • Example 6 may include the system of Example 3, wherein the driver bifurcator is to initialize, using the data structure metadata, the user space driver, wherein the user space driver is a polling mode driver, wherein the driver bifurcator is to retrieve the user configuration information and the data structure metadata, generate the device based on the user configuration information and the data structure metadata, update the data structure with the user space queue pair metadata, and register to the user space driver a block device interface to the device.
  • Example 7 may include the system of Example 3, wherein the bifurcated driver is communicatively coupled to the memory controller, and wherein the bifurcated driver is to receive the one or more memory access requests from the one or more applications to access one or more of the file system device storage or the NVMe block device storage, process, using the bifurcated driver, the one or more memory access requests, update, using, one or more of the kernel namespace, the user namespace, the one or more kernel queue pairs or the one or more user space queue pairs based on the one or more memory access requests, and synchronize, using the memory controller, the data structure.
  • Example 8 may include the system of Example 3, wherein the volatile memory includes random access memory (RAM) , including one or more of dynamic random access memory (DRAM) , static random access memory (SRAM) , or synchronous dynamic random access memory (SDRAM) , and wherein the non-volatile memory includes non-volatile random access memories including one or more of solid state memory, 3D cross point memory, one or more storage devices using chalcogenide phase change material, byte addressable non-volatile memory devices, ferroelectric memory, silicon-oxide-nitride-oxide-silicon (SONOS) memory, polymer memory including ferroelectric polymer memory, ferroelectric transistor random access memory (Fe-TRAM) ovonic memory, nanowire memory, or electrically erasable programmable read-only memory (EEPROM) , wherein the solid state memory, includes one or more of planar or 3-dimentional (3D) NAND flash memory or NOR flash memory, wherein the 3D cross point memory comprises at least one of a transistor-less
  • Example 9 may include the system of Example 8, wherein, when the one or more memory access requests is a read request, the bifurcated driver is to direct the one or more memory access requests to the user space driver.
  • Example 10 may include the system of any one of Examples 2 to 8, wherein, when the one or more memory access requests is a write request, the bifurcated driver is to direct the one or more memory access requests to the kernel space driver.
  • Example 11 may include a method of sharing a non-volatile memory (NVM) express (NVMe) device comprising generating, using a processor coupled to a host memory, a bifurcated driver, wherein the host memory is to include one or more of a volatile memory or a non-volatile memory, wherein the bifurcated driver includes a kernel space driver communicatively coupled to a kernel file system, and a user space driver communicatively coupled to a device, and processing, using the bifurcated driver, one or more memory access requests to access the non-volatile memory.
  • NVM non-volatile memory express
  • Example 12 may include the method of Example 11, further comprising receiving the one or more memory access requests from one or more applications modified to use the bifurcated driver, wherein generating the bifurcated driver comprises retrieving, using a driver bifurcator, user configuration information used to configure the bifurcated driver, wherein the user configuration information includes a number of input-output queues and a number of namespaces to use for the kernel space driver and the user space driver, wherein the one or more applications include one or more of a read intensive application or a write intensive application, and wherein the one or more memory access requests include one or more of a read request or a write request.
  • Example 13 may include the method of Example 12, initializing the kernel space driver, comprising generating a data structure in a shared memory space of the host memory, wherein the host memory is to include a kernel, wherein the kernel is to manage the kernel space driver via the bifurcated driver, wherein the shared memory space is accessible by a namespace pair, including a kernel namespace and a user namespace, wherein the non-volatile memory includes a file system device storage and a NVMe block device storage.
  • Initializing the kernel space driver further comprising generating, in the host memory, the kernel namespace, one or more kernel queue pairs, the user namespace and one or more user space queue pairs, assigning the kernel namespace and the one or more kernel queue pairs to the kernel space driver, updating the data structure with kernel queue pair metadata identifying the kernel namespace and the one or more kernel queue pairs, and registering the device to the kernel, wherein the device is the NVMe block device storage.
  • Example 14 may include the method of Example 13, wherein the data structure includes data structure metadata including one or more of submission queue physical addresses, completion queue physical addresses, queue size or queue doorbell registers physical address.
  • Example 15 may include the method of Example 13, wherein the one or more kernel queue pairs include one or more of an admin queue pair includes an admin submission queue and an admin completion queue, and one or more of a kernel input-output queue pair includes a kernel submission queue and a kernel completion queue, and wherein the one or more user space queue pairs comprise a user space submission queue and a user space completion queue.
  • Example 16 may include the method of Example 13, further comprising initializing, using the data structure metadata, the user space driver, wherein the user space driver is a polling mode driver, initializing the user space driver comprising retrieving the user configuration information and the data structure metadata, generating the device based on the user configuration information and the data structure metadata, updating the data structure with the user space queue pair metadata, and registering to the user space driver a block device interface to the device.
  • Example 17 may include the method of Example 13, further comprising receiving, by the bifurcated driver, the one or more memory access requests from the one or more applications to access one or more of the file system device storage or the NVMe block device storage, respectively, processing, using the bifurcated driver, the one or more memory access requests, updating, using a memory controller communicatively coupled to the bifurcated driver, one or more of the kernel namespace, the user namespace, the one or more kernel queue pairs or the one or more user space queue pairs based on the one or more memory access requests, and synchronizing, using the memory controller, the data structure.
  • Example 18 may include the method of Example 13, wherein the volatile memory includes random access memory (RAM) , including one or more of dynamic random access memory (DRAM) , static random access memory (SRAM) , or synchronous dynamic random access memory (SDRAM) , and wherein the non-volatile memory includes non-volatile random access memories including one or more of solid state memory, 3D cross point memory, one or more storage devices using chalcogenide phase change material, byte addressable non-volatile memory devices, ferroelectric memory, silicon-oxide-nitride-oxide-silicon (SONOS) memory, polymer memory including ferroelectric polymer memory, ferroelectric transistor random access memory (Fe-TRAM) ovonic memory, nanowire memory, or electrically erasable programmable read-only memory (EEPROM) , wherein the solid state memory, includes one or more of planar or 3-dimentional (3D) NAND flash memory or NOR flash memory, wherein the 3D cross point memory comprises at least one of a transistor-less
  • Example 19 may include any one of Examples 12 to 18, wherein, when the one or more memory access requests is a read request, directing, using the bifurcated driver, the one or more memory access requests to the user space driver, and wherein, when the one or more memory access requests is a write request, directing, using the bifurcated driver, the one or more memory access requests to the kernel space driver.
  • Example 20 may include at least one computer readable storage medium comprising a set of instructions, which when executed by a computing device, cause the computing device to generate, using a processor coupled to a host memory, a bifurcated driver, wherein the host memory is to include one or more of a volatile memory or a non-volatile memory, wherein the bifurcated driver includes a kernel space driver communicatively coupled to a kernel file system, and a user space driver communicatively coupled to a device, and process, using the bifurcated driver, one or more memory access requests to access the non-volatile memory.
  • a bifurcated driver includes a kernel space driver communicatively coupled to a kernel file system, and a user space driver communicatively coupled to a device, and process, using the bifurcated driver, one or more memory access requests to access the non-volatile memory.
  • Example 21 may include the at least one computer readable storage medium of Example 20, wherein the instructions, when executed, cause a computing device to receive the one or more memory access requests from one or more applications modified to use the bifurcated driver, generate, using a driver bifurcator, the bifurcated driver comprising retrieving user configuration information used to configure the bifurcated driver, wherein the user configuration information includes a number of input-output queues and a number of namespaces to use for the kernel space driver and the user space driver, wherein the one or more applications include one or more of a read intensive application or a write intensive application, and wherein the one or more memory access requests include one or more of a read request or a write request.
  • Example 22 may include the at least one computer readable storage medium of Example 21, wherein the instructions, when executed, further cause the computing device to initialize the kernel space driver, wherein initializing the kernel space driver further causes the computing device to generate a data structure in a shared memory space of the host memory, wherein the host memory is to include a kernel, wherein the kernel is to manage the kernel space driver via the bifurcated driver, wherein the shared memory space is accessible by a namespace pair, including a kernel namespace and a user namespace, wherein the non-volatile memory includes a file system device storage and a non-volatile memory (NVM) express (NVMe) block device storage.
  • NVM non-volatile memory express
  • Initializing the kernel space driver may cause the instructions, when executed, to further cause the computing device to generate, in the host memory, the kernel namespace, one or more kernel queue pairs, the user namespace and one or more user space queue pairs, assign the kernel namespace and the one or more kernel queue pairs to the kernel space driver, update the data structure with kernel queue pair metadata identifying the kernel namespace and the one or more kernel queue pairs, and register the file system device storage to the kernel, wherein the device is the NVMe block device storage.
  • Example 23 may include the at least one computer readable storage medium of Example 22, wherein the data structure includes data structure metadata including one or more of submission queue physical addresses, completion queue physical addresses, queue size or queue doorbell registers physical address.
  • Example 24 may include the at least one computer readable storage medium of Example 22, wherein the one or more kernel queue pairs include one or more of an admin queue pair comprising an admin submission queue and an admin completion queue, and one or more of a kernel input-output queue pair comprising a kernel submission queue and a kernel completion queue, and wherein the one or more user space queue pairs comprise a user space submission queue and a user space completion queue.
  • Example 25 may include the at least one computer readable storage medium of Example 22, wherein the instructions, when executed, cause a computing device to initialize, using the data structure metadata, the user space driver, wherein the user space driver is a polling mode driver, wherein initializing the user space driver further causes the computing device to retrieve the user configuration information and the data structure metadata, generate the device based on the user configuration information and the data structure metadata, update the data structure with the user space queue pair metadata, and register to the user space driver a block device interface to the device.
  • Example 26 may include the at least one computer readable storage medium of Example 22, wherein the volatile memory includes random access memory (RAM) , including one or more of dynamic random access memory (DRAM) , static random access memory (SRAM) , or synchronous dynamic random access memory (SDRAM) , and wherein the non-volatile memory includes non-volatile random access memories including one or more of solid state memory, 3D cross point memory, one or more storage devices using chalcogenide phase change material, byte addressable non-volatile memory devices, ferroelectric memory, silicon-oxide-nitride-oxide-silicon (SONOS) memory, polymer memory including ferroelectric polymer memory, ferroelectric transistor random access memory (Fe-TRAM) ovonic memory, nanowire memory, or electrically erasable programmable read-only memory (EEPROM) , wherein the solid state memory, includes one or more of planar or 3-dimentional (3D) NAND flash memory or NOR flash memory, wherein the 3D cross point memory comprises at least
  • Example 27 may include the at least one computer readable storage medium of any one of Example 22 to 26, wherein the instructions, when executed, cause a computing device to receive, by the bifurcated driver, the one or more memory access requests from the one or more applications to access one or more of the file system device storage or the NVMe block device storage, process, using the bifurcated driver, the one or more memory access requests, update, using the memory controller communicatively coupled to the bifurcated driver, one or more of the kernel namespace, the user namespace, the one or more kernel queue pairs or the one or more user space queue pairs based on the one or more memory access requests, and synchronize, using the memory controller, the data structure, wherein, when the one or more memory access requests is a read request, directing, using the bifurcated driver, the one or more memory access requests to the user space driver, and wherein, when the one or more memory access requests is a write request, directing, using the bifurcated driver, the one or more memory access requests to the kernel space driver.
  • Example 28 may include a system comprising means for generating, using a processor coupled to a host memory, a bifurcated driver, wherein the host memory is to include one or more of a volatile memory or a non-volatile memory, wherein the bifurcated driver includes a kernel space driver communicatively coupled to a kernel file system, and a user space driver communicatively coupled to a device, and processing, using the bifurcated driver, one or more memory access requests to access the non-volatile memory.
  • Example 29 may include the system of Example 28, further comprising means for receiving the one or more memory access requests from one or more applications modified to use the bifurcated driver, wherein means for generating the bifurcated driver comprises retrieving, using a driver bifurcator, user configuration information used to configure the bifurcated driver, wherein the user configuration information includes a number of input-output queues and a number of namespaces to use for the kernel space driver and the user space driver, wherein the one or more applications include one or more of a read intensive application or a write intensive application, and wherein the one or more memory access requests include one or more of a read request or a write request.
  • Example 30 may include the system of Example 28, further comprising means for initializing the kernel space driver, comprising means for generating a data structure in a shared memory space of the host memory, wherein the host memory is to include a kernel, wherein the kernel is to manage the kernel space driver via the bifurcated driver, wherein the shared memory space is accessible by a bifurcator namespace pair, the bifurcator namespace pair including a kernel namespace and a user namespace, wherein the non-volatile memory includes a file system device storage and a NVMe block device storage.
  • Initializing the kernel space driver may further comprise means for generating, in the host memory, the kernel namespace, one or more kernel queue pairs, the user namespace and one or more user space queue pairs, means for assigning the kernel namespace and the one or more kernel queue pairs to the kernel space driver, means for updating the data structure with kernel queue pair metadata identifying the kernel namespace and the one or more kernel queue pairs, and means for registering the file system device storage to the kernel, wherein the device is the NVMe block device storage.
  • Example 31 may include the system of Example 30, wherein the data structure includes data structure metadata including one or more of submission queue physical addresses, completion queue physical addresses, queue size or queue doorbell registers physical address.
  • Example 32 may include the system of Example 30, wherein the one or more kernel queue pairs include one or more of an admin queue pair includes an admin submission queue and an admin completion queue, and one or more of a kernel input-output queue pair includes a kernel submission queue and a kernel completion queue, and wherein the one or more user space queue pairs comprise a user space submission queue and a user space completion queue.
  • Example 33 may include the system of Example 30, wherein the volatile memory includes random access memory (RAM) , including one or more of dynamic random access memory (DRAM) , static random access memory (SRAM) , or synchronous dynamic random access memory (SDRAM) , and wherein the non-volatile memory includes non-volatile random access memories including one or more of solid state memory, 3D cross point memory, one or more storage devices using chalcogenide phase change material, byte addressable non-volatile memory devices, ferroelectric memory, silicon-oxide-nitride-oxide-silicon (SONOS) memory, polymer memory including ferroelectric polymer memory, ferroelectric transistor random access memory (Fe-TRAM) ovonic memory, nanowire memory, or electrically erasable programmable read-only memory (EEPROM) , wherein the solid state memory, includes one or more of planar or 3-dimentional (3D) NAND flash memory or NOR flash memory, wherein the 3D cross point memory comprises at least one of a transistor-less
  • Example 34 may include the system of any one of Examples 30 to 33, further comprising means for initializing, using the data structure metadata, the user space driver, wherein the user space driver is a polling mode driver, initializing the user space driver comprising means for retrieving the user configuration information and the data structure metadata, means for generating the device based on the user configuration information and the data structure metadata, means for updating the data structure with the user space queue pair metadata, means for registering to the user space driver a block device interface to the device.
  • Initializing the user space driver further comprises means for receiving, by the bifurcated driver, the one or more memory access requests from the one or more applications to access one or more of the file system device storage or the NVMe block device storage, means for processing, using the bifurcated driver, the one or more memory access requests, means for updating, using the memory controller communicatively coupled to the bifurcated driver, one or more of the kernel namespace, the user namespace, the one or more kernel queue pairs or the one or more user space queue pairs based on the one or more memory access requests, and means for synchronizing, using the memory controller, the data structure, wherein, when the one or more memory access requests is a read request, directing, using the bifurcated driver, the one or more memory access requests to the user space driver, and wherein, when the one or more memory access requests is a write request, directing, using the bifurcated driver, the one or more memory access requests to the kernel space driver.
  • Embodiments are applicable for use with all types of semiconductor integrated circuit ( “IC” ) chips.
  • IC semiconductor integrated circuit
  • Examples of these IC chips include but are not limited to processors, controllers, chipset components, programmable logic arrays (PLAs) , memory chips, network chips, systems on chip (SoCs) , SSD/NAND controller ASICs, and the like.
  • PLAs programmable logic arrays
  • SoCs systems on chip
  • SSD/NAND controller ASICs solid state drive/NAND controller ASICs
  • signal conductor lines are represented with lines. Some may be different, to indicate more constituent signal paths, have a number label, to indicate a number of constituent signal paths, and/or have arrows at one or more ends, to indicate primary information flow direction. This, however, should not be construed in a limiting manner.
  • Any represented signal lines may actually comprise one or more signals that may travel in multiple directions and may be implemented with any suitable type of signal scheme, e.g., digital or analog lines implemented with differential pairs, optical fiber lines, and/or single-ended lines.
  • Example sizes/models/values/ranges may have been given, although embodiments are not limited to the same. As manufacturing techniques (e.g., photolithography) mature over time, it is expected that devices of smaller size could be manufactured.
  • well known power/ground connections to IC chips and other components may or may not be shown within the figures, for simplicity of illustration and discussion, and so as not to obscure certain aspects of the embodiments.
  • arrangements may be shown in block diagram form in order to avoid obscuring embodiments, and also in view of the fact that specifics with respect to implementation of such block diagram arrangements are highly dependent upon the computing system within which the embodiment is to be implemented, i.e., such specifics should be well within purview of one skilled in the art.
  • Coupled may be used herein to refer to any type of relationship, direct or indirect, between the components in question, and may apply to electrical, mechanical, fluid, optical, electromagnetic, electromechanical or other connections.
  • first may be used herein only to facilitate discussion, and carry no particular temporal or chronological significance unless otherwise indicated.
  • a list of items joined by the term “one or more of” may mean any combination of the listed terms.
  • the phrases “one or more of A, B or C” may mean A; B; C; A and B; A and C; B and C; or A, B and C.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Human Computer Interaction (AREA)
  • Computer Security & Cryptography (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Des systèmes, des appareils et des procédés peuvent fournir un pilote bifurqué pour gérer un ou plusieurs dispositifs de mémoire non volatile (NVM) express (NVMe) et/ou un stockage de système de fichiers. Plus particulièrement, des systèmes, des appareils et des procédés peuvent fournir une manière d'assurer une performance d'entrée-sortie élevée, à l'aide d'un pilote NVMe bifurqué qui comprend un pilote d'espace utilisateur et un pilote d'espace de noyau. Les systèmes, appareils et procédés peuvent permettre au conducteur bifurqué de diriger des demandes d'accès au pilote d'espace utilisateur et/ou au pilote d'espace de noyau sur la base d'une ou plusieurs de l'intensité de lecture ou de l'intensité d'écriture d'une application et/ou de fichiers utilisés par l'application, afin de fournir des performances de transactions par seconde (TPS) élevées.
PCT/CN2016/113701 2016-12-23 2016-12-30 Technologie pour mettre en oeuvre un pilote express de mémoire non volatile bifurquée WO2018113030A1 (fr)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201680091055.5A CN109983443B (zh) 2016-12-23 2016-12-30 实现分叉非易失性存储器快速驱动器的技术
DE112016007538.3T DE112016007538T5 (de) 2016-12-23 2016-12-30 Technologie zur realisierung eines binärverzweigten non-volatile-memory-express-treibers

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN2016111736 2016-12-23
CNPCT/CN2016/111736 2016-12-23

Publications (1)

Publication Number Publication Date
WO2018113030A1 true WO2018113030A1 (fr) 2018-06-28

Family

ID=62624352

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/113701 WO2018113030A1 (fr) 2016-12-23 2016-12-30 Technologie pour mettre en oeuvre un pilote express de mémoire non volatile bifurquée

Country Status (3)

Country Link
CN (1) CN109983443B (fr)
DE (1) DE112016007538T5 (fr)
WO (1) WO2018113030A1 (fr)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111444113A (zh) * 2019-01-16 2020-07-24 阿里巴巴集团控股有限公司 非易失性存储介质共享方法、装置、电子设备及存储设备
US11409439B2 (en) 2020-11-10 2022-08-09 Samsung Electronics Co., Ltd. Binding application to namespace (NS) to set to submission queue (SQ) and assigning performance service level agreement (SLA) and passing it to a storage device
US12130767B2 (en) 2020-03-31 2024-10-29 Samsung Electronics Co., Ltd. Scaling performance in a storage server with storage devices
US12405824B2 (en) 2020-11-10 2025-09-02 Samsung Electronics Co., Ltd. System architecture providing end-to-end performance isolation for multi-tenant systems

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112667565B (zh) * 2020-12-30 2021-12-03 湖南博匠信息科技有限公司 一种基于fuse的存储单元文件管理方法及系统
CN114968385B (zh) * 2022-05-23 2024-08-06 斑马网络技术有限公司 基于微内核操作系统的驱动协调方法、设备及存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090265515A1 (en) * 2008-04-16 2009-10-22 Hiroshi Kyusojin Information Processing Apparatus, Information Processing Method, and Computer Program
US20100250836A1 (en) * 2009-03-25 2010-09-30 Anobit Technologies Ltd Use of Host System Resources by Memory Controller
US8533376B1 (en) * 2011-07-22 2013-09-10 Kabushiki Kaisha Yaskawa Denki Data processing method, data processing apparatus and robot
CN104683430A (zh) * 2013-07-08 2015-06-03 英特尔公司 用于从远程可访问存储设备进行初始化的技术

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101458748B (zh) * 2005-04-22 2011-12-07 微软公司 载入内核组件以创建安全计算环境的方法
US8713312B2 (en) * 2008-12-07 2014-04-29 Trend Micrio Incorporated Method and system for detecting data modification within computing device
US9535870B2 (en) * 2013-09-18 2017-01-03 HGST Netherlands B.V. Acknowledgement-less protocol for solid state drive interface

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090265515A1 (en) * 2008-04-16 2009-10-22 Hiroshi Kyusojin Information Processing Apparatus, Information Processing Method, and Computer Program
US20100250836A1 (en) * 2009-03-25 2010-09-30 Anobit Technologies Ltd Use of Host System Resources by Memory Controller
US8533376B1 (en) * 2011-07-22 2013-09-10 Kabushiki Kaisha Yaskawa Denki Data processing method, data processing apparatus and robot
CN104683430A (zh) * 2013-07-08 2015-06-03 英特尔公司 用于从远程可访问存储设备进行初始化的技术

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111444113A (zh) * 2019-01-16 2020-07-24 阿里巴巴集团控股有限公司 非易失性存储介质共享方法、装置、电子设备及存储设备
CN111444113B (zh) * 2019-01-16 2023-06-13 阿里巴巴集团控股有限公司 非易失性存储介质共享方法、装置、电子设备及存储设备
US12130767B2 (en) 2020-03-31 2024-10-29 Samsung Electronics Co., Ltd. Scaling performance in a storage server with storage devices
US11409439B2 (en) 2020-11-10 2022-08-09 Samsung Electronics Co., Ltd. Binding application to namespace (NS) to set to submission queue (SQ) and assigning performance service level agreement (SLA) and passing it to a storage device
US12405824B2 (en) 2020-11-10 2025-09-02 Samsung Electronics Co., Ltd. System architecture providing end-to-end performance isolation for multi-tenant systems

Also Published As

Publication number Publication date
CN109983443A (zh) 2019-07-05
DE112016007538T5 (de) 2019-09-26
CN109983443B (zh) 2024-03-08

Similar Documents

Publication Publication Date Title
US11625321B2 (en) Apparatuses and methods for memory address translation during block migration using depth mapping table based on mapping state
US11836380B2 (en) NVMe direct virtualization with configurable storage
CN114402291B (zh) 用于将数据绑定到存储器命名空间的存储器系统
WO2018113030A1 (fr) Technologie pour mettre en oeuvre un pilote express de mémoire non volatile bifurquée
US11194750B2 (en) Memory sub-system with multiple ports having single root virtualization
KR20220041937A (ko) 메모리 유형에 대한 페이지 테이블 후크
US20210049101A1 (en) MEMORY TIERING USING PCIe CONNECTED FAR MEMORY
US20170255565A1 (en) Method and apparatus for providing a contiguously addressable memory region by remapping an address space
CN107621959A (zh) 电子装置及其软件训练方法、计算系统
CN112017700A (zh) 用于存储器装置的动态功率管理网络
KR20230094964A (ko) 이종 메모리 타겟의 인터리빙
US11385926B2 (en) Application and system fast launch by virtual address area container
US20250258765A1 (en) Managing metadata associated with memory access operations in a memory device
US12379859B2 (en) Storage controller mapping physical function to virtual machine and method of operating electronic system including the same
CN120752607A (zh) 具有文件系统管理器的数据存储装置
WO2021034599A1 (fr) Systèmes de mémoire hiérarchiques
EP4018323A1 (fr) Systèmes de mémoire hiérarchique
EP4018321A1 (fr) Systèmes de mémoire hiérarchiques
WO2021034654A1 (fr) Systèmes de hiérarchisation des mémoires

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16924497

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 16924497

Country of ref document: EP

Kind code of ref document: A1