[go: up one dir, main page]

WO2018002999A1 - Dispositif et équipement de mémoire - Google Patents

Dispositif et équipement de mémoire Download PDF

Info

Publication number
WO2018002999A1
WO2018002999A1 PCT/JP2016/069077 JP2016069077W WO2018002999A1 WO 2018002999 A1 WO2018002999 A1 WO 2018002999A1 JP 2016069077 W JP2016069077 W JP 2016069077W WO 2018002999 A1 WO2018002999 A1 WO 2018002999A1
Authority
WO
WIPO (PCT)
Prior art keywords
storage
block
data
controller
ssd
Prior art date
Application number
PCT/JP2016/069077
Other languages
English (en)
Japanese (ja)
Inventor
英通 小関
藤本 和久
Original Assignee
株式会社日立製作所
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 株式会社日立製作所 filed Critical 株式会社日立製作所
Priority to PCT/JP2016/069077 priority Critical patent/WO2018002999A1/fr
Publication of WO2018002999A1 publication Critical patent/WO2018002999A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/08Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers from or to individual record carriers, e.g. punched card, memory card, integrated circuit [IC] card or smart card

Definitions

  • the present invention relates to storage device control.
  • the storage apparatus has a large number of storage devices for storing data and a storage controller for controlling the storage devices, and is intended to provide a large-capacity data storage space to a computer.
  • An HDD Hard Disk Disk Drive
  • SSD Solid State
  • FM flash memory
  • the reclamation process is a process of generating a reusable block by erasing one or more blocks.
  • an SSD When erasing a block, the data on the page in use must be moved to another block. Therefore, as described in Patent Document 1, for example, an SSD generally has a physical storage capacity larger than the size of a logical address space provided to an initiator such as a host or a storage controller. This physical storage area exceeding the size of the logical address space is called a spare area. The SSD performs reclamation processing using a spare area.
  • SSD requires a spare block (spare area) for reclamation processing. If there are no spare blocks (depletion), the operation cannot be continued. As is well known, there is an upper limit on the number of times each block can be erased. When the number of erasable blocks exceeds the upper limit, the block cannot be used, and the spare area is reduced.
  • the number of erasures can be predicted to some extent by observing the usage trend of the SSD. Therefore, the use of the SSD with the increased number of erasures is stopped, and the data of the SSD is stored in another storage device. It is possible to continue operation by moving to. Rather, a problem is a failure that occurs suddenly (occurs at an unexpected timing).
  • F FM block or die quality varies. If the quality of certain blocks (or dies) is innately worse than others, some of these blocks (or dies) will fail suddenly, resulting in depletion of spare areas As a result, there is a possibility that the operation cannot be continued. However, it is difficult to detect such a sudden decrease in blocks in advance.
  • a storage device includes a device controller that provides a storage controller with a logical storage space of a predetermined size, and a nonvolatile semiconductor storage medium having a plurality of blocks that are data erasure units. Each block is configured so that the cells in the block can be changed from a state in which the cell is operating in the first mode capable of storing n-bit information to a second mode capable of storing m-bit (n ⁇ m) information. Has been.
  • the device controller manages, as a spare capacity, a storage area that exceeds the amount of the intra-block storage area required for allocation to the logical storage space among the usable storage areas in the block.
  • the usable storage area is increased by changing a part of the block being operated in the first mode to be operated in the second mode.
  • 12 is a flowchart of data read / write processing of the SSD controller according to the second embodiment. 12 is a flowchart of FM diagnosis processing of the SSD controller in the second embodiment. It is a figure explaining the outline
  • capacitance rebalance process. 10 is a flowchart of drive monitoring processing of a storage controller in Embodiment 2. It is a figure explaining the structure of the logical physical conversion table in Example 2. FIG. It is a figure explaining the structure of the block management table in Example 2. FIG.
  • the description is based on the premise that the storage device is an SSD.
  • the nonvolatile semiconductor storage medium included in the SSD is a flash memory (FM).
  • the flash memory is assumed to be a type of flash memory that is read / written in units of pages, typically a NAND flash memory.
  • the flash memory may be another type of flash memory instead of the NAND type.
  • other types of non-volatile semiconductor storage media such as phase change memory may be employed.
  • FIG. 1 is a time-series representation of the capacity change of the SSD according to the first embodiment.
  • the SSD in this embodiment is equipped with an FM capable of changing the cell mode.
  • the description will be made on the assumption that each cell can be operated in either a mode capable of storing n-bit data or a mode capable of storing m-bit data (where n ⁇ m). .
  • the mode in which the cell can store n-bit (2 bits) data is called the MLC (Multi-Level Cell) mode, and the mode in which the cell can store m-bit (3 bits) data. This is called a TLC (Triple-Level Cell) mode.
  • the MLC mode and the TLC mode are collectively referred to as “cell mode” or “mode”.
  • the SSD has a logical address space (LBA space) provided to an initiator such as a storage controller and a physical address space (PBA space) for storing actual data.
  • LBA space logical address space
  • PBA space physical address space
  • the size of the logical address space is the logical capacity
  • the size of the physical address space is the physical capacity.
  • a rectangular object 1000 represents a logical address space
  • objects 1001 and 1002 represent areas on the physical address space.
  • the lengths of the objects 1000, 1001, and 1002 represent the size (capacity) of the space. Therefore, in the description of FIG. 1, the objects 1000, 1001, and 1002 are referred to as a logical capacity 1000, a physical capacity 1001, and a physical capacity 1002, respectively.
  • the SSD provides the logical capacity 1000 to the initiator and has a physical address space (physical storage area) having a size equal to the sum of the physical capacities 1001 and 1002.
  • the physical capacities 1001 and 1002 are constructed in the MLC mode.
  • the physical capacity 1001 is the total capacity of physical areas (a set of cells) reserved for allocation to the logical address space, and is equal to the logical capacity 1000.
  • the physical capacity 1002 is an area that the SSD has beyond the physical capacity 1001 and is a kind of surplus area.
  • the physical capacity 1002 is the amount of physical storage area that the SSD has for that purpose, and is referred to as “reserve capacity” in this specification.
  • the state at time t2 shows a situation when a failure occurs in some blocks in the SSD and the spare capacity is reduced.
  • the physical storage area corresponding to the physical capacity 1004 is unusable due to a block failure. Even if some of the blocks become unusable, the logical capacity 1000 cannot be reduced, and the physical capacity to be allocated to the logical capacity 1000 is as much as the logical capacity 1000. Therefore, when some blocks become unusable, the reserve capacity is reduced by the amount of the unusable blocks. As a result, the reserve capacity 1002 existing at time t1 is reduced to the reserve capacity 1003 at time t2. If further block failure occurs from here, there is a high possibility that the reserve capacity will be exhausted. Here, the FM cell remains in the MLC mode.
  • the SSD controller that has recognized that there is a high risk of reserve capacity depletion expands (recovers) the reserve capacity by changing some cells from the MLC mode to the TLC mode.
  • the state at time t3 indicates the internal state of the SSD after the mode change is performed.
  • the SSD controller changes a part of blocks constituting the physical capacity 1001 (or physical capacity 1003) from the MLC mode to the TLC mode. Therefore, the MLC mode area is reduced from the physical capacity 1001 (and physical capacity 1003) to the physical capacity 1005. On the other hand, in the area of the difference between the physical capacity 1001 and the physical capacity 1005, the capacity has increased 1.5 times due to the TLC mode.
  • the areas correspond to physical capacities 1006 and 1007.
  • the SSD controller allocates physical capacities 1005 and 1006 (an area having a size equal to the logical capacity 1000) to a user data storage application, and uses the physical capacity 1007 as a spare capacity. As a result, the reserve capacity has expanded from the physical capacity 1003 at time t2 to the physical capacity 1007.
  • the SSD controller can expand the physical capacity by changing the cell mode when the reserve capacity decreases. As a result, the reduced reserve capacity can be recovered, and the risk of operation stoppage due to exhaustion of the reserve capacity can be avoided. Furthermore, in a series of processes, the logical capacity provided to the initiator such as the storage controller is not changed, so that the storage controller can continue operation without being influenced by the internal state of the SSD.
  • FIG. 2 is a diagram illustrating a configuration example of the storage system 10000 including the storage apparatus 1 according to the first embodiment.
  • the storage device 1 includes a storage controller 10 and a plurality of SSDs 21 connected to the storage controller 10.
  • the SSD 21 is a storage device for storing write data from an initiator such as the host 2, and is a storage device that employs a nonvolatile semiconductor memory such as a flash memory as a storage medium.
  • the internal configuration of the SSD 21 will be described later.
  • the SSD 21 is connected to the storage controller 10 by a transmission line (SAS link) conforming to the SAS (Serial Attached SCSI) standard, a transmission line (PCI link) conforming to the PCI (Peripheral Component Interconnect) standard, or the like.
  • SAS link SAS
  • PCI link Peripheral Component Interconnect
  • the storage apparatus 1 of this embodiment can be equipped with an HDD (Hard Disk Drive) 25 in addition to the SSD 21.
  • the HDD 25 is a storage device that uses a magnetic disk as a recording medium.
  • the HDD 25 is also connected to the storage controller 10 like the SSD 21.
  • the HDD 25 is also connected to the storage controller 10 by a SAS link or the like.
  • storage devices such as the SSD 21 and the HDD 25 installed in the storage apparatus 1 may be referred to as “drives”.
  • drives storage devices
  • One or more hosts 2 are connected to the storage controller 10.
  • a management host 5 is connected to the storage controller 10.
  • the storage controller 10 and the host 2 are connected via a SAN (Storage Area Network) 3 formed using a fiber channel as an example.
  • the storage controller 10 and the management host 5 are connected via a LAN (Local Area Network) 6 formed using Ethernet as an example.
  • the storage controller 10 includes at least a processor (CPU) 11, a host interface (denoted as “host I / F” in the figure) 12, a disk interface (denoted as “disk I / F” in the figure) 13, a memory 14, a management I / F 15 for use.
  • the processor 11, host IF 12, disk IF 13, memory 14 and management I / F 15 are interconnected via an internal switch (internal SW) 16.
  • internal SW internal switch
  • FIG. 2 a plurality of these components may be mounted in the storage controller 10 in order to achieve high performance and high availability. Further, instead of the internal SW 16, the components may be connected to each other via a common bus.
  • the disk I / F 13 has at least an interface controller and a transfer circuit.
  • the interface controller is a component for converting a protocol (SAS in one example) used by the SSD 21 into a communication protocol (PCI-Express as an example) used in the storage controller 10.
  • the transfer circuit is used when the storage controller 10 transfers data (read, write) to the SSD 21.
  • the host I / F 12 has at least an interface controller and a transfer circuit, like the disk I / F 13.
  • the interface controller included in the host I / F 12 converts a communication protocol (for example, fiber channel) used in the data transfer path between the host 2 and the storage controller 10 and a communication protocol used in the storage controller 10. belongs to.
  • the processor 11 performs various controls of the storage device 1.
  • the memory 14 is used to store programs executed by the processor 11 and various management information of the storage device 1 used by the processor 11.
  • the memory 14 is also used for temporarily storing I / O target data for the SSD 21.
  • the storage area in the memory 14 used for temporarily storing the I / O target data for the SSD 21 is referred to as “cache”.
  • the memory 14 is configured by a volatile storage medium such as DRAM or SRAM. However, as another embodiment, the memory 14 may be configured by using a nonvolatile memory.
  • FIG. 3 is a diagram for explaining the volume configuration of the storage system.
  • the capacity virtualization function is a technique for providing a virtual capacity larger than the physical capacity of the storage apparatus to the host computer as a virtual volume. Details will be described below with reference to FIG.
  • the storage device 1 is equipped with SSDs 21-1 to 21-3.
  • the SSD 21-1 has a logical address space presented to the storage controller 10 and a physical address space for storing actual data.
  • the size of the logical address space is the logical capacity
  • the size of the physical address space is the physical capacity.
  • the association between the area on the logical address space and the area on the physical address space can be dynamically changed, and is managed by a logical-physical conversion table 1100 described later.
  • the physical address space is composed of a plurality of blocks 211 described later.
  • the block 211 is used in either the MLC mode (“M” in the figure) or the TLC mode (“T” in the figure).
  • the SSD 21 can change the cell mode for each block 211.
  • the storage controller 10 forms a RAID group (RG) 30-1 using the logical address space provided by the SSDs 21-1 to 21-3. Although not shown, the RAID group 30-2 is configured using another SSD. Further, the storage controller 10 makes the two RAID groups 30-1 and 30-2 belong to a management unit called a pool 35.
  • the pool 35 is a set of storage areas that can be allocated to virtual chunks of a virtual volume to be described later.
  • the storage controller 10 manages the storage area of the RAID group by dividing it into partitions of a predetermined size. This partition is called “chunk”. A chunk 31 is created in the RAID group 30-1, and a chunk 32 is created in the RAID group 30-2.
  • the storage device 1 is connected to the host computer 2 and provides the virtual volume 40 to the host computer 2.
  • the virtual volume 40 is a virtual volume formed by the capacity virtualization function.
  • the storage controller 10 receives a write request for the virtual volume 40 from the host computer 2, the storage controller 10 allocates an arbitrary chunk in the pool 35 to the virtual chunk 41 of the virtual volume 40 and writes data associated with the write request to the chunk.
  • the storage device 1 manages a plurality of SSDs 21 as one RAID group.
  • one (or two) SSDs 21 in the RAID group fail and data access becomes impossible, the data stored in the failed SSD 21 is stored using the data in the remaining SSDs 21. I am trying to recover.
  • SSD # 0 (20-0) to SSD # 3 (20-3) respectively represent logical address spaces provided by the SSD 21 to the storage controller 10.
  • the storage controller 10 forms one RAID group 30 from a plurality (four in the example of FIG. 4) of SSDs 21, and the logical address space (SSD # 0 (20-0) to SSD) of each SSD belonging to the RAID group 30 # 3 (20-3)) is divided into a plurality of fixed-size storage areas called stripe blocks (301) for management.
  • FIG. 4 shows an example in which the RAID level of the RAID group 30 (representing the data redundancy method in the RAID technology and generally having RAID levels of RAID1 to RAID6) is RAID5.
  • boxes such as “0”, “1”, and “P” in the RAID group 20 represent stripe blocks.
  • a number such as “1” assigned to each stripe block is referred to as a “stripe block number”.
  • a stripe block described as “P” in the stripe block is a stripe block in which redundant data (parity) is stored, and this is called a “parity stripe”.
  • a stripe block in which numbers (0, 1, etc.) are written is a stripe block in which data written from an initiator such as the host 2 (data that is not redundant data) is stored. This stripe block is called “data stripe”.
  • the stripe block located at the head of SSD # 3 (20-3) is the parity stripe 301-3.
  • the data stripe positioned at the head of each SSD (SSD # 0 (20-0) to SSD # 2 (20-2)) Redundant data is generated by performing a predetermined operation (for example, exclusive OR (XOR) or the like) on data stored in (striped blocks 301-0, 301-1, 301-2).
  • a predetermined operation for example, exclusive OR (XOR) or the like
  • each stripe block belonging to one stripe line is located at the same position in the logical address space of the SSDs 21-0 to 21-3, as in the stripe line 300 shown in FIG.
  • a stripe line is configured according to the rule of existing at (address).
  • the “chunk” described above is an area composed of a plurality of stripe lines continuously arranged in the RAID group, as shown in FIG. Further, the number of data stripes included in each chunk in the storage device 1 is the same.
  • one chunk 31 is a region composed of a plurality of stripe lines, but one chunk 31 may be configured to have only one stripe line.
  • a virtual chunk is a partition of a predetermined size on the storage space of a virtual volume.
  • One chunk is mapped to one virtual chunk.
  • the storage device 1 receives a data write request for a virtual chunk from the host 2, the storage device 1 stores the data in the mapped chunk.
  • the size of the virtual chunk is equal to the total size of all data stripes included in the chunk.
  • the storage controller 10 manages the storage area (chunk) allocated to the virtual chunk by recording the mapping between the virtual chunk and the chunk in a virtual volume management table 500 described later.
  • the storage controller 10 determines a storage area (chunk) on the SSD 20 to which data written to the area is to be written only when a write request for the area on the virtual chunk is received from the host 2. As the chunk determined here, one chunk is determined from among chunks not yet assigned to any virtual chunk (unused chunk).
  • the contents of these programs and management tables will be described below.
  • FIG. 5 is a diagram for explaining the configuration of the virtual volume management table.
  • the virtual volume management table 500 is a table for managing the mapping relationship between the virtual chunks in each virtual volume defined in the storage apparatus 1 and the chunks.
  • the virtual volume management table 500 has columns of virtual volume # 501, pool # 502, virtual volume LBA range 503, virtual chunk number 504, RAID group number 505, and chunk number 506. Each row (record) of the virtual volume management table 500 indicates that the chunk specified by the RAID group number 505 and the chunk number 506 is mapped to the virtual chunk specified by the virtual volume # 501 and the virtual chunk number 504. To express.
  • not only the virtual volume management table 500 but also each row of a table for managing various information is referred to as a “record”.
  • NULL invalid value
  • Pool # 502 stores the identification number of the pool to which the chunk that can be allocated to the virtual volume belongs. That is, the chunks that can be allocated to the virtual chunks of the virtual volume identified by the virtual volume # 501 are limited to the chunks (or RAID groups) belonging to the pool # 502 in principle.
  • the virtual volume LBA range 503 is information indicating which range on the virtual volume the virtual chunk specified by the virtual chunk number 504 corresponds to. As an example, in the row (record) 500-1 of FIG. 5, the virtual volume LBA range 503 is “0x0500 to 0x09FF” and the virtual chunk number 504 is “2”. This indicates that the LBA of volume # 0 corresponds to the area from 0x0500 to 0x09FF.
  • FIG. 6 is a diagram for explaining the configuration of the pool management table.
  • the pool is managed by a pool management table 550.
  • the pool management table 550 includes columns of pool # 551, RG # 552, chunk # 553, RAID group LBA 554, status 555, and remaining capacity 556.
  • each record is for storing information about a chunk.
  • RG # 552 of each record represents the RAID group number of the RAID group to which the chunk belongs
  • pool # 551 represents the pool number of the pool to which the chunk belongs.
  • pool # 551 represents the pool number to which the RAID group specified by RG # 552 belongs.
  • the RAID group LBA 554 of each record is information indicating in which range on the RAID group the chunk is positioned.
  • the status 555 is information indicating whether the chunk is assigned to the virtual chunk (whether mapped). When “assigned” is stored in the status 555, it indicates that the chunk is assigned to the virtual chunk. Conversely, when “unallocated” is stored in the status 555, it means that the chunk is not allocated to the virtual chunk.
  • the remaining capacity 556 is a total value of unused capacity of the RAID group in the pool, and is equal to the total value of the capacity of chunks whose status 555 is “unallocated”. Note that the storage apparatus 1 according to the present embodiment manages the remaining capacity 556 for each RAID group.
  • the storage apparatus 1 may manage the remaining capacity for each pool.
  • FIG. 7 is a diagram for explaining the configuration of a RAID group management table.
  • the RAID group is managed by a RAID group management table 650.
  • the RAID group management table 650 includes columns of RG # 651, RAID level 652, drive number 653, drive attribute 654, RAID group LBA655, drive logical capacity 656, and average data compression rate 657.
  • RG # 651 stores the RAID group number of the RAID group.
  • the RAID level 652 indicates the RAID configuration of the RAID group.
  • the drive number 653 stores an identifier of the SSD 21 belonging to the RAID group specified by RG # 651.
  • the drive attribute 654 indicates whether the drive specified by the drive number 653 is an active drive or a spare drive.
  • the active drive means a drive that currently stores user data, and “active” is set in the drive attribute 654 of the active drive.
  • One spare drive is a drive that starts operation as an alternative drive when the active drive fails. “Spare” is set in the drive attribute 654 of the spare drive.
  • the RAID group LBA 655 is information indicating which area on the RAID group each area of the SSD 21 specified by the drive number 653 is positioned.
  • the drive logical capacity 655 indicates the capacity (logical capacity) of the drive.
  • the average data compression rate 657 is information indicating how much data transferred by the storage controller 10 is reduced when the drive has a data compression function. In the first embodiment, it is assumed that the data compression function is invalidated. Therefore, “N / A” indicating an invalid state is stored in FIG.
  • FIG. 8 is a diagram illustrating a configuration example of the SSD 21.
  • the SSD 21 includes an SSD controller 200 and a plurality of FM chips 210.
  • the SSD controller 200 includes a processor (CPU) 201, a disk I / F 202, an FM chip I / F 203, a memory 204, a parity operation circuit 206, and a compression / decompression circuit 207, which are interconnected via an internal connection switch 205. Has been.
  • the disk I / F 202 is an interface controller for performing communication between the SSD 21 and the storage controller 10.
  • the disk I / F 202 is connected to the disk I / F 13 of the storage controller 10 via a transmission line (SAS link or PCI link).
  • the FM chip I / F 203 is an interface controller for performing communication between the SSD controller 200 and the FM chip 210.
  • the FM chip I / F 203 has a function of generating ECC (Error Correcting Code), error detection using the ECC, and error correction.
  • ECC Error Correcting Code
  • a BCH code, an LDPC (Low Density Parity Check) code, or the like may be used.
  • the FM chip I / F 203 When the data is transmitted (written) from the SSD controller 200 to the FM chip 210, the FM chip I / F 203 generates an ECC. The FM chip I / F 203 adds the generated ECC to the data, and writes the data with the ECC added to the FM chip 210.
  • the SSD controller 200 reads data from the FM chip 210
  • the data with the ECC added is read from the FM chip 210
  • the data with the ECC added arrives at the FM chip I / F 203.
  • the FM chip I / F 203 performs a data error check using the ECC (generates an ECC from the data, and checks whether the generated ECC matches the ECC added to the data), and a data error is detected. In this case, data correction is performed using ECC.
  • the CPU 201 performs processing related to various commands coming from the storage controller 10.
  • the memory 204 stores a program executed by the processor 201 and various management information. A part of the memory 204 is also used as a buffer for temporarily storing write data transmitted from the storage controller 10 together with a write command and data read from the FM chip 210.
  • an area used as a buffer in the area of the memory 204 is referred to as a “buffer area”.
  • a volatile memory such as a DRAM is used.
  • a nonvolatile memory may be used for the memory 204.
  • At least an SSD control program, a logical-physical conversion table 1100, a block management table 1150, and a configuration information management table 1300 are stored in the memory 204 of the SSD 21.
  • the parity operation circuit 206 is a circuit for creating parity data in the SSD 21.
  • the SSD 21 has a function of configuring a RAID group from a plurality of FM chips 210 and performing data recovery using RAID technology.
  • the parity calculation circuit 206 is hardware for generating redundant data (parity) in RAID technology.
  • the redundant data generated by the parity operation circuit 206 is expressed as “parity” or “parity data”, while the ECC generated by the FM chip I / F 203 is expressed as “ECC”.
  • the compression / decompression circuit 207 is a circuit for compressing and decompressing data. However, in the first embodiment, an example in which the SSD 21 does not compress data will be described. Therefore, in the SSD 21 according to the first embodiment, the compression / decompression circuit 207 may not be provided. A usage example of the compression / decompression circuit 207 will be described in a second embodiment.
  • FM chip 210 is a non-volatile semiconductor memory chip such as a NAND flash memory.
  • the FM chip 210 has a plurality of dies 213, and each die 213 has a plurality of cells 214.
  • the cell 214 is a memory element including a transistor or the like, and each cell 214 can hold one or a plurality of bits of data.
  • Write data from the SSD controller 200 is stored in the cell 214.
  • the amount of information (number of bits) that can be stored in the cell 214 can be changed by an instruction from the SSD controller 200.
  • reading / writing of data in the flash memory cannot be performed in units of cells 214.
  • a set of a plurality of cells 214 is performed for each area of a predetermined size (for example, 8 KB) called a page.
  • Data erasure is performed for each block 211 that is a set of pages.
  • SSD means a storage device having the same form factor as the HDD.
  • the SSD means a general storage device including a plurality of flash memories and a controller for controlling them, and the external shape is not limited to a general HDD or SSD form factor.
  • a nonvolatile semiconductor memory such as NOR or NAND may be used for the flash memory.
  • magnetoresistive memory MRAM Magnetic random access memory
  • resistance change memory ReRAM resistance random access memory
  • ferroelectric memory FeRAM Feroelectric random
  • Various semiconductor memories may be used.
  • FIG. 9 is an explanatory diagram of data arrangement in the SSD. This is equivalent to the case where the RAID group 30 in FIG. 4 is replaced with the SSD 21 and the SSD is replaced with the FM chip.
  • the unit of the stripe block (401-0, 401-1, 401-2, 401-3, etc.) is one physical page.
  • the RAID group may be configured such that the block 211, the die 213, or the FM chip 210 is a unit of a stripe block.
  • the SSD 21 When write data is written to a data stripe (physical page), the SSD 21 generates parity belonging to the same stripe line, writes write data to the data stripe, and stores parity in the parity stripe. Since the physical page cannot be overwritten, when updating the data stripe and parity stripe, an unused physical page in the chip is selected and data and parity are written to the unused physical page.
  • the physical capacity 1001 is equal to the size of the logical capacity 1000
  • the physical address space corresponding to the physical capacity 1001 is an area where write data from the storage controller 10 serving as the initiator can be stored.
  • the physical address space is a space composed of data stripes in the SSD 21 and does not include the capacity of the parity stripe in FIG.
  • the ECC generated by the FM chip I / F 203 is not included in the physical address space. Therefore, the total of the physical capacities 1001 and 1002 is in a relation that it is equal to the total capacity of the blocks 211 (or cells 214) used as data stripes among the blocks 211 in the SSD 21.
  • FIG. 10 is a diagram for explaining the configuration of the configuration information management table.
  • the configuration information management table 1300 mainly stores information related to the capacity of the SSD 21.
  • the configuration information management table 1300 includes columns of logical capacity 1301, block status 1302, number of blocks 1303, capacity 1304, spare capacity 1305, FM mode change threshold 1306, and average data compression ratio 1307.
  • the logical capacity 1301 indicates the size of the logical address space provided by the SSD.
  • the block status 1302, the number of blocks 1303, and the capacity 1304 indicate what state the FM block is currently in.
  • “normal (MLC mode)” indicates a block in a normal state and operating in the MLC mode
  • “normal (TLC mode)” indicates a block in a normal state and operating in the TLC mode
  • “Failure (MLC mode)” indicates each block in a failure state.
  • the configuration information management table 1300 illustrated in FIG. 10 indicates that 1000 blocks are operating in the normal state and in the MLC mode, and the capacity thereof is 1500 GB.
  • the number of blocks 1303 and the capacity 1304 in the row in which the block status 1302 is “normal (MLC mode)” respectively store the number of blocks that are operating in the normal state and in the MLC mode, and the total capacity of the blocks.
  • the number of blocks 1303 and the capacity 1304 in the row in which the block status 1302 is “normal (TLC mode)” respectively store the number of blocks operating in the normal state and the TLC mode, and the total capacity of the blocks.
  • the number of blocks 1303 and the capacity 1304 in the row where the block status 1302 is “failed” respectively store the number of blocks of the failed block and the total capacity of the blocks.
  • the block number 1303 and the capacity 1304 of the row in which the block status 1302 is “normal (MLC mode)” are respectively referred to as “the number of blocks in the normal state (MLC mode)” and “the block capacity in the normal state (MLC mode)”.
  • the number of blocks 1303 and the capacity 1304 of the row whose block status 1302 is “normal (TLC mode)” are “the number of blocks in the normal state (TLC mode)” and “the block capacity in the normal state (TLC mode)”, respectively.
  • the number of blocks 1303 and the capacity 1304 in the row whose block status 1302 is “failed” are respectively referred to as “number of blocks in failure state” and “block capacity in failure state”.
  • the spare capacity 1305 indicates the size of the spare capacity of the SSD.
  • the normal (MLC mode) block capacity is M
  • the normal (TLC mode) block capacity is T
  • the logical capacity is L
  • the reserve capacity 1305 is expressed by the following equation (1).
  • Reserve capacity 1305 M + TL (1) Is a value calculated by.
  • M in the initial state, about 30% to 40% of the physical area of the SSD is secured as a spare capacity, as in the case of a general SSD. If some blocks fail and cannot be used while using the SSD 21, the value of M (or T) decreases. Since L is a constant value (the logical capacity does not change), the reserve capacity 1305 is reduced as a result.
  • the FM mode change threshold value 1306 is a value for the SSD controller 200 to determine that a block mode change is necessary. When the value of the spare capacity 1304 falls below the value of the FM mode change threshold value 1306, the SSD controller 200 determines that there is a risk that the spare block will be exhausted, and the mode of some of the MLC mode blocks in the FM Make changes.
  • the average data compression rate 1307 is information indicating which data is compressed when the data compression function is enabled. However, in the first embodiment, “N / A” indicating an invalid state is stored in FIG. 10 in order to describe an example in which data compression is not performed.
  • the SSD 21 having the configuration information management table 1300 currently has a total physical capacity of 1875 GB, of which 150 GB is in a failure state and the remaining 1725 GB is in a normal state. Yes, of which 1500 GB is operating in MLC mode and 225 GB is operating in TLC mode.
  • the SSD 21 provides a capacity of 1000 GB to the user or the storage controller 10 as a logical capacity, and has a reserve capacity of 725 GB.
  • the physical capacity (capacity 1304, spare capacity 1305) shown in FIG. 10 represents the capacity of the block 211 used as the data stripe.
  • the number of blocks 1303 in FIG. 10 also represents the number of blocks 211 used as data stripes.
  • the SSD 21 may record the capacity of all blocks 211 in the SSD 21 in the capacity 1304 and the reserve capacity 1305, and record the number of all blocks 211 in the SSD 21 in the number of blocks 1303.
  • FIG. 11 is a diagram for explaining the configuration of the logical-physical conversion table.
  • the logical / physical conversion table 1100 is a table for managing the mapping between logical pages and physical pages managed by the SSD 21.
  • the SSD 21 employs a flash memory as a storage medium.
  • the minimum access (read, write) unit of the flash memory (FM chip 210) is a page (physical page).
  • the size of the physical page is, for example, 8 KB.
  • the SSD 21 according to the first embodiment manages the logical address space provided to the storage controller 10 by dividing the logical address space into areas of the same size as the physical page. An area having the same size as the physical page is called a “logical page”.
  • the SSD 21 according to the first embodiment maps one physical page to one logical page.
  • the SSD 21 manages each block in all the FM chips 210 with a unique identification number in the SSD 21, and this identification number is called a block number (block #).
  • block # Each physical page in the block is managed with a unique number in the block, and this number is called a physical page number (or physical page #).
  • physical page # By specifying the block # and the physical page #, the physical page in the SSD 21 is uniquely specified.
  • the SSD 21 manages each logical page in the SSD 21 by assigning a unique identification number in the SSD. This identification number is called a logical page number (logical page #).
  • the logical-physical conversion table 1100 stores information on block # and physical page # of a physical page mapped to a certain logical page for each logical page.
  • the logical-physical conversion table 1100 has columns of SSD LBA 1101, logical page # 1102, status 1103, block # 1104, and physical page # 1105, as shown in FIG. Each record of the logical-physical conversion table 1100 stores information about the logical page specified by the logical page # 1102.
  • the SSD LBA 1101 stores the LBA (range) on the logical address space provided by the SSD 21 to the storage controller 10 corresponding to the logical page.
  • the SSD 21 can convert the LBA included in the access request into a logical page # using the SSD LBA 1101 and the logical page # 1102.
  • block # 1104 and physical page # 1105 information for specifying the physical page mapped to the logical page (that is, block # and physical page #) is stored.
  • Status 1103 stores information indicating whether a physical page is mapped to a logical page. No physical page is mapped to the logical page of the SSD 21 in the initial state.
  • a physical page is mapped to a logical page to be written by the write request.
  • “assignment” is stored in the status 1103, it indicates that the physical page is mapped to the logical page.
  • unallocated is stored in the status 1103, it means that the physical page is not mapped to the logical page (at this time, the block # 1104 and the physical page # 1105 corresponding to the logical page are set to NULL). (Invalid value) is stored).
  • FIG. 12 is a diagram illustrating the configuration of the block management table.
  • the block management table 1150 is a table for managing the states of blocks and physical pages. Each record in the block management table 1150 stores information about a physical page in the SSD 21.
  • the block management table 1150 has columns of block # 1151, FM cell mode 1152, physical page # 1153, status 1154, and erase count 1155.
  • status 1154 is the same information as block # 1104, physical page # 1105, and status 1103 in the logical-physical conversion table 1100, respectively. That is, when a physical page is allocated to a logical page, the block # and physical page # of the allocated physical page are stored in block # 1104 and physical page # 1105 of the logical-physical conversion table 1100, and the status 1103 is “allocated”. Is stored. At the same time, “assignment” is also stored in the status 1154 (in the block management table 1105) of the assigned physical page.
  • the SSD 21 manages the block as an unusable block. Therefore, the SSD 21 stores “failure (blocked)” in the status 1154 of each physical page belonging to the block.
  • the FM cell mode 1152 is information indicating which mode the cell of the block is in, and the FM cell mode 1152 stores either “TLC” or “MLC” information. is doing. In the erase count 1155, the cumulative count of block erase is stored.
  • FIG. 13 is a flowchart of the storage controller task.
  • the storage controller task is realized by the CPU 11 of the storage controller 10 executing a storage control program.
  • the storage controller 10 periodically executes this task process.
  • the storage controller 10 determines whether a read or write request has been received from the host computer 2 (S10). If no request has been received (S10: No), S20 is performed next.
  • the storage controller 10 determines whether this request is a read or a write (S40). If this request is a read (S40: Read), the storage controller 10 executes a read process (S50), and then executes S20. Details of the read process will be described later (see FIG. 15). If this request is a write (S40: write), the storage controller 10 executes a write process (S60), and then executes S20. Details of the write processing will be described later (see FIG. 14).
  • the storage controller 10 executes a drive monitoring process (S20), and then makes a determination in S30. Details of the drive monitoring process will be described later (FIG. 20).
  • the storage controller 10 determines whether or not a request for stopping the storage apparatus 1 has been received (S30). When the stop request has been received (S30: Yes), the storage controller 10 executes the stop process of the storage apparatus 1 and ends the process (END). When the stop request has not been received (S30: No), the storage controller 10 repeats the process from S10 again.
  • FIG. 14 is a flowchart of storage controller write processing. Similar to the storage controller task, the write processing is also realized by the CPU 11 of the storage controller 10 executing the storage control program.
  • the host 2 transmits a write request and write data to the storage controller 10 (S61).
  • the storage controller 10 receives a write request from the host 2, the storage controller 10 starts a write process.
  • the storage controller 10 refers to the virtual volume management table 500 and the pool management table 550 to determine whether or not a chunk has been allocated to a virtual chunk including the write destination address of the virtual volume specified by the write request. (S62).
  • the storage controller 10 assigns the chunk to the virtual chunk (S63), and executes S64 after the chunk is assigned.
  • the storage controller 10 executes S64 without performing S63.
  • the storage controller 10 stores the write data in the cache and generates parity (S64).
  • the parity generated here is parity data of a parity stripe belonging to the same stripe line as the write data. Further, when it is necessary to read data or parity before update from the storage device (SSD 21) for parity generation, the processing is also performed.
  • the storage controller 10 transmits a write command and write data to the write destination storage device (S65).
  • the parity is also written to the storage device.
  • the storage controller 10 receives a write completion notification from the write destination storage device (S66).
  • the storage controller 10 transmits a completion response to the write request to the host 2 (S67).
  • the host 2 receives a completion response to the write request from the storage controller 10 (S68), and ends the processing (END).
  • the flow of the write process described here is an example, and each step may be executed in a different order.
  • a completion response is transmitted to the host 2.
  • the storage controller 10 performs processing in the order of transmitting a completion response to the host 2 immediately after storing the write data in the cache (S64) and then transmitting the write data and parity to the storage device. May be.
  • the storage apparatus 1 can store the write data transmitted from the host 2.
  • FIG. 15 is a flowchart of the storage controller read processing. The read process is also realized by the CPU 11 of the storage controller 10 executing a storage control program.
  • the host 2 sends a read request to the storage controller 10 (S51).
  • the storage controller 10 When the storage controller 10 receives a read request from the host 2, the storage controller 10 starts a read process. First, the storage controller 10 specifies a virtual chunk including the address of the virtual volume specified by the read request, specifies a chunk assigned to the virtual chunk, and stores the read destination from the storage devices constituting the chunk. A device is specified (S52). The storage controller 10 transmits a read command to the specified storage device (S53). Then, the storage controller 10 receives read data from the storage device (S54). The storage controller 10 stores the read data in the cache (S55). The storage controller 10 transmits a completion response to the read request and read data to the host 2 (S56). The host 2 receives a completion response and read data from the storage controller 10 (S57), and ends the processing (END).
  • the storage apparatus 1 can respond with read data in response to a read request from the host 2.
  • FIG. 16 is a flowchart of the SSD controller task.
  • the SSD controller task is performed by the CPU 201 of the SSD controller 200 executing the SSD control program.
  • the SSD controller task is executed periodically.
  • each process will be described with the SSD controller 200 as the subject. However, unless otherwise specified, it means that each process is executed by the CPU 201.
  • the SSD controller 200 first performs a process for a read or write request from the storage controller 10 as an initiator (S100). Details of the read / write processing will be described later (FIG. 17). Next, FM diagnosis processing for checking the presence or absence of a block failure is performed (S120), and then an FM for restoring the spare area as necessary. A depletion recovery process is executed (S140), and the determination of S160 is performed. Details of the FM diagnosis process will be described later (FIG. 18). Further, the FM depletion recovery process will be described later (FIG. 19).
  • the SSD controller 200 determines whether or not a request to stop the SSD 21 has been received from the storage controller 10. When the stop request has been received (S160: Yes), the SSD controller 200 executes the stop process of the SSD 21 and ends the process (END). When the stop request has not been received (S160: No), the SSD controller 200 repeats the process from S100 again.
  • the SSD 21 can store the write data transmitted from the storage controller 10 and read the read data. Further, the state of the FM chip 210 can be monitored, and the recovery process of the spare capacity can be executed according to the result.
  • FIG. 17 is a flowchart of data read / write processing of the SSD controller. Similar to the SSD controller task, the CPU 201 of the SSD controller 200 executes the SSD control program, thereby realizing data read / write processing.
  • the data access size specified by the access request (read request or write request) requested by the storage controller 10 to the SSD 21 is the size of one logical page. An example will be described in which the range on the logical address space specified by the access request matches the logical page boundary.
  • the SSD controller 200 determines whether a read or write request has been received from the storage controller 10 as an initiator (S101). If the request has not been received (S101: No), the SSD controller 200 ends this process. When the request from the storage controller 10 is received (S101: Yes), the SSD controller 200 determines the content of the request (S102).
  • the SSD controller 200 transfers data from the physical page storing the read target data to the buffer area based on the information in the logical-physical conversion table 1100. (S103). Next, the SSD controller 200 determines whether or not the read data is normally read based on the error detection result of the ECC circuit included in the FM chip I / F 203 (S104). If the data could not be read normally in S104, the SSD controller 200 determines that an uncorrectable error state in which an error due to ECC is impossible (S104: Yes), and proceeds to S105. On the other hand, when the data can be normally read in S104 (S104: no), the SSD controller 200 transfers the data in the buffer area to the storage controller 10 (S117), and ends a series of processing.
  • S104 uncorrectable error state in which an error due to ECC is impossible
  • the SSD controller 200 determines that the block to which the physical page storing the read target data belongs is in a failure state, and closes the block (S105). Next, the SSD controller 200 performs data recovery processing of the block using the parity data (S106), and transfers the read target data to the storage controller 10 from the recovered data (S108). Next, in order to store the recovered data, the SSD controller 200 secures one block of physical pages whose status 1153 is unallocated based on the information in the block management table 1150, and the recovered data is secured. Store in the block (S109).
  • the SSD controller 200 After S109, the SSD controller 200 performs an update operation of various tables (the logical physical conversion table 1100, the block management table 1150, and the configuration information management table 1300) (S110). In particular, in S110, the SSD controller 200 adds 1 to the number of failed blocks 1303 in the configuration information management table 1300, and adds the capacity of one block to the failed block capacity 1304. When the blocked block is in the MLC mode, the SSD controller 200 subtracts 1 from the number of blocks 1303 in the normal state (MLC mode) and subtracts the capacity for one block from the block capacity 1304 in the normal state (MLC mode).
  • MLC mode normal state
  • MLC mode the capacity for one block from the block capacity 1304 in the normal state
  • the SSD controller 200 updates (subtracts) the reserve capacity 1305. After S110, the SSD controller 200 ends the process.
  • the SSD controller 200 when the request received in S102 is a write command (S102: write command), the SSD controller 200 first stores the write target data in the buffer area (S111), and then stores the data in the block management table 1150. Based on the information, the physical page whose status 1153 is not allocated is specified, and the data stored in the buffer area is stored in the specified physical page (S113). Furthermore, the SSD controller 200 generates a parity stripe on the same stripe line as the physical page in which data is stored, selects one unassigned physical page to store the generated parity stripe, and selects the selected physical page. Parity data is written in the physical page (S114). Thereafter, the SSD controller 200 returns a notification (response) of the completion of processing related to the write command to the storage controller 10 (S115).
  • S102 write command
  • the SSD controller 200 After S115, the SSD controller 200 performs S110.
  • the SSD controller 200 updates the logical-physical conversion table 1100 and the block management table 1150 in S110.
  • FIG. 18 is a flowchart of the FM diagnosis process of the SSD controller. Similar to the SSD controller task, the CPU 201 of the SSD controller 200 executes the SSD control program, thereby realizing FM diagnosis processing.
  • the SSD controller 200 determines whether there is a block that needs to be diagnosed (S121). For example, this processing may be performed periodically at specific time intervals (cycles) or when a specific command such as an execution instruction from the storage controller 10 is received, the number of block erasures, or page access It may be executed when a specific event occurs, such as when the number of times reaches N times.
  • the SSD controller 200 ends the process (END).
  • the SSD controller 200 selects a block to be inspected (S122), and reads the data of the page in the handling block (S123). Thereafter, S124 is performed. In S123, the SSD controller 200 does not need to read all pages. For example, only the physical page allocated to the logical page or only a specific physical page having an even or odd physical page number is read. Also good.
  • the SSD controller 200 determines whether there is a page in which an uncorrectable error has occurred due to a sudden hardware failure or the like. If there is a page where an uncorrectable error has occurred (S124: Yes), the SSD controller 200 closes the block (S127). Next, the parity data is used to recover the data in the block (S128), and the recovered data is stored in another block (S129). Thereafter, various tables are updated (S130). The processing of S127 to S130 is the same as S105 to S110 except that data transmission to the storage controller 10 (S108) is not performed. Thereafter, S121 is performed again.
  • the SSD controller 200 executes S125.
  • S125 the SSD controller 200 determines whether or not the number of error bits generated on the inspection target page is greater than a predetermined threshold. If the number of generated error bits is not greater than the predetermined threshold value in S125 (S125: no), the SSD controller 200 performs the process from S121 again. On the other hand, when the number of generated error bits is larger than the predetermined threshold (S125: Yes), the SSD controller 200 performs a refresh process on the block (S126), and then executes S130.
  • “refresh” means a process of reading data stored in a physical page (or block) and moving it to another physical page (block). Therefore, in S130, the logical / physical conversion table 1100 and the block management table 1150 are updated. However, when S130 is executed after S126, the block of block is not generated, and the configuration information management table 1300 is not updated.
  • FIG. 19 is a flowchart of the FM depletion recovery process of the SSD controller. Similar to the SSD controller task, the CPU 201 of the SSD controller 200 executes the SSD control program, thereby realizing FM depletion recovery processing.
  • the SSD controller 200 refers to the configuration information management table 1300 to determine whether or not the spare capacity is almost exhausted (S141). Specifically, when the value of the reserve capacity 1305 is below the FM mode change threshold value 1306, the SSD controller 200 determines that the reserve capacity is almost exhausted. If it is determined in S141 that the depletion is not near (S141: No), the process ends.
  • the SSD controller 200 determines whether there is a block whose cell mode can be changed based on the block management table 1150 (S142). ). If the status 1154 is not “failure”, but the FM cell mode 1152 has only a “TLC” block (S142: No), the SSD controller 200 determines that there is no mode changeable cell and ends the process. To do. On the other hand, when the status 1154 is not “failure” and there is a block whose FM cell mode 1152 is “MLC” (S142: Yes), the SSD controller 200 determines that there is a cell whose mode can be changed, and executes S143. To do.
  • the SSD controller 200 expands the physical capacity by changing the cell mode of some blocks from MLC to TLC.
  • the SSD controller 200 can arbitrarily determine the number of blocks whose cell mode is changed. However, at least the number of blocks necessary for the reserve capacity 1305 to exceed the FM mode change threshold 1306 is changed from the MLC mode to the TLC mode.
  • the cell mode of a block in use (a block having a physical page assigned to a logical page) may be changed. However, in that case, the SSD controller 200 changes the cell mode after moving the physical page data allocated to the logical page among the physical pages in the block to an unused physical page. If the data is moved to an unused physical page, the mapping between the logical page and the physical page is changed, so that the logical-physical conversion table 1100, the block management table 1150, etc. are updated (in step S144 described later). )
  • the SSD controller 200 updates the configuration information management table 1300 and, if necessary, updates the logical-physical conversion table 1100 and the block management table 1150 (S144). Further, the SSD controller 200 notifies the storage controller 10 that the FM cell mode has been changed and that the cause is a block failure (S145), and ends this processing.
  • the notification transmitted from the SSD 21 to the storage controller 10 in S145 is referred to as “mode change notification”.
  • the mode change notification includes information included in the configuration information management table 1300. Therefore, the storage controller 10 can grasp the state of the SSD 21 in detail by receiving the mode change notification.
  • FIG. 20 is a flowchart of storage controller drive monitoring processing. Similar to the storage controller task, the drive monitoring process is also realized by the CPU 11 of the storage controller 10 executing the storage control program.
  • Storage controller 10 determines whether a mode change notification has been received from SSD controller 200 (S21). If no mode change notification has been received (S21: No), this process ends. If a mode change notification has been received (S21: Yes), the storage controller 10 determines whether there is room for expanding the physical capacity (S22).
  • the storage controller 10 When S22 is performed, the storage controller 10 receives the information of the configuration information management table 1300 from the SSD 21 (SSD controller 200) together with the mode change notification. Therefore, the storage controller 10 uses the information in the configuration information management table 1300 to determine whether there is room for expanding the physical capacity (whether the normal state (MLC mode) block 1303 is greater than 0). If there is a normal block that is operating in the MLC mode (S22: Yes), it is determined that there is still room for enlargement, and this process is terminated. On the other hand, if it is determined in S22 that there is no room for enlargement (S22: No), the storage controller 10 next executes S23.
  • MLC mode normal state
  • the storage controller 10 has no remaining capacity for avoiding the risk of depletion of the spare capacity in the relevant SSD 21, so that if there is a further block failure, there is a risk that the spare capacity will be completely depleted. It is determined that the data is very high, data is copied to the spare drive, and the relevant SSD is blocked (S23).
  • S23 data is copied to the spare drive, and the relevant SSD is blocked.
  • RAID technology it is possible to execute access request processing to a drive and data movement to a spare drive in parallel (even if these two processes are performed in parallel, Some can be controlled to prevent data corruption and access to incorrect data).
  • an access request (read or write request) to the SSD 21 may be accepted in parallel with the processing of S23. Thereafter, this process is terminated.
  • the storage controller 10 determines whether the normal state (MLC mode) block 1303 is larger than a predetermined threshold. It may be determined whether or not. Alternatively, when the determination of S21 is affirmative (when the mode change notification is received), the storage controller 10 may always execute S23 without performing the determination of S22. This is because the SSD that issues the mode change notification is likely to be in a state where it cannot be used because the spare capacity is exhausted soon.
  • the storage device SSD
  • the storage apparatus when the reserve capacity decreases and falls below a predetermined threshold, a part of a block operated in the MLC mode (or a mode in which each cell can store n-bit data) is stored in the TLC.
  • the mode is changed to operate in a mode (or a mode in which each cell can store m-bit (m> n) data), and the reserve capacity is increased.
  • the SSD spare capacity is depleted and the SSD cannot be used.
  • the SSD when the SSD according to the present embodiment changes a part of the block operated in the MLC mode to the TLC mode, the SSD transmits a mode change notification to the storage controller of the storage apparatus.
  • the storage controller receives this notification, it moves the SSD data that issued the notification to the spare drive.
  • the SSD that issues the mode change notification that is, the SSD that has changed the mode of the cell is in a state where the spare capacity is small, and there is a high possibility that it cannot be used if a further failure occurs.
  • the storage controller moves the data of the SSD that has issued the mode change notification to the spare drive, it is possible to prevent a situation in which the data cannot be accessed even if a further failure occurs.
  • the storage controller may immediately move the data to the spare drive without changing the cell mode in the SSD. If a further failure occurs, the SSD cannot be used, and there is a possibility that data cannot be moved to the spare drive. Therefore, as explained in the present embodiment, it is more reliable that the storage controller moves the data to the spare drive after the SSD has changed the cell mode once to secure a certain spare capacity. It can be performed.
  • the configuration of the storage apparatus according to the second embodiment is the same as that described in the first embodiment.
  • the hardware configuration of the SSD according to the second embodiment is almost the same as that described in the first embodiment. Therefore, when specifying each component in a storage apparatus or SSD, it demonstrates using the same term (or reference number) used in Example 1.
  • FIG. 1
  • the SSD according to the first embodiment does not necessarily include the compression / decompression circuit 207, but the SSD according to the second embodiment always includes the compression / decompression circuit 207 or an equivalent thereof.
  • compression means a process of reducing the data size while maintaining the meaning of the data using a reversible compression algorithm such as the LZW algorithm.
  • the SSD 21 receives the write data from the storage controller 10, it compresses the data using the compression / decompression circuit 207 and stores the compressed data in the FM 210.
  • Data compressed by the compression / decompression circuit 207 is referred to as “compressed data”.
  • the SSD 21 receives a read request from the storage controller 10, the SSD 21 reads the compressed data from the FM 210, decompresses the compressed data using the compression / decompression circuit 207, and returns the data to the storage controller 10. That is, in the storage device 1 according to the present embodiment, data compression or expansion is performed transparently to the host 1 and the storage controller 10.
  • the definition of the compression rate in a present Example is as follows.
  • the compression ratio can be obtained by calculating “y ⁇ x”.
  • the value of the compression rate becomes smaller (takes a value close to 0).
  • the larger the data size after compression the larger the value of the compression ratio (takes a value close to 1). Therefore, in this embodiment, “small compression ratio” means that the compression efficiency is good and the data size after compression is small, and “high compression ratio” means that the compression efficiency is bad and the data size after compression is too small. It means not to be.
  • FIG. 21 is a diagram showing an outline of the second embodiment. Similar to FIG. 1, the change in the internal capacity of the SSD is expressed in time series. At time t1, the physical capacity 1021 is the size of the physical storage area required for allocation to the logical address space, and the physical capacity 1022 is a spare capacity. At time t1, the physical area corresponding to the physical capacities 1021 and 1022 is constructed in the MLC mode.
  • the logical capacity is determined in expectation that the size of data stored in the FM is reduced. Therefore, in the initial state, the logical capacity 1020 of the SSD is set to a value larger than the physical capacity 1021. This is because when the data actually stored in the physical capacity is reduced, an amount of data exceeding the physical capacity can be stored in the SSD.
  • the average value of the compression rate when general data is compressed by the compression / decompression circuit 207 (this is referred to as “assumed compression rate” in this specification.
  • the assumed compression rate is a predetermined constant.
  • the logical capacity is determined on the basis of In the initial state, if the assumed compression rate of data is a, the logical capacity 1020 is equal to the value obtained by the physical capacity 1021 ⁇ a.
  • the state at time t1 is a state in which data is written in the entire logical address space (or almost the entire region) and the compression rate of each data is the assumed compression rate (a). At this time, an amount of physical area corresponding to the physical capacity 1021 is mapped to the entire logical address space.
  • the logical capacity may be determined based on an index other than the average value of the compression rate.
  • the logical capacity may be determined on the assumption that all data is compressed at the minimum compression rate (this is b) that can be realized by the compression / decompression circuit 207.
  • the logical capacity is physical capacity 1021 / b.
  • the logical capacity is four times the physical capacity 1021.
  • T2 represents a case in which when the data already stored at t1 is overwritten and updated, the update data has changed to a data pattern that is difficult to compress (that is, the data compression rate has deteriorated).
  • the data compression rate deteriorates, the data size after compression increases.
  • the physical capacity required to store user data at t1 increases from the physical capacity 1021 to the physical capacity 1024.
  • the size of the physical capacity 1022 reserved as the spare capacity is reduced to the physical capacity 1025.
  • FIG. 26 is a configuration example of the logical-physical conversion table 1100 ′ included in the SSD 21 according to the second embodiment.
  • the SSD 21 according to the second embodiment includes a logical-physical conversion table 1100 'instead of the logical-physical conversion table 1100 (FIG. 11) included in the SSD 21 according to the first embodiment.
  • the SSD LBA 1101 to physical page # 1105 are the same as the logical / physical conversion table 1100 described in the first embodiment.
  • the logical / physical conversion table 1100 ′ further includes columns of Offset 1106 and Length 1107.
  • Offset 1106 represents a relative address (offset) where the top of the physical page is 0, and Length 1107 represents the length of the area.
  • the unit of the value stored in Offset 1106 and Length 1107 is a byte.
  • each row (record) of the logical-physical conversion table 1100 ′ is added to the logical page specified by the logical page # 1102, and the top of this physical page among the physical pages specified by the block # 1104 and the physical page # 1105. This indicates that an area of Length 1106 (bytes) starting from the position of Offset 1105 (bytes) to Allocation is allocated.
  • FIG. 27 is a configuration example of the block management table 1150 ′ included in the SSD 21 according to the second embodiment.
  • the SSD 21 according to the second embodiment has a block management table 1150 'instead of the block management table 1500 (FIG. 12) included in the SSD 21 according to the first embodiment.
  • the block management table 1150 ′ has columns of Offset 1156 and Length 1157 in addition to the columns of the block management table 1150 in the first embodiment.
  • Offset 1156 and Length 1157 are the same information as Offset 1106 and Length 1107 of the logical-physical conversion table 1100 ′.
  • Offset 1156 represents the relative address (offset) when the top position of the physical page is 0, and Length 1157 represents the length of the area. .
  • the unit of the value stored in Offset 1156 and Length 1157 is a byte.
  • each row of the block management table 1150 ′ indicates that the areas specified by Offset 1156 and Length 1157 among the areas of the physical page specified by block # 1151 and physical page # 1153 are allocated to the logical page. To express.
  • the difference between the configuration information management table 1300 in the second embodiment and the first embodiment will be described. Since the format of the configuration information management table 1300 included in the SSD 21 according to the second embodiment is the same as that described in the first embodiment, the illustration is omitted. However, the definition of the reserve capacity 1305 in the second embodiment is different from that described in the first embodiment.
  • the spare capacity 1305 has the normal (MLC mode) block capacity M, the normal (TLC mode) block capacity T, and the logical capacity L.
  • the reserve capacity 1305 is expressed by the following equation (2).
  • Reserve capacity 1305 Min (M + TL ⁇ a, M + TA) (2) The value calculated in Note that a is an assumed compression rate, A is the total capacity of physical pages allocated to the logical address space, and Min ( ⁇ , ⁇ ) means the smaller value of ⁇ and ⁇ .
  • the SSD 21 according to the second embodiment secures an area of about 30% to 40% as a spare capacity in the physical area of the SSD in the initial state.
  • A 0. Therefore, the reserve capacity is equal to “M + TL ⁇ a”.
  • M or T
  • M + TL ⁇ a the value of M (or T) decreases, resulting in “M + TL ⁇ a”.
  • the value (reserved capacity 1305) will decrease.
  • FIG. 22 is a flowchart of the data read / write processing of the SSD controller according to the second embodiment.
  • the main difference from FIG. 17 is that the data compression process is added before the data is stored in the FM in the write process (S112), and the data is decompressed before the data is transferred to the storage controller 10 in the read process. (S107 and S116).
  • the Offset 1106, Length 1107, Offset 1156, and Length 1157 are also updated.
  • the spare capacity 1305 is updated, the SSD controller 200 determines the spare capacity 1305 based on the equation (2) described above.
  • the compression rate of the SSD 21 may fluctuate. Therefore, in the management information update process (S110) performed after S115, the SSD controller 200 obtains the data compression rate and updates the content of the average data compression rate 1307.
  • the compression rate can be obtained, for example, by performing the following calculation. In the area on the logical address space, X is the number of logical pages to which physical pages are allocated, and Y is the number of physical pages allocated to logical pages. At this time, “Y ⁇ X” is the compression rate. Both X and Y can be counted by referring to the logical-physical conversion table 1100 '.
  • the process illustrated in FIG. 22 is an example, and may be executed by a procedure other than this.
  • each time a write command is received compressed data is written to a physical page. Therefore, an unused area (an area where received data is not written) is written in the latter half of the physical page. ) Remains. Since the minimum write unit of the FM 210 is a physical page, data cannot be written into this unused area later. Therefore, the SSD controller 200 may accumulate the write data after compression in the buffer area and perform S115 (return of completion notification) before executing S113 and S114. Then, the SSD controller 200 may execute S113 and S114 when compressed data of one physical page or more is accumulated in the buffer area. Thereby, a lot of compressed data can be stored in one physical page.
  • FIG. 23 is a flowchart of the FM depletion recovery process of the SSD controller according to the second embodiment.
  • the compression rate of stored data deteriorates as an execution factor of the reserve capacity recovery process due to the change of the cell mode, in addition to the case of the block failure described in the first embodiment.
  • the difference between the FM depletion recovery process according to the second embodiment and the process described in the first embodiment (FIG. 19) is that the FM depletion recovery process according to the second embodiment updates the internal information after changing the FM cell mode.
  • a process (S146) for determining the cause of the shortage of the spare capacity is added.
  • the SSD controller 200 refers to the average data compression ratio 1307 and the number of failed blocks 1303 (or failed block capacity 1304) in the configuration information management table 1300, and the deterioration of the data compression ratio is a cause of the exhaustion of the spare capacity. Alternatively, it may be determined whether the increase in the number of failed blocks is the cause of exhaustion of the spare capacity. If the cause is a deterioration of the data compression rate (S146: deterioration of the data compression rate), the SSD controller 200 informs the storage controller 10 to that effect (S147).
  • the storage controller 10 when the storage controller 10 detects an SSD in which the cell mode has been changed due to the deterioration factor of the data compression rate in the drive monitoring process, the storage controller 10 avoids the depletion of the spare capacity due to the further deterioration of the data compression rate. Then, a capacity rebalancing process is performed in which a part of the data of the SSD is moved to another SSD.
  • the outline will be described with reference to FIG.
  • capacity rebalancing is performed by means of moving data in chunk units, not SSD units.
  • RG30-1 is a RAID group to which the SSD that has changed the cell mode belongs, and has a large amount of data storage.
  • the RG 30-2 is a RAID group with a small data storage amount. Therefore, the storage controller 10 reduces the data storage amount of the RG 30-1 by moving the data of the chunk 31 belonging to the RG 30-1 to the chunk 32 belonging to the RG 30-2. Along with this, a reduction in the data storage amount of the SSD belonging to RG 30-1 is realized. Note that the above-described chunk movement is performed transparently to the host 2 by the capacity virtualization function of the storage controller.
  • FIG. 25 is a flowchart of the drive monitoring process of the storage controller in the second embodiment.
  • the storage controller 10 determines whether or not a cell mode change notification is received from the SSD controller 200 (S21). When the notification has not been received (S21: No), this process ends. If a notification has been received in S21 (S21: Yes), the storage controller 10 analyzes the factor (S24). If the cause is a block failure (S24: block failure), the storage controller 10 executes S22 and S23, and the post-execution processing of S23 ends. Since the processing contents of S22 and S23 have already been described in the first embodiment, description thereof is omitted here.
  • the storage controller 10 determines whether or not the capacity rebalancing can be executed (S25). More specifically, in S25, the storage controller 10 refers to the pool management table 550 and determines whether there is a RAID group with a sufficient capacity (a RAID group whose remaining capacity 556 is equal to or greater than a predetermined threshold value). Determine if it exists). If another RAID group with sufficient capacity exists (S25: Yes), the storage controller 10 moves some chunks of data to the RAID group with sufficient capacity (S26), and ends this process. To do.
  • the storage controller 10 displays a notification requesting capacity addition to the pool on the display screen of the management host 5, The user is prompted to add a storage area to the pool (S27). After the user adds a storage area (RAID group) to the pool, the storage controller 10 moves some chunks of data to the pool with the storage area added (S28). After S28, this process ends.
  • the case where only the physical capacity is increased when the cell mode is changed has been described.
  • the logical capacity may be increased simultaneously.
  • the SSD compresses data using the compression / decompression circuit 207 and reduces the amount of data stored in the FM.
  • the CPU 201 uses the lossless compression algorithm. May be configured to perform data compression or decompression.
  • duplication of data is performed.
  • the amount of data stored in the FM may be reduced by performing the exclusion process.
  • Deduplication is a technique for reducing the amount of data by searching for the same data in the entire storage area and deleting the rest of the data, and can be said to be a data compression process in a broad sense. In that case, the deterioration of the data compression rate may be read as the deterioration of the deduplication rate.
  • the SSD may be configured to further reduce the amount of data stored in the FM by using both the deduplication process and the compression process using the lossless compression algorithm.
  • Each program (storage control program or SSD control program) that causes the CPU to execute the processing described above is provided by a program distribution server or a storage medium that can be read by a computer, and is installed in each device that executes the program. Also good.
  • the computer-readable storage medium is a non-transitory computer-readable medium such as a non-volatile storage medium such as an IC card, an SD card, or a DVD.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Microelectronics & Electronic Packaging (AREA)
  • Techniques For Improving Reliability Of Storages (AREA)

Abstract

La présente invention concerne un dispositif de mémoire, selon un mode de réalisation, qui consiste en un dispositif de commande de dispositif qui fournit un espace de mémoire logique d'une taille prédéfinie à un dispositif de commande de mémoire et un support d'informations non volatiles à semi-conducteurs ayant une pluralité de blocs, qui sont des unités d'effacement de données. En outre, chaque bloc est configuré de sorte que les cellules dans le bloc peuvent être commutées du fonctionnement dans un premier mode, dans lequel les cellules peuvent mémoriser des informations à n bits, vers le fonctionnement dans un second mode, dans lequel les cellules peuvent mémoriser des informations à m bits (où n < m). Le dispositif de commande de dispositif gère, en tant que capacité de réserve, la partie de la zone de mémoire disponible combinée de la pluralité de blocs autres qu'une zone de mémoire sélectionnée requise pour attribuer l'espace de mémoire logique et, si la capacité de réserve devient inférieure à une valeur seuil prédéterminée, le dispositif de commande de dispositif commute certains des blocs du fonctionnement dans le premier mode vers le fonctionnement dans le second mode, ce qui permet d'augmenter les zones de mémoire disponibles.
PCT/JP2016/069077 2016-06-28 2016-06-28 Dispositif et équipement de mémoire WO2018002999A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/JP2016/069077 WO2018002999A1 (fr) 2016-06-28 2016-06-28 Dispositif et équipement de mémoire

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2016/069077 WO2018002999A1 (fr) 2016-06-28 2016-06-28 Dispositif et équipement de mémoire

Publications (1)

Publication Number Publication Date
WO2018002999A1 true WO2018002999A1 (fr) 2018-01-04

Family

ID=60787096

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2016/069077 WO2018002999A1 (fr) 2016-06-28 2016-06-28 Dispositif et équipement de mémoire

Country Status (1)

Country Link
WO (1) WO2018002999A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110162267A (zh) * 2018-02-13 2019-08-23 点序科技股份有限公司 快闪存储器存储装置与写入管理方法

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2009090731A1 (fr) * 2008-01-16 2009-07-23 Fujitsu Limited Dispositif de mémoire à semi-conducteurs, appareil de commande et procédé de commande
JP2011003111A (ja) * 2009-06-22 2011-01-06 Hitachi Ltd フラッシュメモリを用いたストレージシステムの管理方法及び計算機
JP2012068986A (ja) * 2010-09-24 2012-04-05 Toshiba Corp メモリシステム

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2009090731A1 (fr) * 2008-01-16 2009-07-23 Fujitsu Limited Dispositif de mémoire à semi-conducteurs, appareil de commande et procédé de commande
JP2011003111A (ja) * 2009-06-22 2011-01-06 Hitachi Ltd フラッシュメモリを用いたストレージシステムの管理方法及び計算機
JP2012068986A (ja) * 2010-09-24 2012-04-05 Toshiba Corp メモリシステム

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110162267A (zh) * 2018-02-13 2019-08-23 点序科技股份有限公司 快闪存储器存储装置与写入管理方法

Similar Documents

Publication Publication Date Title
JP6073471B2 (ja) ストレージ装置
JP6283771B2 (ja) ストレージ装置
JP6381529B2 (ja) ストレージ装置および記憶制御方法
JP5937697B2 (ja) ストレージシステム
JP5192352B2 (ja) 記憶装置及びデータ格納領域管理方法
JP6000376B2 (ja) 特性の異なる複数種類のキャッシュメモリを有する情報処理装置
JP6216897B2 (ja) ストレージシステム
JP6062060B2 (ja) ストレージ装置、ストレージシステム、及びストレージ装置制御方法
WO2014141411A1 (fr) Système et procédé de stockage pour la commande d&#39;un système de stockage
US20140189203A1 (en) Storage apparatus and storage control method
WO2011024239A1 (fr) Système de stockage comprenant une pluralité d&#39;ensembles flash
WO2016181528A1 (fr) Dispositif de stockage
JP6817340B2 (ja) 計算機
US10013322B2 (en) Storage apparatus and storage apparatus control method
WO2017109931A1 (fr) Système informatique
CN111124950A (zh) 数据管理装置、数据管理方法和存储介质
JP2017199043A (ja) ストレージ装置とシステム及び方法とプログラム
WO2018002999A1 (fr) Dispositif et équipement de mémoire
JP6605762B2 (ja) 記憶ドライブの故障により消失したデータを復元する装置
JP6163588B2 (ja) ストレージシステム

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16907232

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16907232

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: JP