[go: up one dir, main page]

WO1997049037A1 - Nouvelle structure d'antememoire et procede - Google Patents

Nouvelle structure d'antememoire et procede Download PDF

Info

Publication number
WO1997049037A1
WO1997049037A1 PCT/US1997/010155 US9710155W WO9749037A1 WO 1997049037 A1 WO1997049037 A1 WO 1997049037A1 US 9710155 W US9710155 W US 9710155W WO 9749037 A1 WO9749037 A1 WO 9749037A1
Authority
WO
WIPO (PCT)
Prior art keywords
cache
data
cache memory
lru
entry
Prior art date
Application number
PCT/US1997/010155
Other languages
English (en)
Inventor
Marvin Lautzenheiser
Original Assignee
Zitel Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zitel Corporation filed Critical Zitel Corporation
Priority to AU34839/97A priority Critical patent/AU3483997A/en
Publication of WO1997049037A1 publication Critical patent/WO1997049037A1/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/12Replacement control
    • G06F12/121Replacement control using replacement algorithms
    • G06F12/128Replacement control using replacement algorithms adapted to multidimensional cache systems, e.g. set-associative, multicache, multiset or multilevel
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0866Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches for peripheral storage systems, e.g. disk cache
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/12Replacement control
    • G06F12/121Replacement control using replacement algorithms
    • G06F12/123Replacement control using replacement algorithms with age lists, e.g. queue, most recently used [MRU] list or least recently used [LRU] list
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/31Providing disk cache in a specific location of a storage system
    • G06F2212/312In storage controller

Definitions

  • This invention relates to a high performance computer data storage device utilizing a combination of solid state storage and one or more mass memories, such as a rotating magnetic disk device.
  • a typical caching system uses a single solid state memory unit as a holding area for a string of magnetic disks, allowing certain information to be stored in a high speed cache memory, thereby increasing speed of performance as compared to the use solely of lower speed disk memories, i.e. for some percentage of times a piece of data is contained in the high speed cache memory, thereby allowing faster access as compared with when that data is only stored m a disk drive.
  • Host computer 101 communicates with the entire string 102 of disks 102-1 through 102-N via cache unit 103 via Host interface 104, such as a Small Computer Systems Interface (SCSI) . All data going to or from disk string 102 passes through the cache-to-disk data path consisting of host interface 104, cache unit 103, and disk interface 105.
  • Cache unit 103 manages the caching of data and services requests from host computer 101.
  • Ma]or components of cache unit 103 include microprocessor 103-1, cache management hardware 103-2, cache management firmware 103-3, address lookup table 103-4, and solid state cache memory 103-5.
  • the prior art cache system of Figure 1 is intended to hold frequently accessed data m a solid state memory area so as to give more rapid access to that data than would be achieved if the same data were accessed from the disk media.
  • Such cache systems are quite effective when attached to certain host computers and under certain work loads.
  • the single cache memory 103-5 is used in conjunction with all disks in disk string 102. Data from any of the disks may reside in cache memory 103-5 at any given time. The most frequently accessed data is given precedence for caching regardless of the disk drive on which it resides.
  • the determination of whether or not the data is m cache memory 103-5 and the location of that data in cache memory 103-5 is usually via hashing schemes and table search operations. Hashing schemes and table searches can introduce time delays of their own which can defeat the purpose of the cache unit itself. Performance is very sensitive to cache-hit rates. Due to caching overhead and queuing times, a low hit rate in a typical string oriented cache system can result in overall performance that is poorer than that of an equivalently configured uncached string of disks.
  • the size of cache memory 103-5 relative to the capacity of disk drives 102 is generally low.
  • An apparently obvious technique to remedy a low hit rate is to increase the cache memory 103-5 size.
  • With limited cache memory 103-5 capacity a multitude of requests over a variety of data segments exhausts the capability of the cache system to retain the desirable data m cache memory 103-5. Often, data which would be reused in the near future is decached prematurely to make room in cache memory 103-5 for handling new requests from host computer 101. The result is a reduced cache hit rate.
  • a reduced hit rate increases the number of disk accesses; increased disk accesses increase the contention on the data path.
  • a self-defeating cycle is instituted. "Background" cache-ahead operations are limited since the data transferred during such cache ahead operations travels over the same data path as, and often conflicts with data transferred to service direct requests from the host computer 101.
  • the data path between cache unit 103 and disk string 102 can easily be overloaded. All data to and from any of the disks in disk string 102, whether for satisfying requests from host computer 101 or for cache management purposes, travels across the cache-to-disk path. This creates a bottleneck if a large amount of prefetching of data from disk string 102 to cache memory 103-5 occurs.
  • Each attempt to prefetch data from disk string 102 into cache memory 103-5 potentially creates contention for the path with data being communicated between any of the disk drives of disk string 102 and host computer 101.
  • prefetching of data into cache memory 103-5 must be judiciously limited; increasing the size of the cache memory 103-5 beyond a certain limit does not produce corresponding improvements in the performance of the cache system. This initiates a string of related phenomena.
  • Cache-ahead management is often limited to fetching an extra succeeding track of data from disk wherever a read command from the host cannot be fulfilled from the cached data. This technique helps minimize the tendency of cache-ahead to increase the queuing of requests waiting for the path between cache memory 103-5 and disk string 102.
  • one of the concep t s on which caching is based is that data accesses tend to De concentrated within a given locality withm a reasonably short time frame.
  • Cache memory 103-5 is generally volatile; the data is lost if power to the unit is removed. This characteristic, coupled with the possibility of unexpected power outages, has generally imposed a write-through design for handling data transferred from host computer 101 to the cached string. In such a design, all writes from the host are written directly to disk; handled at disk speed, these operations are subject to all the inherent time delays of seek, latency, and lower transfer rates commonly associated with disk operations.
  • a solid state storage device has high-speed response, but at a relatively high cost per megabyte of storage.
  • a rotating magnetic disk, optical disk, or other mass media provides high storage capacity at a relatively low cost per megabyte, but with a low-speed response.
  • the teachings of this invention provide a hybrid solid state and mass storage device which gives near solid state speed at a cost per megabyte approaching that of the mass storage device .
  • embodiments will be described with regard to magnetic disk media. However, it is to be understood that the teachings of this invention are equally applicable to other types of mass storage devices, including optical disk devices, and the like.
  • the hardware features include: one or more rotating magnetic disk media, an ample solid state storage capacity; private channels between the disks and the solid state storage device; and high speed microprocessors to gather the intelligence, make data management decisions, and carry out the various data management tasks.
  • the firmware features include the logic for gathering the historical data, making management decisions, and instructing the hardware to carry out the various data management operations.
  • Important aspects of the firmware include making the decisions regarding the retention of data in the solid state memory based on usage history gathered during the device ' s recent work load experience; and a comprehensive, intelligent, plateau-based methodology for dynamically distributing the solid state memory for the usage of the data stored on, or to be stored on, the various disks.
  • This distribution of solid state memory is work load sensitive and is constantly dynamic; it is accomplished in such a way as to guarantee full utilization of the solid state memory while at the same time ensuring that the data for all disks are handled in such a way as to retain the most useful data for each disk m the solid state memory for the most appropriate amount of time.
  • the multiple plateau-based cache distribution methodology is illustrated in Figures 57 and 58, and described m the section entitled Plateau Cache Illustration.
  • the hardware and firmware features are combined in a methodology which incorporates simultaneity of memory management and data storage operations.
  • the hybrid storage media of this invention performs at near solid state speeds for many types of computer workloads while practically never performing at less than normal magnetic disk speeds for any workload.
  • One or more rotating magnetic disk media are used to give the device a large capacity; the solid state storage is used to give the device a high-speed response capability.
  • By associating the solid state media directly with magnetic disk devices private data communicaticn lines are established which avoid contention between normal data transfers between the host and the device and transfers between the solid state memory and the disks.
  • the private data channels permit virtually unlimited conversation between the two storage media.
  • Utilization of ample solid state memory permits efficient maintenance of data for multiple, simultaneously active data streams.
  • Management of the storage is via one or more microprocessors which utilize historical and projected data accesses to perform intelligent placement of data. No table searches are employed m the time critical path.
  • Host accesses to data stored m the solid state memory are at solid state speeds; host accesses to data stored on the magnetic disks are at disk device speeds. Under most conditions, all data sent from the host to the device is handled at solid state speeds.
  • Intelligence is embodied to cause the device to dynamically shift its operating priorities in order to maintain performance at a high level, including the optimization of the handling of near-future host-device I/O operations.
  • GLOSSARY OF TERMS ADDRESS TRANSLATION The method for converting a sector address into a disk bin address and sector offset within the
  • ADDRESS TRANSLATION TABLE The table which maintains the relationship between disk bin identifiers and solid state memory addresses; also holds statistics about frequency of bin accesses, recent bin accesses, or other information as required.
  • ADT TABLE See ADDRESS TRANSLATION TABLE
  • BACKGROUND ACTIVITY Any of the tasks done by the described device's controller which are not done m immediate support of the host's activity. For example, the writing of modified data from cache to disk, prefetch reading of data from disk into cache, etc.
  • BACKGROUND SEEK The first step in a background sweep write of modified data from cache to disk.
  • BACKGROUND SWEEP The collective activities which write data from cache to disk as background tasks.
  • BACKGROUND OPERATION See BACKGROUND ACTIVITY.
  • BACKGROUND TASK See BACKGROUND ACTIVITY.
  • BACKGROUND WRITE The writing of modified data from cache to disk as a background task.
  • BAL Bin Address List.
  • BATTERY BACKUP The hardware module and its controller which assures the described device of power for an orderly shutdown when outside power is interrupted.
  • BEGGING FOR A BIN The canvassing of the cache chains for all the drives to find a cache bin that can be reused.
  • BM An arbitrary number of contiguous sectors occupying space on either a disk drive or in cache memory in which data is stored.
  • BIN ADDRESS LIST (BAL) : A list of one or more disk bin addresses associated with a host command, and which relate the command's requirements to disk bins, cache bins, and sector addresses .
  • BIN SIZE The number of sectors considered to be in a disk bin and also the number of sectors considered to be m a cache bin; this may or may not be equal to the actual number of sectors in a disk track.
  • BUYING A BIN The acquisition of a cache bin from the global cache chain to use for data for a given drive, such cache bin received in exchange for a cache bin currently in the LRU position of the cache chain of that drive.
  • CACHE The solid state memory area which holds user data within the cache system of this invention.
  • CACHE-AHEAD FACTOR At each cache bin read hit or re-hit, cached data sufficient to satisfy a number of I/O's may remain in front of, and/or behind, the current location of the data involved in the current I/O. When either of these two remaining areas contain valid data for less than a set number of I/O's, the cache-ahead is activated. That minimum number of potential I/O's is the cache-ahead factor, or the proximity factor.
  • CACHE BIN A data bin in cache memory.
  • CACHE CHAIN The logical bidirectional chain which maintains references to a set of cache bins m a certain sequence; in the described device, m the order of most-recently-used (MRU) to least-recently-used (LRU) .
  • CACHE CHAIN STATUS An attribute based on the number of cache bins in a given cache chain, either for a given drive or global, and which is associated with a cache chain which indicates that the cache chain is in a specified condition. Such cache bins contain no modified data since cache bins containing modified data are removed from the cache chain and placed in the modified pool. The cache chain status is used to control decisions regarding the way in which the device manages the cache and other resources .
  • CACHE HIT A host initiated read or write command which can be serviced entirely by utilizing currently cached data and/or currently assigned cache bins.
  • CACHE HIT RATE The proportion of all host I/O's which have been, or are being, serviced as cache hits.
  • CACHE MEMORY The solid state memory for retaining data. See SOLID STATE MEMORY.
  • CACHE MISS A host initiated read or write command which cannot be serviced entirely by utilizing currently cached data and/or currently assigned cache bins.
  • CACHE READ HIT A host initiated read command which can be serviced entirely by utilizing currently cached data.
  • CACHE READ MISS A host initiated read command which cannot be serviced entirely by utilizing currently cached data.
  • CACHE STATUS See CACHE CHAIN STATUS.
  • CACHE WRITE HIT A host initiated write command which can be serviced entirely by utilizing currently assigned cache bins.
  • CACHE WRITE MISS A host initiated write command which cannot be serviced entirely by utilizing currently assigned cache bins .
  • CHAINING A method utilized m the LRU table to logically connect cache bins in a desired sequence, in the case of this invention, to connect the cache bins in most-recently-used to least-recently-used order. Chaining is also used in any table management in which the logical order is dynamic and does not always match the physical order of the data sets in the table.
  • CHAINING See CHAIN.
  • CLEAN BIN CLEAN CACHE BIN: A cache bin containing only unmodified data; that is data that matches the data in the corresponding disk bin.
  • CLEAN DATA Data currently resident in a cache bin which exactly matches the data stored m the disk bin to which the cache bin is assigned.
  • CLEANING The act of writing data from cache memory to disk, such data having been written from the host into cache and which has not yet been written from cache to disk.
  • CONTROL TABLE Any of the various tables which maintain records which control the location and retention of data stored in cache or on disk, and which are used to control the activities of the described device.
  • DATA BIN (DISK) See DISK BIN.
  • DATA CHANNEL See CHANNEL
  • PROXIMITY A group of contiguous sectors within a cache bin, and of a size which matches the most recent host read for data from that cache bin.
  • DECACHE The removal of the logical references which relate some or all of the data in a cache bin to the data in a disk bin. When such references are all removed, the decached data in the cache bin is no longer available from cache, but can be retrieved from the disk on which it is stored.
  • DIRTY BIN; MODIFIED BIN A cache bin which contains data which has been written to the cache by the host, and which data has not yet been written from the cache to the corresponding disk drive.
  • DIRTY DATA; MODIFIED DATA Data which has been written to the cache by the host, and which data has not yet been written from the cache to the corresponding disk drive. In other words, data in a cache bin that does not match the data in the corresponding disk bin.
  • DISCONNECT The action of removing the current logical connection on a channel between two devices, thus freeing the channel for use by other devices which have access to the channel .
  • DISK BIN; DRIVE BIN A data bin on a disk.
  • DISK BIN ADDRESS The address of the first sector of data in a given bin on disk. These addresses correspond to physical locations on the rotating magnetic disk. Each sector address as specified in an I/O operation can be converted into a disk bin address and a sector offset within that bin.
  • DISK CACHE That portion of the described device's cache memory which is assigned to data corresponding to data stored on, or intended to be stored on, a specific disk drive.
  • DISK DRIVE See DISK.
  • DISK SECTOR ADDRESS The address of a physical sector on the magnetic disk device.
  • DISK SERVER The logical section of the caching device which handles the writes to, and reads from, the rotating magnetic disk.
  • DISK TRACK A complete data track on a disk; one complete band on one platter of the disk device; this terminology is generally not meaningful to the logic of the described device.
  • DMA Direct Memory Access; that is, memory-to-memory transfer without the involvement of the processor.
  • DRAM Dynamic random access memory. The chip or chips that are used for solid state memory devices. DRIVE: See DISK. DRIVE BIN: See DISK BIN.
  • DRIVE CACHE Collectively the cache bins which are currently assigned to a given drive. 97/49037 PCMJS97/10155
  • DRIVE CACHE CHAIN The logical chain of cache bins which are currently assigned to maintain data for a given disk drive of the described device. Such cache bins are available to be decached and reused only in very special circumstances, as opposed to those cache bins which have migrated into the global cache chain.
  • DRIVE CACHE CHAIN STATUS A term describing a drive's cache condition based on the number of unmodified cache bins assigned to that given drive. See CACHE CHAIN STATUS.
  • DRIVE MODE An attribute of each drive based on the number of cache bins which contain modified data which indicates that the amount of such cache is at a specified level . Used to control decisions regarding the way in which the device manages the drive, the cache, and other resources .
  • DRIVE NORMAL MODE The condition of the cache assigned to a specific drive in which the described storage device can use its normal priorities with respect to the management of data for that drive in order to reach its optimal performance level. In this mode, the background sweep is dormant; and cache ahead, recycle, and read ahead operations take place as circumstances indicate they are appropriate.
  • DRIVE SATURATED MODE The drive mode in which no cache ahead operations, no recycling, and no read ahead operations are permitted on the drive. The global logic will not allow the number of modified cache bins assigned to a given drive to exceed the number that places the device in this mode.
  • DRIVE SWEEP MODE The drive mode in which the sweep has been activated for the drive based on the number of cache bins assigned to the drive and which contain modified data; m this mode the sweep shares resources with the cache ahead operations.
  • DRIVE TIMEOUT MODE The drive mode in which the sweep has been activated for the drive based on the time since a cache bin assigned to the drive was first written to by the host and which still contains modified data; in this mode the sweep shares resources with the cache ahead operations.
  • DRIVE URGENT MODE The drive mode in which the sweep is active. Due to the large number of cache bins containing modified data, no cache ahead operations are permitted on the drive, and recycling is limited to the more frequently accessed cache bins.
  • DRIVE STATUS See DRIVE CACHE CHAIN STATUS.
  • EDAC Error Detection And Correction.
  • EEPROM Electrically Erasable Programmable Read-Only Memory.
  • EPROM Erasable Programmable Read-Only Memory.
  • EXCESS CACHE CHAIN STATUS The term used to describe a cache chain, either global or local, when that chain contains more than the desired maximum number of cache bins.
  • the global cache chain will be in this status whenever the described device is powered on.
  • FIRMWARE The collective set of logical instructions and control tables which control the described device's activities.
  • GAP ELIMINATION The actions taken to make the cached data within a given cache bin contiguous. This can be done by reading data from the disk into the gap, or by taking whatever actions are necessary to allow one of the cached areas to be decached.
  • GAP READ The elimination of a gap m cached, modified data in a single cache bin by reading intervening data from the disk.
  • GAP TABLE A control table which contains a line which references each gap of each cache bin which currently contains a gap.
  • This table will usually be null, or will have no lines m it, since the logic of the described device eliminates gaps as soon as feasible after their creation by the host.
  • GAP WRITE The elimination of a gap in cached, modified data in a single cache bin by writing some or all of the modified data m the cache bin from cache to the disk.
  • GLOBAL CACHE Collectively the cache bins which are currently assigned to a given drive, but which have been placed in the global cache chain. Cache bins in the global cache chain are more readily accessible for decachmg and reassignment to other drives .
  • GLOBAL CACHE CHAIN The logical chain of cache bins which, although they may be currently assigned to maintain data for the various disk drives of the described device, are readily available to be decached and reused for caching data for any of the drives .
  • GLOBAL CACHE CHAIN STATUS A term describing the global cache condition based on the number of unmodified cache bins currently in the global cache chain. See CACHE CHAIN STATUS.
  • GLOBAL DRIVE MODE A term describing a controlling factor of the drive operations; determined by the total number of modified cache bins assigned to all drives.
  • GLOBAL NORMAL DRIVE MODE See GLOBAL NORMAL MODE.
  • GLOBAL NORMAL MODE The condition based on the total number of cache bins containing modified cache bins in which the described storage device can use its normal priorities with respect to the management disk drive activities in order to reach its optimal performance level. In this mode, the background sweep, recycling, and cache ahead operations are under individual drive control .
  • GLOBAL POOL See MODIFIED POOL.
  • GLOBAL SATURATED MODE The global mode in which a very large number of cache bins contain modified data. All cache aheads, all recycling, and all read aheads are prohibited. The sweep is operating for all drives which have any modified cache bins assigned to them.
  • GLOBAL URGENT MODE The global mode in which a large number of cache bins contain modified data. The background sweep is forced on for all drives which have any modified cache bins, cache aheads are prohibited, but read aheads are permitted under individual drive control.
  • GLOBAL LRU The term used to reference that cache bin in the least-recently-used position of the global cache chain.
  • GLOBAL MODE See GLOBAL DRIVE MODE.
  • GLOBAL MODIFIED CACHE POOL The pool of cache bins each of which contains one or more sectors of modified, or dirty, data.
  • GLOBAL STATUS See GLOBAL CACHE CHAIN STATUS .
  • HASHING A procedure used in many computer programs which is used to quickly determine the approximate location of some desired information. Used extensively in conventional caching devices to reduce the amount of searching to determine where, if at all, data for a given disk location may be located in cache. This methodology usually results in a search to locate the exact data location, such search prone to becoming very time consuming for large caches. The present invention uses no such hashing and/or search scheme, and is not subject to such time delays.
  • HOLE See GAP.
  • HOST The computer to which the caching device is attached.
  • HOST COMMAND Any of the logical instructions sent from the host to the described device to instruct the device to take some kind of action, such as to send data to the host (a READ command) , to accept data from the host for storing (a WRITE command) , among others.
  • HOST SERVER The portion of the caching device which interfaces with the host computer.
  • I/O SIZE The size of a host I/O request as a number of sectors .
  • LAD See LEAST ACTIVE DRIVE.
  • LEAST ACTIVE DRIVE The least active disk drive based on recent host I/O activity and short term history of host I/O activity.
  • LEAST ACTIVE DRIVE CHAIN See LEAST ACTIVE DRIVE LIST.
  • LEAST ACTIVE DRIVE LIST The chained lines of the ADT-DISKS table which maintain the drive activity information.
  • LEAST-RECENTLY-USED TABLE See LRU TABLE.
  • LINK A term used to describe a line in a chained table, and which is tied, forward, backward, or both ways via pointers, to another line or other lines m the table.
  • LOCKED CACHE BIN The term used to describe a cache bin which is part of an ongoing I/O, either between the host and the solid state memory, or between the solid state memory and the rotating media storage device, or both.
  • LOGICAL SPINDLE A spindle of a disk drive, or a logical portion thereof which has been designated a spindle for purposes of the described device.
  • LRU Least-Recently-Used, as pertains to that data storage cache bin which has not been accessed for the longest period of time.
  • MAGNETIC DISK See DISK.
  • MAGNETIC DISK MEDIA A device which utilizes one or more spinning disks on which to store data.
  • MARGINAL CACHE CHAIN STATUS The cache chain status, either global or local, in which the number of cache bins in the cache chain is approaching the smallest permissible. The device's control logic will attempt to keep the cache chain from becoming smaller, but will allow it to do so if no other course of action is available to handle the host activity.
  • MODE See DRIVE MODE and GLOBAL DRIVE MODE.
  • MODIFIED BIN See DIRTY BIN.
  • MODIFIED BINS TABLE See MOD TABLE.
  • MODIFIED CACHE BIN See DIRTY BIN.
  • MODIFIED CACHE POOL See MODIFIED POOL.
  • MODIFIED DATA See DIRTY DATA.
  • MODIFIED POOL MODIFIED CACHE POOL: A generally unordered list of all the cache bins which, at a given moment m time, contain modified data.
  • MOST ACTIVE DRIVE The most active drive based on recent host I/O activity as reflected in the LEAST ACTIVE DRIVE LIST of the ADT table.
  • MRU Most-Recently-Used, as pertains to that data storage track which has been accessed in the nearest time past.
  • NORMAL CACHE CHAIN STATUS The cache chain status, either global or local, m which the number of cache bins in the cache chain is withm the preset limits which are deemed to be the best operating range for a drive's cache chain.
  • NORMAL MODE See DRIVE NORMAL MODE and GLOBAL NORMAL MODE.
  • NULL NULL VALUE: A value in a table field which indicates the field should be considered to be empty; depending on usage, will be zero, or will be the highest value the bit structure of the field can accommodate.
  • NULL VALUE See NULL.
  • PERIPHERAL STORAGE DEVICE Any of several data storage devices attached to a host for purposes of storing data. In the described device, may be a disk drive, but is not limited thereto.
  • PHYSICAL TRACK See DISK TRACK.
  • the plateaus represent various amounts of cache which may be assigned to the disks or to global cache and which is protected for a given set of conditions. In this embodiment, the plateaus are fixed at initialization time, and are the same for all drives. In logical extensions of this invention, the plateaus for the various drives would not necessarily be equal, and they may be dynamically adjusted during the operation of the device.
  • PREFETCH See CACHE AHEAD.
  • PRIVATE CHANNEL See PRIVATE DATA CHANNEL.
  • PRIVATE DATA CHANNEL In the described device, a physical data path used to move data between the solid state memory ana a disk drive, but not used to move data between the host computer and the described device; therefore, private to the device.
  • PROXIMITY A term for expressing the "nearness" of the data in a cache bin which is currently being accessed by the host to either end of the said cache bin.
  • PROXIMITY FACTOR See CACHE-AHEAD FACTOR.
  • QUEUED HOST COMMAND Information concerning a host command which has been received by the described device, but which could not be immediately handled by the device; for example, a read cache miss.
  • QUEUED READ CACHE MISS Information concerning a host read command which has been received by the described device, but, which could not be immediately fulfilled by the device.
  • QUEUED READ COMMAND See QUEUED READ CACHE MISS.
  • QUEUED SEEK CACHE MISS Information concerning a host seek command which has been received by the described device, and for which the data at the specified disk address is not currently stored in cache.
  • QUEUED WRITE CACHE MISS Information concerning a host write command which has been received by the described device, but which could not be immediately handled by the device.
  • READ AHEAD The reading of data from a disk bin into a cache bin as a part of the reading of requested data from disk into cache resulting from a read cache miss.
  • the data so read will be the data from the end of the requested data to the end of the disk bin. This is not to be confused with a cache ahead operation.
  • READ CACHE HIT See CACHE READ HIT.
  • READ CACHE MISS See CACHE READ MISS.
  • READ COMMAND A logical instruction sent from the host to the described device to instruct the device to send data to the host .
  • READ FETCH The reading of data from a disk bin into a cache bin m order to satisfy the host request for data which is not currently in cache. The data so read will be the data from the beginning of the requested data to the end of the requested data withm the disk bm. This is not to be confused with a read ahead or a cache ahead operation.
  • READ HIT See CACHE READ HIT.
  • READ MISS See CACHE READ MISS.
  • READ QUEUE A temporary queue of read cache miss commands; the information in this queue is used to cache the requested data and then to control the transmission of that data to the host .
  • RECONNECT The action of reestablishing a logical data path connection on a channel between two devices, thus enabling the channel to be used for transmitting data. Used m handling queued read cache misses and queued writes.
  • RECYCLE The term used to describe the retention of data in a cache bm beyond that bin's logical arrival at the global cache LRU position; such retention may be based on a number of factors, including whether or not some data in the cache bin was read at some time since the data m the cache bin was most recently cached, or since the data was last retained in cache as the result of a recycling action.
  • ROTATING MAGNETIC DISK See DISK.
  • ROTATING STORAGE MEDIA A data storage device such as a magnetic disk.
  • SATURATED MODE SATURATED: See SATURATED DRIVE MODE and SATURATED GLOBAL DRIVE MODE.
  • SCSI Small Computer System Interface; the name applied to the protocol for interfacing devices, such as a disk device to a host computer.
  • SCSI CONTROL CHANNEL A physical connection between devices which uses the SCSI protocol, and is made up of logical controllers connected by a cable.
  • SECTOR The logical sub-unit of a disk track or disk bin; the smallest addressable unit of data on a disk.
  • SECTOR ADDRESS The numerical identifier of a disk sector, generally indicating the sequential location of the sector on the disk.
  • SECTOR OFFSET In the described device, the relative location of a given sector withm a cache bm or disk bm.
  • SEEK The action of positioning the read/write head of a disk drive to some specific sector address Usually done by a host in preparation for a subsequent read or a write command.
  • SEEK CACHE MISS In the described device, a seek command from the host for which the data of the corresponding disk bm is not cached. The described device will place the information about the seek m a seek queue and attempt to execute a background read ahead in response to a seek cache miss.
  • SEEK COMMAND A logical instruction sent from the host to the described device to instruct the device to position the read/write head of the disk to some specific sector address. In the described device, this is handled as an optional cache ahead operation.
  • SEEK QUEUE A temporary queue of host seek miss commands which are waiting to be satisfied by background cache ahead actions, should time permit.
  • SEGMENT See SECTOR.
  • SERIAL PORT A means for communicating with a device, external to the described device, such as a terminal or personal computer, which in this context may be used to reset operating parameters, reconfigure the device, or make inquiries concerning the device's operations.
  • SOLID STATE STORAGE See SOLID STATE MEMORY.
  • SOLID STATE STORAGE DEVICE See SOLID STATE MEMORY.
  • SPINDLE See LOGICAL SPINDLE.
  • SSD See SOLID STATE MEMORY.
  • SSD BIN ADDRESS The address in the solid state memory at which the first byte of the first sector currently corresponding to a given disk bm resides.
  • STATUS See DRIVE CACHE CHAIN STATUS and GLOBAL CACHE CHAIN STATUS .
  • STEALING A BIN The acquisition of a cache bm from the global cache chain, or indirectly from another drive's cache chain, for use for data for a given drive when that drive does not give back a cache bm in exchange.
  • SWEEP See BACKGROUND SWEEP.
  • SWEEP MODE See DRIVE SWEEP MODE.
  • TABLE SEARCH A technique used in some devices to find references to certain data, such as the location of cached data. This procedure is often time consuming, and in general, it is not used in the described device m any of the time critical paths .
  • TIMEOUT In the described device, a timeout occurs when, for a given drive, some cache bm has been holding dirty, or modified, data for more than some preset length of time. The occurrence of a timeout will place the drive m timeout mode if the sweep for that drive is not already active.
  • TIMEOUT MODE See DRIVE TIMEOUT MODE.
  • WRITE CACHE HIT See CACHE WRITE HIT.
  • WRITE CACHE MISS See CACHE WRITE MISS.
  • WRITE COMMAND A logical instruction sent from the host to the described device to instruct the device to accept data from the host for storing m the device.
  • WRITE QUEUE A temporary queue of host write miss commands which are waiting to be satisfied due to the extremely heavy load of write activities which have temporarily depleted the supply of cache bins available for reuse. This queue will usually be null, or empty, and if it is not, the described device will be operating m the saturated mode.
  • WRITE THROUGH A technique used in some caching devices to allow host write commands to bypass cache and write directly to the disk drive.
  • Figure 1 depicts the logic for a typical prior-art cached disk, computer data storage system.
  • Figure 2 depicts an overall view of the hardware component of one embodiment of the present invention.
  • Figure 3 depicts an overall view of one embodiment of the present invention which uses cached disks as a computer data storage unit .
  • CACHE ILLUSTRATIONS These diagrams depict the cache of selected embodiments of the present invention as related to statuses and modified pools as related to the various drive operating modes.
  • Figure 4 depicts one embodiment of drive cache structure as its various sizes relate to the drive cache statuses.
  • Figure 5 depicts one embodiment of the pool of modified cache bins associated with a drive, showing how the pool's size relates to the drive cache modes.
  • Figure 6 depicts one embodiment of the global cache structure, as its various sizes relate to the global cache statuses.
  • Figure 7 depicts the composite pool of modified cache bins, showing how the pool's size relates to the global cache modes.
  • FIRMWARE - CACHE MANAGEMENT MODULES These modules handle the cache management of the present invention. Each module may be invoked from one or more places m the firmware as needed. They may be called as a result of an interrupt, from withm the background controller, or as a result of, or part of any activity.
  • Figure 8 shows one embodiment of a cache-ahead determination procedure that determines which cache bm, if any, should be scheduled for a cache-ahead operation.
  • Figure 9 shows one embodiment of a set drive mode procedure for setting the operating mode for a specific drive's cache based on the number of modified cache bins assigned to that drive.
  • Figure 10 shows one embodiment of a set drive cache chain status module, which uses the information about the number of cache bins currently assigned to the specified drive to set a cache chain status for that drive.
  • Figure 11 shows an exemplary procedure for locating a cache bin to reuse.
  • Figure 12 shows an exemplary method by which a specific drive gets a cache bm from the global cache to use for its current caching requirements when the drive can afford to give a cache bm to the global chain m return.
  • Figure 13 shows an exemplary method by which a specific drive gets a cache bm from the global cache to use for its current caching requirements when no drive can give a bm to global .
  • Figure 14 shows an exemplary method by which a specific drive indirectly gets a cache bm from another drive's cache to use for its own current caching requirements.
  • Figure 15 shows an exemplary method by which a specific drive gets a cache bm for its use from any drive when none is available m either the drive's own cache chain or m the global cache .
  • Figure 16 shows one embodiment of logic for determining the least active drive whose cache chain is the best candidate for supplying a bm for use by another drive.
  • Figure 17 shows an exemplary procedure for setting the operating mode for the global cache based on the total number of modified cache bins.
  • Figure 18 depicts one embodiment of a set global cache chain status module, which uses the information about the number of cache bins currently assigned to the global portion of cache to set a global cache chain status.
  • Figure 19 shows exemplary logic for determining if a current host request is a cache hit or must be handled as a cache miss.
  • Figure 20 depicts exemplary logic for setting up a bm address list based on a host command.
  • Figure 21 depicts exemplary logic for translating the sector address of a host command mto a bm identifier.
  • Figure 22 shows an exemplary method for updating and rechammg the least-active disk list.
  • Figure 23 shows an exemplary method by which a cache bm which has just been involved in a cache read hit is rechamed to the MRU position of that drive's cache chain, if that action is appropriate.
  • Figure 24 depicts exemplary logic for moving a newly modified cache bm from a drive's cache chain to the pool of modified cache bins.
  • Figure 25 depicts exemplary logic for moving a newly modified cache bm from the global cache chain to the pool of modified cache bins.
  • Figure 26 depicts one embodiment of a module which determines whether or not sufficient cache bins are available for reuse to handle a current requirement .
  • these modules are invoked, directly or indirectly, by a host interrupt
  • These modules may call other modules which are described in a different section.
  • Figure 27 shows exemplary logic of the host-command interrupt management .
  • Figure 28 depicts exemplary logic of handling the completion of a host command.
  • Figure 29 depicts exemplary logic for handling a read command from the host when all the data to satisfy the command is found to be in the cache memory.
  • Figure 30 depicts exemplary logic for the host interrupt handling of a read command when some or all of the data required to satisfy the command is not m the cache memory.
  • Figure 31 depicts exemplary logic for handling a seek command from the host when the addressed bm is not currently cached.
  • Figure 32 depicts exemplary logic for the host interrupt handling of a write command when all the data bins related to that write are found to be m the cache memory.
  • Figure 33 depicts exemplary logic for the host interrupt andlmg of a write command when some or all of the data bins related to that write are not m the cache memory.
  • these modules make up the executive control and the control loop which runs at all times during which an interrupt is not being processed.
  • These modules may call other modules which are described m a different section
  • Figure 34 depicts one embodiment of the initiation of a cache-ahead operation, if one is scheduled for the given drive
  • Figure 35 shows one embodiment of the main firmware contro_ for the described device.
  • Figure 36 shows one embodiment of the operations which take place when the described device is initially powered on.
  • Figure 37 shows one embodiment of the background operation control loop for a drive.
  • Figure 38 depicts an exemplary procedure which shuts down the described device when it is powered off.
  • Figure 39 shows one embodiment of the logic for eliminating gaps in the modified portions of the cached data in a cache bin.
  • Figure 40 shows one embodiment of the handling of the queued commands for a given drive.
  • Figure 41 depicts, for one embodiment of this invention, the methods by which a module rechains cache bins from the LRU or near LRU positions of the global cache chain to the MRU or LRU positions of the specific, private disk cache chains; the movement is based on recycling information on each cache bin reflecting that bins activity since first being cached or since it was last successfully recycled.
  • Figure 42 depicts one embodiment of the handling of a queued read cache miss operation.
  • Figure 43 depicts one embodiment of the method of fetching missing data from disk.
  • Figure 44 depicts one embodiment of the handling of a queued seek cache miss operation.
  • Figure 45 depicts one embodiment of the logic for determining if the cache associated with a given drive has included any modified cache bins for more than the specified time limit.
  • Figure 46 depicts one embodiment of the initiation of a background write from cache to disk, if one is appropriate for the given drive at the current time.
  • Figure 47 depicts one embodiment of a method for identifying a modified cache bin assigned to a specified disk, which cache bin to write from cache to disk at this time.
  • Figure 48 depicts one embodiment of the handling of a queued write cache miss operation.
  • these modules are invoked, directly or indirectly, by an interrupt from one of the disk drives. These modules may call other modules which are described in a different section.
  • Figure 49 shows one embodiment of logic for handling the termination of a cache-ahead operation.
  • Figure 50 shows one embodiment of logic of the drive-interrupt management.
  • Figure 51 depicts exemplary actions to be taken when a read from, or write to, a drive has completed, such read or write initiated for the purpose of eliminating a gap or gaps in the modified cached data of a cache bin.
  • Figure 52 depicts exemplary logic for the termination of a seek which was initiated for the purpose of writing modified data from a cache bin to its corresponding disk drive.
  • Figure 53 depicts exemplary logic for handling the termination of a background write from cache to disk.
  • MODULES ENTERED VIA INTERNAL INTERRUPTS In one embodiment, these modules are invoked, directly or indirectly, by an interrupt from within the described device itself. These modules may call other modules which are described in a different section.
  • Figure 54 depicts exemplary handling of a power-off interrupt .
  • Figure 55 depicts exemplary logic for initiation of the background sweep for writing from cache to disk when the device is in its power-down sequence.
  • MODULES ENTERED VIA SERIAL PORT INTERRUPTS In exemplary embodiments, this module is invoked, directly or indirectly, by an interrupt from a device attached to the serial port of the described device. These modules may call other modules which are described in a different section.
  • Figure 56 depicts exemplary logic for handling the communications with a peripheral attached to the device via the serial port .
  • Figure 57 is a graph illustrating eight cases of the cache assignments of a three-plateau configuration.
  • Figure 58 is a graph illustrating eight cases of the cache assignments of a five-plateau configuration.
  • Table CM1 depicts exemplary operating rules based on the drive modes.
  • Table CM2 depicts exemplary control precedence for the global modes and the drive modes.
  • Table CM3 summarizes exemplary rules for setting the drive modes .
  • Table CM4 summarizes exemplary rules for setting the global mode.
  • Table CS1 summarizes exemplary rules for setting the drive cache statuses .
  • Table CS2 summarizes exemplary rules for setting the global cache status.
  • Table CS3 gives an exemplary set of the possible status conditions and the corresponding actions required to acquire a cache bin for reuse.
  • Table CS4 gives an exemplary set of bin acquisition methods based on the combinations of cache statuses.
  • Table LB1 gives an exemplary set of cache bin locking rules for operations involving host reads.
  • Table LB2 gives an exemplary set of cache bm locking rules for operations involving host writes.
  • Table LB3 gives an exemplary set of cache bm locking rules for operations involving caching activities.
  • Table LB4 gives an exemplary set of cache bm locking rules for operations involving sweep activities.
  • Table LB5 gives an exemplary set of cache bm locking rules for operations involving gaps. CONTROL TABLE EXAMPLES Tables TCA through TCG give an example of a Configuration Table which defines one exemplary configuration of the present invention.
  • Table TCA gives an exemplary set of sizing parameters for one configuration of the described device and some basic values derived therefrom.
  • Table TCB gives an exemplary set of drive cache status parameters for one configuration of the described device and some basic values derived therefrom.
  • Table TCC gives an exemplary set of global cache status parameters for one configuration of the described device and some basic values derived therefrom.
  • Table TCD gives an exemplary set of drive mode parameters for one configuration of the described device and some basic values derived therefrom.
  • Table TCE gives an exemplary set of global mode parameters for one configuration of the described device and some basic values derived therefrom.
  • Table TCF gives an exemplary set of recycling parameters for one configuration of the described device.
  • Table TCG gives an exemplary set of drive activity control parameters for one configuration of the described device.
  • Table TLB gives an example of the unmdexed values of an LRU table at the completion of system initialization at power up time.
  • Table TLC gives an exemplary snapshot of portions of an LRU table that are indexed by spindle number, taken at the completion of system initialization at power up time.
  • Table TLD gives an exemplary snapshot of portions of an LRU table that are indexed by cache bm number, taken at the completion of system initialization at power up time.
  • Tables TLE through TLG give exemplary snapshots of some portions of a Least-Recently-Used Table taken during the operation of the present invention.
  • Table TLE gives an example of the unmdexed values of an LRU table at an arbitrary time during the operation of the present invention.
  • Table TLF gives an exemplary snapshot of portions of an LRU table that are indexed by spindle number, taken at an arbitrary time during the operation of the present invention.
  • Table TLG gives an exemplary snapshot of portions of an LRU table that are indexed by cache bm number, taken at the completion of system initialization at power up time.
  • Tables TAB through TAD give an example of an initial Address Translation (ADT) Table.
  • Table TAB gives an example of the unmdexed values of an ADT table at the completion of system initialization at power up time.
  • Table TAC gives an exemplary snapshot of portions of an ADT table that are indexed by spindle number, taken at the completion of system initialization at power up time.
  • Table TAD gives an exemplary snapshot of portions of an ADT table that are indexed by disk bm number, taken at the completion of system initialization at power up time.
  • Tables TAE through TAG give exemplary snapshots of some portions of an Address Translation Table taken during the operation of the described device.
  • Table TAE gives an example of the unmdexed values of an ADT table at an arbitrary time during the operation of described device.
  • Table TAF gives an exemplary snapshot of portions of an ADT table that are indexed by spindle number, taken at an arbitrary time during the operation of described device.
  • Table TAG gives an exemplary snapshot of portions of an ADT table that are indexed by disk bin number, taken at an arbitrary time during the operation of described device.
  • Tables TGB through TGD give an example of an initial GAP Table.
  • Table TGB gives an example of the unmdexed values of a GAP table at the completion of system initialization at power up time.
  • Table TGC gives an exemplary snapshot of portions of a GAP table that are indexed by spindle number, taken at the completion of system initialization at power up time.
  • Table TGD gives an example of the unmdexed values of a GAP table at the completion of system initialization at power up time.
  • Tables TGE through TGG give exemplary snapshots of some portions of a GAP table taken at an arbitrary time during the operation of the described device.
  • Table TGE gives an example of the unmdexed values of a GAP table taken at an arbitrary time during the operation of the described device.
  • Table TGF gives an exemplary snapshot of portions of a GAP table that are indexed by spindle number, taken at an arbitrary time during the operation of described device.
  • Table TGG gives an exemplary snapshot of portions of a GAP table that are indexed by gap number, taken at an arbitrary time during the operation of described device.
  • Table TMD gives an exemplary snapshot of portions of a Modified Bins table taken at the completion of system initialization at power up time.
  • Table TMG gives an exemplary snapshot of some portions of a Modified Bins Table taken during the operation of the described device.
  • the present invention is a computer peripheral data storage device consisting of a combination solid state memory and one or more mass storage devices, such as rotating magnetic disks; such device having the large capacity of magnetic disks with near solid state speed at a cost per megabyte approaching that of magnetic disk media.
  • mass storage devices such as rotating magnetic disks
  • This invention derives its large storage capacity from the rotating magnetic disk media. Its high speed performance stems from the combination of a private channel between the two storage media, one or more microprocessors utilizing a set of unique data management algorithms, a unique prefetch procedure, parallel activity capabilities, and an ample solid state memory.
  • This hybrid storage media gives overall performance near that of solid state memory for most types of computer workloads while practically never performing at less than normal magnetic disk speeds for any workload.
  • the present invention appears to be one or more directly addressable entities, such as magnetic disks.
  • a solid state memory and one or more magnetic disk devices private data communication lines are established within the device which avoids contention with normal data transfers between the host and the device, and transfers between the solid state memory and the disk media.
  • These private data channels permit unrestricte ⁇ data transfers between the two storage media with practically no contention with the communication between the host computer and the present invention.
  • Utilization of ample solid state memory permits efficient retention of data for multiple, simultaneously active data streams. Management of the storage is via microprocessors which anticipate data accesses based on historical activity. Data is moved into the solid state memory from the one or more mass memory devices based on management algorithms which insure that no table searches need be employed m the time-critical path.
  • Host computer accesses to data stored in the solid state memory are at near solid state speeds; accesses to data stored in the mass memory but not in the solid state memory are at near mass memory speeds. All data sent from the host to the device is transferred at solid state speeds limited only by the channel capability.
  • One embodiment of the present invention includes a power backup system which includes a rechargeable battery; this backup system is prepared to maintain power on the device should the outside power be interrupted. If such a power interruption occurs, the device manager takes whatever action is necessary to place all updated data onto mass storage before shutting down the entire device. Information about functional errors and operational statistics are maintained by the diagnostic module-error logger. Access to this module is via a device console and/or an attached personal type computer. The console and/or personal computer are the operator's access to the unit for such actions as powering the unit on and off, reading or resetting the error logger, inquiring of the unit's statistics, and modifying the 5 device's management parameters and configuration.
  • Memory device 200 is a self-contained module which includes interfaces with certain external devices. Its primary contact is with host computer 201 via host interface 204.
  • Host interface 204 comprises, for 5 example, a dedicated SCSI control processor which handles communications between host computer 201 and memory manager 205.
  • An operator interface is provided via the console 207, which allows the user to exercise overall control of the memory device 200.
  • Memory manager 205 handles all functions necessary to manage the storage of data m, and retrieval of data from disk drive 210 5 (or high capacity memory devices) and solid state memory 208, the two storage media.
  • the memory manager 205 consists of one or more microprocessors 205-1, associated firmware 205-2, and management tables, such as Address Translation (ADT) Table 205-3 and Least Recently Used (LRU) Table 205-4.
  • ADT Address Translation
  • LRU Least Recently Used
  • Solid state memory 30 208 is utilized for that data which memory manager 205, based on its experience, deems most useful to host computer 201, or most likely to become useful in the near future.
  • Magnetic disk 210 is the ultimate storage for all data, and provides the needed large storage capacity. It may include one or more disk drives
  • Disk interface 209 serves as a separate dedicated control processor (such as an SCSI processor) for
  • a separate disk interface 209-1 through 209-N is associated with each disk drive 210-1 through
  • Console 210-N Information about functional errors and operational statistics are maintained by diagnostic module-error logger 206. 0 Access to module 206 is obtained through console 207. Console
  • 207 serves as the operator's access to the memory device 200 for such actions as powering the system on and off, reading or resetting the error logger, or inquiring of system statistics.
  • the memory device 200 includes power backup system 203 which
  • Backup system 203 is prepared to maintain power to memory device 200 should normal power be interrupted. If such a power interruption occurs, the memory manager 205 takes whatever action is necessary to place all updated data stored in solid state memory 208 onto magnetic disk
  • Figure 3 depicts a hardware controller block diagram of one embodiment of this invention.
  • hardware controller 300 provides three I/O ports, 301, 302, and 303.
  • I/O ports 301 and 302 are single-ended or differential wide or
  • I/O ports 303-1 through 303-N is a single-ended SCSI port used to connect controller 300 to disk drive 210 (which in this embodiment is a
  • Cache memory 308 (corresponding to memory 208) is a large, high-speed memory used to store, on a dynamic basis, the currently active and potentially active data.
  • the storage capacity of cache memory 308 can be selected at any convenient size and, in the embodiment depicted in Figure 3, comprises 64 Megabytes of storage.
  • Cache memory 308 is organized as 16 Megawords; each word consists of four data bytes (32 bits) and seven bits of error-correctmg code.
  • the storage capacity of cache memory 308 is selected to be within the range of approximately one-half of one percent (0.5%) to 100 percent of the storage capacity of the one or more magnetic disks 210 ( Figure 2) with which it operates.
  • a small portion of cache memory 308 is used to store the tables required to manage the caching operations; alternatively, a different memory (not shown, but accessible by microcontroller 305) is used for this purpose.
  • Error Detection and Correction (EDAC) circuitry 306 performs error detecting and correcting functions for cache memory 308.
  • EDAC circuitry 306 generates a seven-bit error-correctmg code for each 32-bit data word written to cache memory 308; this information is written to cache memory 308 along with the data word from which it was generated.
  • the error-correctmg code is examined by EDAC circuitry 306 when data is retrieved from cache memory 308 to verify that the data has not been corrupted since last written to cache memory 308.
  • the modified Hamming code chosen for this embodiment allows EDAC circuitry 306 to correct all smgle-bit errors that occur and detect all double-bit and many multiple-bit errors that occur.
  • Error logger 307 is used to provide a record of errors that are detected by EDAC circuitry
  • error logger 307 The information recorded by error logger 307 is retrieved by microcontroller 305 for analysis and/or display. This information is sufficiently detailed to permit identification by microcontroller 305 of the specific bit in error (for single-bit errors) or the specific word in error (for double-bit errors) .
  • EDAC circuitry 306 detects a smgle-bit error
  • the bit in error is corrected as the data is transferred to whichever interface requested the data (processor/cache interface logic 316, host/cache interface logic 311 or 312, and disk/cache interface logic 313) .
  • a signal is also sent to microcontroller 305 to permit handling of this error condition (which involves analyzing the error based on the contents of error logger 307, attempting to scrub (correct) the error, and analyzing the results of the scrub to determine if the error was soft or hard) .
  • EDAC circuitry 306 detects a double-bit error
  • a signal is sent to microcontroller 305.
  • Microcontroller 305 will recognize that some data has been corrupted. If the corruption has occurred in the ADT or LRU tables, an attempt is made to reconstruct the now-defective table from the other, then relocate both tables to a different portion of cache memory 308. If the corruption has occurred in an area of cache memory 308 that holds user data, microcontroller 305 attempts to salvage as much data as possible (transferring appropriate portions of cache memory 308 to disk drives 210-1 through 210-N, for example) before refusing to accept new data transfer commands.
  • Microcontroller 305 includes programmable control processor 314 (for example, a 68360 microcontroller available from Motorola) , 64 kilobytes of EPROM memory 315, and hardware to allow programmable control processor 314 to control the following: I/O ports 301, 302, and 303, cache memory 308, EDAC 306, error logger 307, host/cache interface logic 311 and 312, disk/cache interface logic 313, processor/cache interface logic 316, and serial port 309. Programmable control processor 314 performs the functions dictated by software programs that have been converted into a form that it can execute directly.
  • programmable control processor 314 for example, a 68360 microcontroller available from Motorola
  • Programmable control processor 314 performs the functions dictated by software programs that have been converted into a form that it can execute directly.
  • the host/cache interface logic sections 311 and 312 are essentially identical. Each host/cache interface logic section contains the DMA, byte/word, word/byte, and address register hardware that is required for the corresponding I/O port (301 for 311, 302 for 312) to gain access to cache memory 308. Each host/cache interface logic section also contains hardware to permit control via microcontroller 305. In this embodiment I/O ports 301 and 302 have data path widths of eight bits (byte) . Cache memory 308 has a data path width of 32 bits (word) .Disk/cache interface logic 313 is similar to host/cache interface logic sections 311 and 312.
  • Disk/cache interface logic 313 also contains hardware to permit control via microcontroller 305.
  • I/O port 303 has a data path width of eight bits (byte) .
  • Processor/cache interface logic 316 is similar to host/cache interface logic sections 311 and 312 and disk/cache interface logic 313. It contains the DMA, half-word/word, word/half-word, and address register hardware that is required for programmable control processor 314 to gain access to cache memory 308.
  • Processor/cache interface logic 316 also contains hardware to permit control via microcontroller 305.
  • Serial port 309 allows the connection of an external device (for example, a small computer) to provide a human interface to the system 200.
  • Serial port 309 permits initiation of diagnostics, reporting of diagnostic results, setup of system 200 operating parameters, monitoring of system 200 performance, and reviewing errors recorded inside system 200.
  • serial port 309 allows the transfer of different and/or improved soft-ware programs from the external device to the control program storage (when memory 315 is implemented with EEPROM rather than EPROM, for example) .
  • firmware provides an active set of logical rules which is a real-time, full-time manager of the device's activities. Among its major responsibilities are the following: 1. Initialization of the device at power up.
  • Cache management including the movement of data between cache memory and the various integral disk drives.
  • control is transferred to the firmware executive controller. See Figure 35.
  • the first task of the executive is to test the various hardware components and initialize the entire set of control tables. See Figure 36. After completion of initialization, the executive enters a closed loop which controls the background tasks associated with each disk drive. See Figure 37. When power to the device is interrupted, the executive initiates a controlled shutdown. See Figure 38. Between power up and shutdown, the system reacts to host commands and, most importantly, is proactive in making independent decisions about its best course of action to maintain the most efficient operation. CONTROL TABLES
  • the activities of the present invention are controlled by firmware which m turn is highly dependent on a set of logical tables.
  • firmware which m turn is highly dependent on a set of logical tables.
  • the configuration of the device and records of the activities and data whereabouts are maintained in tables which are themselves stored in memory in the described device.
  • the Configuration Table (CFG) table is made up of unmdexed items which describe the configuration of the device and some of the values defining the device's rules for operation.
  • the Address Translation (ADT) table the primary function of this table is to maintain records of which disk bins' data is cached in which cache bins at each instant. It also maintains some historical records of disk bins' activities.
  • the Least Recently Used (LRU) table this table is central to the logic of managing the cache bins of the device. It maintains information on which portions of cache bins contain valid data, which portions contain modified data, the order of most recent usage of the data in the cache bins, the recycling control information, and any other information necessary to the operation of the device.
  • the Gap (GAP) table this table works in conjunction with the LRU table in keeping track of the modified portions of data withm cache bins.
  • This table comes into play only when there are more than one, non- contiguous modified portions of modified data withm any one cache bm. 5.
  • the Modified Bins (MOD) table This table keeps a bit -map type record of all disk bins, with an indicator of whether or not the cache bm currently related to the disk bm contains modified data which is waiting to be written to the disk.
  • the Configuration table describes an operational version of the present invention and gives the basic rules for its operation.
  • CFG-SECSIZE size in bytes, of the sectors on the disk drives.
  • CFG-CACHMB size in megabytes, of entire cache.
  • CFG-CACHBINS size in bins, of the entire cache.
  • CFG-GSEXCPCT lower limit (pet) of all cache, global excess status CFG-GSEXCESB lower limit (bins) of global chain in excess status CFG-GSEXCPCT * CFG-CACHBINS DRIVE MODE PARAMETERS CFG-DMSWEEP lower limit (bins) of modified bins for sweep mode
  • CFG-DRVSIZE Definition Capacity, in megabytes, of each disk drive.
  • CFG-SEC ⁇ IZE Size, m sectors, of each cache bm and each disk bm.
  • Initialization Set at time CFG table is created; may be reset via communication with the serial port when the device is totally inactive, offline, and has no data stored in its cache. In one example, this is preset to a number which creates a bm size of approximately 32KB. This is approximately 64 sectors if the sector size is 512 bytes.
  • Initialization Preset to predefined number; can be reset via serial port communication when the device is totally inactive, offline, and has no data stored in its cache.
  • CFG-DSNORPCT Definition The minimum cache size assigned to a drive when that drive is in normal status; expressed as a percentage of all cache .
  • Initialization Preset to predefined number,- can be reset via serial port communication when the device is totally inactive, offline, and has no data stored m its cache.
  • CFG-DSEXCPCT Definition The lower limit of the drive minimum cache size when the drive is in excess status; expressed as a percentage of the total cache, distributed over all drives. This also, and more importantly, defines the upper limit of a drive's normal status.
  • Initialization Preset to predefined number; can be reset via serial port communication when the device is totally inactive, offline, and has no data stored in its cache.
  • Initialization Calculated at device startup time as a portion of the entire cache in the device.
  • CFG-GSMARGB Definition Absolute lower limit of number of cache bins in the global cache chain when in global marginal status. The lowest number of cache bins permitted to be assigned to the global cache chain at any time; the logic of the device always keeps the number of cache bins in the global cache chain greater than this number.
  • Initialization Preset to predefined number; can be reset via erial port communication when the device is totally inactive, offline, and has no data stored in its cache.
  • Initialization Preset to predefined number; can be reset via serial port communication when the device is totally inactive, offline, and has no data stored in its cache.
  • CFG-GSEXCPCT Definition The lower limit (% of total cache) of the amount of the total cache in the global cache chain when m global cache excess status.
  • Initialization Preset to predefined number; can be reset via serial port communication when the device is totally inactive, offline, and has no data stored m its cache.
  • Initialization Preset to predefined number; can be reset via serial port communication when the device is totally inactive, offline, and has no data stored in its cache.
  • CFG-DMURGPCT Definition The percent of a drive's average share of all cache bins, which when modified, puts that drive mto urgent mode.
  • Initialization Preset to predefined number; can be reset via serial port communication when the device is totally inactive, offline, and has no data stored in its cache.
  • CFG-DMURGNTB Preset to predefined number; can be reset via serial port communication when the device is totally inactive, offline, and has no data stored in its cache.
  • Initialization Set based on other parameters.
  • CFG-DMURGNTB CFG-DMURGPCT * CFG-CACHBINS / CFG-DRIVES CFG-DMSATPCT
  • Initialization Preset to predefined number; can be reset via serial port communication when the device is totally inactive, offline, and has no data stored in its cache.
  • Initialization Preset to predefined number; can be reset via serial port communication when the device is totally inactive, offline, and has no data stored in its cache.
  • Initialization Set based on other parameters.
  • Initialization Preset to predefined number; can be reset via serial port communication when the device is totally inactive, offline, and has no data stored in its cache.
  • Initialization Preset to predefined number; can be reset via serial port communication when the device is totally inactive, offline, and has no data stored in its cache.
  • Initialization Preset to predefined number; can be reset via serial port communication when the device is totally inactive, offline, and has no data stored in its cache.
  • Initialization Preset to predefined number; can be reset via serial port communication when the device is totally inactive, offline, and has no data stored m its cache.
  • Initialization Preset to predefined number; can be reset via serial port communication when the device is totally inactive, offline, and has no data stored in its cache.
  • CFG-LAD-ADJUST Definition The value of the total host I/O count that, when attained, causes the counts of I/O's for each drive to be adjusted downward by the least-active-drive tally adjustment factor.
  • Initialization Preset to predefined number; can be reset via serial port communication when the device is totally inactive, offline, and has no data stored in its cache.
  • CFG-LAD-ADJUST Definition The divisor to be used for adjusting the count-relative tally of host I/O's for each drive.
  • Initialization Preset to predefined number; can be reset via serial port communication when the device is totally inactive, offline, and has no data stored m its cache.
  • FORMAT OF ADDRESS TRANSLATION ADT TABLE The Address Translation Table, working with the LRU, GAP, and MOD tables, maintains the information required to manage the caching operations.
  • ADT Address Translation Table
  • ADT-DISKS the indexed section containing information pertaining to each logical spmdle mcluded in the described device.
  • ADT-BINS the indexed section of the ADT table containing information pertaining to each logical disk bm of the entire described storage device.
  • the unmdexed segment of the ADT table contains fields whose values are dynamically variable values; these fields are used primarily as counters of the device activities.
  • ADTC-READS The total number of host reads from the device since the device was powered on or since this field was last reset.
  • ADTC-WRITES The total number of host writes to the device since the device was powered on or since this field was last reset .
  • APT-DISKS THE FIRST INDEXED SEGMENT OF THE ADT TABLE
  • the first tabular section of the ADT table contains information relating to each logical spindle. There is one line in this section for each logical spmdle, and each line is referenced by logical spmdle number.
  • ADTD-LINE-BEG The first line number withm the ADT-BINS table which relates to the referenced logical disk spmdle. This is set during system configuration and is not changed during the storage device's operation.
  • ADTD-LINE-BEG is used as an offset to locate lines in the ADT-BINS table that are associated with a specific logical spmdle.
  • ADTD-HEAD-POS The current position of the read/write head of the logical spindle. This is kept as a logical disk bin number and is updated each time the referenced disk head is repositioned.
  • ADTD-SWEEP-DIR For each logical disk spmdle, the direction in which the current sweep of the background writes is progressing. This is updated each time the sweep reverses its direction across the referenced disk. 4) ADTD-DISK-ACCESSES. A count of the number of times this logical disk spmdle has been accessed by the host since the last time this field was reset. This is an optional field, but if present, is incremented each time the logical spmdle is accessed for either a read or a write. This field may be used to influence the amount of cache assigned to each spmdle at any given time.
  • ADTD-DISK-READS A count of the number of times this logical disk spmdle has been accessed by a host read operation since the last time this field was reset. This field may be used to influence the amount of cache assigned to each spmdle at any given time.
  • ADTD-DISK-WRITES A count of the number of times this logical disk spmdle has been accessed by a host write operation since the last time this field was reset. This field may be used to influence the amount of cache assigned to each spmdle at any given time.
  • ADTD-LAD-USAGE A count related to the number of times this logical disk spindle has been accessed by the host. This field is incremented each time the logical spmdle is accessed for either a read or a write, and it is recalculated when the current total count of host I/O's for all drives reaches a preset limit. This field is used to maintain balance among the various drives and the management of the amount of cache assigned to each spmdle at any given time.
  • ADTD-LINK-MORE the pointer to the ADTD line relating to the drive which has the next higher usage factor in the least-active-drive list. This is part of the bidirectional chaining of the LAD list lines. If this bm is the most active of all m the chain, ADTD-LINK-MORE will contain a null value.
  • ADTD-LINK-LESS the pointer to the ADTD line relating to the drive which has the next lower usage factor in the least-active-drive list. This is part of the bidirectional chaining of the LAD list lines. If this bm is the least active of all in the chain, ADTD-LINK-LESS will contain a null value.
  • the lines are grouped based on logical spmdle number so that all lines related to bins of the first logical spmdle are in the first section of the table, followed by all lines for bins of the second logical spmdle, and so on for all logical spindles in the storage device.
  • the ADT-BINS line number adjusted by the offset for the specific logical spmdle, directly corresponds to a logical disk bm number on the disk.
  • the host wants to access or modify data on the disk, it does so by referencing a starting disk sector address and indicating the number of sectors to be accessed or modified. For caching purposes, the starting sector address is converted mto a logical disk bin number and sector offset withm that logical disk bm.
  • a disk sector address is converted mto a logical disk bm number and a sector offset by dividing it by the number of sectors per logical disk bm. The remainder is the offset mto the bm.
  • the quotient is the disk bm identifier and is the index into the ADT table. Using this index, the condition of the specified disk bm can be determined directly from data in the ADT table; no search is required to determine cache-hits or misses.
  • Each ADT-BINS line contains at least the following item:
  • ADTB-CACHE-BIN This field contains the number of the logical cache bm which contains the data for the logical disk bm corresponding to this ADT table line number. By design, the value in ADTB-CACHE-BIN also points to the line m the LRU table related to the cached disk bm. A null value is stored in this field to indicate that the data which is stored m, or is destined to be stored in, the logical disk bm is not in cache memory. It is by means of this field that cache-hits can be serviced completely without any table search. This field is updated each time data for a logical disk bin is entered into or removed from the cache memory.
  • each ADT-BINS line may contain one or more activity monitoring fields such as, but not limited to the following:
  • ADTB-BIN-ACCESSES A count of the number of times this logical disk bm of this spmdle has been accessed by the host since the last time this field was reset.
  • ADTB-BIN-READS A count of the number of times this logical disk bm of this spmdle has been accessed by a host read operation since the last time this field was reset.
  • the LRU table maintains the basic information pertaining to all cache bins. This includes a variety of information, generally of the following nature: 1. The assignment of cache bins to logical spindles;
  • the LRU table also provides certain redundancies for the data kept in the ADT table, thus contributing to system reliability. It is by means of the LRU table, the ADT table, and the GAP table that the system determines which cache bin to overwrite when cache space is needed for an uncached disk bin.
  • the unindexed portion of the LRU table contains global data required to manage the caching process .
  • the tabular portions provide the actual cache management information and are composed of pointers for LRU chaining purposes, pointers into the ADT and GAP tables, and the recycle control registers or flags. For reference purposes, these three segments will be referenced as follows:
  • LRU-CONTROL the unindexed section of the LRU table which contains overall, global information concerning the cache bins.
  • LRU-DISKS the indexed section which contains information pertaining to the cache bins associated with each logical spmdle mcluded in the device.
  • LRU-BINS the indexed section of the LRU table which contains information pertaining to each cache bm of the entire storage device.
  • LRUC-GLOBAL-STATUS current status of the global cache chain. 5. LRUC-GLOBAL-DIRTY - total number of modified cache bins . 6. LRUC-GLOBAL-MODE - current operating mode for the global cache. LRU-DISKS. FIRST INDEXED SEGMENT OF LRU TABLE. SPINDLES INFORMATION
  • LRUD-BINS-DIRTY number modified cache bins assigned to spmdle.
  • LRUD-DISK-LRU pointer to oldest bm in spindle's cache chain.
  • LRUD-DISK-MRU pointer to newest bm in spindle's cache chain.
  • LRUB-CHAIN - flag indicating bm is in global or m a drive chain.
  • LRUB-VALID-LOW lowest sector in cache bin containing valid data.
  • LRUB-LOCK-RD-FETCH locked for fetch from disk for host read.
  • LRUP-LOCK-RD-AHEAD locked for read ahead based on host read.
  • the unindexed items pertain to the cache management, and include the following single-valued items.
  • LRUC-GLOBAL-BINS the total number of cache bins currently in the global cache chain.
  • LRUC-GLOBAL-LRU the pointer to oldest bm in the global cache chain.
  • This LRU-CONTROL element points to the LRU-BINS table line whose corresponding cache data area is considered to be in the global chain and which has been left untouched for the longest period of time by a host read, cache ahead, or a cleaning operation. If there is new read activity for the referenced cache bm it is updated and promoted to the MRU position for its spmdle; if the new activity is a result of a host write, the cache bin is logically placed in the modified bins pool.
  • the GLOBAL LRU cache bm is the first candidate for overwriting when new data must be placed in the cache for any spmdle . When such overwriting does occur, this bm will be removed from the global chain and placed in one of the spmdle ' s local LRU chains or in the modified pool.
  • LRUC-GLOBAL-MRU the pointer to the LRU-BINS table line whose corresponding cache bm is in the global chain and which is considered to be the most recently used of the global cache bins.
  • GLOBAL-MRU is updated every time a cache bin of any spmdle is demoted from any local spindle's LRU chain, when a cache bin of the modified pool is cleaned by writing its modified data to its disk, or when, for any reason, a cache bm is chained to this position.
  • This relates to the number of cache bins which contain unmodified data and which are currently assigned to the global cache and, therefore, are currently linked mto the global cache chain.
  • the status is reset each time a cache bm is placed in or removed from the global cache chain.
  • the global status is always excess, normal, marginal, or minimal. 5.
  • LRUC-GLOBAL-DIRTY the total number of modified cache bins, for all spindles, regardless of the spmdle to which they are assigned.
  • cache bins contain data which has written from the host into cache and are currently waiting to be written from cache to disk.
  • LRUC-GLOBAL-MODE the current operating mode for the global cache. This relates to the total number of cache bins assigned to all disk drives and which currently contain modified data. The mode is reset each time a cache bin for any drive is moved into or out of the modified pool. The global mode is always normal, urgent, or saturated.
  • the first tabular section of the LRU table contains information relating to the cache bins assigned to each logical spindle. There is one line in this section for each logical spindle, and each line is referenced by logical spindle number.
  • LRUD-BINS-CHAIN the number cache bins allocated to the corresponding spindle's private cache chain. This field is used to maintain the number of cache bins containing clean data and currently allocated to this spindle's private cache. This count excludes those cache bins assigned to this spindle but which are allocated to the global cache (and thus, linked into the global chain) .
  • LRUD-BINS-DIRTY the number of cache bins currently assigned to the corresponding spindle, each of which contains some modified, or dirty, data which is currently awaiting a write to disk. This number is increased by one whenever an unmodified cache bin associated with this spindle is updated by the host, and it is decreased by one whenever data in a modified cache bin associated with this spindle is copied to the disk.
  • LRUD-DISK-LRU points to the spindle's cache bin (and to the corresponding line m the LRUB table) which is in the spindle's cache chain and which has been untouched by host activity for the longest period of time. It is updated when new activity for the referenced cache bin makes it no longer the least-recently-used of the referenced spindle.
  • the referenced cache bin is the next candidate for demoting to global cache when this spindle must give up a cache bin.
  • a cache bin demoted from the LRU position of any local cache chain enters the global chain at the MRU position.
  • LRUD-DISK-MRU points to the cache bin (and to the corresponding line in the LRUB table) which has been most recently referenced by host read activity.
  • the referenced cache bin will always be in the spindle's private cache.
  • LRUD-DISK-MRU is updated each time a cache bin of the referenced spindle is touched by a read from the host, when data for the spindle is read from disk into cache as part of a cache ahead operation, or when a cache bin is promoted based on the recycling procedures.
  • the address of the accessed cache bin is placed m LRUD-DISK-MRU and the LRU-BINS chains are updated in the LRU-BINS table.
  • a bin is used from the LRU position of the global cache to fulfill the requirement for a cache bin for a given spmdle, that global cache bin is allocated to the specific spmdle at that time and either chained into the spindle's local chain or placed in the modified pool.
  • Such action may require the receiving spmdle, or some other spindle, to give its LRU cache bin to the top (MRU) of the global cache chain. Giving up the LRU cache bin in such a fashion does not decache the data in the bin; the data remains valid for the spindle from whose chain it was removed until the cache bin reaches the global LRU position and is reused for caching some other logical disk bin.
  • LRUD-DISK-STATUS the current status of the cache chain for the disk drive. This relates to the number of cache bins which contain unmodified data and which are currently assigned to a given drive and, therefore, are currently linked into the drive's private cache chain. The status is reset each time a cache bin is placed in or removed from the disk's private cache chain. The drive status is always excess, normal, marginal , or minimal .
  • LRUD-DISK-MODE the current operating mode for the disk drive. This relates to the number of cache bins assigned to the disk drive and which currently contain modified data.
  • the mode is reset each time a cache bin for the given drive is moved into or out of the modified pool, or when a sweep timeout occurs.
  • the drive mode is always normal, timeout, sweep, urgent, or saturated.
  • Each LRU-BINS table line contains pointer fields plus other control fields.
  • LRUB-DISK-ID the identity of the spmdle for this cache bm.
  • LRUB-CHAIN - flag indicating whether the cache bm is m the global cache chain or in the drive cache chain. This is a one-bit marker where a one indicates the cache bm is m the global chain, and a zero indicates the cache bm is m one of the drive's cache chains. When the cache bin is m the modifie ⁇ pool, this field has no meaning.
  • LRUB-LINK-OLD the pointer to the LRUB line relating to the next-older (in usage) cache bm for the same drive. This is part of the bidirectional chaining of the LRU table lines. If this bin is the oldest of all in the chain, LRUB-LINK-OLD will contain a null value.
  • LRUB-LINK-NEW the pointer to the LRUB line relating to the next newer (in usage) cache bm for the same drive. This is the other half of the bidirectional chaining of LRU table lines. If this bm is the newest of all m the chain, LRUB-LINK-NEW will contain a null value.
  • LRUB-VALID-LOW the number of the lowest sector withm the cache bm containing valid data. This is a bm-relative number.
  • LRUB-VALID-HIGH the number of the highest sector withm the cache bm containing valid data. This is a bm-relative number.
  • LRUB-MOD-LOW the number of the lowest sector withm the cache bm containing modified data, if any. This is a bin-relative number.
  • LRUB-MOD-HIGH the number of the highest sector withm the cache bm containing modified data. This is a bm-relative number.
  • LRUB-MOD-GAP a pointer mto the GAP table if any gaps consisting of uncached data exist within the modified portion of the currently cached portion withm this cache bm. If one or more such gaps exist, this field points to the GAP table line containing information pertaining to the first of such gaps. If no such gaps exist, this field will contain a null value. Since normal workloads create few gaps and a background task is dedicated to the clearing of gaps, there will be very few, if any, gaps at any given instant during normal operations .
  • LRUB-LOCKED - a set of flags which indicate whether or not the cache bm is currently locked. This set of flags indicates whether or not the corresponding cache bm is currently the target of some operation, such as being acquired from the disk, being modified by the host, or being written to the disk by the cache controller; such operation making the cache bm unavailable for certain other operations.
  • the following sub-fields each indicate some specific reason for which the cache bm is locked, such lock may restrict some other specific operations involving this cache bm. More than one lock may be set for a given bin at any one time. For purposes of quickly determining if a cache bin is locked, these flags are treated as one field made up of eight sub-fields of one bit each. If found to be locked, the individual lock bits are inspected for the reason(s) for the lock.
  • LRUB-LOCK-RDHIT - a flag which, when set, indicates the cache bm is locked by a host read hit.
  • LRUB-LOCK-RDMISS - a flag which, when set, indicates the cache bm is locked by a host read miss.
  • LRUP-LOCK-WRHIT - a flag which, when set, indicates the cache bm is locked by a host write hit .
  • LRUP-LOCK-WRMISS - a flag which, when set, indicates the cache bm is locked by a host write miss.
  • LRUB-LOCK-RD-FETCH - a flag which, when set, indicates the cache bm is locked for a fetch from disk for a host read miss.
  • LRUP-LOCK-GAP-READ - a flag which, when set, indicates the cache bin is locked for a read from disk to eliminate a gap in modified data.
  • LRUB-LOCK-GAP-WRITE - a flag which, when set, indicates the cache bm is locked for a write to disk to eliminate a gap in modified data.
  • LRUP-LOCK-SWEEP - a flag which, when set, indicates the cache bm is locked by the sweep for writing modified data to disk.
  • LRUB-RECYCLE a field whose value indicates the desirability of recycling, or retaining in cache, the data currently resident m the cache bm. Its management and usage is as described in the recycling section of this document. The higher the value m this field, the more desirable it is to retain the data in this cache bin in cache when the cache bm reaches the LRU position in its spindle's LRU chain.
  • This field may be one or more bits in size; for purposes of this description, it will be assumed to be four bits, allowing for a maximum value of 15.
  • the GAP table maintains the information required to manage the gaps m the valid, modified data withm bins of cached data. For each spmdle, the gaps are chained such that all gaps for a given cache bm are grouped mto a contiguous set of links. There are three sections m the GAP table, the unindexed, or single-valued items, and two sets of indexed, or tabular segments. For reference purposes, these three segments will be referenced as follows:
  • GAP-CONTROL the unindexed section of the GAP table.
  • GAP-DISKS the indexed section containing information pertaining to each logical spindle included in the described device.
  • GAP-GAPS the indexed section containing detailed information pertaining to each gap that currently exists.
  • the unindexed items of the GAP table include the following single-valued items.
  • GAP-GAPS The total number of gaps that currently exist for all logical spindles of the described storage device.
  • GAP-UNUSED-FIRST The line number of the first unused line in the GAPS portion of the GAP table.
  • GAP-UNUSED-NUMBER The number of unused lines in the GAPS portion of the GAP table.
  • the first tabular section of the GAP table contains summary information about the gaps relating to cache bins assigned to each logical spindle. There is one line in this section for each logical spindle, and it is indexed by logical spindle number. For the described version of this invention, this portion of the GAP table will contain four lines; however, any number of logical spindles could be included within the constraints of and consistent with the other elements of the device.
  • GAPD-NUMBER The number of gaps that currently exist for this logical spindle. This is increased by one whenever a new gap is created in a cache bin assigned to this logical spindle, and it is decreased by one whenever a gap for this logical spindle has been eliminated. If no gaps exist in cache bins assigned to this logical spindle, this value is set to zero.
  • GAPD-FIRST A pointer which contains the GAP table line number of the first gap for the logical spmdle.
  • GAPD-LAST A pointer which contains the GAP table line number of the last gap for the logical spmdle.
  • the second tabular section of the GAP table contains detail information about the gaps that exist. There is one line m this section for each gap m any cache bm assigned to any logical spmdle, and it is indexed by arbitrary line number. Lines of the GAP-GAPS table are chained m such a way as to ensure that all GAP-GAPS lines relating to a given cache bm are chained into contiguous links. It is likely that, at any given time, very few, if any, of these lines will contain real information.
  • the design of the storage device operations is to minimize the number of gaps that exist at any given moment, and when they do exist, the device gives a high priority to the background task of eliminating the gaps. 1) GAPG-DISK.
  • GAPG-BIN The cache bm number in which this gap exists.
  • the value in this field acts as an index mto the LRU table for the cache bm in which this gap exists.
  • the value in this field is null if this line of the GAP-GAPS table is not assigned to any cache bm; when this value is null, it indicates that this line is available to be used for some new gap should such a gap be created by caching activities.
  • GAPG-SECTOR-BEG The sector number in the bm identified in GAPG-BIN which is the first that contains non-valid data; this sector is the beginning of the gap. The value in this field is meaningless if this line of the GAP-GAPS table is not assigned to any spmdle.
  • GAPG-SECTOR-END The sector number m the bm identified in GAPG-BIN which is the last that contains non-valid data; this sector is the end of the gap. The value in this field is meaningless if this line of the GAP-GAPS table is not assigned to any spmdle.
  • GAPG-PREV A pointer to a line of the GAP-GAPS table which contains details pertaining to another gap of the same cache bm of the same logical spmdle, which gap precedes this gap m the gap chain for the same cache bin for the same logical spmdle. If this line of the table does not currently represent a gap, this field is used to maintain the position of this table line m the available gaps chain. Note that the firmware will maintain the gap chains in such a fashion so as to ensure that all gaps for a given cache bin assigned to a given spmdle will be connected in contiguous links of the spmdle gap chain.
  • this field points to the preceding unused line of the GAP-GAPS table. If this is the first link in the unused gap chain, this value will be set to null.
  • GAPG-NEXT A pointer to a line of the GAP-GAPS table which contains details pertaining to another gap of the same cache bm of the same logical spmdle, which gap follows this gap in the gap chain for this logical spmdle and cache bm. If this line of the table does not currently represent a gap, this field is used to maintain the position of this table line in the available gaps chain. If this line of the GAP-GAPS table is not assigned to any spmdle, the value in this field points to the succeeding unused line of the GAP-GAPS table. If this is the last link in the unused gap chain, this value will be set to null.
  • the Modified Bins (MOD) Table working with the ADT, LRU, and GAP tables, maintains the information required to manage the background sweep operations of the described device.
  • the MOD table contains one bit. This bit is a one if modified data currently resides in the cache bm corresponding to the disk bin which relates to the said bit and the bit is a zero if no such modified data exists for the disk bin.
  • the bits are accessed in word-size groups, for example, 16 bits per access. If the entire computer word is zero, it is known that there is no modified data m the cache bins corresponding to the disk bins represented by those 16 bits.
  • the computer word is non-zero, there is modified data in cache for at least one of the related disk bins. Since the bits of the MOD table are maintained in the same sequence as the disk bins, and the starting point related to each disk is known, the MOD table can be used to relate disk bins to modified cached data. This information, along with the information in the ADT, LRU, and GAP tables, gives a method of quickly determining which cache bin's data should be written to disk at the next opportunity. See the descriptions of the ADT table items which contain information about disk drive sizes and current read write head positions. There is one segment m the MOD table; it is indexed by an arbitrary value which is calculated from a disk bin number. Each line of the MOPD table is a single 16-bit word. Each word, or line, contains 16 one-bit flags representing the condition of the corresponding 16 disk bins with respect modified data. The reference name for the field is MOD-FLAGS. FORMAT OF CACHE BINS (BAL) TABLE
  • a temporary BAL table is created for each host I/O.
  • the BAL table for a given host I/O is made up of a list of the disk bins involved in the I/O and their corresponding cache bins, if any.
  • a BAL table will contain one line for each disk bin which is part of the I/O.
  • a BAL table contains information necessary for fulfilling the host I/O. This includes a variety of information which includes the entire set of information required to handle the I/O.
  • the indexed portion of the table gives details about each disk bin involved in the host I/O.
  • the unindexed portion of the BAL table contains data describing the I/O.
  • the tabular portion provides actual cache information.
  • these two portions will be referenced as follows: 1.
  • BAL-COMMAND the unindexed section of the BAL table which contains overall information concerning the I/O.
  • BAL-BINS the indexed section which contains information pertaining to the each disk bin associated with the host I/O.
  • the unindexed items in the BAL-COMMAND section describe the host I/O, and include the following single-valued items.
  • BALC-DISK-ID the logical spindle to which the host I/O was addressed.
  • BALC-ADDRESS the logical sector number of the first sector of data, on the specified logical spmdle, to be transferred for the host I/O.
  • BALC-SIZE the number of sectors of data to be transferred m the host I/O.
  • BALC-HIT A flag indicating whether or not all the data (for a read command) , or the whole data area (for a write command) , for the entire host I/O is represented m cache.
  • the indexed items m the BAL-BINS section describe the details about each disk bin involved in the host I/O. There is one line in the table for each disk bm involved in the host I/O. Each line of a BAL table contains the following items. 1. BALB-DBIN - the disk bm number.
  • BALB-CBIN The corresponding cache bm, if any, in which data of the disk bin is currently cached.
  • BALB-BEGSEC The beginning sector number withm the disk bm required for the host I/O.
  • BALB-ENDSEC The final sector number withm the disk bm required for the host I/O.
  • BALB-VALID-LOW The beginning sector number withm the cache bm which is in cache and which is required for the host I/O. For a cache hit, this will match the value m BALB-BEGSEC if this is the first bm involved in the I/O. For a cache hit on subsequent bins, if any, this will indicate the first sector (sector 0) of the bin.
  • BALB-VALID-HIGH The final sector number within the cache bm which is in cache and which is required for the host I/O. For a cache hit, this will match the value in BALB-ENDSEC if this is the last bm involved in the I/O; for a cache hit on 97/49037 PC17US97/10155 ⁇
  • BALB-GAPS A marker indicating whether or not there are any gaps in the required cached area of this cache bm.
  • the caching operations of the described device are based, in part, on the concepts of drive modes and cache statuses. DRIVE MODES
  • the two types of drive modes are the global drive mode and a mode for each individual drive.
  • the modes are based on the number of modified cache bins assigned to each drive.
  • the global mode is based on the total number of modified cache bins for all drives. See Figure 9.
  • the purpose of the modes is to control the described device's actions with respect to its discretionary disk activity such as the background sweep, cache ahead, read ahead, and the cache management of recycling.
  • An illustration of a drive cache with respect to drive operating modes is given m Figure 5.
  • An illustration of the global cache with respect to global operating modes is given in Figure 7.
  • the cache of each drive is always in one of the defined drive modes.
  • the possible drive modes are normal, timeout, sweep, urgent, and saturated.
  • Table CM3 shows the rules for setting the drive modes.
  • CFG-DMURGNTB ⁇ dp ⁇ CFG-DMSATURB urgent may af f ect performance
  • the global cache of the described device is always in one of the defined global modes.
  • the possible global modes are normal, urgent, and saturated.
  • Table CM4 shows the rules for setting the global drive mode. See Figure 17.
  • the operating rules for a given drive are based on that drive's operating mode.
  • the drive mode governs various cache activities relating to the corresponding disk drives; in particular, it governs, drive by drive, the activation of the background sweep modules, the cache-ahead modules, recycling, and the read-ahead modules.
  • Table CM1 summarizes the drive mode -operating rules relationships.
  • the operating rules for the drives may be overridden by the rules based on the global operating mode. If the global mode so indicates, a drive may be forced to operate in a manner other than would be indicated by its own cache mode.
  • Table CM2 summarizes the relationships between the global modes and the drive modes. Global mode cedes control of individual drives to drive mode except as noted in table CM2. In case of conflict between the rules of the two types of modes, the global mode operating rules override the drive mode operating rules.
  • cache chain statuses There are two types of cache chain statuses, global cache status and a drive cache status for each drive.
  • the purpose of the statuses is to help to manage the cache activities, such as to provide plateaus for amounts of cache assigned to each drive under the varying balance of workloads among the drives.
  • the global cache status is based on the number of cache bins in the global cache.
  • Each drive's cache status is based on the number of bins in each drive's cache chain. While there could be any reasonable number of cache statuses, for purposes of this discussion, there will be assumed to be four; these are given names of minimal, marginal, normal, and excess. In the relationships between the statuses, excess is considered above, or higher than, normal; normal is considered above, or higher than, marginal; and marginal is considered above, or higher than, minimal.
  • One of the functions of the statuses is to facilitate the reallocation of cache from one drive to another as different drives become the most active of the drives. As a drive becomes the target of many host I/O's in a short period of time, the cache assigned to that drive will be drawn from the other drives in an orderly fashion. The other drive which is the least active of those in the highest status will give up cache first.
  • cache from the next least active drive will be drawn off until that drive reaches that same lower level .
  • Figures 4 and 6 The sizes of the categories of cache chains on which the statuses are based, and are also shown in Figures 4 and 6. These sizes are chosen such that they enable the described device to act in an efficient manner based on the current cache conditions.
  • Figure 57 illustrates the growth and recession of the cache allocations using the three plateaus as described herein.
  • Figure 58 illustrates the cache allocations with an assumed five plateau configuration, a logical extension of the described concepts.
  • Each cache chain status defines a condition or situation under which the corresponding component is operating and how that component interacts with other components.
  • the minimum status is the one in which the component cannot under any condition give up a cache bin.
  • this status generally defines the number of cache bins required to handle one host I/O of the largest acceptable size.
  • this status generally defines the number of cache bins required to maintain the cache chains intact . Assuming more than one drive is configured mto the described device, not all components can simultaneously be in the minimal status except during a period of a large number of writes by the host .
  • the marginal cache status is the one in which the component has sufficient cache bins available to operate but is not operating in the optimal manner. Assuming more than one drive is configured mto the described device, not all components can simultaneously be in the marginal status except during a period of a large number of writes by the host .
  • the marginal cache status defines the smallest the cache chain for a given drive may become when the described device is operating m the generally normal fashion. In other words, each drive will usually have at least the marginal amount of cache protected from depletion by the needs of other drives.
  • the normal cache status is the one which the device logic desires to maintain the component for best overall device performance.
  • a very active drive will generally operate with a number of cache bins hovering m the neighborhood of the upper limit of the normal status.
  • a very inactive drive will generally operate with a number of cache bins hovering m the neighborhood of the lower limit of the normal status.
  • the excess cache status is the one in which the component has more than the desired maximum cache bins assigned to it for optimal overall device performance.
  • the global cache chain will begin operation m this status when the device is powered up. As various drives become active, the global status will move into the normal status. A drive will not likely ever be m the excess status.
  • the primary purpose of the excess status is to delineate the upper bound of normal cache status. This is important to maintaining the balance of the caches assigned to the various drives under changing workloads DRIVE CACHE CHAIN STATUS DETERMINATION
  • the cache chain for each drive is always in one of the defined drive cache chain statuses. Table CS1 shows the rules for setting the drive cache statuses.
  • the global cache chain is always m one of the defined global cache chain statuses .
  • Table CS2 shows the rules for setting the global cache status.
  • the global and drive cache chain statuses interact to determine where to find the cache bm to be reused.
  • the drive may acquire the bin via one of three types of actions. See Figure 11.
  • Stealing is the preferred method for a drive to obtain a cache bm when a drive needs one for any purpose, for a read of any nature or a write.
  • a drive which needs a cache bm to reuse may, depending on the cache chain statuses, "steal" a bm from the global cache chain. Since the drive is stealing the cache bm, it need not give up a cache bm; the global cache chain can provide a cache bm in some manner. The data in the global LRU cache bm is decached and the cache bm is made available for the drive's use.
  • the global cache may be compensated by taking a cache bm from some other drive.
  • the global cache will usually take a bm from the least active drive in order to maintain the global cache withm the normal status.
  • the global cache chain is in normal status. 2. No other drive has a cache chain status better than marginal, 3. The cache chain status of the stealing drive is not excess.
  • the following set of conditions for stealing will generally occur only during the startup time of the device when nothing is in cache. These conditions are:
  • the global cache chain is in excess status.
  • the set of conditions for stealing a cache bin with compensation will be the normal ones encountered during the operation of the described device. These conditions are: 1. The global cache chain is not in the excess status .
  • a bin is to be stolen from the least active cache chain which has the highest cache status.
  • the global chain is considered to be the most active. See Figures 11 through 16. The following general logic is used: 1. The best, least active chain is identified.
  • the method for obtaining that cache bin depends, to some extent, on the intended usage of the cache bm.
  • the first set of conditions for buying a cache bm 1.
  • the buying drive' s cache chain must be m the excess status; the cache bin may be used for either a read or a write.
  • the second set of conditions for buying a cache bm is a combination of the following:
  • the global cache chain must be in the minimal or marginal status.
  • the buying drive ' s cache chain is not m the excess status .
  • No other drive m the least active drive list has a status better than marginal.
  • the third set of conditions for buying a cache bm is a combination of the following:
  • the cache bm is to be used for a read operation.
  • the global cache chain must be m the minimal status .
  • the buying drive's cache chain is in the minimal status.
  • No other drive in the least active drive list has a status better than marginal .
  • the buying drive's LRU cache bm is rechamed to the MRU position of the global cache chain.
  • the global LRU cache bm is rechamed to the MRU position of the cache chain of the drive requiring a cache bm.
  • the data in the global LRU cache bin is decached and that cache bin is made available for the drive's use. If no drive is found to be able to donate a cache bin, the system is overloaded with modified bins, i.e. in a saturated state. In this condition, the management of all drives is actively trying to write modified bins to the corresponding drives. A drive begging for a bin must wait until a cache bin becomes available from some drive's cache, even from its own. DECACHE DATA FROM A CACHE BIN
  • the references to the corresponding drive bin are updated in the ADT table, LRU table, and GAP table, if any, are updated to show that the drive bin whose data currently was cached in the given cache bin is no longer cached.
  • a bin that is a candidate for decaching will never have any references in the MODIFIED BINS table since that would indicate the host has written some data into this cache bin which has not yet been written to the drive.
  • such a cache bin would be in the modified pool, and not in the global cache chain.
  • the cache bin to be decached will be the cache bin currently at the LRU position of the global cache chain. This generally will be a cache bin whose data has not 97/49037 PCMJS97/10155
  • the cache bm chosen is not necessarily the one with absolutely longest time since access; due to the dynamic and ever-changing assignment of the described device ' s cache mto a global cache chain and multiple private cache chains for each drive, there may be some cache bm assigned to a drive other than the one currently involved in the reason for the decaching which has been cached but unreferenced for a longer period of time. This exception is intentional, it is part of the design which is intended to prevent activity related to any one drive from creating excessive interference with the caching performance of data for another drive; it enhances the effectiveness of the described caching scheme.
  • the primary condition which must be satisfied in order for data m a cache bm to be decached is that the bm must be inactive; that is, it is not at this instant the subject of any host or background activity. It is highly unlikely that the global LRU bm would have any activity since most activities would reposition it out of the global cache chain.
  • a gap is created when data destined for more than one portion of a disk bm are cached in a single cache bm, the cached portions of data are both in the modified or dirty condition, and the cached portions are not contiguous. This is dealt with by bookkeeping in the LRU and GAP tables. It is generally desirable to eliminate gaps as much as possible since they complicate the process of determining cache hits. There are several possibilities for the relationship of the location of new data sent from the host with respect to previously cached data.
  • the new data's locality may relate to the localities of previously cached data m a variety of ways.
  • the new data may share no cache bins with the old, in which case, no gaps in the modified data will occur. If the new data does share cache bins with previously cached data, it may share the cache bins with old data m several ways. If the new data is contiguous to, or overlapping the old modified, cached data, no gaps can exist If all the previously cached data is clean, no gap is allowed to be created since some or all of the old data may be logically decached to avoid the gap.
  • a gap may be eliminated in either of two ways, the method being selected to be the most beneficial to the overall operation of the device. The decision of which method to use in eliminating a gap is based on the relative sizes of the dirty data segments, the current drive mode, and the global mode. If the modes indicate that there is a relatively large amount of modified data to be written from cache to disk, the decision will be to write data to eliminate the gap.
  • the data that should be located in the intervening space may be read from the disk mto cache and marked it dirty even when it is not dirty; the gap will no longer exist. This method is chosen when the drive and global modes are both normal, and the ratio of the gap size to the size of the smaller of the adjacent cached areas is less than a predetermined value.
  • the data occupying the modified areas withm the cache bm may be written from cache to disk; in this case the gap is eliminated by cleaning the dirty pieces of data and then decachmg one of the cached areas .
  • the data to be decached is selected based on the relative sizes and directions within the cache bm of the cached portions so as to retain the data more likely to be useful in cache. This is usually the larger of the two segments, but may, under some circumstances be the smaller. Regardless of the method chosen to eliminate a gap, or gaps in the modified data withm a given cache bm, the process involves several steps. See Figures 39, 51, and 50. 1. The GAP table is updated to show a gap elimination is in progress on a given cache bm. 2. The LRU table is updated to show a gap elimination is in progress on a given cache bin.
  • a read from, or a write to, a disk drive is initiated.
  • the device manager continues to handle other tasks, both for the host and background.
  • the gap elimination module completes the LRU table and GAP table bookkeeping required to eliminate the gap.
  • a gap is an area within a bin, its size being the number of sectors between the end of the modified data area preceding the gap and the beginning of the modified data area following the gap.
  • gapsize is the number of sectors in the gap.
  • forward-size is the size of the cached portion within the cache bin which is contiguous to the forward-most sector of the gap.
  • backward-size is the size of the cached portion within the cache bin which is contiguous to the rearward-most sector of the gap.
  • forward-ratio is gapsize divided by forward-size.
  • backward-ratio is gapsize divided by backward-size.
  • gapw ⁇ te is a preset ratio of the gapsize to forward-size or backward-size, which ratio, when exceeded, causes the gap to be eliminated by:
  • One of the goals of the present invention is to have in cache, at any given time, that data which is expected to be accessed by the host m the near future. Much of the data retained m cache has been placed there either directly by the host or has been read from the disk as a direct result of a host activity. In addition to that data, it is desirable to anticipate future host requests for data to be read from the device, and to prefetch that data from the disk into cache. There are several aspects to the prefetching of data. Some of these have a positive effect and some have a negative effect on performance. On the positive side, successful prefetching into cache of data that is actually requested by the host precludes read cache misses. This improves the overall device's performance.
  • a read cache-ahead action is a background type of activity, and only uses the private channel between disk and cache, it will have a minimal negative impact on the caching device's response time to host I/O activity.
  • the cache-ahead is given a lower priority than any incoming host I/O request.
  • the controller checks for the desirability for a cache-ahead after every host I/O which is a read operation regardless of whether the I/O was a cache hit or a cache miss.
  • a major factor in limiting the cache-ahead activity is the lack of need for its operation following most host I/O's.
  • the described caching device determines the number of data segments of the same size as the current host I/O which remain between the location of the end of the current host I/O data and each end of the cached bm containing that data. If this computed number of data segments is more than a predetermined number, the cache unit can handle that number of sequential, contiguous host read I/O's withm that cache bm before there is a need to fetch data for the succeeding bm from the disk into the cache memory.
  • the caching device will attempt to initiate action to fetch the data from the succeeding disk drive bm so that the service to the host can proceed with the least disk-imposed delays. Conversely, if the caching device were to ignore the above-described locality factor and always fetch the next data bm after every cache read-miss, many unneeded bins of data would be fetched from disk into cache memory.
  • the forward bin is cached at this time. If the succeeding bm is already cached, the bm preceding the host I/O is considered; if it is not already cached, and the proximity factor favors caching, the data from the preceding bm is cached at this time. Of course, if both of these candidate bins had been cached previously, the cache-ahead module has no need to do any caching. A very important benefit accrues from this cache-ahead, cache-back feature. If related bins are going to be accessed by the host m a sequential mode, that sequence will be either in a forward or backward direction from the first one accessed in a given disk area.
  • the background sweep and prefetches for a given drive alternately use the resources. See Figure 37.
  • any scheduled prefetches can proceed without concern for the sweep.
  • the sweep can proceed to use the resources as needed.
  • the cache-ahead proceeds as shown in Figure 34.
  • the cache management can do a cache-ahead in two steps.
  • the first step would entail a seek to position the read-write head of disk drive to the proper disk track. If the drive were needed for servicing a higher priority operation, the cache- ahead could be aborted at this point. If no such operation exited at the end of the seek, the cache management would proceed to read the data from disk mto cache to complete the cache-ahead operation.
  • the present invention is designed to handle operations between the host and the cache and between the disks and cacne simultaneously.
  • a system of cache bm locks is incorporated to ensure that no data is overwritten or decached while it is the object of some kind of I/O activity.
  • the cache bin lock flags are mcluded in the LRU table; see the section on the LRU format for a description of those flags. At any given time, a bm may be locked for more than one reason; all locks must be considered in determining if simultaneous operations may proceed. The most restrictive lock prevails. If a cache bin is found locked by a background task, there is no problem, since background tasks can be delayed. If a host I/O request involves a locked cache bm, there can be one of three results based on the lock flags of a bm: the new operation may proceed, the new operation may be delayed (queued for later completion) , or, m rare cases, it may proceed using an alternate cache bm. The following notes discuss the various considerations and are referenced in the tables describing the handling of potential conflicts.
  • GNA A background operation usually will not be initiated if its target cache bm is currently locked for any reason.
  • GNB Multiple operations involving a given cache bin will be handled in the order received or initiated, all other conditions being equal . This is especially important when one or both operations are modifying the data in the cache bin.
  • GNC Use of a newly assigned, alternate cache bin for a host I/O results in the decaching of the cache bin currently assigned to the disk bin. Only clean bins can be decached.
  • Bins that are the subjects of cache aheads or read aheads may be abandoned at the end of those operations if so doing will contribute to the overall performance of the present invention.
  • a cache bin that contains any modified sectors is considered dirty, and resides in the pool of modified bins.
  • a read hit refers to the existence of both the cache bin and the currently valid (cached) sectors m that cache bin.
  • Gaps are taken into account when determining read hits.
  • a locked bin effectively causes a read hit to be handled as a read miss since a locked bin will delay fulfilling the host request.
  • a read miss may result in a read fetch, or in both a read fetch and a read ahead.
  • the LRU table is updated for valid sectors at the time each read fetch and each read ahead is completed.
  • a write hit refers to the cache bm only; the existence or absence of valid sectors is not considered for the write hit/miss determination.
  • a host write immediately marks the sectors being written from the host to the cache as modified in the LRU table, makes the target cache bm dirty, and removes the cache bm from its cache chain (placing the cache bm m the pool of modified bins) .
  • a host write does not mark the sectors being written from the host to the cache as valid in the LRU table until the operation is completed.
  • a host write may modify the currently valid sectors in cache bins, may extend the valid area, create a gap, or do some combination of these.
  • WMB WMB.
  • a host write immediately marks the sectors being written from the host to the cache as modified in the LRU table, makes the target cache bm dirty, and removes the cache bin from its cache chain.
  • a host write does not mark the sectors being written from the host to the cache as valid in the LRU table until the operation is completed.
  • WMC A host write may modify the currently valid sectors in cache bins, may extend the valid area, create a gap, or do some combination of these.
  • RFA reads data from the disk mto cache.
  • a read fetch occurs only as a result of a read miss, and the primary purpose of the fetch is to satisfy the direct requirements of the read miss .
  • RFB uses an assigned cache bm but does not mark the sectors being read from disk mto cache as valid m the LRU table until the disk read (fetch) operation is completed.
  • a read fetch occupies the cache to disk I/O path and precludes other, simultaneous operations requiring that same path.
  • a bm locked for a read fetch cannot be decached to accommodate use of an alternate cache bm, since an active host read is waiting for the data being fetched, and which was resident in the present invention prior to this new host write.
  • a read ahead refers to the reading of data from disk mto cache of the portion of the data in the disk bm succeeding the sectors covered by the read fetch which satisfied the read miss which triggered the read ahead.
  • RAB uses an assigned cache bm but does not mark the sectors being read from disk mto cache as valid m the LRU table until the disk read operation is completed.
  • a read ahead bin may be abandoned if it is clean, and a subsequent, overlapping (in time) host I/O operation can be handled more quickly by assigning and using another cache bm instead of the read ahead bm. In this case, the bin that is the subject of a read ahead will be abandoned prior to or at the end of that operation, and the abandoned cache bins made available for immediate reuse.
  • a read ahead occupies the cache to disk I/O path and precludes other, simultaneous operations requiring that same path.
  • CACHE AHEAD NOTES CAA A cache ahead is the result of the proximity logic determining that the data in a disk bm adjacent to data currently residing m a cache bin should be read from disk into cache.
  • a cache ahead assigns a cache bm to a disk bin; the cache bm is always clean and is immediately placed m the cache chain of the owning drive, but no sectors will be marked valid in the cache bin until the cache ahead is completed.
  • a cache ahead bm may be abandoned if a subsequent, overlapping (m time) host I/O operation can be handled more quickly by assigning and using another cache bin instead of the cache ahead bin. In this case, the bm that is the subject of a cache ahead will be abandoned prior to or at the end of that operation, and the abandoned cache bins made available for immediate reuse.
  • CAD A cache ahead occupies the cache to disk I/O path and precludes other, simultaneous operations requiring that same path.
  • a sweep write for a clean cache bm is a contradiction of terms, and cannot occur.
  • a sweep only deals with a dirty bin.
  • SWB A sweep write operation does not alter any data m a cache bm, and does not change the valid area of the cache bm.
  • a sweep write occupies the cache to disk I/O path and precludes other operations requiring that same path.
  • SWD. A host write may occur on a bm locked for a sweep write; however, the bm and its (newly) modified sectors must be recorded as dirty when the sweep and host write are completed, rather than being completely clean as at the completion of an uncontested sweep write.
  • a gap can occur only in modified portions of the data in a cache bm, and thus the bm is dirty.
  • a gap read for a clean cache bm is a contradiction of terms, and cannot occur.
  • a gap read is presumed to alter data m a cache bm.
  • a gap read occupies the cache to disk I/O path and precludes other, simultaneous operations requiring that same path
  • GRD GRD.
  • a gap read uses an assigned cache bm but does not mark the sectors being read from disk mto cache as valid in the LRU table until the disk read operation is completed.
  • a gap can occur only in modified portions of the data in a cache bm, and thus the bm is dirty.
  • a gap write for a clean cache bm is a contradiction of terms, and cannot occur.
  • GWB A gap write takes the sectors that are about to be decached out of the valid area when the gap write is initiated.
  • GWC A gap write occupies the cache to disk I/O path and precludes other, simultaneous operations requiring that same path.
  • OPERATION CONDITION LOCKED BY COMMENT NOTES sweep clean any impossible SWA sweep dirty cache ahead impossible CAB sweep dirty read ahead impossible GNA sweep dirty read fetch impossible GNA sweep dirty sweep impossible GNE sweep dirty gap read impossible GNA sweep dirty gap write impossible GNA sweep dirty read hit proceed SWB sweep dirty read miss impossible RMA,RFB,RAD sweep dirty write hit impossible GNA sweep dirty write miss impossible GNA
  • An important goal of the described methodology is the retention in cache, at any given moment m time, of data which is most likely to be requested by the host in the near future.
  • One of the mechanisms of the present invention for accomplishing this goal is the recycling of cached bins based on recent history of usage of the data in each cache bm. Recycling, m its simplest form, is the granting of a "free ride" through the MRU-to-LRU cycle. Whenever a cache bm containing data previously accessed by the host is re-accessed by a read command from the host, information associated with that cache bm is updated in such a way as to indicate that bin's recent activity.
  • a write operation by the host does not contribute to the recycling of a bm, since a write miss is usually handled at the same speed as a hit, and, therefore, a much smaller benefit would accrue from recycling based on host write activity. It is likely the present invention's performance benefits as much or more from the availability for reuse of cache bins whose primary activity was host writes as it would from recycling such cache bins.
  • the recycling information is inspected whenever the cache bin reaches or nears the LRU position of the global cache chain. Normally, when a cache bin reaches the global LRU position, it is the primary candidate for decaching of its data when the cache bm is needed for caching some other data.
  • the LRU cache bm may be placed at the MRU position of the drive's private cache instead of being decached and reassigned.
  • This action provides the currently cached data in that cache bin one or more "free rides" down through the private and global cache chains, or in other words, that data's time in cache is increased.
  • the recycling management decides whether to place the recycled cache at the MRU position of its drive cache chain, or at the MRU position of the global cache chain.
  • the cache bin is not to be recycled.
  • the resulting value is one half or more, the integer portion of the resulting value is placed in the recycle register, and the cache bm is recycled.
  • the recycle register is divided by a preset factor based on the urgent mode. See the recycling control parameters in the CFG table description.
  • the resulting value is one half or more, the integer portion of the resulting value is placed in the recycle register, and the cache bm is recycled.
  • the data from the host is placed in cache and the cache bins are assigned to the specified disk and placed in the modified bins pool.
  • the data will be written from the cache to its specified disk in the background, minimizing the impact of the disk operations on the time required to service the host I/O.
  • the modules that handle this background activity are collectively known as the background sweep modules.
  • To limit the sweep activity, and thus limit contention for the spindles only the data in those portions of cache bins which have been modified are written from SSD to disk during a sweep.
  • Each disk's sweep is influenced by the mode in which the drive is currently operating, and is triggered independently of the other drives' conditions except when the global cache is saturated.
  • the background sweep modules do not usually copy data from cache to disk as soon as the data has been modified. Rather, the sweep modules remain dormant until some set of conditions justify its operation.
  • the background sweep for a given disk can be awakened by any of three sets of circumstances. These circumstances are:
  • the drive enters the sweep mode.
  • a drive is placed in the sweep mode when the number of cache bins containing modified data for that drive exceeds a preset threshold;
  • the drive enters the timeout mode.
  • a drive is placed in timeout mode when a specified amount of time has elapsed since the data in the oldest modified cache bin was written from the host to the specified disk's cache;
  • the global cache is m saturated mode, and there is some modified data waiting to be written from this disk's cache to the disk. Global cache is placed in saturated mode when some prespecified amount of all cache bins contains modified data.
  • a drive is placed m sweep mode when its count of modified cache bins surpasses some preset number.
  • the modified data from some number of cache bins will be written to the disk.
  • the number of cache bins from which data is to be written to disk is equal to the number of cache bins containing modified data at the time the count placed the drive m sweep mode. Since more cache bins may be written into by the host while the sweep is active, this sweep may not reduce the number of modified cache bins to zero. It is important that this limitation on the number of cache bins to be written exists since, otherwise, the sweep could be caught up in a lengthy set of repetitious writes of one cache bm.
  • the drive may be placed in the timeout mode.
  • a timeout occurs when data in a cache bin has been modified, the corresponding bm on disk has not yet been updated after a certain minimum time has elapsed, and the sweep for that disk has not been activated by the modified count. See Figure 45.
  • a timeout occurs for a given disk's cache, by definition there will be data in at least one cache bm which needs to be copied to disk. At this time, the given disk's cache will be placed m the timeout mode.
  • a sweep which has been initiated by a timeout unlike a sweep triggered by the counter, will write all modified cache bins for that disk drive to the disk before the drive sweep is terminated. Note, however, that this is a background activity, and as such, still has a lower priority than the handling of host commands.
  • the global cache may be forced into the saturated mode.
  • the global mode overrides the individual drive modes and conditions, and the sweep is activated for all drives for which there exist any modified cache bins.
  • the sweep for each drive behaves as though there had been a timeout. This method of triggering all the sweeps is for the purpose of making maximum use of the simultaneous cache-to-disk operations. As soon as the global crisis is past, the drive sweep operations will revert to individual drive control.
  • the background activities for each drive are handled; see Figure 37.
  • a write from cache to disk may be initiated.
  • the modified cache bin corresponding to the disk bin which is nearest, but not directly at, the disk read-write head position is identified. See Figure 47. If the read-write head for the drive is not currently located at the address of the disk bin corresponding to the modified bin a seek is initiated on the drive to the identified disk bin, and the sweep takes no further action at this time. If no other activity directly involving the disk drive occurs before the sweep again is given an opportunity to write a modified cache bin, the same cache bin will be identified, and this time, the head-position will be proper to continue with the write.
  • the command is a read and it can be serviced entirely from cache, it is serviced by the read-hit portion of the controller (see description of read-hit handling) .
  • the command is considered a read cache miss, the information required to service it is queued, and the storage device disconnects from the host. See Figure 30.
  • the background manager will note the existence of the queued tasK and will complete the handling of the read miss. See Figures 37, 40, and 42.
  • the command is a write and all bins involved in the operation are already in cache, the command is serviced by the write-hit portion of the controller. See Figure 32 and description of write-hit handling. If any portion of the write involves an uncached bm or bins, the command is turned over to the write-miss portion of the controller.
  • this drive is rechamed upward m the LAD list. Additionally, the total disk accesses is incremented. If this makes the total accesses reach the LAD maximum, the total accesses field is reset to zero, and the access count for each drive is recalculated to be its current value divided by the LAD adjustment factor.
  • This overall LAD procedure is designed to temporarily favor the most active drive (s) but not allow one huge burst of activity by one drive to dominate the cache management for an overly long period of time.
  • the analysis of a host command includes creation of a bm address list (BAL) which contains the locations of each bm involved in the operation (see description of bm address list setup) .
  • BAL bm address list
  • the list will contain the bin's current location in cache, if it already resides there; or where it will reside in cache after this command, and related caching activity have been completed.
  • the space for it to be put mto m cache is located, and the current bm resident in that space is decached.
  • the analysis includes setting the cache hit/miss flag so that the controller logic can be expedited. See Figure 19.
  • the controller segment which sets up the bm address list uses the I/O sector address and size to determine the disk bin identifying numbers for each bm involved in the I/O operation as described m the section below. See Figure 20. The number of bins involved is also determined, and for each, the portion of the bm which is involved m the operation is calculated.
  • a sector address can be converted into a bin address by dividing it by the bm size. The quotient will be the bm number, and the remainder will be the offset mto the bm where the sector resides. See Figure 21. CACHE HIT/MISS DETERMINATION - READ
  • Each line of the bm address list is inspected, and if, for a given disk bm, a corresponding cache bm is shown in the ADT table to exist, that information is copied mto the corresponding line of the BAL list.
  • the valid sectors information in the LRU table for the bm is compared with the sectors required for the current I/O.
  • Each line of the bm address list is inspected, and if, for a given disk bm, a corresponding cache bm is shown in the ADT table to exist, that information is copied mto the corresponding line of the BAL list. If any bins are not in cache, the cache-miss marker is set.
  • CACHE READ-HIT OPERATION Refer to Figure 29.
  • an I/O read command must have been received from the host. The command will have been analyzed and the bm address table will have been set up, and it has been determined that all data required to fulfill the read command is available in cache. With this preliminary work completed, the host read command can be satisfied by using each line of the bm address table as a subcommand control.
  • a cache read hit is satisfied entirely from the cached data without disconnecting from the host.
  • the caching firmware executes several operations at this time: 1.
  • the requested data must be sent to the host. Since all required portions of all affected bins are already in cache, all required data can be sent directly from the cache to the host .
  • the affected bins will be rechamed to become the most-recently-used (MRU) cache bins in the LRU cache chain for the drive. This may involve moving the cache b ⁇ n(s) from the global cache chain to the specific drive's cache chain.
  • MRU most-recently-used
  • the LRU table is updated to reflect the fact that this data has been accessed by the host; if the recycling register value for any cache bin involved m the read hit has not reached its maximum allowable value, it is increased by one to provide for the possible recycling of the cache bm when that bm reaches the LRU position of the cache chain. 4. Proximity calculations are performed to determine the desirability of scheduling a potential read-ahead of an adjacent disk bm. Refer to the discussion on cache-ahead and see Figure 8. 5. The statuses of the global cache chain and the specified drive cache chain must be updated. CACHE READ-MISS OPERATION
  • a cache read-miss ( Figure 30) is satisfied m part or wholly from the disk. In order to reach this module, an I/O read command must have been received from the host. The command will have been analyzed and the bin address table will have been set up, and it has been determined that some or all data required to fulfill the read command are not available in cache. A cache read-miss is handled in several steps: 1. See Figure 30. The information required to handle the read command is saved in a read queue for later use.
  • the storage device logically disconnects from the host.
  • Steps 3 through 6 are repeated for each disk bm involved m the host request .
  • this module will, if the bin was linked mto either the drive cache chain or the global cache chain, remove the affected cache bm from the cache chain and place it in the modified pool. In each case, the corresponding LRU table is updated to reflect any changes resulting from the existence of this new data. If the new data created any gaps in the modified portions of the bins, the GAP table is also updated accordingly in order to handle any needed post-transfer staging of partial bins. UPDATING THE GAP TABLE
  • Adding a gap reference for a given cache bm is handled in several steps; if no previous gaps exist for the cache bm, the LRUB table gap pointer item will be null. To indicate a gap exists in the modified data of the cache bm, the pointer is set to the number of the first available line in the GAP table. The referenced GAP table line is then filled in with the information about the gap.
  • a write miss is usually handled entirely withm the cache.
  • the host command will have been analyzed and the bm address list will have been set up. With this preliminary work completed, the host write command can be satisfied by using each line of the bm address list as a subcommand control. Since this is a cache-miss, at least one of the addressed disk bins has no related cache bm assigned to it. One of two conditions will exist; 1) the device is operating in global saturated mode, or 2) it is not operating in global saturated mode. CACHE WRITE-MISS OPERATION WHEN NOT SATURATED If global cache is not operating in saturated mode, all data can be sent directly from the host to the cache without disconnecting from the host.
  • cache bins are selected as needed and assigned to the drive. See Figures 33 and 26. As data is written into each cache bin, the bin, if not already in the modified pool, is removed from its current cache chain and placed in the modified pool, and the MOD table is updated. See Figures 24 and 25.
  • the corresponding LRU table is updated to reflect any changes resulting from the existence of this new data. If the new data created any gaps in the modified portions of the bins, the GAP table is also updated accordingly in order to handle any needed post-transfer staging of partial bins. Since the writing into cache bins may change the status and modes of both the drive cache and the global cache, the drive status, the drive mode, the global status, and the global mode are all updated.
  • the corresponding ADT, LRU and GAP tables are updated to reflect any changes resulting from the existence of this new data.
  • the new data may change the status and modes of both the drive and the global cache, so the drive status, the drive mode, the global status, and the global mode are all updated.
  • the queued seek will not be carried out as long as the disk is busy handling host read cache misses, or if the drive is busy on background sweeps or cache-ahead actions, or if the cache modes or statuses indicate it would be intrusive to the overall caching performance to use a cache bm for the seek action.
  • the disk When the read from disk to cache completes, the disk will generate an interrupt. See Figure 50.
  • the seek termination module will then update the control tables to show the subject disk bin's data is now in a cache bin.
  • the cache bm is rechamed as the MRU bin for the drive. See Figure 52.
  • INTERNAL INTERRUPTS While the present invention is m operation, it monitors its power source. Should the power to the unit be interrupted, for any reason, the device goes through a controlled power-down sequence, initiated as depicted in Figure 54.
  • POWER DOWN CONTROL As depicted in the diagram of Figure 38, this portion of the firmware is invoked when the unit senses that the voltage on the power line to it has dropped.
  • the device Since some of the data in the device may be in cache bins in a modified state and awaiting transfer to one or more of the disks, power must be maintained to the cache memory until the modified portions have been written to their respective disks. Thus, a failure of the line power causes the device to switch to the battery backup module.
  • the battery backup module provides power while the memory device goes through an intelligent shutdown process. If the host is m the process of a data transfer with the memory device when power drops, the shutdown controller allows the transfer in progress to be completed. It then blocks any further transactions with the host from being initiated. The shutdown controller then must initiate a background sweep for each disk to copy any modified portions of cache bins from the solid state memory to the disk so that such data will not be lost when power is completely shut off to the control and memory circuits.
  • the solid state memory will also reside on the disks. At this point the disk spindles can be powered down, reducing the load on the battery. Most power outages are of a short duration. Therefore, the controller continues to supply battery power to the control circuits and the solid state memory for some number of seconds. If the outside power is restored in this time period, the controller will power the spindles back up and switch back to outside power. In this case, the operation can proceed without having to reestablish the historical data in the solid state memory. In any case, no data is at risk since it is all stored on the rotating magnetic disks before a final shutdown.
  • the final background sweep copies modified portions of data m cache bins from the solid state memory to the magnetic disk. See Figure 55. There will usually be only a few such cache bins, or portions of cache bins to copy for each drive since the number of cache bins that can reach this state is intentionally limited by the logical operation of the system.
  • the final sweep makes use of logic developed for the normal timeout operation of the background sweep. The sweep is initiated in much the same manner as for a timeout during normal operation. If no cache bins for a given drive contain data which need to be copied, the sweep for that drive is left in the dormant state, and no further sweep action is required for the drive.
  • the sweep control sets up and initiates a background write event for the drive. Writes for all drives are executed simultaneously until all data from modified cache bins has been written from cache to the respective drives. When no modified cache bins remain to be copied, the sweep is finished.
  • the present invention includes a capability for utilizing an external terminal which can communicate with the device ' s executive control via a serial port.
  • This communication facility can handle several types of activities. See Figure 56.
  • the serial port may make inquiries to obtain data about the workloads the device has been encountering, such as the numbers of I/O's by the host, the current cache condition, the history of the caching operations, etc. This information is sufficient to allow external analysis to determine, as a function of time, levels of performance, frequency of occurrence of various situations such as the background sweep, cache-aheads, and the modes of operation.
  • the serial port can be used to initiate self tests and to obtain the results thereof.
  • the serial port may, under certain circumstances, modify the configuration of the device. For example, a disk drive may be removed from, or added to the configuration. Another example, is the resetting of some of the information in the configuration table, such as the various cache management parameters .
  • the serial port may be used by the device ' s executive to report current operational conditions such as hardware failures or perceived problems, such as excessive disk retries during I/O's between the cache and the disks.
  • control tables or segments of control tables are listed here. These tables and segments represent conditions which could occur at initialization and during the operation of the present invention. A very brief description of each pertinent table field is given here for convenience in interpreting the table data; see the various table format descriptions for more detailed information. Note that the asterisk (*) is used throughout to indicate a null value, and a dash (-) is used to indicate the field has no meaning m that particular circumstance. CONFIGURATION TABLE
  • the CONFIGURATION table is made up of seven sections. 1. SIZING PARAMETERS AND DERIVED VALUES
  • CFG-DRIVEB 62 500 bins capacity of each disk drive.
  • CFG-DRIVES 4 spindles number of disk drives on device.
  • CFG-DSNORMB 820 bins lower limit of drive normal status.
  • CFG-GSEXCPCT 50 percent lower limit percent of all cache in global excess status, CFG-GSEXCESB 4, 096 bins lower limit of global chain in excess status. CFG-GSEXCPCT * CFG-CACHBINS
  • CFG-DMSATPCT 80 percent lower limit percent of modified bins for saturated mode, CFG-DMSATURB 1,638 bins lower limit of modified bins for saturated mode. CFG-DMSATPCT * CFG-CACHBINS /
  • CFG-GMSATPCT 60 percent lower limit percent of all bins modified bins for saturated mode.
  • the complete LRU table is made up of three sections
  • LRU-CONTROL unindexed counters of device activity 2.
  • LRU-DISKS indexed by logical spindle 3.
  • LRU-BINS indexed by logical disk bin LEAST RECENTLY USED CONTROL TABLE, INITIAL
  • LRUC-TOTAL-MOD 0 bins number modified cache bins LRUC-GLOBAL-BINS 8,184 bins number in global cache chain LRUC-GLOBAL-LRU 0 line oldest bin in global cache chain LRUC-GLOBAL-MRU 8,183 line newest bin in global cache chain
  • the values in the fol lowing table are dynamical ly variable ; while there are many possible valid sets of values ,
  • the values in this table represent one possible set of values at a given point in time in the operation of the present invention .
  • LRUC-TOTAL-MOD 40 bins number modified cache bins LRUC-GLOBAL-BINS 1002 bins number global cache chain bins LRUC-GLOBAL-LRU 110 line oldest bin in global cache chain LRUC-GLOBAL-MRU 2883 line newest bin in global cache chain
  • the values are dynamically variable; there are many possible valid sets of values depending on the specific implementation of the present invention, its configuration, and its work load prior to the capture of these LRU table values .
  • the values in 0 this table represent a small sample of one possible set of values at a given point in time in the operation of the present invention. Only a few selected LRU lines are shown.
  • ADT ADDRESS TRANSLATION
  • ADTC - ACCESSES 0 accesses number of host accesses to device 3 5
  • ADTC -READS 0 accesses number of host reads to device
  • ADTC -WRITES 0 accesses number of host writes to device
  • ADTD-LINE-BEG first ADT-BINS line for referenced spmdle ADTD-HEAD-POS current position of read/write head of spindle
  • ADTD-SWEEP-DIR current direction of sweep ADTD-DISK-ACCESSES number of host accesses since last reset
  • ADTD-DISK-READS number of host read accesses since last reset
  • ADTD-DISK-WRITES number of host write accesses since last reset
  • ADTD-LAD-USAGE function based on number of host accesses ADTD-LINK-MORE chain pointer to more active drive in LAD list
  • ADTB-CACHE-BIN logical cache bin containing disk bin data ADTB-BIN-ACCESSES number of host accesses since last reset
  • ADTC-ACCESSES 10,000,000 number of host accesses to device
  • ADTC-READS 8,000,000 number of host reads to device
  • ADTC-WRITES 2,000,000 number of host writes to device
  • the values in this table represent a sample of one possible set of ADT values at a given point in time in the operation of the present invention.
  • the values in this table represent a sample of one possible set of ADT values at a given point in time in the operation of the present invention.
  • the line numbers, disk bin numbers, and disk numbers are implicit and are not in the ADT table, but they are included here for clarity.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

L'invention concerne un dispositif de mémorisation (100) comportant une mémoire à semi-conducteurs (208) et une mémoire à disques (210) assurant un temps de réponse rapide proche des semi-conducteurs pour un volume de travail important et avec une réponse améliorée, au moyen d'un matériel et d'algorithmes permettant de placer et de conserver les données dans le support le plus approprié sur la base de l'activité réelle et projetée. Une méthode sans recherche pour déterminer l'emplacement des données est utilisée. Une mémoire à semi-conducteurs de capacité suffisante permet la rétention de données utiles et actives, ainsi que la pré-analyse des données dans la mémoire à semi-conducteurs. Le transfert des données actualisées de la mémoire à semi-conducteurs au disque et des données pré-analysées du disque à la mémoire à semi-conducteurs s'effectue sous forme de tâche de fond opportune et discrète. Un mécanisme de verrouillage assure l'intégrité des données tout en permettant des opérations sur celles-ci entre l'ordinateur centrale et la mémoire à semi-conducteurs, et entre cette dernière et la mémoire à disques. Des voies privées entre la mémoire à semi-conducteurs et les disques empêchent les échanges entre ces supports d'entrer en conflit avec les transmissions entre l'ordinateur central et le dispositif de mémorisation décrit.
PCT/US1997/010155 1996-06-18 1997-06-12 Nouvelle structure d'antememoire et procede WO1997049037A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU34839/97A AU3483997A (en) 1996-06-18 1997-06-12 Novel cache memory structure and method

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US66573796A 1996-06-18 1996-06-18
US08/665,737 1996-06-18

Publications (1)

Publication Number Publication Date
WO1997049037A1 true WO1997049037A1 (fr) 1997-12-24

Family

ID=24671378

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1997/010155 WO1997049037A1 (fr) 1996-06-18 1997-06-12 Nouvelle structure d'antememoire et procede

Country Status (2)

Country Link
AU (1) AU3483997A (fr)
WO (1) WO1997049037A1 (fr)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0066766A2 (fr) * 1981-06-05 1982-12-15 International Business Machines Corporation Dispositif de contrôle d'entrée-sortie avec une antémémoire dynamiquement réglable
WO1984002013A1 (fr) * 1982-11-15 1984-05-24 Storage Technology Corp Division adaptative en domaines d'un espace d'antememoire
WO1992015933A1 (fr) * 1991-03-05 1992-09-17 Zitel Corporation Systeme d'antememoire et procede d'exploitation de ce systeme
US5307473A (en) * 1991-02-20 1994-04-26 Hitachi, Ltd. Controller for storage unit and method of controlling storage unit
US5434992A (en) * 1992-09-04 1995-07-18 International Business Machines Corporation Method and means for dynamically partitioning cache into a global and data type subcache hierarchy from a real time reference trace

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0066766A2 (fr) * 1981-06-05 1982-12-15 International Business Machines Corporation Dispositif de contrôle d'entrée-sortie avec une antémémoire dynamiquement réglable
WO1984002013A1 (fr) * 1982-11-15 1984-05-24 Storage Technology Corp Division adaptative en domaines d'un espace d'antememoire
US5307473A (en) * 1991-02-20 1994-04-26 Hitachi, Ltd. Controller for storage unit and method of controlling storage unit
WO1992015933A1 (fr) * 1991-03-05 1992-09-17 Zitel Corporation Systeme d'antememoire et procede d'exploitation de ce systeme
US5434992A (en) * 1992-09-04 1995-07-18 International Business Machines Corporation Method and means for dynamically partitioning cache into a global and data type subcache hierarchy from a real time reference trace

Also Published As

Publication number Publication date
AU3483997A (en) 1998-01-07

Similar Documents

Publication Publication Date Title
US4875155A (en) Peripheral subsystem having read/write cache with record access
US5594885A (en) Method for operating a cache memory system using a recycled register for identifying a reuse status of a corresponding cache entry
AU673488B2 (en) Cache memory system and method of operating the cache memory system
US5590300A (en) Cache memory utilizing address translation table
US4779189A (en) Peripheral subsystem initialization method and apparatus
US5325509A (en) Method of operating a cache memory including determining desirability of cache ahead or cache behind based on a number of available I/O operations
US5881311A (en) Data storage subsystem with block based data management
US6389509B1 (en) Memory cache device
US6928518B2 (en) Disk drive employing adaptive flushing of a write cache
US7111134B2 (en) Subsystem and subsystem processing method
JP4819369B2 (ja) ストレージシステム
US20240419364A1 (en) Tiering Data Strategy for a Distributed Storage System
US6889288B2 (en) Reducing data copy operations for writing data from a network to storage of a cached data storage system by organizing cache blocks as linked lists of data fragments
US9785564B2 (en) Hybrid memory with associative cache
US20050086437A1 (en) Method and system for a cache replacement technique with adaptive skipping
US5555399A (en) Dynamic idle list size processing in a virtual memory management operating system
US20030212865A1 (en) Method and apparatus for flushing write cache data
US20130262752A1 (en) Efficient use of hybrid media in cache architectures
US5717884A (en) Method and apparatus for cache management
US7032093B1 (en) On-demand allocation of physical storage for virtual volumes using a zero logical disk
US20040049638A1 (en) Method for data retention in a data cache and data storage system
US6782444B1 (en) Digital data storage subsystem including directory for efficiently providing formatting information for stored records
EP0156179B1 (fr) Méthode de protection de mémoire volatile primaire dans un système de mémoire à étages
US20110252205A1 (en) Managing Access Commands By Multiple Level Caching
US5845318A (en) Dasd I/O caching method and application including replacement policy minimizing data retrieval and storage costs

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AU CA JP KR

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): AT BE CH DE DK ES FI FR GB GR IE IT LU MC NL PT SE

DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
121 Ep: the epo has been informed by wipo that ep was designated in this application
NENP Non-entry into the national phase

Ref country code: JP

Ref document number: 98503124

Format of ref document f/p: F

NENP Non-entry into the national phase

Ref country code: CA

122 Ep: pct application non-entry in european phase