Kumar et al., 2015 - Google Patents
Fusion: Design tradeoffs in coherent cache hierarchies for acceleratorsKumar et al., 2015
View PDF- Document ID
- 8515970280107087326
- Author
- Kumar S
- Shriraman A
- Vedula N
- Publication year
- Publication venue
- Proceedings of the 42Nd Annual International Symposium on Computer Architecture
External Links
Snippet
Chip designers have shown increasing interest in integrating specialized fixed-function coprocessors into multicore designs to improve energy efficiency. Recent work in academia [11, 37] and industry [16] has sought to enable more fine-grain offloading at the granularity of …
- 230000004927 fusion 0 title description 56
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0806—Multiuser, multiprocessor or multiprocessing cache systems
- G06F12/0815—Cache consistency protocols
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0806—Multiuser, multiprocessor or multiprocessing cache systems
- G06F12/0808—Multiuser, multiprocessor or multiprocessing cache systems with cache invalidating means
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0844—Multiple simultaneous or quasi-simultaneous cache accessing
- G06F12/0846—Cache with multiple tag or data arrays being simultaneously accessible
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0893—Caches characterised by their organisation or structure
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/34—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation; Recording or statistical evaluation of user activity, e.g. usability assessment
- G06F11/3466—Performance evaluation by tracing or monitoring
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F1/00—Details of data-processing equipment not covered by groups G06F3/00 - G06F13/00, e.g. cooling, packaging or power supply specially adapted for computer application
- G06F1/26—Power supply means, e.g. regulation thereof
- G06F1/32—Means for saving power
- G06F1/3203—Power Management, i.e. event-based initiation of power-saving mode
- G06F1/3234—Action, measure or step performed to reduce power consumption
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/16—Combinations of two or more digital computers each having at least an arithmetic unit, a programme unit and a register, e.g. for a simultaneous processing of several programmes
- G06F15/163—Interprocessor communication
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F2212/00—Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
- G06F2212/25—Using a specific main memory architecture
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F1/00—Details of data-processing equipment not covered by groups G06F3/00 - G06F13/00, e.g. cooling, packaging or power supply specially adapted for computer application
- G06F1/16—Constructional details or arrangements
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F2201/00—Indexing scheme relating to error detection, to error correction, and to monitoring
- G06F2201/885—Monitoring specific for caches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F2212/00—Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
- G06F2212/10—Providing a specific technical effect
- G06F2212/1012—Design facilitation
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Kumar et al. | Fusion: Design tradeoffs in coherent cache hierarchies for accelerators | |
| Zhang et al. | Victim replication: Maximizing capacity while hiding wire delay in tiled chip multiprocessors | |
| Cantin et al. | Improving multiprocessor performance with coarse-grain coherence tracking | |
| Ramos et al. | Modeling communication in cache-coherent SMP systems: a case-study with Xeon Phi | |
| Zebchuk et al. | A tagless coherence directory | |
| Kim et al. | Subspace snooping: Filtering snoops with operating system support | |
| Volos et al. | Bump: Bulk memory access prediction and streaming | |
| Koukos et al. | Building heterogeneous unified virtual memories (uvms) without the overhead | |
| Cantin et al. | Coarse-grain coherence tracking: RegionScout and region coherence arrays | |
| Dublish et al. | Cooperative caching for GPUs | |
| Loghi et al. | Cache coherence tradeoffs in shared-memory MPSoCs | |
| Ros et al. | DiCo-CMP: Efficient cache coherency in tiled CMP architectures | |
| Park et al. | Location-aware cache management for many-core processors with deep cache hierarchy | |
| Aggarwal et al. | Power-efficient DRAM speculation | |
| Shim et al. | Library cache coherence | |
| Hossain et al. | Improving support for locality and fine-grain sharing in chip multiprocessors | |
| Shukla et al. | Tiny directory: Efficient shared memory in many-core systems with ultra-low-overhead coherence tracking | |
| Fensch et al. | Designing a physical locality aware coherence protocol for chip-multiprocessors | |
| Foglia et al. | Exploiting replication to improve performances of NUCA-based CMP systems | |
| Barredo et al. | Planar: a programmable accelerator for near-memory data rearrangement | |
| Vermij et al. | An architecture for integrated near-data processors | |
| Salapura et al. | Improving the accuracy of snoop filtering using stream registers | |
| Shi et al. | LDAC: Locality-aware data access control for large-scale multicore cache hierarchies | |
| Soltaniyeh et al. | Classifying data blocks at subpage granularity with an on-chip page table to improve coherence in tiled cmps | |
| Ros et al. | Cache coherence protocols for many-core CMPs |