Cicek et al., 2022 - Google Patents
Energy efficient boosting of gemm accelerators for dnn via reuseCicek et al., 2022
View PDF- Document ID
- 10388025999366812905
- Author
- Cicek N
- Shen X
- Ozturk O
- Publication year
- Publication venue
- ACM Transactions on Design Automation of Electronic Systems (TODAES)
External Links
Snippet
Reuse-centric convolutional neural networks (CNN) acceleration speeds up CNN inference by reusing computations for similar neuron vectors in CNN's input layer or activation maps. This new paradigm of optimizations is, however, largely limited by the overheads in neuron …
- 210000002569 neurons 0 abstract description 84
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
- G06F17/30386—Retrieval requests
- G06F17/30424—Query processing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/30—Arrangements for executing machine-instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/30007—Arrangements for executing specific machine instructions to perform operations on data operands
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/30—Arrangements for executing machine-instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline, look ahead
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30861—Retrieval from the Internet, e.g. browsers
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored programme computers
- G06F15/78—Architectures of general purpose stored programme computers comprising a single central processing unit
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/16—Combinations of two or more digital computers each having at least an arithmetic unit, a programme unit and a register, e.g. for a simultaneous processing of several programmes
- G06F15/163—Interprocessor communication
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/50—Computer-aided design
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Error detection; Error correction; Monitoring responding to the occurence of a fault, e.g. fault tolerance
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformations of program code
- G06F8/41—Compilation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F19/00—Digital computing or data processing equipment or methods, specially adapted for specific applications
- G06F19/10—Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Cao et al. | Efficient and effective sparse LSTM on FPGA with bank-balanced sparsity | |
| Mittal et al. | A survey of deep learning on CPUs: Opportunities and co-optimizations | |
| Albericio et al. | Cnvlutin: Ineffectual-neuron-free deep neural network computing | |
| Chung et al. | Linqits: Big data on little clients | |
| Aluru et al. | A review of hardware acceleration for computational genomics | |
| Gong et al. | Save: Sparsity-aware vector engine for accelerating dnn training and inference on cpus | |
| US20080250227A1 (en) | General Purpose Multiprocessor Programming Apparatus And Method | |
| Kim et al. | Accelerating large-scale graph-based nearest neighbor search on a computational storage platform | |
| Lee et al. | Anna: Specialized architecture for approximate nearest neighbor search | |
| Chen et al. | On-the-fly parallel data shuffling for graph processing on OpenCL-based FPGAs | |
| Wang et al. | Accelerating generalized linear models with MLWeaving: A one-size-fits-all system for any-precision learning | |
| Cicek et al. | Energy efficient boosting of gemm accelerators for dnn via reuse | |
| Cong et al. | Best-effort FPGA programming: A few steps can go a long way | |
| Soltaniyeh et al. | An accelerator for sparse convolutional neural networks leveraging systolic general matrix-matrix multiplication | |
| Han et al. | Distme: A fast and elastic distributed matrix computation engine using gpus | |
| Cicek et al. | General reuse-centric CNN accelerator | |
| Chen et al. | fgSpMSpV: A fine-grained parallel SpMSpV framework on HPC platforms | |
| Lin et al. | Hitgnn: High-throughput gnn training framework on cpu+ multi-fpga heterogeneous platform | |
| Yesil et al. | Hardware accelerator design for data centers | |
| Lee et al. | Similarity search on automata processors | |
| Lee et al. | MVP: An efficient CNN accelerator with matrix, vector, and processing-near-memory units | |
| Gupta et al. | Store-n-learn: Classification and clustering with hyperdimensional computing across flash hierarchy | |
| Qararyah et al. | An efficient hybrid deep learning accelerator for compact and heterogeneous CNNs | |
| Jeon et al. | XEM: Tensor accelerator for AB21 supercomputing artificial intelligence processor | |
| Sharafeddin et al. | On the effectiveness of accelerating MapReduce functions using the Xilinx Vivado HLS tool |