Brandalero et al., 2018 - Google Patents

Accelerating error-tolerant applications with approximate function reuse

Brandalero et al., 2018

Document ID: 8128257482633115814
Author: Brandalero M; da Silveira L; Souza J; Beck A
Publication year: 2018
Publication venue: Science of Computer Programming

External Links

Cited by

Snippet

Function reuse is a promising approach to accelerate single-threaded applications and exceed the limits of instruction-level parallelism. This approach exploits the observation that certain functions are executed several times with the same inputs, producing the same …

Continue reading at www.sciencedirect.com (HTML) (other versions)

230000001965 increased 0 abstract description 12

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/30—Arrangements for executing machine-instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/30007—Arrangements for executing specific machine instructions to perform operations on data operands
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/30—Arrangements for executing machine-instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline, look ahead
- G06F9/3824—Operand accessing
- G06F9/383—Operand prefetching
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0806—Multiuser, multiprocessor or multiprocessing cache systems
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformations of program code
- G06F8/41—Compilation
- G06F8/44—Encoding
- G06F8/443—Optimisation
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/34—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation; Recording or statistical evaluation of user activity, e.g. usability assessment
- G06F11/3409—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation; Recording or statistical evaluation of user activity, e.g. usability assessment for performance assessment
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/50—Computer-aided design
- G06F17/5009—Computer-aided design using simulation
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/14—Fourier, Walsh or analogous domain transformations, e.g. Laplace, Hilbert, Karhunen-Loeve, transforms
- G06F17/141—Discrete Fourier transforms
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F1/00—Details of data-processing equipment not covered by groups G06F3/00 - G06F13/00, e.g. cooling, packaging or power supply specially adapted for computer application
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F2207/00—Indexing scheme relating to methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F2207/38—Indexing scheme relating to groups G06F7/38 - G06F7/575
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL

Similar Documents

Publication	Publication Date	Title
Silberstein et al.	2008	Efficient computation of sum-products on GPUs through software-managed cache
Fowers et al.	2012	A performance and energy comparison of FPGAs, GPUs, and multicores for sliding-window applications
Chen et al.	2014	GPU-accelerated sparse LU factorization for circuit simulation with performance modeling
Krommydas et al.	2016	Opendwarfs: Characterization of dwarf-based benchmarks on fixed and reconfigurable architectures
Wong et al.	2016	Approximating warps with intra-warp operand value similarity
Yang et al.	2012	A unified optimizing compiler framework for different GPGPU architectures
Alappat et al.	2022	Execution‐Cache‐Memory modeling and performance tuning of sparse matrix‐vector multiplication and Lattice quantum chromodynamics on A64FX
Ahn et al.	2005	Scatter-add in data parallel architectures
Stevens et al.	2018	AxBA: An approximate bus architecture framework
Cebrian et al.	2015	ParVec: vectorizing the PARSEC benchmark suite
Cooke et al.	2015	A tradeoff analysis of FPGAs, GPUs, and multicores for sliding-window applications
Liu et al.	2019	AxMemo: Hardware-compiler co-design for approximate code memoization
Zardoshti et al.	2016	Adaptive sparse matrix representation for efficient matrix–vector multiplication
Neves et al.	2020	Compiler-assisted data streaming for regular code structures
Brandalero et al.	2018	Accelerating error-tolerant applications with approximate function reuse
de la Cruz et al.	2014	Modeling stencil computations on modern HPC architectures
AlAhmadi et al.	2019	Performance characteristics for sparse matrix-vector multiplication on GPUs
Pereira et al.	2011	Spectral method characterization on FPGA and GPU accelerators
Atoofian	2020	Approximate cache in GPGPUs
Shivdikar	2021	SMASH: Sparse matrix atomic scratchpad hashing
Hofmann et al.	2015	Performance analysis of the Kahan-enhanced scalar product on current multicore processors
Kulkarni et al.	2017	Low overhead CS-based heterogeneous framework for big data acceleration
Al-Hashimi et al.	2017	On the power characteristics of mergesort: An empirical study
Jin et al.	2018	Nuclear Reactor Simulation on OpenCL FPGA: a Case Study of RSBench
van der Sanden	2011	Evaluating the performance and portability of opencl