Preiss et al., 2009 - Google Patents

Advanced clockgating schemes for fused-multiply-add-type floating-point units

Preiss et al., 2009

Document ID: 12002409084096265461
Author: Preiss J; Boersma M; Mueller S
Publication year: 2009
Publication venue: 2009 19th IEEE Symposium on Computer Arithmetic

External Links

Cited by

Snippet

The paper introduces fine-grain clockgating schemes for fused multiply-add-type floating- point units (FPU). The clockgating is based on instruction type, precision and operand values. The presented schemes focus on reducing the power at peak performance, where …

Continue reading at www.academia.edu (PDF) (other versions)

230000001603 reducing 0 abstract description 11

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/38—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
- G06F7/48—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
- G06F7/52—Multiplying; Dividing
- G06F7/523—Multiplying only
- G06F7/533—Reduction of the number of iteration steps or stages, e.g. using the Booth algorithm, log-sum, odd-even
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/38—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
- G06F7/48—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
- G06F7/52—Multiplying; Dividing
- G06F7/523—Multiplying only
- G06F7/53—Multiplying only in parallel-parallel fashion, i.e. both operands being entered in parallel
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F1/00—Details of data-processing equipment not covered by groups G06F3/00 - G06F13/00, e.g. cooling, packaging or power supply specially adapted for computer application
- G06F1/26—Power supply means, e.g. regulation thereof
- G06F1/32—Means for saving power
- G06F1/3203—Power Management, i.e. event-based initiation of power-saving mode
- G06F1/3234—Action, measure or step performed to reduce power consumption
- G06F1/3237—Power saving by disabling clock generation or distribution
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/38—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
- G06F7/48—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
- G06F7/499—Denomination or exception handling, e.g. rounding, overflow
- G06F7/49905—Exception handling
- G06F7/4991—Overflow or underflow
- G06F7/49915—Mantissa overflow or underflow in handling floating-point numbers
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/38—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
- G06F7/48—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
- G06F7/544—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices for evaluating functions by calculation
- G06F7/5443—Sum of products
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/38—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
- G06F7/48—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
- G06F7/483—Computations with numbers represented by a non-linear combination of denominational numbers, e.g. rational numbers, logarithmic number system, floating-point numbers
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/30—Arrangements for executing machine-instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline, look ahead
- G06F9/3867—Concurrent instruction execution, e.g. pipeline, look ahead using instruction pipelines
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/38—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
- G06F7/48—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
- G06F7/50—Adding; Subtracting
- G06F7/505—Adding; Subtracting in bit-parallel fashion, i.e. having a different digit-handling circuit for each denomination
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/30—Arrangements for executing machine-instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/30007—Arrangements for executing specific machine instructions to perform operations on data operands
- G06F9/3001—Arithmetic instructions
- G06F9/30014—Arithmetic instructions with variable precision
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F2207/00—Indexing scheme relating to methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F2207/38—Indexing scheme relating to groups G06F7/38 - G06F7/575
- G06F2207/3804—Details
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/50—Computer-aided design
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F2207/00—Indexing scheme relating to methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F2207/535—Indexing scheme relating to groups G06F7/535 - G06F7/5375
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F2217/00—Indexing scheme relating to computer aided design [CAD]
- G06F2217/78—Power analysis and optimization

Similar Documents

Publication	Publication Date	Title
KR101566257B1 (en)	2015-11-05	Reducing power consumption in a fused multiply-add (fma) unit responsive to input data values
Vangal et al.	2006	A 6.2-GFlops floating-point multiply-accumulator with conditional normalization
US7137021B2 (en)	2006-11-14	Power saving in FPU with gated power based on opcodes and data
Mueller et al.	2005	The vector floating-point unit in a synergistic processor element of a Cell processor
US11068238B2 (en)	2021-07-20	Multiplier circuit
Bruintjes et al.	2012	Sabrewing: A lightweight architecture for combined floating-point and integer arithmetic
Preiss et al.	2009	Advanced clockgating schemes for fused-multiply-add-type floating-point units
US8566383B2 (en)	2013-10-22	Distributed residue-checking of a floating point unit
Del Barrio et al.	2014	Ultra-low-power adder stage design for exascale floating point units
Samanth et al.	2019	Power reduction of a functional unit using rt-level clock-gating and operand isolation
Kuang et al.	2013	Energy-efficient multiple-precision floating-point multiplier for embedded applications
US8244783B2 (en)	2012-08-14	Normalizer shift prediction for log estimate instructions
Banerjee et al.	2005	Novel low-overhead operand isolation techniques for low-power datapath synthesis
Curran et al.	2006	4GHz+ low-latency fixed-point and binary floating-point execution units for the POWER6 processor
Bo et al.	2012	Reducing power consumption of floating-point multiplier via asynchronous technique
Pimentel et al.	2014	Hybrid floating-point modules with low area overhead on a fine-grained processing core
Lasith et al.	2014	Efficient implementation of single precision floating point processor in FPGA
Ratković et al.	2016	A fully parameterizable low power design of vector fused multiply-add using active clock-gating techniques
Gok et al.	2017	VaLHALLA: Variable latency history aware local-carry lazy adder
Hockert et al.	2012	Improving floating-point performance in less area: Fractured floating point units (FFPUs)
Gilani et al.	2012	Virtual floating-point units for low-power embedded processors
Dieter et al.	2007	Low-cost microarchitectural support for improved floating-point accuracy
Zhang et al.	2008	A power-efficient floating-point co-processor design
Mahmoud et al.	2017	Low energy pipelined Dual Base (decimal/binary) Multiplier, DBM, design
Lazo	2021	Design and implementation of an out-of-order execution engine of floating-point arithmetic operations