[go: up one dir, main page]

Palamadai Natarajan et al., 2017 - Google Patents

Autotuning divide‐and‐conquer stencil computations

Palamadai Natarajan et al., 2017

View PDF
Document ID
721709892673851483
Author
Palamadai Natarajan E
Mehri Dehnavi M
Leiserson C
Publication year
Publication venue
Concurrency and Computation: Practice and Experience

External Links

Snippet

This paper explores autotuning strategies for serial divide‐and‐conquer stencil computations, comparing the efficacy of traditional “heuristic” autotuning with that of “pruned‐ exhaustive” autotuning. We present a pruned‐exhaustive autotuner called Ztune that …
Continue reading at www.cs.toronto.edu (PDF) (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30286Information retrieval; Database structures therefor; File system structures therefor in structured data stores
    • G06F17/30386Retrieval requests
    • G06F17/30424Query processing
    • G06F17/30442Query optimisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30286Information retrieval; Database structures therefor; File system structures therefor in structured data stores
    • G06F17/30386Retrieval requests
    • G06F17/30389Query formulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformations of program code
    • G06F8/41Compilation
    • G06F8/44Encoding
    • G06F8/443Optimisation
    • G06F8/4441Reducing the execution time required by the program code
    • G06F8/4442Reducing the number of cache misses; Data prefetching
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformations of program code
    • G06F8/41Compilation
    • G06F8/45Exploiting coarse grain parallelism in compilation, i.e. parallelism between groups of instructions
    • G06F8/456Parallelism detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30286Information retrieval; Database structures therefor; File system structures therefor in structured data stores
    • G06F17/30289Database design, administration or maintenance
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for programme control, e.g. control unit
    • G06F9/06Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/50Computer-aided design
    • G06F17/5009Computer-aided design using simulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3409Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation; Recording or statistical evaluation of user activity, e.g. usability assessment for performance assessment
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for programme control, e.g. control unit
    • G06F9/06Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
    • G06F9/44Arrangements for executing specific programmes
    • G06F9/455Emulation; Software simulation, i.e. virtualisation or emulation of application or operating system execution engines
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/70Software maintenance or management
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/60Software deployment
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F2201/00Indexing scheme relating to error detection, to error correction, and to monitoring

Similar Documents

Publication Publication Date Title
Wang et al. Gunrock: GPU graph analytics
Gao et al. Estimating GPU memory consumption of deep learning models
US10437573B2 (en) General purpose distributed data parallel computing using a high level language
Sundaram et al. Graphmat: High performance graph analytics made productive
US8239847B2 (en) General distributed reduction for data parallel computing
Xie et al. Fast iterative graph computation with block updates
Reguly et al. Loop tiling in large-scale stencil codes at run-time with OPS
Gareev et al. High-performance generalized tensor operations: A compiler-oriented approach
Tang et al. Cache-oblivious wavefront: improving parallelism of recursive dynamic programming algorithms without losing cache-efficiency
Christen et al. Patus for convenient high-performance stencils: evaluation in earthquake simulations
Elafrou et al. Sparsex: A library for high-performance sparse matrix-vector multiplication on multicore platforms
Sevenich et al. Using domain-specific languages for analytic graph databases
Izquierdo-Carrasco et al. A generic vectorization scheme and a GPU kernel for the phylogenetic likelihood library
Mustafa A survey of performance tuning techniques and tools for parallel applications
Devarajan et al. Vidya: Performing code-block I/O characterization for data access optimization
Wernsing et al. Elastic computing: A portable optimization framework for hybrid computers
Goda et al. Out-of-order execution of database queries
Sankaran et al. Benchmarking the linear algebra awareness of tensorflow and pytorch
Satish et al. Can traditional programming bridge the ninja performance gap for parallel computing applications?
Abdelfattah et al. Addressing irregular patterns of matrix computations on GPUs and their impact on applications powered by sparse direct solvers
Palamadai Natarajan et al. Autotuning divide‐and‐conquer stencil computations
Lukarski Parallel sparse linear algebra for multi-core and many-core platforms: Parallel solvers and preconditioners
Palkowski et al. Tiling Nussinov’s RNA folding loop nest with a space-time approach
AlOnazi et al. Asynchronous task-based parallelization of algebraic multigrid
Hofmann et al. Performance analysis of the Kahan-enhanced scalar product on current multicore processors