SparseBench
Table of Contents
Introduction#
SparseBench is a hybrid MPI+OpenMP parallel sparse solver benchmark for evaluating the performance of iterative sparse linear solvers and sparse matrix-vector operations. It is designed to measure performance on both single-node and distributed memory systems.
Dwarf Classification#
SparseBench implements the Sparse Linear Algebra dwarf from the Berkeley taxonomy. The core computational pattern is the sparse matrix-vector multiply (SpMV), which is characterized by indirect memory access through index arrays and low arithmetic intensity. This makes SpMV typically memory-bandwidth bound, with performance strongly dependent on the memory subsystem and data layout.
Key Features#
- Iterative solvers: Conjugate Gradient (CG), Sparse Matrix-Vector Multiplication (SpMV), GMRES and Chebyshev Filter Diagonalization (planned)
- Sparse matrix formats: CRS (Compressed Row Storage), SCS (Sell-C-Sigma for vectorization), CCRS (Compressed CRS for memory efficiency)
- Parallelism: MPI for distributed memory, OpenMP for shared memory
- Matrix input: Generated 3D stencil matrices, Matrix Market files (.mtx), binary format (.bmx)
- Optimized MPI communication: Single
MPI_Neighbor_alltoallvcall per iteration through matrix localization - Precision: Single and double precision floating point
Getting Started#
Clone the SparseBench repository and follow the build and usage instructions in the README.
Performance Characteristics#
SparseBench kernels are memory-bandwidth bound due to the low arithmetic intensity of sparse matrix-vector operations. The indirect addressing pattern in SpMV leads to irregular memory access, making cache utilization and data layout critical for performance. The Sell-C-Sigma (SCS) format improves SIMD vectorization and cache efficiency compared to standard CRS on modern processors. Key tuning dimensions include the sparse matrix format, the number of MPI ranks vs. OpenMP threads, and thread affinity settings.
Citations#
TODO
Credits & License#
SparseBench is developed by the Erlangen National High Performance Computing Center (NHR@FAU) at the University of Erlangen-Nuremberg.
Licensed under the MIT License.