Why C and not C++?#

Our focus is on the algorithmic level. C lets us implement algorithms in a straightforward, explicit way that anyone with basic programming knowledge can follow. A language like C++ would shift attention away from the algorithms themselves, introducing abstractions and implicit behavior that obscure what’s really happening under the hood. C also has practical advantages: it is arguably the most portable language in existence, runs on virtually any processor architecture and operating system, and delivers solid, reliable performance.

Do you support alternative programming models?#

Given the requirements for portability and simplicity, and with a focus on state-of-the-art performance, we generally avoid offering multiple programming models. Instead, we stick to the de facto standard HPC programming models: OpenMP for thread-level parallelism, MPI for distributed-memory parallel programming, and CUDA-like models for GPU accelerators. There is one exception: to explore portable SIMD code, we also provide an ISPC version for RabbitCT.

Why make and not cmake?#

The build system follows the same guiding principles of simplicity and portability. We use standard GNU Make, available on virtually every platform, and provide a Makefile that works out of the box while remaining easy to extend and adapt. Build configuration is handled through plain text files, keeping settings transparent and easy to adapt. The Makefile supports multiple toolchains and enforces a consistent directory structure throughout the project. Integration with Clang tools is supported via an auto-generated .clangd file for IDE features such as code completion and diagnostics, and the third-party bear tool can be used to generate a compile commands database for full compatibility with the Clang ecosystem.

But there are already proxy app collections. Why another?#

The earliest proxy app collections — such as the NAS Parallel Benchmarks, the Mantevo Proxy Apps, and the SPEC HPC benchmarks — were created primarily for hardware characterization, procurement preparation, and early-stage algorithm porting. These projects aggregate contributions from many developers, resulting in implementations that span multiple languages and vary widely in code quality and optimization. Since applications are often extracted from production codes, they inherit the complexity and technical debt of their donor codebases, and tend to be used as black boxes with little scrutiny applied to the underlying source code.

Our approach is different. We craft each implementation from scratch, with readability, correctness, and performance as equal priorities. The result is clean, consistent code that clearly exposes the underlying algorithms without unnecessary complexity. We actively track developments in parallel programming and hardware architecture to ensure our implementations reflect the state of the art.

These codes serve three primary use cases: performance-oriented research, where clean baselines are essential for meaningful analysis; hardware characterization and benchmarking, where transparency yields reliable and reproducible results; and teaching, where clarity is paramount to conveying algorithmic and architectural concepts effectively.