SaaSHub helps you find the best software and product alternatives Learn more →
Top 23 C++ parallel-computing Projects
-
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
Project mention: Adding 70-language translation to an image API without paying per word | dev.to | 2026-06-09
Runtime: CTranslate2 with int8 quantization. This is the key piece. It shrinks the model to ~1.3 GB and runs CPU inference fast. Do not run raw transformers on CPU for this.
-
kokkos
Kokkos C++ Performance Portability Programming Ecosystem: The Programming Model - Parallel Execution and Memory Abstraction
-
Project mention: Delivering the Missing Building Blocks for Nvidia CUDA Kernel Fusion in Python | news.ycombinator.com | 2025-07-16
There’s an extensive change-log supporting the CCCL 3.0 release on GitHub from 3 hours ago: https://github.com/NVIDIA/cccl/releases/tag/v3.0.0
-
-
-
-
Kratos
Kratos Multiphysics (A.K.A Kratos) is a framework for building parallel multi-disciplinary simulation software. Modularity, extensibility and HPC are the main objectives. Kratos has BSD license and is written in C++ with extensive Python interface. (by KratosMultiphysics)
-
-
libfork
A bleeding-edge, lock-free, wait-free, continuation-stealing tasking library built on C++20's coroutines
-
-
-
-
coros
An easy-to-use and fast library for task-based parallelism, utilizing coroutines. (by mtmucha)
-
ForkUnion
Lower-latency OpenMP-style minimalistic scoped thread-pool designed for 'Fork-Join' parallelism in Rust and C++, avoiding memory allocations, mutexes, CAS-primitives, and false-sharing on the hot path 🍴
As mentioned in the docstring above, using STL's `std::hardware_destructive_interference_size` won't help you. On ARM, this issue becomes even more pronounced, so concurrency-heavy code should ideally be compiled multiple times for different coherence protocols and leverage "dynamic dispatch", similar to how I & others handle SIMD instructions in libraries that need to run on a very diverse set of platforms.
[1] https://github.com/ashvardanian/ForkUnion/blob/46666f6347ece...
-
-
-
-
-
ConcurrentDeque
Fast, generalized, implementation of the Chase-Lev lock-free work-stealing deque for C++17
-
-
Lazy
Light-weight header-only library for parallel function calls and continuations in C++ based on Eric Niebler's talk at CppCon 2019.
-
C++ parallel-computing discussion
C++ parallel-computing related posts
-
Myths Programmers Believe about CPU Caches
-
Show HN: Coros – A Modern C++ Library for Task Parallelism
-
rodin alternatives - mfem and FreeFem-sources
7 projects | 8 Mar 2023 -
Learn PDE constrained optimization
-
Open source FEA tools instead of ANSYS Workbench and APDL
-
Eighty Years of the Finite Element Method: Birth, Evolution, and Future
-
Fortran on GPU
-
A note from our sponsor - SaaSHub
www.saashub.com | 21 Jun 2026
Index
What are some of the best open-source parallel-computing projects in C++? This list will help you:
| # | Project | Stars |
|---|---|---|
| 1 | Taskflow | 12,017 |
| 2 | CTranslate2 | 4,525 |
| 3 | kokkos | 2,572 |
| 4 | cccl | 2,388 |
| 5 | mfem | 2,184 |
| 6 | dealii | 1,679 |
| 7 | Vc | 1,536 |
| 8 | Kratos | 1,289 |
| 9 | dolfinx | 1,148 |
| 10 | libfork | 879 |
| 11 | oneMath | 766 |
| 12 | RAJA | 585 |
| 13 | parlaylib | 437 |
| 14 | coros | 333 |
| 15 | ForkUnion | 332 |
| 16 | feelpp | 330 |
| 17 | PothosCore | 314 |
| 18 | CPURasterizer | 200 |
| 19 | axom | 191 |
| 20 | ConcurrentDeque | 162 |
| 21 | cppRouting | 121 |
| 22 | Lazy | 111 |
| 23 | Bulk | 94 |