Building the Python Wheels
Proteus now builds Python wheels as a package family rather than a single monolithic wheel:
proteus-python- pure-Python shim package
- installs the host backend by default
proteus-python-backend-host- native host backend wheel for Linux and macOS
proteus-python-backend-cu12- native CUDA 12 superset backend wheel for Linux
x86_64
The user-facing import surface remains:
import proteus
At runtime, the shim discovers installed backend entry points and prefers the highest-priority backend:
cuda12host
Packaging Model
The shim package is built from the repository root with setuptools.
The backend packages are built from dedicated subprojects:
packaging/python/backend-hostpackaging/python/backend-cu12
Those backend projects use scikit-build-core and the repository CMake build,
but they install their native payload into backend-local packages instead of
directly under proteus/:
proteus_backend_host/proteus_backend_cu12/
Each backend package contains:
- backend-local
_proteus.* - backend-local
libproteus.* - repaired vendored LLVM/Clang shared libraries
- backend registration metadata under the
proteus.backendsentry-point group
The shim package exports:
proteus.active_backendproteus.available_backends()- the selected backend's native API re-exported from
proteus
Build-Time Requirements
All backend wheel builds require:
- Python with
build - CMake
pybind11scikit-build-core- a usable LLVM/Clang installation discovered via
LLVM_INSTALL_DIR
Platform-specific requirements:
- macOS host backend:
- Homebrew LLVM 22
delocate- Linux host backend:
- the
manylinux_2_28LLVM container auditwheel- Linux CUDA backend:
- the CUDA-capable
manylinux_2_28LLVM container - CUDA Toolkit 12 with
libnvptxcompiler_static.a auditwheel
Local Builds
Shim Wheel
python3 -m venv /tmp/proteus-wheel-venv
/tmp/proteus-wheel-venv/bin/python -m pip install -U pip build setuptools setuptools-scm
/tmp/proteus-wheel-venv/bin/python -m build --wheel --outdir dist .
Host Backend Wheel on macOS arm64
python3 -m venv /tmp/proteus-wheel-venv
/tmp/proteus-wheel-venv/bin/python -m pip install -U pip build scikit-build-core pybind11 delocate
brew install llvm@22
MACOSX_DEPLOYMENT_TARGET=14.0 \
LLVM_INSTALL_DIR=/opt/homebrew/opt/llvm \
/tmp/proteus-wheel-venv/bin/python -m build --wheel \
--outdir wheelhouse \
packaging/python/backend-host
Host Backend Wheel on Linux
bash packaging/python/image-scripts/build-manylinux-llvm-container.sh
python -m pip install -U pip build cibuildwheel
CIBW_MANYLINUX_X86_64_IMAGE=ghcr.io/olympus-hpc/proteus-manylinux-llvm:22.1.3 \
python -m cibuildwheel packaging/python/backend-host --output-dir wheelhouse
CUDA 12 Backend Wheel on Linux
bash packaging/python/image-scripts/build-manylinux-cuda-llvm-container.sh
python -m pip install -U pip build cibuildwheel
CIBW_MANYLINUX_X86_64_IMAGE=ghcr.io/olympus-hpc/proteus-manylinux-cuda-llvm:12.4.1-22.1.3 \
python -m cibuildwheel packaging/python/backend-cu12 --output-dir wheelhouse
Linux Container Images
The Linux wheel workflows use prebuilt GHCR images.
Host backend image inputs:
packaging/python/image-scripts/manylinux-llvm.Dockerfilepackaging/python/image-scripts/build-llvm-manylinux.shpackaging/python/image-scripts/build-manylinux-llvm-container.sh
CUDA backend image inputs:
packaging/python/image-scripts/manylinux-cuda-llvm.Dockerfilepackaging/python/image-scripts/build-llvm-manylinux.shpackaging/python/image-scripts/build-manylinux-cuda-llvm-container.sh
The CUDA image installs:
- LLVM
22.1.3 - CUDA Toolkit
12.4 - static
libnvptxcompiler_static.a
Installed-Wheel Verification
Do not treat build-tree imports as sufficient. Install from built wheels into a clean environment.
Default Host Install
/tmp/proteus-wheel-venv/bin/python -m pip install \
--no-index \
--find-links dist \
--find-links wheelhouse \
dist/proteus_python-*.whl
export PROTEUS_CLANGXX_BIN=/path/to/clang++-22
/tmp/proteus-wheel-venv/bin/python bindings/python/tests/test_backend_loader.py
/tmp/proteus-wheel-venv/bin/python bindings/python/tests/test_host_cpp_smoke.py
/tmp/proteus-wheel-venv/bin/python bindings/python/tests/test_host_cpp_validation.py
/tmp/proteus-wheel-venv/bin/python bindings/python/tests/test_host_cpp_std_headers.py
/tmp/proteus-wheel-venv/bin/python bindings/python/tests/test_invalid_clang_override.py
/tmp/proteus-wheel-venv/bin/python bindings/python/tests/test_wheel_layout.py
CUDA Superset Install
Install the shim first, then the CUDA backend wheel from the local wheelhouse:
/tmp/proteus-wheel-venv/bin/python -m pip install \
--no-index \
--find-links dist \
--find-links wheelhouse \
dist/proteus_python-*.whl
/tmp/proteus-wheel-venv/bin/python -m pip install \
--no-index \
--find-links wheelhouse \
proteus-python-backend-cu12==<version>
On CPU-only machines, validate the host path:
export PROTEUS_CLANGXX_BIN=/path/to/clang++-22
/tmp/proteus-wheel-venv/bin/python bindings/python/tests/test_backend_loader.py
/tmp/proteus-wheel-venv/bin/python bindings/python/tests/test_host_cpp_smoke.py
/tmp/proteus-wheel-venv/bin/python bindings/python/tests/test_host_cpp_validation.py
/tmp/proteus-wheel-venv/bin/python bindings/python/tests/test_host_cpp_std_headers.py
/tmp/proteus-wheel-venv/bin/python bindings/python/tests/test_invalid_clang_override.py
/tmp/proteus-wheel-venv/bin/python bindings/python/tests/test_wheel_layout.py
On Linux systems with an NVIDIA GPU and driver, additionally validate:
export CUDA_HOME=/usr/local/cuda
/tmp/proteus-wheel-venv/bin/python bindings/python/tests/test_gpu_cpp_smoke.py
/tmp/proteus-wheel-venv/bin/python bindings/python/tests/test_gpu_cpp_launch_validation.py
/tmp/proteus-wheel-venv/bin/python bindings/python/tests/test_gpu_cpp_pointer_validation.py
Runtime Contract
Host Backend
- ships Proteus and the LLVM/Clang runtime libraries it links against
- does not ship a host C++ compiler toolchain
- still requires a host
clang++whose major version matches the bundled LLVM/Clang runtime forfrontend="cpp", target="host" - for the current wheel line, use
clang++-22or equivalent and setPROTEUS_CLANGXX_BINwhen it is not the defaultclang++onPATH - a mismatched
clang++may fail while resolving builtin/system headers during the in-process frontend compile
| Install | Backend | Target | Required compiler/toolchain |
|---|---|---|---|
pip install proteus-python |
proteus-python[host] |
Host CPU | LLVM/Clang 22.x |
pip install proteus-python[cuda12] |
proteus-python[cuda12] |
Host CPU + NVIDIA CUDA GPU | LLVM/Clang 22.x, or the NVIDIA toolchain when using NVCC for host/device compilation |
CUDA 12 Backend
- is a superset backend: host functionality remains available
- does not vendor
libcuda.so,libcudart, or the CUDA Toolkit - requires an installed NVIDIA driver for CUDA functionality
- requires a matching CUDA 12 toolkit root for runtime compilation
CUDA toolkit resolution remains:
PROTEUS_CUDA_HOMECUDA_HOMECUDA_PATH
For wheel-targeted CUDA builds on Linux, Proteus now resolves libcuda.so.1
with dlopen() at runtime instead of linking the wheel directly against the
CUDA driver shared library. This keeps the wheel compatible with manylinux
repair and lets host-only functionality work on machines without an NVIDIA
driver.
CI Workflow
The wheel workflow is defined in .github/workflows/ci-wheels.yml.
It produces three artifact groups:
- shim wheel
- host backend wheels
- CUDA backend wheels
Current target matrix:
- shim: pure Python
- host backend:
macOS arm64manylinux_2_28 x86_64- CUDA backend:
manylinux_2_28 x86_64
The workflow builds all backend wheels with cibuildwheel, then performs
explicit installed-wheel validation using the newly built shim and backend
wheels from the local artifact directories.