MLIR Frontend API

The MLIR frontend API lets you provide MLIR source code as a string and compile it through Proteus at runtime. It is intended for users who already produce MLIR or want direct access to Proteus's MLIR lowering path.

Unlike the annotation interface, this path does not require compiling the application with Clang. The application can be built with any compatible compiler because Proteus parses and lowers the MLIR source at runtime.

If you want to provide C++ source strings instead of MLIR, see the C++ frontend API. If you want to construct IR programmatically rather than provide source strings, see the DSL API.

Overview

MLIRJitModule is constructed from a target string plus the MLIR source code:

target "host", "cuda", or "hip"
source text containing a top-level MLIR module

For host targets, retrieve entry points with getFunction() and execute them with run(). For CUDA and HIP targets, retrieve GPU entry points with getKernel() and launch them with grid dimensions, block dimensions, dynamic shared memory size, stream, and kernel arguments.

Host Example

Here is a minimal host example that compiles an MLIR function and calls it from C++:

#include <proteus/MLIRJitModule.h>

using namespace proteus;

static constexpr const char *Code = R"mlir(
module {
  func.func @add(%a: i32, %b: i32) -> i32 {
    %sum = arith.addi %a, %b : i32
    return %sum : i32
  }
}
)mlir";

MLIRJitModule Module{"host", Code};
auto Add = Module.getFunction<int(int, int)>("add");

int Result = Add.run(40, 2);

The function name passed to getFunction() must match the MLIR symbol name.

GPU Example

GPU modules use gpu.module and gpu.func operations. The top-level module must be a GPU container module, and device lowering expects exactly one top-level gpu.module. The gpu.module symbol name is not significant.

#include <proteus/MLIRJitModule.h>

using namespace proteus;

static constexpr const char *Code = R"mlir(
module attributes {gpu.container_module} {
  gpu.module @device_code {
    gpu.func @write42(%out: !llvm.ptr) kernel {
      %c42 = arith.constant 42 : i32
      llvm.store %c42, %out : i32, !llvm.ptr
      gpu.return
    }
  }
}
)mlir";

MLIRJitModule Module{"cuda", Code};
auto Write42 = Module.getKernel<void(int *)>("write42");

int *DeviceBuffer = ...;
Write42.launch(
  /* GridDim */ {1, 1, 1},
  /* BlockDim */ {1, 1, 1},
  /* ShmemSize */ 0,
  /* Stream */ nullptr,
  DeviceBuffer);

Use target "hip" instead of "cuda" to compile the same shape of MLIR for HIP, assuming Proteus was built with HIP support.

Device Module Requirements

For CUDA and HIP MLIR input, Proteus lowers a single device module to device LLVM IR. The input must contain exactly one top-level gpu.module. If no device module is present, or if multiple gpu.module operations are present, compilation fails with a diagnostic.

Kernel functions should be represented as gpu.func operations marked kernel. Retrieve them with getKernel() by their gpu.func symbol name.

Kernel Function Attributes

For GPU kernels, you can set supported function attributes before launching. For example, Proteus exposes JitFuncAttribute::MaxDynamicSharedMemorySize:

auto Kernel = Module.getKernel<void(int *)>("shmem_plain");
Kernel.setFuncAttribute(JitFuncAttribute::MaxDynamicSharedMemorySize,
                        49 * 1024);
Kernel.launch({1, 1, 1}, {1, 1, 1}, 49 * 1024, nullptr, Out);