Skip to content

Recording artifacts

This page describes the files and data generated by Mneme during kernel recording and replay.


Example of artifacts collected during recording:

Below you can see some artifact files generated by a recording run performed by (mneme record):

./record-example-dir/
├── 15941914485064662553.json
├── DeviceState.epilogue.15941914485064662553.16313427880266313990.mneme
├── DeviceState.prologue.15941914485064662553.16313427880266313990.mneme
└── RecordedIR_15941914485064662553.bc
Mneme generates files of the following types: 1. Record Database: A JSON file describing the recorded kernel and associates metadata 2. GPU device memory state: Binary blobs describing the memory state of the GPU 3. LLVM IR: File(s) contatining the LLVM IR of the recorded kernel(s).


Recording database

Each invocation of mneme record produces one or more recording database (s), each represented by a JSON file. The recording database file is named using the format <static-hash>.json, where the static hash is a numeric identifier that uniquely represents the source code of a single GPU kernel. The database serves as the entry point for replay and tuning workflows.

The database describes: - the recorded kernel, - the LLVM IR required for replay, and - one or more recorded execution instances of that kernel.


Example

Below is an example of a recording database generated by mneme record:

{
  "DemangledName": "void vecAdd_test<long>(long*, long*, unsigned long)",
  "KernelName": "_Z11vecAdd_testIlEvPT_S1_m",
  "StaticHash": 15941914485064662553,
  "Modules": [
    "RecordedIR_15941914485064662553.bc"
  ],
  "VASize": 4294967296,
  "instances": {
    "16313427880266313990": {
      "GridDims": { "x": 40000, "y": 1, "z": 1 },
      "BlockDims": { "x": 256, "y": 1, "z": 1 },
      "SharedMem": 0,
      "Prologue": "DeviceState.prologue.*.mneme",
      "Epilogue": "DeviceState.epilogue.*.mneme",
      "Occurrences": 1
    }
  }
}

Top-level fields

  • KernelName: Mangled name of the recorded GPU kernel.
  • DemangledName: Human-readable name of the kernel, useful for inspection and debugging.
  • StaticHash: A stable numeric identifier uniquely representing the kernel’s source code.
  • Modules: LLVM IR modules (.bc) required to replay and recompile the kernel.
  • VASize: Size of the virtual address space allocated during recording.
  • instances: A mapping from instance identifiers to recorded kernel executions. Each instance corresponds to a specific dynamic execution context.

Kernel instances (instances)

A mapping from instance identifiers to recorded kernel executions. Each instance corresponds to a specific dynamic execution context.

Each entry under instances corresponds to the same kernel but to a different dynamic execution context, typically resulting from different launch configurations or runtime parameters. In Mneme terminology, such executions are identified by distinct dynamic hashes. Every dynamic hash is described by: - launch configuration (grid and block dimensions), - shared memory size, - paths to device memory snapshots, and - occurrence count indicating how often this instance was observed.

These instance identifiers are used directly by the mneme replay command via the -rid option.


GPU Device Memory State

For each recorded kernel execution, Mneme captures the state of GPU device memory both before kernel launch (prologue state) and after kernel execution (epilogue state), and stores these states persistently.

At replay time, Mneme restores device memory to the recorded prologue state by remapping the recorded memory layout into the replay execution. The kernel is then executed in isolation, with the expectation that its execution transforms device memory to a state equivalent to the recorded epilogue state.

This mechanism enables faithful reproduction of kernel behavior and forms the foundation for replay, verification, and performance experimentation.

The prologue and epilogue files contain serialized descriptors of GPU memory contents and associated metadata, including the number of memory regions, their sizes, and kernel argument mappings.


Recorded LLVM IR

Mneme stores the recorded kernel code as LLVM IR files (.bc), which are fully compatible with standard LLVM tooling. These files serve as the input to replay and tuning workflows and may be inspected or transformed directly using tools such as opt or custom LLVM passes.


Artifact lifecycle

Recorded artifacts follow a well-defined lifecycle:

  • created by mneme record
  • consumed by mneme replay
  • removed by mneme clean
  • relocated by mneme move and mneme copy