Recording artifacts¶
This page describes the files and data generated by Mneme during kernel recording and replay.
Example of artifacts collected during recording:¶
Below you can see some artifact files generated by a recording run performed by (mneme record):
./record-example-dir/
├── 15941914485064662553.json
├── DeviceState.epilogue.15941914485064662553.16313427880266313990.mneme
├── DeviceState.prologue.15941914485064662553.16313427880266313990.mneme
└── RecordedIR_15941914485064662553.bc
Recording database¶
Each invocation of mneme record produces one or more recording database (s), each
represented by a JSON file. The recording database file is named using
the format <static-hash>.json, where the static hash is a numeric
identifier that uniquely represents the source code of a single GPU
kernel. The database serves as the
entry point for replay and tuning workflows.
The database describes: - the recorded kernel, - the LLVM IR required for replay, and - one or more recorded execution instances of that kernel.
Example¶
Below is an example of a recording database generated by mneme record:
{
"DemangledName": "void vecAdd_test<long>(long*, long*, unsigned long)",
"KernelName": "_Z11vecAdd_testIlEvPT_S1_m",
"StaticHash": 15941914485064662553,
"Modules": [
"RecordedIR_15941914485064662553.bc"
],
"VASize": 4294967296,
"instances": {
"16313427880266313990": {
"GridDims": { "x": 40000, "y": 1, "z": 1 },
"BlockDims": { "x": 256, "y": 1, "z": 1 },
"SharedMem": 0,
"Prologue": "DeviceState.prologue.*.mneme",
"Epilogue": "DeviceState.epilogue.*.mneme",
"Occurrences": 1
}
}
}
Top-level fields¶
KernelName: Mangled name of the recorded GPU kernel.DemangledName: Human-readable name of the kernel, useful for inspection and debugging.StaticHash: A stable numeric identifier uniquely representing the kernel’s source code.Modules: LLVM IR modules (.bc) required to replay and recompile the kernel.VASize: Size of the virtual address space allocated during recording.instances: A mapping from instance identifiers to recorded kernel executions. Each instance corresponds to a specific dynamic execution context.
Kernel instances (instances)¶
A mapping from instance identifiers to recorded kernel executions. Each instance corresponds to a specific dynamic execution context.
Each entry under instances corresponds to the same kernel but to a
different dynamic execution context, typically resulting from different
launch configurations or runtime parameters. In Mneme terminology, such
executions are identified by distinct dynamic hashes. Every dynamic hash is described by:
- launch configuration (grid and block dimensions),
- shared memory size,
- paths to device memory snapshots, and
- occurrence count indicating how often this instance was observed.
These instance identifiers are used directly by the mneme replay command via the -rid option.
GPU Device Memory State¶
For each recorded kernel execution, Mneme captures the state of GPU device memory both before kernel launch (prologue state) and after kernel execution (epilogue state), and stores these states persistently.
At replay time, Mneme restores device memory to the recorded prologue state by remapping the recorded memory layout into the replay execution. The kernel is then executed in isolation, with the expectation that its execution transforms device memory to a state equivalent to the recorded epilogue state.
This mechanism enables faithful reproduction of kernel behavior and forms the foundation for replay, verification, and performance experimentation.
The prologue and epilogue files contain serialized descriptors of GPU memory contents and associated metadata, including the number of memory regions, their sizes, and kernel argument mappings.
Recorded LLVM IR¶
Mneme stores the recorded kernel code as LLVM IR files (.bc), which are
fully compatible with standard LLVM tooling. These files serve as the
input to replay and tuning workflows and may be inspected or transformed
directly using tools such as opt or custom LLVM passes.
Artifact lifecycle¶
Recorded artifacts follow a well-defined lifecycle:
- created by
mneme record - consumed by
mneme replay - removed by
mneme clean - relocated by
mneme moveandmneme copy