Nexus: HSA Packet Source Code Extractor
Intercepts HSA packets from a running process and extracts, per kernel, assembly and HIP source into a structured trace (e.g. JSON). Use for kernel-level inspection and assembly/source correlation.
When to Use
- User needs to see which kernels ran and their assembly or HIP source
- Debugging or analyzing GPU dispatch and code generation
- Inspecting assembly-to-source mapping for a HIP (or ROCm) application
Instructions
- Ensure the target runs on AMD ROCm and uses HSA (e.g. HIP application or ROCm runtime).
- Choose execution path:
- If a Nexus MCP server is available, use its tools:
list_kernelsto enumerate kernels in a trace, andextract_kernel_codeto get assembly and HIP/source mapping (signature, files, lines). Seenexus/nexus/mcp/server.pyfor tool parameters and schemas. - Otherwise use the Python API from the environment where Nexus is installed.
- If a Nexus MCP server is available, use its tools:
Python API (recommended when no MCP)
from nexus import Nexus
nexus = Nexus(log_level=1)
trace = nexus.run(["python", "my_gpu_script.py"])
# Or run a binary:
# trace = nexus.run(["./my_hip_app"])
for kernel in trace:
print(kernel.name, len(kernel.assembly), "instructions")
for i, asm_line in enumerate(kernel.assembly, 1):
print(f" {i}. {asm_line}")
for line_no, hip_line in zip(kernel.lines or range(1, len(kernel.hip)+1), kernel.hip):
print(f" {line_no}: {hip_line}")
# Access by kernel name
k = trace["vector_add(float const*, float const*, float*, int)"]
print(k.assembly, k.hip, k.signature, k.files, k.lines)
# Save/load trace
trace.save("trace.json")
loaded = Nexus.load("trace.json")
Set log_level (0–4) to control verbosity. Use relative paths for the run command and output file so the skill is portable.
Environment-based usage (no Python API)
When the process cannot be launched via nexus.run():
- Set
HSA_TOOLS_LIBto the Nexus shared library path (e.g.build/lib/libnexus.soor the installed path). - Set
NEXUS_OUTPUT_FILEto the output JSON path. - Set
NEXUS_LOG_LEVEL(0–4) if needed. - Run the application as usual; it will be traced and the output file will contain the kernel data.
Optional: NEXUS_EXTRA_SEARCH_PREFIX (colon-separated) for HIP source search; TRITON_DISABLE_LINE_INFO=0 for Triton kernel line info.
Workflow
- Identify the command that runs the GPU workload (e.g.
python script.pyor./app). - If using the Python API: create
Nexus(log_level=...), callnexus.run([...]), then iteratetraceand optionallytrace.save(...). - If using the env method: set
HSA_TOOLS_LIBandNEXUS_OUTPUT_FILE, then run the app; open the JSON and parse thekernelsstructure. - Use kernel
signature,assembly,hip,files, andlinesto analyze what ran and map assembly back to source. - Use relative paths for commands and output files.
Notes
- Nexus is intended for research/analysis; ensure the target environment has the Nexus library and compatible ROCm/HSA stack.
- For Triton kernels, enable line info via
TRITON_DISABLE_LINE_INFO=0when using the Python API.
