Skip to content

feat examples/llm: add optional --metrics-json output for reproducible run timings#468

Open
salignatmoandal wants to merge 1 commit into
zml:masterfrom
salignatmoandal:feat/metrics-json-llm
Open

feat examples/llm: add optional --metrics-json output for reproducible run timings#468
salignatmoandal wants to merge 1 commit into
zml:masterfrom
salignatmoandal:feat/metrics-json-llm

Conversation

@salignatmoandal

Copy link
Copy Markdown

This PR adds an optional --metrics-json=<path> flag to examples/llm to export structured timing metrics for reproducible performance tracking.

What it adds

New CLI flag: --metrics-json=<path>

  • JSON metrics output with:
  • model
  • backend
  • seqlen
  • tokenizer_load_ms
  • weights_load_ms
  • compile_ms
  • generation_ms (non-interactive --prompt mode)

Why

This enables benchmark automation and easier regression detection across commits/backends by providing machine-readable run metrics instead of relying only on logs.

Behavior notes

No behavior change when --metrics-json is not provided.
Metrics are primarily intended for non-interactive runs (--prompt).

Example output snippet

{
  "model": "hf://meta-llama/Llama-3.1-8B-Instruct",
  "backend": "cuda_fa3",
  "seqlen": 2048,
  "tokenizer_load_ms": 118,
  "weights_load_ms": 8420,
  "compile_ms": 15640,
  "generation_ms": 972
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant