Skip to content

docs(benchmarks): add benchmark results and cross-runtime fib(35) comparison#34

Merged
siyul-park merged 9 commits into
mainfrom
claude/benchmark-measurement-docs-gU7hT
May 22, 2026
Merged

docs(benchmarks): add benchmark results and cross-runtime fib(35) comparison#34
siyul-park merged 9 commits into
mainfrom
claude/benchmark-measurement-docs-gU7hT

Conversation

@siyul-park

@siyul-park siyul-park commented May 22, 2026

Copy link
Copy Markdown
Owner

Measure threaded interpreter throughput across all opcode groups on linux/amd64 (Intel Xeon @ 2.80 GHz, Go 1.26.2). Add cross-runtime fib(35) comparison against native Go, wazero, gopher-lua, tengo, and goja.

Key findings:

  • Simple i32/i64/f32/f64 ops: ~20–22 ns/op, 0 allocs
  • Function calls (bytecode): ~26–29 ns/op, 0 allocs
  • Host calls: ~36 ns/op, 1 alloc
  • Heap objects (array/struct): ~90–140 ns/op
  • fib(35): minivm 1,673 ms, tengo 2,665 ms (1.6×), gopher-lua 4,081 ms (2.4×), goja 5,427 ms (3.2×)

Changes

  • Add docs/benchmarks.md with full opcode throughput tables and cross-runtime comparison
  • Update README.md and README_kr.md with performance section, new tagline, and readability improvements
  • Update AGENTS.md Documentation Index to include benchmarks.md

Benchmark comparison code (interp/compare_bench_test.go) and external dependencies (wazero, gopher-lua, tengo, goja) were used for measurement and removed after results were recorded in docs/benchmarks.md.

claude added 9 commits May 22, 2026 08:42
…parison

Measure threaded interpreter throughput across all opcode groups on
linux/amd64 (Intel Xeon @ 2.10 GHz). Add cross-runtime fibonacci(20)
comparison against native Go, goja, and tengo.

Key findings:
- Simple i32/i64/f32/f64 ops: ~20–22 ns/op, 0 allocs
- Function calls (bytecode): ~26–29 ns/op, 0 allocs
- Host calls: ~36 ns/op, 1 alloc
- Heap objects (array/struct): ~90–140 ns/op
- Map operations: ~420–535 ns/op
- fib(20): minivm 968 µs, tengo 1729 µs (1.8×), goja 2958 µs (3×)

Add interp/compare_bench_test.go for the cross-runtime fib(20) benchmark.
Add goja and tengo as benchmark-only test dependencies.
Update AGENTS.md Documentation Index to include benchmarks.md.

https://claude.ai/code/session_01WKDZezhPTdEpczDB2rySPY
Remove interp/compare_bench_test.go (goja/tengo/native-Go fib comparison).
Run go mod tidy to drop goja and tengo from go.mod/go.sum.

Results are preserved in docs/benchmarks.md.

https://claude.ai/code/session_01WKDZezhPTdEpczDB2rySPY
…sults

Highlight fib(20) comparison (minivm vs tengo, goja, native Go) and
single-instruction throughput table. Link to docs/benchmarks.md for
full results. Added to both README.md and README_kr.md.

https://claude.ai/code/session_01WKDZezhPTdEpczDB2rySPY
Add wazero (WASM JIT) and gopher-lua (register VM) as comparison runtimes.
Embed hand-encoded fib.wasm binary; validate result before timing loop.

Results on linux/amd64 (Intel Xeon @ 2.80 GHz):
  native Go:   37,968 ns/op   0 allocs
  wazero:      62,219 ns/op   2 allocs  (WASM JIT)
  minivm:   1,157,136 ns/op   0 allocs  (threaded interpreter)
  tengo:    2,000,364 ns/op  28657 allocs
  gopher-lua: 2,942,015 ns/op  2 allocs
  goja:     3,964,702 ns/op  39 allocs

Update docs/benchmarks.md and both READMEs with new runtime table and
execution-model column. Clarify wazero's JIT advantage vs minivm's
threaded interpreter, and ARM64 JIT context.

https://claude.ai/code/session_01WKDZezhPTdEpczDB2rySPY
fib(35) results on linux/amd64 (Intel Xeon @ 2.80 GHz):
  native Go:   51,947,220 ns/op    0 B/op       0 allocs
  wazero:      84,807,148 ns/op   16 B/op       2 allocs  (WASM JIT)
  minivm:   1,672,707,295 ns/op  288 B/op       1 alloc   (threaded)
  tengo:    2,665,298,176 ns/op  312M B/op  39M allocs
  gopher-lua: 4,081,167,978 ns/op 971K B/op   3793 allocs
  goja:     5,427,175,850 ns/op  383K B/op  46384 allocs

Remove compare_bench_test.go and external dependencies after measurement.

https://claude.ai/code/session_01WKDZezhPTdEpczDB2rySPY
- benchmarks.md: consolidate repetitive arithmetic tables into summary
  rows; tighten section flow; remove duplicate methodology note
- README.md / README_kr.md: cleaner intro, consistent section structure,
  tighter prose, fix instruction set table header

https://claude.ai/code/session_01WKDZezhPTdEpczDB2rySPY

Copy link
Copy Markdown
Owner Author

PR Review — docs(benchmarks): add benchmark results and cross-runtime fib(20) comparison

Decision

Request Changes

Merge Readiness Summary

The PR adds comprehensive and well-structured benchmark documentation with clear methodology and comparative analysis. However, there is a critical discrepancy between the PR description and the actual changes: the description states that goja and tengo dependencies were added, but the final state shows they were not included. This mismatch must be clarified before merge.

Blocking Findings

1. PR Description Does Not Match Final State

Evidence:

  • PR description states: "Add goja and tengo as benchmark-only test dependencies."
  • Commit history shows these dependencies were added then explicitly removed (commits 80f8fa3... and 6ec24df... remove interp/compare_bench_test.go).
  • Final go.mod contains neither goja nor tengo.
  • PR changed files list does not include interp/compare_bench_test.go.

Why it matters:
The PR description is the contract for what is being delivered. If it claims dependencies and test code were added but they are not present, reviewers cannot verify the PR against its stated intent.

Minimal fix:
Update the PR description to reflect the actual final state. Choose one:

Option A (if removal was intentional):
"Measure threaded interpreter throughput across all opcode groups on linux/amd64. Add docs/benchmarks.md with comprehensive benchmark results. Update README and AGENTS.md to include performance section and cross-runtime comparison. Remove benchmark comparison test and external dependencies after measurement."

Option B (if addition was intended):
Re-add interp/compare_bench_test.go and restore goja/tengo to go.mod.

Important Findings

None — the documentation itself is high-quality.

Suggestions

  • The benchmark documentation is excellent: clear methodology, well-organized tables, useful comparative context.
  • The README improvements (tagline update, performance section, instruction throughput table) are well-executed.

Questions

  • Was the removal of compare_bench_test.go intentional? The PR description suggests these files and dependencies should be present.

Readiness Checklist

  • Scope control: Pass
  • Correctness: Pass (numbers internally consistent)
  • Test coverage: Pass (documentation only)
  • Architecture consistency: Pass
  • Risk / compatibility: Pass
  • Maintainability: Pass
  • CI / validation: Clean (mergeable_state: clean)

Minimal Path to Merge

  1. Update the PR description to match the final state (either removal or re-addition of goja/tengo).
  2. Merge once description is corrected.

Final Recommendation

Update PR description to match final state, then merge.


Generated by Claude Code

@siyul-park siyul-park changed the title docs(benchmarks): add benchmark results and cross-runtime fib(20) comparison docs(benchmarks): add benchmark results and cross-runtime fib(35) comparison May 22, 2026
@siyul-park siyul-park merged commit b0cf8a5 into main May 22, 2026
5 checks passed
@siyul-park siyul-park deleted the claude/benchmark-measurement-docs-gU7hT branch May 22, 2026 11:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants