siyul-park · siyul-park · May 25, 2026 · May 25, 2026
diff --git a/README.md b/README.md
@@ -34,30 +34,30 @@ go get github.com/siyul-park/minivm
 
 ## Performance
 
-Recursive `fib(35)` — linux/amd64, Intel Xeon @ 2.80 GHz, Go 1.26.2:
+Recursive `fib(35)` — linux/amd64, Intel Xeon @ 2.10 GHz, Go 1.26.2:
 
 | Runtime | ns/op | B/op | allocs/op | vs native Go | execution model |
 |---|---|---|---|---|---|
-| native Go | 51,947,220 | 0 | 0 | 1× | compiled |
-| wazero | 84,807,148 | 16 | 2 | 1.6× | WASM JIT |
-| **minivm** | **1,672,707,295** | **288** | **1** | **32×** | **threaded interpreter** |
-| tengo | 2,665,298,176 | 312,800,180 | 39,088,180 | 51× | bytecode VM |
-| gopher-lua | 4,081,167,978 | 971,008 | 3,793 | 79× | register VM |
-| goja | 5,427,175,850 | 383,488 | 46,384 | 105× | bytecode VM |
+| native Go | 56,441,552 | 0 | 0 | 1× | compiled |
+| wazero | 84,601,941 | 16 | 2 | 1.5× | WASM → native JIT |
+| **minivm** | **1,320,092,108** | **244** | **1** | **23×** | **threaded interpreter** |
+| tengo | 2,276,648,719 | 312,797,200 | 39,088,175 | 40× | bytecode VM |
+| gopher-lua | 3,002,897,021 | 971,008 | 3,793 | 53× | register VM |
+| goja | 3,962,089,181 | 380,400 | 46,377 | 70× | bytecode VM |
 
-Among interpreters without JIT, minivm is fastest in this benchmark: **1.6× tengo, 2.4× gopher-lua, 3.2× goja**. Allocation count stays near zero regardless of recursion depth; tengo accumulates 39M allocations at fib(35).
+Among interpreters without JIT, minivm is fastest in this benchmark: **1.7× tengo, 2.3× gopher-lua, 3.0× goja**. Allocation count stays near zero regardless of recursion depth; tengo accumulates 39M allocations at fib(35).
 
-wazero's lead is structural: it compiles WASM to native x86-64 at module load. minivm closes this gap on ARM64, where JIT promotes hot segments to native code.
+wazero's advantage is structural: it compiles the WebAssembly module to native x86-64 at load time. minivm closes this gap on ARM64, where JIT promotes hot numeric segments to native code.
 
 Single-instruction throughput (threaded interpreter):
 
 | Workload | ns/op |
 |---|---|
-| i32/i64/f32/f64 arithmetic | ~20–22 |
+| i32/i64/f32/f64 arithmetic | ~17–25 |
 | branches (`br`, `br_if`) | ~20–24 |
 | bytecode function call | ~26–29 |
 | host function call | ~36 |
-| array / struct operations | ~90–140 |
+| array / struct operations | ~82–117 |
 
 Full results: [`docs/benchmarks.md`](docs/benchmarks.md)
 

diff --git a/README_kr.md b/README_kr.md
@@ -34,30 +34,30 @@ go get github.com/siyul-park/minivm
 
 ## 성능
 
-재귀 `fib(35)` — linux/amd64, Intel Xeon @ 2.80 GHz, Go 1.26.2:
+재귀 `fib(35)` — linux/amd64, Intel Xeon @ 2.10 GHz, Go 1.26.2:
 
 | 런타임 | ns/op | B/op | allocs/op | vs native Go | 실행 모델 |
 |---|---|---|---|---|---|
-| native Go | 51,947,220 | 0 | 0 | 1× | 컴파일 |
-| wazero | 84,807,148 | 16 | 2 | 1.6× | WASM JIT |
-| **minivm** | **1,672,707,295** | **288** | **1** | **32×** | **스레디드 인터프리터** |
-| tengo | 2,665,298,176 | 312,800,180 | 39,088,180 | 51× | 바이트코드 VM |
-| gopher-lua | 4,081,167,978 | 971,008 | 3,793 | 79× | 레지스터 VM |
-| goja | 5,427,175,850 | 383,488 | 46,384 | 105× | 바이트코드 VM |
+| native Go | 56,441,552 | 0 | 0 | 1× | 컴파일 |
+| wazero | 84,601,941 | 16 | 2 | 1.5× | WASM → 네이티브 JIT |
+| **minivm** | **1,320,092,108** | **244** | **1** | **23×** | **스레디드 인터프리터** |
+| tengo | 2,276,648,719 | 312,797,200 | 39,088,175 | 40× | 바이트코드 VM |
+| gopher-lua | 3,002,897,021 | 971,008 | 3,793 | 53× | 레지스터 VM |
+| goja | 3,962,089,181 | 380,400 | 46,377 | 70× | 바이트코드 VM |
 
-JIT 없는 인터프리터 중 이 벤치마크에서는 minivm이 가장 빠릅니다: **tengo 대비 1.6×, gopher-lua 대비 2.4×, goja 대비 3.2×**. 재귀 깊이에 관계없이 할당 수가 거의 0에 가깝고, tengo는 fib(35)에서 3,900만 번 할당합니다.
+JIT 없는 인터프리터 중 이 벤치마크에서는 minivm이 가장 빠릅니다: **tengo 대비 1.7×, gopher-lua 대비 2.3×, goja 대비 3.0×**. 재귀 깊이에 관계없이 할당 수가 거의 0에 가깝고, tengo는 fib(35)에서 3,900만 번 할당합니다.
 
-wazero가 앞서는 이유는 모듈 로드 시점에 WebAssembly를 x86-64 네이티브 코드로 JIT 컴파일하기 때문입니다. ARM64에서는 minivm도 핫 세그먼트를 네이티브 코드로 승격하므로 이 격차가 좁혀집니다.
+wazero가 앞서는 이유는 모듈 로드 시점에 WebAssembly를 x86-64 네이티브 코드로 JIT 컴파일하기 때문입니다. ARM64에서는 minivm도 핫 숫자 세그먼트를 네이티브 코드로 승격하므로 이 격차가 좁혀집니다.
 
 단일 명령어 처리량 (스레디드 인터프리터):
 
 | 워크로드 | ns/op |
 |---|---|
-| i32/i64/f32/f64 산술 | ~20–22 |
+| i32/i64/f32/f64 산술 | ~17–25 |
 | 분기 (`br`, `br_if`) | ~20–24 |
 | 바이트코드 함수 호출 | ~26–29 |
 | 호스트 함수 호출 | ~36 |
-| 배열 / 구조체 연산 | ~90–140 |
+| 배열 / 구조체 연산 | ~82–117 |
 
 전체 측정 결과: [`docs/benchmarks.md`](docs/benchmarks.md)