NVFP4-first inference engine for consumer Blackwell. OMMA.SF.16864, KV64 cache (128K context), multi-kernel dispatch. DenForge engine backend. RTX 5070 Ti (GB203-300-A1, SM120).
-
Updated
May 20, 2026 - C++
NVFP4-first inference engine for consumer Blackwell. OMMA.SF.16864, KV64 cache (128K context), multi-kernel dispatch. DenForge engine backend. RTX 5070 Ti (GB203-300-A1, SM120).
Add a description, image, and links to the omma topic page so that developers can more easily learn about it.
To associate your repository with the omma topic, visit your repo's landing page and select "manage topics."