nvidia-h20

Here is 1 public repository matching this topic...

CarrotSwordsman / H20-LLM-Cookbook

Reproducible benchmark suite and tuned Triton fused-MoE configs for NVIDIA H20 LLM inference. 24 configs, 36 perf data points, geomean 1.09× / peak 1.74× speedup.

benchmark triton moe hopper h20 kernel-tuning mixture-of-experts bf16 llm fp8 vllm llm-inference qwen mixtral deepseek sglang triton-kernels nvidia-h20 fused-moe

Updated Jun 2, 2026
Python

Improve this page

Add a description, image, and links to the nvidia-h20 topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the nvidia-h20 topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

nvidia-h20

Here is 1 public repository matching this topic...

CarrotSwordsman / H20-LLM-Cookbook

Improve this page

Add this topic to your repo