#
vram
Here are 4 public repositories matching this topic...
Expert streaming inference engine for MoE models larger than VRAM — run 235B+ models on consumer GPUs
-
Updated
Mar 30, 2026 - C
Research into CUDA Unified Memory as a VRAM extension for LLM inference
-
Updated
Jun 12, 2026 - C
GDDR6X VRAM Temperature reader for Ampere/Ada (3000 and 4000 series)
-
Updated
Apr 28, 2026 - C
Improve this page
Add a description, image, and links to the vram topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the vram topic, visit your repo's landing page and select "manage topics."