hbm

Thermal-aware batch controller for vLLM/TensorRT-LLM. Prevents HBM thermal throttling from killing p99 latency on H100/H200. Monitors nvidia-smi, auto-cuts batch size at 85°C, migrates cold KV to DRAM. Prometheus + Grafana included. 4.2s -> 2.1s p99 at 128K context.

Updated Apr 13, 2026
Python

wnsgus00114-droid / cips-afo-thesis

Star

A.F.O artifact for bridge-sensitive bottleneck attribution and control in hierarchical LLM memory paths (gem5-based reproducibility bundle).

simulation computer-architecture hbm memory-hierarchy gem5 artifact-evaluation llm-inference bottleneck-analysis

Updated May 11, 2026
Python

dancinlab / hexa-chip

Star

🔲 Chip substrate — 28-verb semiconductor stack (architecture / design / EDA / process / packaging / NPU / PIM / 3D / photonic / RTL-gen / yield / consciousness-chip).

chip pim hbm semiconductor 3d-ic photonic npu advanced-packaging n6-invariant hexa-family korea-fab

Updated May 14, 2026
Python

manishklach / hbm_fragmentation_guard_sim

Star

Reference simulator + benchmark harness for HBM residency control and fragmentation metrics.

benchmark simulator gpu allocator prefetch memory-management hbm fragmentation

Updated Apr 2, 2026
Python

Improve this page

Add a description, image, and links to the hbm topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the hbm topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

hbm

Here are 7 public repositories matching this topic...

ModelEngine-Group / unified-cache-management

harvard-acc / DreamRAM

sp4s-s / sparse-jax-atn

manishklach / thermal-ctrl-harness

wnsgus00114-droid / cips-afo-thesis

dancinlab / hexa-chip

manishklach / hbm_fragmentation_guard_sim

Improve this page

Add this topic to your repo