Stars
2
results
for sponsorable starred repositories
written in Python
Clear filter
A high-throughput and memory-efficient inference and serving engine for LLMs
https://wavespeed.ai/ Context parallel attention that accelerates DiT model inference with dynamic caching