xlite-dev
Pinned Loading
Repositories
- sglang Public Forked from sgl-project/sglang
SGLang is a fast serving framework for large language models and vision language models.
xlite-dev/sglangās past year of commit activity - ffpa-attn Public
š¤FFPA: Extends FlashAttention-2 via Split-D for large headdims, 1.5x~3Ćāš vs SDPA, up to 430Tš on H200.
xlite-dev/ffpa-attnās past year of commit activity - cache-dit Public Forked from vipshop/cache-dit
A PyTorch-native inference engine with cache, parallelism, quantization for Diffusion Transformers.
xlite-dev/cache-ditās past year of commit activity - .github Public
xlite-dev/.githubās past year of commit activity - deepseek-v4-for-copilot Public Forked from Vizards/deepseek-v4-for-copilot
Pick DeepSeek V4 from the Copilot Chat model picker ā and keep everything else Copilot already gives you.
xlite-dev/deepseek-v4-for-copilotās past year of commit activity - LeetCUDA Public
šLeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginnersš, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.š
xlite-dev/LeetCUDAās past year of commit activity - diffusers Public Forked from huggingface/diffusers
š¤ Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch and FLAX.
xlite-dev/diffusersās past year of commit activity - svdquant-kernels Public Forked from ultism/svdquant-kernels
Cross-architecture CUDA kernels for SVDQuant (W4A4 with low-rank correction)
xlite-dev/svdquant-kernelsās past year of commit activity - flash-attention Public Forked from Dao-AILab/flash-attention
Fast and memory-efficient exact attention
xlite-dev/flash-attentionās past year of commit activity - cutlass Public Forked from NVIDIA/cutlass
CUDA Templates and Python DSLs for High-Performance Linear Algebra
xlite-dev/cutlassās past year of commit activity
Top languages
Loadingā¦
Most used topics
Loadingā¦