Highlights
- Pro
Lists (1)
Sort Name ascending (A-Z)
Stars
Tensors and Dynamic neural networks in Python with strong GPU acceleration
Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train OpenAI gpt-oss, DeepSeek-R1, Qwen3, Gemma 3, TTS 2x faster with 70% less VRAM.
CUDA Templates and Python DSLs for High-Performance Linear Algebra
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
OpenBLAS is an optimized BLAS library based on GotoBLAS2 1.13 BSD version.
A modular graph-based Retrieval-Augmented Generation (RAG) system
Awesome LLM compression research papers and tools.
RAG on Everything with LEANN. Enjoy 97% storage savings while running a fast, accurate, and 100% private RAG application on your personal device.
Github Pages template based upon HTML and Markdown for personal, portfolio-based websites.
A header-only C++ library for sketching in randomized linear algebra
Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.
Master programming by recreating your favorite technologies from scratch.
FlashMLA: Efficient Multi-head Latent Attention Kernels
A collection of AWESOME things about domain adaptation
Fully open reproduction of DeepSeek-R1
Official style files for papers submitted to venues of the Association for Computational Linguistics
For releasing code related to compression methods for transformers, accompanying our publications