Stars
Tensors and Dynamic neural networks in Python with strong GPU acceleration
🧑🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), ga…
Making large AI models cheaper, faster and more accessible
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (V…
Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more
🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.
Fast and memory-efficient exact attention
tiktoken is a fast BPE tokeniser for use with OpenAI's models.
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.
Ongoing research training transformer models at scale
Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"
Open deep learning compiler stack for cpu, gpu and specialized accelerators
A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch
Home of StarCoder: fine-tuning & inference!
A PyTorch native platform for training generative AI models
Data Structure and Algorithm notes. 数据结构与算法/leetcode/lintcode题解/
PyTorch extensions for high performance and large scale training.
An Automatic Model Compression (AutoMC) framework for developing smaller and faster AI applications.
A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit and 4-bit floating point (FP8 and FP4) precision on Hopper, Ada and Blackwell GPUs, to provide better performance…
slime is an LLM post-training framework for RL Scaling.
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models