Highlights
- Pro
Lists (1)
Sort Name ascending (A-Z)
Stars
Tensors and Dynamic neural networks in Python with strong GPU acceleration
Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train OpenAI gpt-oss, DeepSeek-R1, Qwen3, Gemma 3, TTS 2x faster with 70% less VRAM.
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
A modular graph-based Retrieval-Augmented Generation (RAG) system
Fully open reproduction of DeepSeek-R1
Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.
RAG on Everything with LEANN. Enjoy 97% storage savings while running a fast, accurate, and 100% private RAG application on your personal device.
Code for the ICML 2023 paper "SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot".
Implementation related to the Deep Complex Networks
For releasing code related to compression methods for transformers, accompanying our publications
Source code to our paper: "Learning a Variational Network for Reconstruction of Accelerated MRI Data"
Implementation of Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer in PyTorch.
code for the paper "A Statistical Framework for Low-bitwidth Training of Deep Neural Networks"
A numerical library for High-Dimensional option Pricing problems, including Fourier transform methods, Monte Carlo methods and the Deep Galerkin method
My implementation of the gMLP model from the paper "Pay Attention to MLPs".
ReCoDe project to showcase an implementation of the Euler-Maruyama numerical method to solve Stochastic Differential Equations