Stars
g1: Using Llama-3.1 70b on Groq to create o1-like reasoning chains
Understand Human Behavior to Align True Needs
Set of tools to assess and improve LLM security.
🚀 Efficient implementations of state-of-the-art linear attention models
[ICLR2025 Spotlight] SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models
A bidirectional pipeline parallelism algorithm for computation-communication overlap in DeepSeek V3/R1 training.
🚪✊Knock Knock: Get notified when your training ends with only two additional lines of code
Leaderboard Comparing LLM Performance at Producing Hallucinations when Summarizing Short Documents
RAGEN leverages reinforcement learning to train LLM reasoning agents in interactive, stochastic environments.
A library for advanced large language model reasoning
Minimalistic large language model 3D-parallelism training
AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:
micronet, a model compression and deploy lib. compression: 1、quantization: quantization-aware-training(QAT), High-Bit(>2b)(DoReFa/Quantization and Training of Neural Networks for Efficient Integer-…
deepspeedai / Megatron-DeepSpeed
Forked from NVIDIA/Megatron-LMOngoing research training transformer language models at scale, including: BERT & GPT-2
Muon is an optimizer for hidden layers in neural networks
Minimalistic 4D-parallelism distributed training framework for education purpose
YaRN: Efficient Context Window Extension of Large Language Models
Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback
Stanford NLP Python library for Representation Finetuning (ReFT)
[NIPS'25 Spotlight] Mulberry, an o1-like Reasoning and Reflection MLLM Implemented via Collective MCTS
verl-agent is an extension of veRL, designed for training LLM/VLM agents via RL. verl-agent is also the official code for paper "Group-in-Group Policy Optimization for LLM Agent Training"
Trained models & code to predict toxic comments on all 3 Jigsaw Toxic Comment Challenges. Built using ⚡ Pytorch Lightning and 🤗 Transformers. For access to our API, please email us at contact@unita…