Stars
Scalable toolkit for efficient model reinforcement
Scalable toolkit for efficient model reinforcement
Towards a Unified View of Large Language Model Post-Training
Mirage Persistent Kernel: Compiling LLMs into a MegaKernel
A very simple GRPO implement for reproducing r1-like LLM thinking.
verl/HybridFlow: A Flexible and Efficient RL Post-Training Framework
how to optimize some algorithm in cuda.
High-performance inference framework for large language models, focusing on efficiency, flexibility, and availability.
Ollama Function Calling with Search API
📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉
LLM notes, including model inference, transformer model structure, and llm framework code analysis notes.
A repository aimed at pruning DeepSeek V3, R1 and R1-zero to a usable size
This is the repository that contains the source code for the Self-Evaluation Guided MCTS for online DPO.
Implementation of paper Data Engineering for Scaling Language Models to 128K Context
RTP-LLM: Alibaba's high-performance LLM inference engine for diverse applications.
Tools for merging pretrained large language models.
A flexible and efficient training framework for large-scale alignment tasks
An Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO & REINFORCE++ & VLM & TIS & vLLM & Ray & Async RL)
😇 A PyTorch-like deep learning framework. Just for fun.