-
Colossal-AI Team @hpcaitech
- Shanghai, China
-
16:29
(UTC +08:00) - https://www.linkedin.com/in/tongli3701/
Lists (10)
Sort Name ascending (A-Z)
Stars
slime is an LLM post-training framework for RL Scaling.
Tiny-FSDP, a minimalistic re-implementation of the PyTorch FSDP
A Survey of Reinforcement Learning for Large Reasoning Models
An open-source solution for full parameter fine-tuning of DeepSeek-V3/R1 671B, including complete code and scripts from training to inference, as well as some practical experiences and conclusions.โฆ
A very simple GRPO implement for reproducing r1-like LLM thinking.
Fully open reproduction of DeepSeek-R1
Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning
verl: Volcano Engine Reinforcement Learning for LLMs
Scalable RL solution for advanced reasoning of language models
An Open Large Reasoning Model for Real-World Solutions
A flexible and efficient training framework for large-scale alignment tasks
[ICLR 2025] VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation
[EMNLP2025] "LightRAG: Simple and Fast Retrieval-Augmented Generation"
(ICML 2024) Alphazero-like Tree-Search can guide large language model decoding and training
OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models
ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search (NeurIPS 2024)
Building open version of OpenAI o1 via reasoning traces (Groq, ollama, Anthropic, Gemini, OpenAI, Azure supported) Demo: https://huggingface.co/spaces/pseudotensor/open-strawberry
Writing AI Conference Papers: A Handbook for Beginners
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 ๐ and reasoning techniques.
Efficient Triton Kernels for LLM Training
Empowering RAG with a memory-based data interface for all-purpose applications!
A simple, easy-to-hack GraphRAG implementation
๐ MINT-1T: A one trillion token multimodal interleaved dataset.
[TMLR 2025] Latte: Latent Diffusion Transformer for Video Generation.