Skip to content
View TongLi3701's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report TongLi3701

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

AI agents running research on single-GPU nanochat training automatically

Python 58,317 8,085 Updated Mar 26, 2026

OpenClaw-RL: Train any agent simply by talking

Python 4,307 424 Updated Mar 27, 2026
Python 1,246 127 Updated Feb 28, 2026

slime is an LLM post-training framework for RL Scaling.

Python 5,005 670 Updated Mar 27, 2026

Tiny-FSDP, a minimalistic re-implementation of the PyTorch FSDP

Python 101 9 Updated Aug 20, 2025

A Survey of Reinforcement Learning for Large Reasoning Models

TeX 2,403 128 Updated Nov 9, 2025

An open-source solution for full parameter fine-tuning of DeepSeek-V3/R1 671B, including complete code and scripts from training to inference, as well as some practical experiences and conclusions.…

Python 798 96 Updated Mar 13, 2025

A very simple GRPO implement for reproducing r1-like LLM thinking.

Python 1,621 130 Updated Nov 21, 2025

Fully open reproduction of DeepSeek-R1

Python 25,970 2,414 Updated Nov 24, 2025

Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning

Python 193 6 Updated Mar 20, 2025

Reproduce R1 Zero on Logic Puzzle

Python 2,442 163 Updated Mar 20, 2025

verl: Volcano Engine Reinforcement Learning for LLMs

Python 20,261 3,519 Updated Mar 27, 2026

Scalable RL solution for advanced reasoning of language models

Python 1,831 107 Updated Mar 18, 2025

An Open Large Reasoning Model for Real-World Solutions

Python 1,539 80 Updated Feb 13, 2026

A flexible and efficient training framework for large-scale alignment tasks

Python 452 39 Updated Oct 23, 2025

Large Reasoning Models

Python 806 47 Updated Dec 3, 2024

[ICLR 2025] VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation

Python 421 18 Updated Apr 25, 2025
Python 130 20 Updated Jun 18, 2024

[EMNLP2025] "LightRAG: Simple and Fast Retrieval-Augmented Generation"

Python 30,771 4,398 Updated Mar 27, 2026

(ICML 2024) Alphazero-like Tree-Search can guide large language model decoding and training

Python 285 30 Updated May 26, 2024

OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models

Python 1,839 130 Updated Jan 17, 2025

O1 Replication Journey

2,001 61 Updated Jan 14, 2025

ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search (NeurIPS 2024)

Python 698 49 Updated Jan 20, 2025

Building open version of OpenAI o1 via reasoning traces (Groq, ollama, Anthropic, Gemini, OpenAI, Azure supported) Demo: https://huggingface.co/spaces/pseudotensor/open-strawberry

Python 187 18 Updated Oct 15, 2024

Code for Quiet-STaR

Python 741 91 Updated Aug 21, 2024

Writing AI Conference Papers: A Handbook for Beginners

3,560 127 Updated Jul 16, 2025

A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.

6,903 367 Updated Dec 17, 2025

Efficient Triton Kernels for LLM Training

Python 6,241 507 Updated Mar 27, 2026
Next