TongLi3701

🎯

Focusing

Tong Li TongLi3701

🎯

Focusing

Post-train for Generative AI

60 followers · 163 following

Shanghai, China
10:44 (UTC +08:00)
https://www.linkedin.com/in/tongli3701/

Achievements

x3 x3

Achievements

x3 x3

Lists (10)

Sort

Stars

yifeizhangcs / awesome-agentic-commerce

5 Updated Mar 29, 2026

karpathy / autoresearch

AI agents running research on single-GPU nanochat training automatically

Python 62,797 8,786 Updated Mar 26, 2026

EverMind-AI / MSA

Memory Sparse Attention - 亿级（100M）token 上下文的端到端可训练记忆框架

2,399 133 Updated Mar 29, 2026

Gen-Verse / OpenClaw-RL

OpenClaw-RL: Train any agent simply by talking

Python 4,481 448 Updated Mar 31, 2026

ChenmienTan / RL2

Python 1,249 128 Updated Feb 28, 2026

THUDM / slime

slime is an LLM post-training framework for RL Scaling.

Python 5,056 677 Updated Mar 29, 2026

liangyuwang / Tiny-FSDP

Tiny-FSDP, a minimalistic re-implementation of the PyTorch FSDP

Python 101 9 Updated Aug 20, 2025

TsinghuaC3I / Awesome-RL-for-LRMs

A Survey of Reinforcement Learning for Large Reasoning Models

TeX 2,411 128 Updated Nov 9, 2025

ScienceOne-AI / DeepSeek-671B-SFT-Guide

An open-source solution for full parameter fine-tuning of DeepSeek-V3/R1 671B, including complete code and scripts from training to inference, as well as some practical experiences and conclusions.…

Python 800 95 Updated Mar 13, 2025

lsdefine / simple_GRPO

A very simple GRPO implement for reproducing r1-like LLM thinking.

Python 1,624 129 Updated Nov 21, 2025

huggingface / open-r1

Fully open reproduction of DeepSeek-R1

Python 25,967 2,411 Updated Nov 24, 2025

InternLM / OREAL

Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning

Python 192 6 Updated Mar 20, 2025

Unakar / Logic-RL

Reproduce R1 Zero on Logic Puzzle

Python 2,443 163 Updated Mar 20, 2025

verl-project / verl

verl: Volcano Engine Reinforcement Learning for LLMs

Python 20,357 3,545 Updated Apr 1, 2026

PRIME-RL / PRIME

Scalable RL solution for advanced reasoning of language models

Python 1,835 108 Updated Mar 18, 2025

AIDC-AI / Marco-o1

An Open Large Reasoning Model for Real-World Solutions

Python 1,539 80 Updated Feb 13, 2026

alibaba / ChatLearn

A flexible and efficient training framework for large-scale alignment tasks

Python 452 39 Updated Oct 23, 2025

SimpleBerry / LLaMA-O1

Large Reasoning Models

Python 805 47 Updated Dec 3, 2024

mit-han-lab / vila-u

[ICLR 2025] VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation

Python 420 18 Updated Apr 25, 2025

BrendanGraham14 / mcts-llm

Python 131 20 Updated Jun 18, 2024

HKUDS / LightRAG

[EMNLP2025] "LightRAG: Simple and Fast Retrieval-Augmented Generation"

Python 31,296 4,472 Updated Mar 30, 2026

waterhorse1 / LLM_Tree_Search

(ICML 2024) Alphazero-like Tree-Search can guide large language model decoding and training

Python 285 30 Updated May 26, 2024

openreasoner / openr

OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models

Python 1,839 130 Updated Jan 17, 2025

GAIR-NLP / O1-Journey

O1 Replication Journey

2,000 61 Updated Jan 14, 2025

THUDM / ReST-MCTS

ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search (NeurIPS 2024)

Python 698 49 Updated Jan 20, 2025

pseudotensor / open-strawberry

Building open version of OpenAI o1 via reasoning traces (Groq, ollama, Anthropic, Gemini, OpenAI, Azure supported) Demo: https://huggingface.co/spaces/pseudotensor/open-strawberry

Python 186 18 Updated Oct 15, 2024