jianzhu

Follow

🎯

Focusing

steve jianzhu

🎯

Focusing

Follow

88 followers · 20 following

Beijing, China

Achievements

Achievements

Stars

karpathy / autoresearch

AI agents running research on single-GPU nanochat training automatically

Python 55,948 7,796 Updated Mar 21, 2026

anyofai / anyofai.github.io

2026年最新ChatGPT充值订阅教程（117元/月)：本文会重点介绍五种开通ChatGPT Plus会员的方法，包括购买ChatGPT Plus独立账号、为你的ChatGPT代充值、拼车合租ChatGPT Plus账号、使用苹果Apple礼品卡充值ChatGPT会员、使用国外的虚拟信用卡订阅ChatGPT Plus会员。

CSS 863 32 Updated Mar 18, 2026

HKUDS / nanobot

"🐈 nanobot: The Ultra-Lightweight OpenClaw"

Python 36,252 6,203 Updated Mar 25, 2026

lasgroup / SDPO

Reinforcement Learning via Self-Distillation (SDPO)

Python 683 65 Updated Feb 18, 2026

openclaw / openclaw

Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞

TypeScript 335,912 65,666 Updated Mar 25, 2026

farion1231 / cc-switch

A cross-platform desktop All-in-One assistant tool for Claude Code, Codex, OpenCode, openclaw & Gemini CLI.

Rust 33,383 2,007 Updated Mar 25, 2026

karpathy / nanochat

The best ChatGPT that $100 can buy.

Python 50,298 6,602 Updated Mar 25, 2026

Continual-Intelligence / SEAL

Self-Adapting Language Models

Python 1,730 305 Updated Aug 1, 2025

wizard-III / Archer2.0

Archer2.0 evolves from its predecessor by introducing ASPO, which overcomes fundamental PPO-Clip limitations to prevent premature convergence and unlock greater RL potential.

Python 30 2 Updated Oct 10, 2025

AlmondGod / tinyworlds

A minimal implementation of DeepMind's Genie world model

Python 1,191 97 Updated Feb 28, 2026

X-PLUG / MobileAgent

Mobile-Agent: The Powerful GUI Agent Family

Python 8,308 835 Updated Mar 25, 2026

anordin95 / a-conceptual-overview-of-asyncio

Python 260 11 Updated Jan 14, 2026

MikeWangWZHL / PAPO

Official repo for "PAPO: Perception-Aware Policy Optimization for Multimodal Reasoning"

Python 125 9 Updated Feb 4, 2026

JIA-Lab-research / ARPO

Official Implementation of ARPO: End-to-End Policy Optimization for GUI Agents with Experience Replay

Python 153 10 Updated May 29, 2025

yfzhang114 / r1_reward

✨✨ [ICLR 2026] R1-Reward: Training Multimodal Reward Model Through Stable Reinforcement Learning

Python 282 22 Updated May 9, 2025

MoonshotAI / Kimi-VL

Kimi-VL: Mixture-of-Experts Vision-Language Model for Multimodal Reasoning, Long-Context Understanding, and Strong Agent Capabilities

1,169 74 Updated Jul 15, 2025

verl-project / verl

verl: Volcano Engine Reinforcement Learning for LLMs

Python 20,203 3,501 Updated Mar 25, 2026

huggingface / open-r1

Fully open reproduction of DeepSeek-R1

Python 25,967 2,416 Updated Nov 24, 2025

sail-sg / understand-r1-zero

Understanding R1-Zero-Like Training: A Critical Perspective

Python 1,236 57 Updated Aug 27, 2025

lukDev / awr_pytorch

PyTorch implementation of AWR.

Python 4 1 Updated Apr 29, 2020

NVlabs / COAT

[ICLR 2025] COAT: Compressing Optimizer States and Activation for Memory-Efficient FP8 Training

Python 262 25 Updated Aug 9, 2025

DigiRL-agent / digiq

Python 118 8 Updated Apr 8, 2025

deepseek-ai / open-infra-index

Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation

7,972 288 Updated May 15, 2025

simplescaling / s1

s1: Simple test-time scaling

Python 6,650 765 Updated Jun 25, 2025

deepseek-ai / DeepSeek-R1

91,987 11,750 Updated Jun 27, 2025

LeslieTrue / SFTvsRL

Official implementation of paper: SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Python 318 18 Updated Apr 28, 2025

deepseek-ai / DeepSeek-V3

Python 102,379 16,606 Updated Aug 28, 2025

princeton-nlp / LESS

[ICML 2024] LESS: Selecting Influential Data for Targeted Instruction Tuning

Jupyter Notebook 519 45 Updated Oct 20, 2024

karpathy / nanoGPT

The simplest, fastest repository for training/finetuning medium-sized GPTs.

Python 55,584 9,465 Updated Nov 12, 2025

nikhilvyas / SOAP

Python 258 16 Updated Dec 2, 2024