-
Shanghai Jiao Tong University
- Shanghai, China
-
19:31
(UTC +08:00) - gszfwsb.github.io
- @ShaoboWang6
Highlights
- Pro
Lists (11)
Sort Name ascending (A-Z)
Starred repositories
Lightning-Fast RL for LLM Reasoning and Agents. Made Simple & Flexible.
slime is an LLM post-training framework for RL Scaling.
Official PyTorch implementation of the paper "Grounding and Enhancing Informativeness and Utility in Dataset Distillation" (InfoUtil) in ICLR 2026.
Official PyTorch implementation of the paper "Rethinking LLM Evaluation: Can We Evaluate LLMs with 200× Less Data" (EssenceBench) in ICLR 2026.
Fully autonomous & self-evolving research from idea to paper. Chat an Idea. Get a Paper. 🦞
General plug-and-play inference library for Recursive Language Models (RLMs), supporting various sandboxes.
Edit Banana: A framework for converting statistical formats into editable.
AI agents running research on single-GPU nanochat training automatically
Post-training with Tinker
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
Official implementation of "MoDoMoDo: Multi-Domain Data Mixtures for Multimodal LLM Reinforcement Learning"
Assignments for CS146S: The Modern Software Dev (Stanford University Fall 2025)
Qwen3.5 is the large language model series developed by Qwen team, Alibaba Cloud.
Comprehensive open-source library of AI research and engineering skills for any AI model. Package the skills and your claude code/codex/gemini agent will be an AI research agent with full horsepowe…
Revisiting Mid-training in the Era of Reinforcement Learning Scaling
Official repository for the paper "Learning beyond Teacher: Generalized On-Policy Distillation with Reward Extrapolation"
Reinforcement Learning via Self-Distillation (SDPO)
The Github repo for our survey paper: A Survey of Linear Attention: Algorithm, Theory, Application, and Infrastructure
f.k.a. Awesome ChatGPT Prompts. Share, discover, and collect prompts from the community. Free and open source — self-host for your organization with complete privacy.
自动化上传视频到社交媒体:抖音、小红书、视频号、tiktok、youtube、bilibili
Shaping capabilities with token-level pretraining data filtering
Our code for ICLR'25 paper "DataMan: Data Manager for Pre-training Large Language Models".