Skip to content
View gszfwsb's full-sized avatar
😈
Making alchemy
😈
Making alchemy

Highlights

  • Pro

Block or report gszfwsb

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

Lightning-Fast RL for LLM Reasoning and Agents. Made Simple & Flexible.

Python 4,986 447 Updated Apr 3, 2026

slime is an LLM post-training framework for RL Scaling.

Python 5,122 692 Updated Apr 5, 2026

Official PyTorch implementation of the paper "Grounding and Enhancing Informativeness and Utility in Dataset Distillation" (InfoUtil) in ICLR 2026.

2 Updated Feb 28, 2026

Official PyTorch implementation of the paper "Rethinking LLM Evaluation: Can We Evaluate LLMs with 200× Less Data" (EssenceBench) in ICLR 2026.

Python 3 1 Updated Mar 18, 2026

Fully autonomous & self-evolving research from idea to paper. Chat an Idea. Get a Paper. 🦞

Python 10,366 1,157 Updated Apr 4, 2026

General plug-and-play inference library for Recursive Language Models (RLMs), supporting various sandboxes.

Python 3,268 601 Updated Mar 24, 2026

MBA AI Agent课程:2天集中培训,从LLM基础到多智能体系统

41 7 Updated Mar 14, 2026

Edit Banana: A framework for converting statistical formats into editable.

Python 4,692 302 Updated Apr 3, 2026

AI agents running research on single-GPU nanochat training automatically

Python 66,021 9,442 Updated Mar 26, 2026
Python 1,881 118 Updated Sep 30, 2025

Training API and CLI

Python 368 43 Updated Mar 19, 2026

Post-training with Tinker

Python 3,029 367 Updated Apr 5, 2026

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

Python 158,825 32,737 Updated Apr 5, 2026

Official implementation of "MoDoMoDo: Multi-Domain Data Mixtures for Multimodal LLM Reinforcement Learning"

Python 10 2 Updated Aug 23, 2025
Python 34 2 Updated Mar 26, 2026

Assignments for CS146S: The Modern Software Dev (Stanford University Fall 2025)

Python 3,400 796 Updated Nov 10, 2025

Qwen3.5 is the large language model series developed by Qwen team, Alibaba Cloud.

2,467 135 Updated Mar 2, 2026

Comprehensive open-source library of AI research and engineering skills for any AI model. Package the skills and your claude code/codex/gemini agent will be an AI research agent with full horsepowe…

TeX 6,173 482 Updated Apr 1, 2026

Revisiting Mid-training in the Era of Reinforcement Learning Scaling

Jupyter Notebook 186 14 Updated Jul 23, 2025

Official repository for the paper "Learning beyond Teacher: Generalized On-Policy Distillation with Reward Extrapolation"

Python 83 7 Updated Mar 18, 2026

Reinforcement Learning via Self-Distillation (SDPO)

Python 725 80 Updated Feb 18, 2026

The Github repo for our survey paper: A Survey of Linear Attention: Algorithm, Theory, Application, and Infrastructure

8 Updated Feb 6, 2026

f.k.a. Awesome ChatGPT Prompts. Share, discover, and collect prompts from the community. Free and open source — self-host for your organization with complete privacy.

HTML 157,492 20,624 Updated Apr 5, 2026

自动化上传视频到社交媒体:抖音、小红书、视频号、tiktok、youtube、bilibili

Python 9,709 1,752 Updated Mar 26, 2026

Shaping capabilities with token-level pretraining data filtering

Python 92 6 Updated Jan 28, 2026

Our code for ICLR'25 paper "DataMan: Data Manager for Pre-training Large Language Models".

Python 122 2 Updated Feb 7, 2026
Python 13 Updated Sep 30, 2025
Next