Skip to content
View gszfwsb's full-sized avatar
😈
Making alchemy
😈
Making alchemy

Highlights

  • Pro

Block or report gszfwsb

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results
Python 30 Updated May 12, 2026

Experimenting Heuristic Learning with ImageNet

Python 52 4 Updated May 18, 2026

The agent that grows with you

Python 155,043 24,852 Updated May 18, 2026

The RL Bridge for LLM-based Agent Applications. Made Simple & Flexible.

Python 5,182 496 Updated May 16, 2026

slime is an LLM post-training framework for RL Scaling.

Python 5,708 796 Updated May 14, 2026

Official PyTorch implementation of the paper "Grounding and Enhancing Informativeness and Utility in Dataset Distillation" (InfoUtil) in ICLR 2026.

2 Updated Feb 28, 2026

Official PyTorch implementation of the paper "Rethinking LLM Evaluation: Can We Evaluate LLMs with 200× Less Data" (EssenceBench) in ICLR 2026.

Python 3 1 Updated Mar 18, 2026

Fully autonomous & self-evolving research from idea to paper. Chat an Idea. Get a Paper. 🦞

Python 12,242 1,433 Updated Apr 23, 2026

General plug-and-play inference library for Recursive Language Models (RLMs), supporting various sandboxes.

Python 4,260 748 Updated May 13, 2026

MBA AI Agent课程:2天集中培训,从LLM基础到多智能体系统

51 12 Updated Mar 14, 2026

Edit Banana: A framework for converting statistical formats into editable.

Python 5,189 350 Updated Apr 30, 2026

AI agents running research on single-GPU nanochat training automatically

Python 81,569 11,857 Updated Mar 26, 2026
Python 1,928 122 Updated Sep 30, 2025

Training API and CLI

Python 467 56 Updated May 14, 2026

Post-training with Tinker

Python 3,300 419 Updated May 18, 2026

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

Python 160,700 33,240 Updated May 18, 2026

Official implementation of "MoDoMoDo: Multi-Domain Data Mixtures for Multimodal LLM Reinforcement Learning"

Python 10 2 Updated Aug 23, 2025
Python 40 5 Updated Mar 26, 2026

Assignments for CS146S: The Modern Software Dev (Stanford University Fall 2025)

Python 3,628 877 Updated Nov 10, 2025

Qwen3.6 is the large language model series developed by Qwen team, Alibaba Group.

3,391 220 Updated May 11, 2026

Comprehensive open-source library of AI research and engineering skills for any AI model. Package the skills and your claude code/codex/gemini agent will be an AI research agent with full horsepowe…

TeX 8,561 654 Updated Apr 28, 2026

Revisiting Mid-training in the Era of Reinforcement Learning Scaling

Jupyter Notebook 188 14 Updated Jul 23, 2025

Official repository for the paper "Learning beyond Teacher: Generalized On-Policy Distillation with Reward Extrapolation"

Python 164 11 Updated Mar 18, 2026

Reinforcement Learning via Self-Distillation (SDPO)

Python 876 95 Updated Feb 18, 2026

The Github repo for our survey paper: A Survey of Linear Attention: Algorithm, Theory, Application, and Infrastructure

10 Updated Feb 6, 2026

f.k.a. Awesome ChatGPT Prompts. Share, discover, and collect prompts from the community. Free and open source — self-host for your organization with complete privacy.

HTML 162,431 21,145 Updated May 17, 2026

自动化上传视频到社交媒体:抖音、小红书、视频号、tiktok、youtube、bilibili

Python 11,125 1,990 Updated May 17, 2026

Shaping capabilities with token-level pretraining data filtering

Python 93 7 Updated Jan 28, 2026
Next