Skip to content
View wyxscir's full-sized avatar
🍒
🍒
  • beijing

Block or report wyxscir

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

A project implementing various agentic RL based on the Slime post-training framework

Python 312 14 Updated Apr 11, 2026

Qwen3-omni is a natively end-to-end, omni-modal LLM developed by the Qwen team at Alibaba Cloud, capable of understanding text, audio, images, and video, as well as generating speech in real time.

Jupyter Notebook 3,664 252 Updated Jan 8, 2026

Deep dive into Claude Code internals — architecture, agent loop, context engineering, and more. / 深入解析 Claude Code 源码:架构、Agent 循环、上下文工程、工具系统等

HTML 2 Updated Mar 31, 2026

The repo is finally unlocked. enjoy the party! The fastest repo in history to surpass 100K stars ⭐. Join Discord: https://discord.gg/5TUQKqFWd Built in Rust using oh-my-codex.

Rust 182,151 107,404 Updated Apr 12, 2026

Lightweight coding agent that runs in your terminal

Rust 74,763 10,573 Updated Apr 12, 2026

A construction kit for reinforcement learning environment management.

Python 408 54 Updated Apr 12, 2026

Official repository for DR Tulu: Reinforcement Learning with Evolving Rubrics for Deep Research

Python 615 59 Updated Apr 6, 2026

Bash is all you need - A nano claude code–like 「agent harness」, built from 0 to 1

TypeScript 52,028 8,500 Updated Apr 7, 2026

MCore-Bridge: Providing Megatron-Core model definitions for state-of-the-art large models and making Megatron training as simple as Transformers.

Python 41 5 Updated Apr 12, 2026

Harbor is a framework for running agent evaluations and creating and using RL environments.

Python 1,417 897 Updated Apr 12, 2026

😼 优雅地使用基于 clash/mihomo 的代理环境

Shell 11,755 1,340 Updated Mar 31, 2026

2026年最新ChatGPT充值订阅教程(117元/月):本文会重点介绍五种开通ChatGPT Plus会员的方法,包括购买ChatGPT Plus独立账号、为你的ChatGPT代充值、拼车合租ChatGPT Plus账号、使用苹果Apple礼品卡充值ChatGPT会员、使用国外的虚拟信用卡订阅ChatGPT Plus会员。

CSS 919 34 Updated Mar 18, 2026

OpenResearcher: A Fully Open Pipeline for Long-Horizon Deep Research Trajectory Synthesis

Python 647 70 Updated Apr 4, 2026

[KernelGYM & Dr. Kernel] A distributed GPU environment and a collection of RL training methods to support RL for Kernel Generations

Python 155 15 Updated Mar 29, 2026

OpenClaw-RL: Train any agent simply by talking

Python 4,834 505 Updated Apr 11, 2026

xiaohongshu-skills

Python 1,017 150 Updated Mar 28, 2026

Measuring how well CLI agents like Claude Code or Codex CLI can post-train base LLMs on a single H100 GPU in 10 hours

Python 259 27 Updated Apr 7, 2026

CUDA Agent: Large-Scale Agentic RL for High-Performance CUDA Kernel Generation

Python 915 66 Updated Mar 4, 2026

Your Personal AI Assistant; easy to install, deploy on your own machine or on the cloud; supports multiple chat apps with easily extensible capabilities.

Python 15,091 2,034 Updated Apr 12, 2026

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3.5, DeepSeek-R1, GLM-5, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Llava, Phi4, ...)…

Python 13,650 1,343 Updated Apr 12, 2026

A curated skill collection for academic writing and research

Shell 778 61 Updated Apr 6, 2026

科研写作助手 (Research Writing Assistant)

Python 642 60 Updated Mar 29, 2026

RedSearcher's framework for deep search agent trajectory synthesis, QA filtering, and model evaluation, supporting ReACT and DeepSeek-style agent loops.

Python 11 2 Updated Feb 26, 2026

Official repository for the paper "Learning beyond Teacher: Generalized On-Policy Distillation with Reward Extrapolation"

Python 90 8 Updated Mar 18, 2026

Moonshot's most powerful model

1,734 192 Updated Jan 31, 2026

分享AI Infra知识&代码练习:PyTorch/vLLM/SGLang框架入门⚡️、性能加速🚀、大模型基础🧠、AI软硬件🔧等

Jupyter Notebook 1,621 131 Updated Apr 8, 2026

Qwen3.5 is the large language model series developed by Qwen team, Alibaba Cloud.

2,561 145 Updated Mar 2, 2026

GLM-5: From Vibe Coding to Agentic Engineering

2,640 256 Updated Apr 9, 2026

REDSearch: A scalable, cost-efficient framework for long-horizon search agents. Features complex task synthesis, optimized mid-training, post-training (SFT and Agentic RL)

85 5 Updated Feb 26, 2026
Python 7 Updated Jan 30, 2026
Next