- Shenzhen, China
-
18:39
(UTC +08:00) - https://guixianjin.github.io/
- https://scholar.google.com/citations?user=qhu2Y6IAAAAJ&hl=zh-CN
Lists (1)
Sort Name ascending (A-Z)
Stars
Lightweight and Scalable Post-training: The Ray-Free, Debug-Friendly Alignment Stack with Megatron-native simplicity.
ARIS ⚔️ (Auto-Research-In-Sleep) — Lightweight Markdown-only skills for autonomous ML research: cross-model review loops, idea discovery, and experiment automation. No framework, no lock-in — works…
An agentic skills framework & software development methodology that works.
🦸 AI 编程超能力 · 中文增强版 — superpowers(116k+ ⭐)完整汉化 + 6 个中国原创 skills,让 Claude Code / Copilot CLI / Hermes Agent / Cursor / Windsurf / Kiro / Gemini CLI 等 16 款 AI 编程工具真正会干活
Miles is an enterprise-facing reinforcement learning framework for LLM and VLM post-training, forked from and co-evolving with slime.
Stable and Efficient Reinforcement Learning for Trillion-Parameter LLMs
Post-training with Tinker
Self-evolving agent: grows skill tree from 3.3K-line seed, achieving full system control with 6x less token consumption
Fast, small, and fully autonomous AI personal assistant infrastructure, any OS, any platform — deploy anywhere, swap anything 🦀
A lightweight, powerful framework for multi-agent workflows
Claude Code is an agentic coding tool that lives in your terminal, understands your codebase, and helps you code faster by executing routine tasks, explaining complex code, and handling git workflo…
Minimalistic 4D-parallelism distributed training framework for education purpose
Bridge Megatron-Core to Hugging Face/Reinforcement Learning
gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI
A high-performance RL training-inference weight synchronization framework, designed to enable second-level parameter updates from training to inference in RL workflows
🐫 CAMEL: The first and the best multi-agent framework. Finding the Scaling Law of Agents. https://www.camel-ai.org
sail-sg / VocabularyParallelism
Forked from NVIDIA/Megatron-LMVocabulary Parallelism
SGLang is a high-performance serving framework for large language models and multimodal models.
An Asynchronous Reinforcement Learning Engine for Omni-Modal Post-Training at Scale
Checkpoint-engine is a simple middleware to update model weights in LLM inference engines
🌾 OAT: A research-friendly framework for LLM online alignment, including reinforcement learning, preference learning, etc.
Understanding R1-Zero-Like Training: A Critical Perspective
A high-performance and light-weight router for vLLM large scale deployment
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
AI agents running research on single-GPU nanochat training automatically