-
The Chinese University of Hong Kong
- Hong Kong
- https://jun-jie-huang.github.io/
Stars
AI agents running research on single-GPU nanochat training automatically
Comprehensive open-source library of AI research and engineering skills for any AI model. Package the skills and your claude code/codex/gemini agent will be an AI research agent with full horsepowe…
[NeurIPS 2025] Implementation for the paper "The Surprising Effectiveness of Negative Reinforcement in LLM Reasoning"
This is the repo for the paper TerminalTraj: Large-Scale Terminal Agentic Trajectory Generation from Dockerized Environments
[KernelGYM & Dr. Kernel] A distributed GPU environment and a collection of RL training methods to support RL for Kernel Generations
Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞
Minimalist RL for Diffusion LLMs with SOTA reasoning performance (89.1% GSM8K). Official implementation of "The Flexibility Trap".
Agent S: an open agentic framework that uses computers like a human
My learning notes for ML SYS.
Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.
General technology for enabling AI capabilities w/ LLMs and MLLMs
The official implementation of "ML-Master: Towards AI-for-AI via Integration of Exploration and Reasoning"
Repository-level Repair Agent Based on SWE-Bench—JoyCode Agent
PreServe: Intelligent Management for LMaaS Systems via Hierarchical Prediction [ICSE'26]
A comprehensive code domain benchmark review of LLM researches.
Research code artifacts for Code World Model (CWM) including inference tools, reproducibility, and documentation.
SWE-Swiss: A Multi-Task Fine-Tuning and RL Recipe for High-Performance Issue Resolution
The 100 line AI agent that solves GitHub issues or helps you in your command line. Radically simple, no huge configs, no giant monorepo—but scores >74% on SWE-bench verified!
Agentless🐱: an agentless approach to automatically solve software development problems
Afterburner: Reinforcement Learning Facilitates Self-Improving Code Efficiency Optimization
Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels
Code implementation of synthetic continued pretraining