Skip to content
View Jun-jie-Huang's full-sized avatar

Block or report Jun-jie-Huang

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

AI agents running research on single-GPU nanochat training automatically

Python 53,208 7,408 Updated Mar 21, 2026

Comprehensive open-source library of AI research and engineering skills for any AI model. Package the skills and your claude code/codex/gemini agent will be an AI research agent with full horsepowe…

TeX 5,528 435 Updated Mar 24, 2026

[NeurIPS 2025] Implementation for the paper "The Surprising Effectiveness of Negative Reinforcement in LLM Reasoning"

Python 163 9 Updated Mar 2, 2026

This is the repo for the paper TerminalTraj: Large-Scale Terminal Agentic Trajectory Generation from Dockerized Environments

7 Updated Feb 10, 2026

[KernelGYM & Dr. Kernel] A distributed GPU environment and a collection of RL training methods to support RL for Kernel Generations

Python 150 7 Updated Mar 24, 2026
Jupyter Notebook 214 3 Updated Dec 19, 2025

Moonshot's most powerful model

1,524 165 Updated Jan 31, 2026

Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞

TypeScript 333,533 65,016 Updated Mar 24, 2026

Minimalist RL for Diffusion LLMs with SOTA reasoning performance (89.1% GSM8K). Official implementation of "The Flexibility Trap".

Python 130 4 Updated Mar 24, 2026

Agent S: an open agentic framework that uses computers like a human

Python 10,563 1,228 Updated Feb 21, 2026

My learning notes for ML SYS.

Python 5,761 373 Updated Mar 19, 2026

Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.

Python 57,056 4,716 Updated Mar 24, 2026

General technology for enabling AI capabilities w/ LLMs and MLLMs

Python 4,310 370 Updated Mar 23, 2026

The official implementation of "ML-Master: Towards AI-for-AI via Integration of Exploration and Reasoning"

Python 379 47 Updated Jan 16, 2026

Repository-level Repair Agent Based on SWE-Bench—JoyCode Agent

Python 325 20 Updated Oct 11, 2025

PreServe: Intelligent Management for LMaaS Systems via Hierarchical Prediction [ICSE'26]

Jupyter Notebook 6 Updated Oct 20, 2025

Democratizing Reinforcement Learning for LLMs

Python 5,276 523 Updated Mar 24, 2026
Python 48 9 Updated Oct 28, 2025

A comprehensive code domain benchmark review of LLM researches.

210 16 Updated Sep 22, 2025

Research code artifacts for Code World Model (CWM) including inference tools, reproducibility, and documentation.

Python 859 69 Updated Dec 26, 2025

SWE-Swiss: A Multi-Task Fine-Tuning and RL Recipe for High-Performance Issue Resolution

Python 103 6 Updated Sep 24, 2025

The 100 line AI agent that solves GitHub issues or helps you in your command line. Radically simple, no huge configs, no giant monorepo—but scores >74% on SWE-bench verified!

Python 3,470 475 Updated Mar 24, 2026

The theory of mind module for the SWE agent

Python 92 12 Updated Jan 13, 2026

Agentless🐱: an agentless approach to automatically solve software development problems

Python 2,022 228 Updated Dec 22, 2024

Afterburner: Reinforcement Learning Facilitates Self-Improving Code Efficiency Optimization

Python 12 Updated Aug 20, 2025

Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels

Python 5,418 483 Updated Mar 24, 2026

Code implementation of synthetic continued pretraining

Jupyter Notebook 157 16 Updated Jan 6, 2025

Reproducing R1 for Code with Reliable Rewards

Python 299 18 Updated May 5, 2025

The open source coding agent.

TypeScript 129,258 13,680 Updated Mar 24, 2026
Next