RulinShao

Follow

™️

Rulin Shao RulinShao

™️

Follow

292 followers · 50 following

University of Washington, Meta AI
Seattle, WA
https://rulinshao.github.io/
@RulinShao

Achievements

Achievements

Organizations

Stars

NVIDIA-NeMo / ProRL-Agent-Server

Agentic RL on Any Harness at Scale

Python 569 60 Updated Jun 17, 2026

aisa-group / InferenceBench

Benchmarking Open-Ended Inference Optimization by AI Agents

Python 27 4 Updated May 16, 2026

gpakosz / .tmux

Oh my tmux! My self-contained, pretty & versatile tmux configuration made with 💛🩷💙🖤❤️🤍

Shell 25,091 3,590 Updated Jun 14, 2026

facebookresearch / ProgramBench

Can Language Models Rebuild Programs From Scratch?

Python 767 51 Updated Jun 18, 2026

self-evolving / repo

TypeScript 44 4 Updated Jun 17, 2026

FrontierCS / Frontier-CS

A benchmark for evaluating LLMs on open-ended CS problems. Exploring the Next Frontier of Computer Science.

C++ 244 38 Updated Jun 17, 2026

WecoAI / aideml

AIDE: AI-Driven Exploration in the Space of Code. The machine Learning engineering agent that automates AI R&D.

Python 1,323 194 Updated May 2, 2026

JuliusBrussee / caveman

🪨 why use many token when few token do trick — Claude Code skill that cuts 65% of tokens by talking like caveman

JavaScript 74,115 4,174 Updated Jun 12, 2026

PolarSeeker / OpenSeeker

OpenSeeker: A search agent with open-source data and models

Python 749 56 Updated May 22, 2026

addyosmani / agent-skills

Production-grade engineering skills for AI coding agents.

Shell 62,345 6,761 Updated Jun 16, 2026

Intent-Lab / VisionClaw

Real-time AI assistant for Meta Ray-Ban smart glasses -- voice + vision + agentic actions via Gemini Live and OpenClaw

2,386 449 Updated May 6, 2026

openai / parameter-golf

Train the smallest LM you can that fits in 16MB. Best model wins!

Python 5,131 3,334 Updated May 4, 2026

stanford-iris-lab / meta-harness-tbench2-artifact

Meta-Harness: 76.4% on Terminal-Bench 2.0 (Claude Opus 4.6)

Python 1,101 161 Updated Mar 26, 2026

andrewyng / context-hub

JavaScript 13,610 1,187 Updated May 31, 2026

benchflow-ai / skillsbench

SkillsBench evaluates how well skills work and how effective agents are at using them.

PDDL 1,370 317 Updated Jun 18, 2026

QwenLM / Qwen-Agent

Agent framework and applications built upon Qwen>=3.0, featuring Function Calling, MCP, Code Interpreter, RAG, Chrome extension, etc.

Python 16,575 1,647 Updated Mar 4, 2026

zhaochenyang20 / Awesome-ML-SYS-Tutorial

My learning notes for ML SYS.

Python 6,537 445 Updated Jun 18, 2026

TIGER-AI-Lab / OpenResearcher

OpenResearcher: A Fully Open Pipeline for Long-Horizon Deep Research Trajectory Synthesis

Python 784 78 Updated Jun 10, 2026

Tencent-Hunyuan / CL-bench

CL-bench: A Benchmark for Context Learning

Python 560 29 Updated May 12, 2026

open-tinker / OpenTinker

OpenTinker is an RL-as-a-Service infrastructure for foundation models

Python 675 63 Updated Mar 21, 2026

test-time-training / discover

Python 589 86 Updated May 24, 2026

Continual-Intelligence / SEAL

Self-Adapting Language Models

Python 1,779 308 Updated Aug 1, 2025

StarTrail-org / RAG-DS-Serve

[AAAI26]: DS SERVE: The Largest Open Vector Store over Pretain Data; A Framework for Efficient and Scalable Neural Retrieval

Python 51 5 Updated Jan 28, 2026

openai / prm800k

800,000 step-level correctness labels on LLM solutions to MATH problems

Python 2,143 127 Updated Jun 1, 2023

google-deepmind / superhuman

Lean 759 77 Updated Jun 5, 2026

caixd-220529 / LifelongAgentBench

Code repo for "LifelongAgentBench: Evaluating LLM Agents as Lifelong Learners"

Python 91 6 Updated May 30, 2025

gastownhall / gastown

Gas Town - multi-agent workspace manager

Go 15,953 1,487 Updated Jun 17, 2026

VoltAgent / awesome-agent-skills

A curated collection of 1000+ agent skills from official dev teams and the community, compatible with Claude Code, Codex, Gemini CLI, Cursor, and more.

25,677 2,731 Updated Jun 16, 2026

Orchestra-Research / AI-Research-SKILLs

Comprehensive open-source library of AI research and engineering skills for any AI model. Package the skills and your claude code/codex/gemini agent will be an AI research agent with full horsepowe…

TeX 9,806 734 Updated Jun 16, 2026

test-time-training / e2e

Official JAX implementation of End-to-End Test-Time Training for Long Context

Python 621 47 Updated Feb 15, 2026