jmiao24

Jiacheng Miao jmiao24

Building AI to do research @Stanford

145 followers · 14 following

Stanford University
Palo Alto, CA
jiachengmiao.com
@Jiacheng_Miao

Achievements

Highlights

Stars

openai / parameter-golf

Train the smallest LM you can that fits in 16MB. Best model wins!

Python 2,486 1,327 Updated Mar 20, 2026

Al-Murphy / alphagenome_FT_MPRA

Benchmarking approaches to fine-tune AlphaGenome on lentiMPRA data

Python 5 Updated Mar 19, 2026

mutable-state-inc / autoresearch-at-home

Forked from karpathy/autoresearch

AI agents running research on single-GPU nanochat training automatically

Python 431 23 Updated Mar 13, 2026

NousResearch / hermes-agent-self-evolution

⚒ Evolutionary self-improvement for Hermes Agent — optimize skills, prompts, and code using DSPy + GEPA

Python 238 17 Updated Mar 9, 2026

karpathy / autoresearch

AI agents running research on single-GPU nanochat training automatically

Python 45,627 6,341 Updated Mar 16, 2026

pablodelucca / pixel-agents

Pixel office.

TypeScript 5,059 716 Updated Mar 19, 2026

bytedance / deer-flow

An open-source SuperAgent harness that researches, codes, and creates. With the help of sandboxes, memories, tools, skills and subagents, it handles different levels of tasks that could take minute…

Python 32,048 3,881 Updated Mar 20, 2026

snarktank / ralph

Ralph is an autonomous AI agent loop that runs repeatedly until all PRD items are complete.

TypeScript 13,366 1,393 Updated Feb 2, 2026

huggingface / trl

Train transformer language models with reinforcement learning.

Python 17,733 2,571 Updated Mar 20, 2026

OpenRLHF / OpenRLHF

An Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO & REINFORCE++ & TIS & vLLM & Ray & Async RL)

Python 9,210 899 Updated Mar 20, 2026

allenai / open-instruct

AllenAI's post-training codebase

Python 3,641 515 Updated Mar 20, 2026

PeterGriffinJin / Search-R1

Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL

Python 4,264 367 Updated Nov 13, 2025

janetmalzahn / llm-phacking

Replication archive for "Do Claude Code and Codex P-Hack? Sycophancy and Statistical Analysis in Large Language Models"

R 17 Updated Mar 3, 2026

alibaba / OpenSandbox

OpenSandbox is a general-purpose sandbox platform for AI applications, offering multi-language SDKs, unified sandbox APIs, and Docker/Kubernetes runtimes for scenarios like Coding Agents, GUI Agent…

Python 8,882 673 Updated Mar 20, 2026