zerlinwang

Follow

😃

Say hello

Zilin Wang zerlinwang

😃

Say hello

Follow

Reinforcement Learning. CS PhD@Oxford

52 followers · 105 following

Oxford
zilinwang.notion.site/me

Achievements

Achievements

Highlights

Pro

Lists (1)

Sort

MARL

Starred repositories

nakamotoo / dsrl_pi0

Official implementation for pi0 steering via DSRL, Steering Your Diffusion Policy with Latent Space Reinforcement Learning (CoRL 2025)

Python 190 28 Updated Aug 5, 2025

ReinFlow / ReinFlow

[NeurIPS 2025] Flow x RL. "ReinFlow: Fine-tuning Flow Policy with Online Reinforcement Learning". Support VLAs e.g., pi0, pi0.5. Fully open-sourced.

Python 247 21 Updated Dec 23, 2025

irom-princeton / dppo

Official implementation of Diffusion Policy Policy Optimization, arxiv 2024

Python 749 93 Updated Feb 4, 2025

xiazhiyi99 / uniaims2ui

TypeScript 1 Updated Jan 14, 2026

ESHyperscale / HyperscaleES

Jax Codebase for Evolutionary Strategies at the Hyperscale

Python 218 18 Updated Dec 25, 2025

LeapLabTHU / Absolute-Zero-Reasoner

Official Repository of Absolute Zero Reasoner

Python 1,808 293 Updated Aug 24, 2025

AlexGoldie / discobench

Official implementation of "DiscoBench: An Open-Ended Benchmark For Algorithm Discovery"

Python 20 2 Updated Feb 4, 2026

AlexGoldie / alexgoldie.github.io

HTML 1 Updated Feb 2, 2026

networkx / networkx

Network Analysis in Python

Python 16,587 3,460 Updated Feb 2, 2026

shapely / shapely

Manipulation and analysis of geometric objects

Python 4,366 609 Updated Jan 30, 2026

Thinklab-SJTU / Bench2Drive

[NeurIPS 2024 Datasets and Benchmarks Track] Closed-Loop E2E-AD Benchmark Enhanced by World Model RL Expert

Python 1,788 119 Updated Feb 18, 2025

MasterXiong / Hyper-VLA

Code of paper "HyperVLA: Efficient Inference in Vision-Language-Action Models via Hypernetworks"

Python 19 2 Updated Oct 8, 2025

romkatv / zsh-bin

Statically-linked, hermetic, relocatable Zsh

Shell 381 25 Updated Jul 27, 2023

rasbt / LLMs-from-scratch

Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

Jupyter Notebook 84,606 12,795 Updated Jan 29, 2026

AlexGoldie / learn-rl-algorithms

Official implementation for "How Should We Meta-Learn Reinforcement Learning Algorithms?"

Python 23 1 Updated Sep 7, 2025

openai / gpt-oss

gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI

Python 19,734 2,031 Updated Jan 13, 2026

phlippe / uvadlc_notebooks

Repository of Jupyter notebook tutorials for teaching the Deep Learning Course at the University of Amsterdam (MSc AI), Fall 2023

Jupyter Notebook 3,084 667 Updated Oct 31, 2025

bsarkar321 / jaxrwkv

Python 11 1 Updated Aug 30, 2025

bsarkar321 / purejaxfsdp

Implementation of Fully Sharded Data Parallelism in Jax

Python 1 Updated Jun 12, 2025

fla-org / flash-linear-attention

🚀 Efficient implementations of state-of-the-art linear attention models

Python 4,347 374 Updated Feb 3, 2026

YunyiShen / ARM-FI

Active reward modeling with last layer Fisher Information (ICML'25)

Python 7 Updated Jul 9, 2025

lilucse / SparseNetwork4DRL

[ICML 2025 oral] Network Sparsity Unlocks the Scaling Potential of Deep Reinforcement Learning

Python 40 1 Updated Jun 5, 2025

PRIME-RL / SimpleVLA-RL

[ICLR 2026] SimpleVLA-RL: Scaling VLA Training via Reinforcement Learning

Python 1,357 82 Updated Jan 6, 2026

Letian-Wang / asaprl

RSS 2023: This repository contains code for the paper Efficient Reinforcement Learning for Autonomous Driving with Parameterized Skills and Priors.

Python 106 12 Updated May 10, 2023

Paper2Poster / Paper2Poster

[NeurIPS 2025 D&B] Open-source Multi-agent Poster Generation from Papers

Python 3,077 209 Updated Dec 21, 2025

eclipse-sumo / sumo

Eclipse SUMO is an open source, highly portable, microscopic and continuous traffic simulation package designed to handle large networks. It allows for intermodal simulation including pedestrians a…

C++ 3,866 1,694 Updated Feb 5, 2026

TsinghuaC3I / Awesome-RL-for-LRMs

A Survey of Reinforcement Learning for Large Reasoning Models

TeX 2,316 129 Updated Nov 9, 2025

xiaomi-research / r1-aqa

🤗 R1-AQA Model: mispeech/r1-aqa

Python 315 28 Updated Mar 28, 2025

Vicky-0256 / DEPfold

Jupyter Notebook 6 Updated Apr 4, 2025

NVlabs / catk

Closed-Loop Supervised Fine-Tuning of Tokenized Traffic Models. CVPR Oral 2025.

Python 175 14 Updated Apr 4, 2025

Starred topics

Awesome Lists