Skip to content
View andrewliao11's full-sized avatar

Block or report andrewliao11

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

An Efficient and User-Friendly Scaling Library for Reinforcement Learning with Large Language Models

Python 3,237 289 Updated Jun 16, 2026

This repository contains all necessary meta information, results and source files to reproduce the results in the publication Eric Müller-Budack, Kader Pustu-Iren, Ralph Ewerth: "Geolocation Estima…

157 38 Updated Jan 19, 2026

Simple RL training for reasoning

Python 3,866 287 Updated Dec 23, 2025

[NeurIPS 2025 Spotlight] Reasoning Environments for Reinforcement Learning with Verifiable Rewards

Python 1,444 120 Updated Apr 17, 2026

GeoGuessr benchmark for language models

Python 60 4 Updated Jun 6, 2026

EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL

Python 5,015 373 Updated Apr 6, 2026

Machine Learning and Computer Vision Engineer - Technical Interview Questions

4,702 762 Updated Jan 24, 2026

Continuous Thought Machines, because thought takes time and reasoning is a process.

Python 1,950 295 Updated Dec 29, 2025

[COLM'25] The official implementation of "LongPerceptualThoughts: Distilling System-2 Reasoning for System-1 Perception"

Python 12 2 Updated Aug 4, 2025

The simplest, fastest repository for training/finetuning small-sized VLMs.

Python 4,909 495 Updated Oct 27, 2025

Witness the aha moment of VLM with less than $3.

Python 4,060 283 Updated May 19, 2025

RAGEN leverages reinforcement learning to train LLM reasoning agents in interactive, stochastic environments.

Python 2,705 225 Updated Apr 14, 2026

verl/HybridFlow: A Flexible and Efficient RL Post-Training Framework

Python 22,003 4,084 Updated Jun 16, 2026

Understanding R1-Zero-Like Training: A Critical Perspective

Python 1,262 60 Updated Aug 27, 2025

Embodied Reasoning Question Answer (ERQA) Benchmark

Python 275 17 Updated Mar 12, 2025

Repository of Jupyter notebook tutorials for teaching the Deep Learning Course at the University of Amsterdam (MSc AI), Fall 2023

Jupyter Notebook 3,160 682 Updated Jun 1, 2026

Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

Jupyter Notebook 97,262 14,878 Updated Jun 2, 2026

[TMLR 25] SFT or RL? An Early Investigation into Training R1-Like Reasoning Large Vision-Language Models

Python 149 1 Updated Oct 10, 2025

Eagle: Frontier Vision-Language Models with Data-Centric Strategies

Python 2,516 222 Updated Jun 15, 2026

A fork to add multimodal model training to open-r1

Python 1,567 72 Updated Feb 8, 2025

Collection of awesome parameter-efficient fine-tuning resources.

587 19 Updated Dec 10, 2025

A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.

6,895 370 Updated Dec 17, 2025

List of papers on Self-Correction of LLMs.

81 3 Updated May 19, 2026

PushWorld: A benchmark for manipulation planning with tools and movable obstacles

Python 94 16 Updated May 5, 2026

A collection of PDDL generators, some of which have been used to generate benchmarks for the International Planning Competition (IPC).

C 156 33 Updated Jan 3, 2026

Official release of the benchmark in paper "VSP: Diagnosing the Dual Challenges of Perception and Reasoning in Spatial Planning Tasks for MLLMs"

Python 20 2 Updated Aug 1, 2025

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 72,214 8,837 Updated Jun 16, 2026

A bibliography and survey of the papers surrounding o1

TeX 1,213 51 Updated Nov 16, 2024

Efficient LLM inference on Slurm clusters.

Python 102 14 Updated Jun 16, 2026

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 83,052 18,121 Updated Jun 16, 2026
Next