Skip to content
View QZH-777's full-sized avatar
  • Tsinghua University
  • Tsinghua University

Block or report QZH-777

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

OpenClaw-RL: Train any agent simply by talking

Python 4,419 439 Updated Mar 28, 2026

This repository hosts a collection of datasets for training and evaluating CUA / GUI agents.

111 8 Updated Jul 27, 2025

An Open Phone Agent Model & Framework. Unlocking the AI Phone for Everyone

Python 24,649 3,885 Updated Mar 6, 2026
Dockerfile 38 12 Updated Mar 26, 2026

All-in-One Sandbox for AI Agents that combines Browser, Shell, File, MCP and VSCode Server in a single Docker container.

Python 3,816 308 Updated Mar 28, 2026

GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models

Python 4,306 443 Updated Feb 1, 2026

**DeepL免秘钥,免启服务**,双击使用,免费无限次使用,(**新增DeepL单词查询功能**)根据网页版JavaScript加密算法逆向开发的bobplugin;所以只要官网的算法不改,理论上就可以无限使用;(重大更新!!!回馈老用户,现已优化,频繁访问后仍然可以继续免费翻译!!) **apiKey is not required,No account password required**

JavaScript 613 41 Updated Aug 30, 2024

A Survey of Reinforcement Learning for Large Reasoning Models

TeX 2,411 128 Updated Nov 9, 2025

《代码随想录》LeetCode 刷题攻略:200道经典题目刷题顺序,共60w字的详细图解,视频难点剖析,50余张思维导图,支持C++,Java,Python,Go,JavaScript等多语言版本,从此算法学习不再迷茫!🔥🔥 来看看,你会发现相见恨晚!🚀

Shell 60,919 12,330 Updated Mar 11, 2026

🌐 Make websites accessible for AI agents. Automate tasks online with ease.

Python 85,078 9,855 Updated Mar 29, 2026
Python 24 20 Updated Oct 12, 2025

Latest Advances on System-2 Reasoning

Python 1,341 76 Updated Jun 8, 2025

Reading notes about Multimodal Large Language Models, Large Language Models, and Diffusion Models

1,050 41 Updated Mar 15, 2026

Official Repo for Open-Reasoner-Zero

Python 2,088 119 Updated Jun 2, 2025

This is suite of the hands-on training materials that shows how to scale CV, NLP, time-series forecasting workloads with Ray.

Jupyter Notebook 456 82 Updated Feb 13, 2024

Reproduce R1 Zero on Logic Puzzle

Python 2,443 163 Updated Mar 20, 2025

An Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO & REINFORCE++ & TIS & vLLM & Ray & Async RL)

Python 9,270 909 Updated Mar 30, 2026

Simple RL training for reasoning

Python 3,844 289 Updated Dec 23, 2025
3 Updated Jan 24, 2025

Machine-generated text detection in the wild (ACL 2024)

Python 224 13 Updated Mar 6, 2025

Solutions of Reinforcement Learning, An Introduction

Jupyter Notebook 2,398 511 Updated Jul 10, 2025

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3.5, DeepSeek-R1, GLM-5, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Llava, Phi4, ...)…

Python 13,425 1,310 Updated Mar 30, 2026

Scalable RL solution for advanced reasoning of language models

Python 1,835 107 Updated Mar 18, 2025
Python 553 65 Updated Jan 2, 2025
JavaScript 86 9 Updated Dec 11, 2025

Building Open LLM Web Agents with Self-Evolving Online Curriculum RL

Python 515 35 Updated Jun 6, 2025

Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.

77,569 8,972 Updated Feb 5, 2026

Towards Large Multimodal Models as Visual Foundation Agents

Python 259 10 Updated Apr 24, 2025
Python 311 22 Updated Aug 18, 2025

An LLM-based Web Navigating Agent (KDD'24)

Python 934 84 Updated Sep 27, 2024
Next