Skip to content
View dqwang122's full-sized avatar
  • CMU
  • Pennsylvania, USA

Block or report dqwang122

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results
Python 95 7 Updated Apr 6, 2026

Democratizing Reinforcement Learning for LLMs

Python 5,618 578 Updated Jun 15, 2026

Open-source, community-driven agent harness

Rust 38,389 3,303 Updated Jun 15, 2026

A.S.E (AICGSecEval) is a repository-level AI-generated code security evaluation benchmark developed by Tencent Wukong Code Security Team.

Python 643 108 Updated May 25, 2026

TeleMem is a high-performance drop-in replacement for Mem0, featuring semantic deduplication, long-term dialogue memory, and multimodal video reasoning.

Python 466 32 Updated Jun 12, 2026

Inference and training library for high-quality TTS models.

Python 5,578 590 Updated Dec 10, 2024

MLE-bench is a benchmark for measuring how well AI agents perform at machine learning engineering

Python 1,585 253 Updated Apr 24, 2026

Multi-SWE-bench: A Multilingual Benchmark for Issue Resolving

Python 339 55 Updated Dec 18, 2025

This repo includes Claude prompt curation to use Claude better.

5,239 579 Updated Feb 28, 2026

(ACL 2025 Main) Code for MultiAgentBench : Evaluating the Collaboration and Competition of LLM agents https://www.arxiv.org/pdf/2503.01935

Python 265 38 Updated Oct 27, 2025

Understanding R1-Zero-Like Training: A Critical Perspective

Python 1,261 59 Updated Aug 27, 2025

Official Repo for Open-Reasoner-Zero

Python 2,097 120 Updated Jun 2, 2025

This is the official implementation of Multi-Agent PPO (MAPPO).

Python 2,024 378 Updated Jul 18, 2024

The RL Bridge for LLM-based Agent Applications. Made Simple & Flexible.

Python 5,307 520 Updated Jun 15, 2026

[ICASSP 2026] Agent4Debate is a dynamic multi-agent framework that leverages LLMs to achieve human-level performance in competitive debate by dynamically coordinating specialized agents to mitigate…

Python 38 6 Updated Jan 19, 2026

Scalable toolkit for efficient model reinforcement

Python 1,733 423 Updated Jun 15, 2026

verl/HybridFlow: A Flexible and Efficient RL Post-Training Framework

Python 21,984 4,079 Updated Jun 15, 2026

Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation

8,004 287 Updated May 15, 2025

Recipes to scale inference-time compute of open models

Python 1,133 132 Updated May 26, 2026

Based on "long-form-factuality" a python based processor to easily fact check anything.

Python 20 3 Updated Apr 1, 2024

A bibliography and survey of the papers surrounding o1

TeX 1,213 51 Updated Nov 16, 2024
Python 972 110 Updated Jan 23, 2025

Open source audio annotation tool for humans

TypeScript 1,139 143 Updated Feb 3, 2026

A training framework for Stable Baselines3 reinforcement learning agents, with hyperparameter optimization and pre-trained agents included.

Python 2,830 599 Updated Jun 15, 2026

🙌 OpenHands: AI-Driven Development

Python 77,198 9,809 Updated Jun 15, 2026

A framework for few-shot evaluation of language models.

Python 12,969 3,340 Updated Jun 2, 2026

Awesome-LLM-Prompt-Optimization: a curated list of advanced prompt optimization and tuning methods in Large Language Models

411 22 Updated Jun 15, 2026

A collection of recent papers on building autonomous agent. Two topics included: RL-based / LLM-based agents.

748 59 Updated Apr 23, 2026

Reference implementation for DPO (Direct Preference Optimization)

Python 2,887 235 Updated Aug 11, 2024

A library for advanced large language model reasoning

Python 2,342 203 Updated Jun 10, 2025
Next