Skip to content
View HeegyuKim's full-sized avatar
  • Seoul, Korea

Block or report HeegyuKim

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Agents' Last Exam

Python 701 29 Updated Jun 20, 2026

An in-the-wild benchmark for AI agents in the OpenClaw Environment.

Python 445 44 Updated May 19, 2026

🌍 AppWorld: A Controllable World of Apps and People for Benchmarking Function Calling and Interactive Coding Agent, ACL'24 Best Resource Paper.

Python 444 69 Updated Feb 17, 2026

[NeurIPS 2022] 🛒WebShop: Towards Scalable Real-World Web Interaction with Grounded Language Agents

Python 556 101 Updated Sep 6, 2024

Agentic RL Training at Scale

Python 1,494 314 Updated Jun 20, 2026

Harbor is a framework for running agent evaluations and creating and using RL environments.

Python 2,588 1,179 Updated Jun 20, 2026

한국인을 위한 스킬 모음집 - SRT, KTX, 카카오톡, 한글과컴퓨터, 날씨, 미세먼지, 법령, 주식정보, 조선왕조실록, KBO, K-리그, LCK, 특허 검색, 토스 증권, 맞춤법 검사, 중고차 가격, 쿠팡, 네이버 블로그, 다이소, 올리브영, 택배 송장 조회 등등...

JavaScript 5,749 648 Updated Jun 19, 2026

AI가 쓴 글이 아닌 것처럼 윤문해주는 스킬

Python 3,138 314 Updated Jun 9, 2026

SkillsBench evaluates how well skills work and how effective agents are at using them.

PDDL 1,376 318 Updated Jun 20, 2026

Real-time global intelligence dashboard. AI-powered news aggregation, geopolitical monitoring, and infrastructure tracking in a unified situational awareness interface

TypeScript 57,583 9,155 Updated Jun 20, 2026

Vero: An Open RL Recipe for General Visual Reasoning

Python 125 11 Updated Jun 19, 2026

Claw-Eval is an evaluation harness for evaluating LLM as agents. All tasks verified by humans.

Python 676 59 Updated May 17, 2026

Browser automation CLI for AI agents

Rust 36,531 2,318 Updated Jun 16, 2026

Mount Hugging Face Buckets and repos as local filesystems. No download, no copy, no waiting.

Rust 749 54 Updated Jun 19, 2026

AI agent toolkit: unified LLM API, agent loop, TUI, coding agent CLI

TypeScript 64,184 7,812 Updated Jun 19, 2026

[NeurIPS 2025] The official implementation of "KL Penalty Control via Perturbation for Direct Preference Optimization"

Python 6 1 Updated Nov 26, 2025

Zero Bubble Pipeline Parallelism

Python 460 33 Updated May 7, 2025

OpenResearcher: A Fully Open Pipeline for Long-Horizon Deep Research Trajectory Synthesis

Python 786 79 Updated Jun 10, 2026

TBD

Python 62 1 Updated Mar 13, 2026

chrome & firefox extension to chat with webpages: local llms

JavaScript 130 15 Updated Dec 20, 2024

Convert Word documents to beautiful Markdown. Via command line or in your browser.

TypeScript 219 28 Updated May 12, 2026

Training library for Megatron-based models with bidirectional Hugging Face conversion capability

Python 734 370 Updated Jun 20, 2026

An open-source RAG-based tool for chatting with your documents.

Python 25,476 2,124 Updated Jun 9, 2026

[NeurIPS 2025 Spotlight] Reasoning Environments for Reinforcement Learning with Verifiable Rewards

Python 1,447 120 Updated Apr 17, 2026

Atropos is a Language Model Reinforcement Learning Environments framework for collecting and evaluating LLM trajectories through diverse environments

Python 1,293 372 Updated Jun 15, 2026

SRTgo: K-Train (KTX, SRT) Reservation Assistant

Python 272 142 Updated Sep 24, 2025

Verifiers for LLM Reinforcement Learning

Python 80 12 Updated Apr 15, 2025

Automatic, unsupervised collection of web agent training data via exploration.

Python 29 4 Updated Oct 8, 2025
Python 26 5 Updated Mar 4, 2026
Next