Skip to content
View jihoontack's full-sized avatar

Highlights

  • Pro

Block or report jihoontack

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

τ-Bench: A Benchmark for Tool-Agent-User Interaction in Real-World Domains

Python 1,035 260 Updated Apr 16, 2026

[Preprint] RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments

Python 202 21 Updated Apr 7, 2026

(ICML'25 Outstanding) CollabLLM: From Passive Responders to Active Collaborators

Jupyter Notebook 291 33 Updated Sep 25, 2025

Benchmark and research code for the paper SWEET-RL Training Multi-Turn LLM Agents onCollaborative Reasoning Tasks

Python 266 12 Updated May 5, 2025

Training Proactive and Personalized LLM Agents

Python 107 10 Updated Jan 20, 2026

Source code for the collaborative reasoner research project at Meta FAIR.

Python 113 13 Updated Mar 26, 2026

[ICLR 2026] ParallelBench: Understanding the Tradeoffs of Parallel Decoding in Diffusion LLMs

Python 45 1 Updated Mar 27, 2026

Post-training with Tinker

Python 3,092 383 Updated Apr 16, 2026

Korea Investment & Securities Open API Github

Python 1,262 678 Updated Mar 18, 2026

Tongyi Deep Research, the Leading Open-source Deep Research Agent

Python 18,667 1,438 Updated Feb 27, 2026

Mirix is a multi-agent personal assistant designed to track on-screen activities and answer user questions intelligently. By capturing real-time visual data and consolidating it into structured mem…

Python 3,514 279 Updated Apr 14, 2026

Official Code of Memento: Fine-tuning LLM Agents without Fine-tuning LLMs

Python 2,398 282 Updated Oct 5, 2025

[COLM 2025] Know Me, Respond to Me: Benchmarking LLMs for Dynamic User Profiling and Personalized Responses at Scale

Python 132 7 Updated Mar 19, 2026
Python 440 47 Updated Oct 12, 2025

A research prototype of a human-centered web agent

Python 9,774 973 Updated Apr 15, 2026

NVIDIA Isaac GR00T N1.6 - A Foundation Model for Generalist Robots.

Python 6,672 1,124 Updated Apr 15, 2026

Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL

Python 4,489 390 Updated Nov 13, 2025

Official implementation of Sparsified State-Space Models are Efficient Highway Networks (TMLR 2025).

2 Updated Mar 6, 2025

verl: Volcano Engine Reinforcement Learning for LLMs

Python 20,739 3,671 Updated Apr 16, 2026

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 76,959 15,705 Updated Apr 16, 2026

Official Repo for Open-Reasoner-Zero

Python 2,091 119 Updated Jun 2, 2025

A hackable, simple, and reseach-friendly GRPO Training Framework with high speed weight synchronization in a multinode environment.

Python 37 4 Updated Aug 27, 2025

Fully open reproduction of DeepSeek-R1

Python 25,991 2,414 Updated Apr 2, 2026

A simple screen parsing tool towards pure vision based GUI agent

Jupyter Notebook 24,654 2,159 Updated Apr 13, 2026

Democratizing Reinforcement Learning for LLMs

Python 5,437 542 Updated Apr 16, 2026

Pretraining and inference code for a large-scale depth-recurrent language model

Python 872 78 Updated Dec 29, 2025

s1: Simple test-time scaling

Python 6,643 762 Updated Jun 25, 2025

Minimal reproduction of DeepSeek R1-Zero

Python 13,053 1,582 Updated Feb 27, 2026
Next