Skip to content
View jihoontack's full-sized avatar

Highlights

  • Pro

Block or report jihoontack

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

τ-Bench: A Benchmark for Tool-Agent-User Interaction in Real-World Domains

Python 1,097 279 Updated Apr 30, 2026

[ICML 2026] RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments

Python 205 21 Updated Apr 30, 2026

(ICML'25 Outstanding) CollabLLM: From Passive Responders to Active Collaborators

Jupyter Notebook 295 33 Updated Sep 25, 2025

Benchmark and research code for the paper SWEET-RL Training Multi-Turn LLM Agents onCollaborative Reasoning Tasks

Python 266 12 Updated May 5, 2025

Training Proactive and Personalized LLM Agents

Python 110 10 Updated Jan 20, 2026

Source code for the collaborative reasoner research project at Meta FAIR.

Python 113 13 Updated Mar 26, 2026

[ICLR 2026] ParallelBench: Understanding the Tradeoffs of Parallel Decoding in Diffusion LLMs

Python 45 1 Updated Mar 27, 2026

Post-training with Tinker

Python 3,189 404 Updated Apr 30, 2026

Korea Investment & Securities Open API Github

Python 1,320 702 Updated Mar 18, 2026

Tongyi Deep Research, the Leading Open-source Deep Research Agent

Python 18,778 1,447 Updated Feb 27, 2026

Mirix is a multi-agent personal assistant designed to track on-screen activities and answer user questions intelligently. By capturing real-time visual data and consolidating it into structured mem…

Python 3,528 281 Updated Apr 28, 2026

Official Code of Memento: Fine-tuning LLM Agents without Fine-tuning LLMs

Python 2,411 285 Updated Oct 5, 2025

[COLM 2025] Know Me, Respond to Me: Benchmarking LLMs for Dynamic User Profiling and Personalized Responses at Scale

Python 136 8 Updated Mar 19, 2026
Python 442 47 Updated Oct 12, 2025

A research prototype of a human-centered web agent

Python 9,802 976 Updated Apr 15, 2026

NVIDIA Isaac GR00T N1.7 - A Foundation Model for Generalist Robots.

Python 6,905 1,167 Updated Apr 26, 2026

Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL

Python 4,612 411 Updated Nov 13, 2025

Official implementation of Sparsified State-Space Models are Efficient Highway Networks (TMLR 2025).

3 Updated Mar 6, 2025

verl/HybridFlow: A Flexible and Efficient RL Post-Training Framework

Python 21,034 3,782 Updated Apr 30, 2026

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 78,704 16,302 Updated Apr 30, 2026

Official Repo for Open-Reasoner-Zero

Python 2,093 119 Updated Jun 2, 2025

A hackable, simple, and reseach-friendly GRPO Training Framework with high speed weight synchronization in a multinode environment.

Python 37 4 Updated Aug 27, 2025

Fully open reproduction of DeepSeek-R1

Python 26,012 2,421 Updated Apr 2, 2026

A simple screen parsing tool towards pure vision based GUI agent

Jupyter Notebook 24,707 2,163 Updated Apr 13, 2026

Democratizing Reinforcement Learning for LLMs

Python 5,463 547 Updated Apr 30, 2026

Pretraining and inference code for a large-scale depth-recurrent language model

Python 879 80 Updated Dec 29, 2025

s1: Simple test-time scaling

Python 6,650 761 Updated Jun 25, 2025

Minimal reproduction of DeepSeek R1-Zero

Python 13,078 1,588 Updated Feb 27, 2026
Next