Skip to content
View jihoontack's full-sized avatar

Highlights

  • Pro

Block or report jihoontack

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

τ²-Bench: Evaluating Conversational Agents in a Dual-Control Environment

Python 888 224 Updated Mar 24, 2026

[Preprint] RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments

Python 192 20 Updated Jan 12, 2026

(ICML'25 Outstanding) CollabLLM: From Passive Responders to Active Collaborators

Jupyter Notebook 287 33 Updated Sep 25, 2025

Benchmark and research code for the paper SWEET-RL Training Multi-Turn LLM Agents onCollaborative Reasoning Tasks

Python 264 11 Updated May 5, 2025

Training Proactive and Personalized LLM Agents

Python 104 9 Updated Jan 20, 2026

Source code for the collaborative reasoner research project at Meta FAIR.

Python 112 13 Updated Apr 17, 2025

[ICLR 2026] ParallelBench: Understanding the Tradeoffs of Parallel Decoding in Diffusion LLMs

Python 44 Updated Mar 23, 2026

Post-training with Tinker

Python 2,972 359 Updated Mar 24, 2026

Korea Investment & Securities Open API Github

Python 1,139 618 Updated Mar 18, 2026

Tongyi Deep Research, the Leading Open-source Deep Research Agent

Python 18,533 1,431 Updated Feb 27, 2026

Mirix is a multi-agent personal assistant designed to track on-screen activities and answer user questions intelligently. By capturing real-time visual data and consolidating it into structured mem…

Python 3,512 280 Updated Mar 12, 2026

Official Code of Memento: Fine-tuning LLM Agents without Fine-tuning LLMs

Python 2,355 275 Updated Oct 5, 2025

[COLM 2025] Know Me, Respond to Me: Benchmarking LLMs for Dynamic User Profiling and Personalized Responses at Scale

Python 126 7 Updated Mar 19, 2026
Python 433 47 Updated Oct 12, 2025

A research prototype of a human-centered web agent

Python 9,749 977 Updated Mar 21, 2026

NVIDIA Isaac GR00T N1.6 - A Foundation Model for Generalist Robots.

Jupyter Notebook 6,497 1,084 Updated Mar 16, 2026

Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL

Python 4,288 369 Updated Nov 13, 2025

Official implementation of Sparsified State-Space Models are Efficient Highway Networks (TMLR 2025).

2 Updated Mar 6, 2025

verl: Volcano Engine Reinforcement Learning for LLMs

Python 20,162 3,489 Updated Mar 24, 2026

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 74,147 14,707 Updated Mar 24, 2026

Official Repo for Open-Reasoner-Zero

Python 2,088 119 Updated Jun 2, 2025

A hackable, simple, and reseach-friendly GRPO Training Framework with high speed weight synchronization in a multinode environment.

Python 37 4 Updated Aug 27, 2025

Fully open reproduction of DeepSeek-R1

Python 25,962 2,416 Updated Nov 24, 2025

A simple screen parsing tool towards pure vision based GUI agent

Jupyter Notebook 24,567 2,147 Updated Sep 12, 2025

Democratizing Reinforcement Learning for LLMs

Python 5,276 523 Updated Mar 24, 2026

Pretraining and inference code for a large-scale depth-recurrent language model

Python 867 77 Updated Dec 29, 2025

s1: Simple test-time scaling

Python 6,650 765 Updated Jun 25, 2025

Minimal reproduction of DeepSeek R1-Zero

Python 12,975 1,583 Updated Feb 27, 2026
Next