Skip to content
View jihoontack's full-sized avatar

Highlights

  • Pro

Block or report jihoontack

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

τ²-Bench: Evaluating Conversational Agents in a Dual-Control Environment

Python 553 121 Updated Dec 18, 2025

[Preprint] RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments

Python 161 16 Updated Nov 14, 2025

(ICML'25 Outstanding) CollabLLM: From Passive Responders to Active Collaborators

Jupyter Notebook 267 28 Updated Sep 25, 2025

Benchmark and research code for the paper SWEET-RL Training Multi-Turn LLM Agents onCollaborative Reasoning Tasks

Python 255 11 Updated May 5, 2025
Python 92 9 Updated Nov 6, 2025

Source code for the collaborative reasoner research project at Meta FAIR.

Python 111 13 Updated Apr 17, 2025

ParallelBench: Understanding the Tradeoffs of Parallel Decoding in Diffusion LLMs

Python 20 Updated Oct 16, 2025

Post-training with Tinker

Python 2,583 249 Updated Dec 20, 2025

Korea Investment & Securities Open API Github

Python 935 475 Updated Nov 13, 2025

Tongyi Deep Research, the Leading Open-source Deep Research Agent

Python 17,659 1,355 Updated Dec 17, 2025

Mirix is a multi-agent personal assistant designed to track on-screen activities and answer user questions intelligently. By capturing real-time visual data and consolidating it into structured mem…

Python 3,554 331 Updated Dec 20, 2025

Official Code of Memento: Fine-tuning LLM Agents without Fine-tuning LLMs

Python 2,098 246 Updated Oct 5, 2025

[COLM 2025] Know Me, Respond to Me: Benchmarking LLMs for Dynamic User Profiling and Personalized Responses at Scale

Python 95 6 Updated Dec 7, 2025
Python 402 42 Updated Oct 12, 2025

A research prototype of a human-centered web agent

Python 9,229 939 Updated Dec 18, 2025

NVIDIA Isaac GR00T N1.6 - A Foundation Model for Generalist Robots.

Jupyter Notebook 5,648 887 Updated Dec 18, 2025

Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL

Python 3,674 309 Updated Nov 13, 2025

Official implementation of Sparsified State-Space Models are Efficient Highway Networks (TMLR 2025).

2 Updated Mar 6, 2025

verl: Volcano Engine Reinforcement Learning for LLMs

Python 17,639 2,856 Updated Dec 20, 2025

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 65,813 12,083 Updated Dec 20, 2025

Official Repo for Open-Reasoner-Zero

Python 2,083 119 Updated Jun 2, 2025

A hackable, simple, and reseach-friendly GRPO Training Framework with high speed weight synchronization in a multinode environment.

Python 35 4 Updated Aug 27, 2025

Fully open reproduction of DeepSeek-R1

Python 25,740 2,405 Updated Nov 24, 2025

A simple screen parsing tool towards pure vision based GUI agent

Jupyter Notebook 24,034 2,060 Updated Sep 12, 2025

Democratizing Reinforcement Learning for LLMs

Python 4,878 467 Updated Dec 18, 2025

Pretraining and inference code for a large-scale depth-recurrent language model

Python 856 73 Updated Oct 16, 2025

s1: Simple test-time scaling

Python 6,615 764 Updated Jun 25, 2025

Minimal reproduction of DeepSeek R1-Zero

Python 12,498 1,530 Updated Apr 24, 2025
Next