-
Tsinghua University
- Beijing, China
-
17:49
(UTC +08:00) - https://hbx-hbx.github.io/
- @hbx_hbx
Highlights
- Pro
Stars
[NeurIPS 2025] TTRL: Test-Time Reinforcement Learning
[ICLR 2026 Blogpost Track Poster] JustRL: Scaling a 1.5B LLM with a Simple RL Recipe
The official code repository for the paper "CostBench: Evaluating Multi-Turn Cost-Optimal Planning and Adaptation in Dynamic Environments for LLM Tool-Use Agents"
The official repository for the dataset FactualBench, which is introduced in paper "Exploring the Generalizability of Factual Hallucination Mitigation via Enhancing Precise Knowledge Utilization".
My learning notes for ML SYS.
MiniCPM4 & MiniCPM4.1: Ultra-Efficient LLMs on End Devices, achieving 3+ generation speedup on reasoning tasks
A Survey of Reinforcement Learning for Large Reasoning Models
Scalable RL solution for advanced reasoning of language models
This is the repository for paper EscapeBench: Pushing Language Models to Think Outside the Box
A large-scale, fine-grained, diverse preference dataset (and models).
A bibliography and survey of the papers surrounding o1
✨✨Latest Advances on Multimodal Large Language Models
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.
Code for the paper "The Right Time Matters: Data Arrangement Affects Zero-Shot Generalization in Instruction Tuning"
Repo for paper "Tell Me More! Towards Implicit User Intention Understanding of Language Model Driven Agents"
The paper list of the 86-page SCIS cover paper "The Rise and Potential of Large Language Model Based Agents: A Survey" by Zhiheng Xi et al.
Can large language models provide useful feedback on research papers? A large-scale empirical analysis.
Chrome Extensions Samples
An open-source toolkit for textual backdoor attack and defense (NeurIPS 2022 D&B, Spotlight)
你管这破玩意叫操作系统源码 — 像小说一样品读 Linux 0.11 核心代码
什么?你敢放心的把后背交给 AI? 我赌你不敢,那就来学学 AI 时代最安全的语言吧(Python无法战胜!)。本书拥有全面且深入的讲解、生动贴切的示例、德芙般丝滑的内容,这可能是目前最用心的 Rust 中文学习教程 / Book
程序员在家做饭方法指南。Programmer's guide about how to cook at home (Simplified Chinese only).