Skip to content
View HBX-hbx's full-sized avatar

Highlights

  • Pro

Block or report HBX-hbx

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results
Lean 3 Updated Jun 15, 2026

SSRL: Self-Search Reinforcement Learning

Python 209 13 Updated Aug 20, 2025

Towards a Unified View of Large Language Model Post-Training

Python 210 11 Updated Sep 8, 2025

The Entropy Mechanism of Reinforcement Learning for Large Language Model Reasoning.

Python 442 15 Updated Jul 11, 2025

Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe

Python 680 43 Updated May 30, 2026

[NeurIPS 2025] TTRL: Test-Time Reinforcement Learning

Python 1,085 83 Updated Apr 15, 2026

[ICLR 2026 Blogpost Track Poster] JustRL: Scaling a 1.5B LLM with a Simple RL Recipe

Python 279 13 Updated Apr 18, 2026

The official code repository for the paper "CostBench: Evaluating Multi-Turn Cost-Optimal Planning and Adaptation in Dynamic Environments for LLM Tool-Use Agents"

Python 31 Updated Jun 14, 2026

Awesome Open-ended AI

447 45 Updated May 17, 2026

The official repository for the dataset FactualBench, which is introduced in paper "Exploring the Generalizability of Factual Hallucination Mitigation via Enhancing Precise Knowledge Utilization".

Python 3 Updated Dec 30, 2025

My learning notes for ML SYS.

Python 6,540 445 Updated Jun 18, 2026

MiniCPM5-1B: A SOTA 1B on-device LLM, small yet powerful.

Jupyter Notebook 9,469 621 Updated Jun 18, 2026

A Survey of Reinforcement Learning for Large Reasoning Models

TeX 2,466 131 Updated Nov 9, 2025

Scalable RL solution for advanced reasoning of language models

Python 1,863 112 Updated Mar 18, 2025

This is the repository for paper EscapeBench: Pushing Language Models to Think Outside the Box

Python 18 1 Updated Dec 19, 2024

A large-scale, fine-grained, diverse preference dataset (and models).

Python 368 17 Updated Dec 29, 2023

A bibliography and survey of the papers surrounding o1

TeX 1,213 51 Updated Nov 16, 2024

✨✨Latest Advances on Multimodal Large Language Models

17,898 1,128 Updated Jun 18, 2026

A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.

6,896 370 Updated Dec 17, 2025

Code for the paper "The Right Time Matters: Data Arrangement Affects Zero-Shot Generalization in Instruction Tuning"

Python 5 Updated Apr 8, 2025
Python 1,343 54 Updated Nov 21, 2024

Repo for paper "Tell Me More! Towards Implicit User Intention Understanding of Language Model Driven Agents"

Python 66 9 Updated Feb 20, 2024

The paper list of the 86-page SCIS cover paper "The Rise and Potential of Large Language Model Based Agents: A Survey" by Zhiheng Xi et al.

8,151 494 Updated Sep 12, 2025

LaTeX Thesis Template for Tsinghua University

TeX 5,391 1,163 Updated Jun 17, 2026

Can large language models provide useful feedback on research papers? A large-scale empirical analysis.

Python 533 53 Updated Jan 11, 2024

Linux内核源码分析

1,640 357 Updated Sep 5, 2023

Chrome Extensions Samples

JavaScript 17,608 9,015 Updated Jun 16, 2026

An open-source toolkit for textual backdoor attack and defense (NeurIPS 2022 D&B, Spotlight)

Python 209 27 Updated Apr 10, 2023
Next