-
Carnegie Mellon University
- Pittsburgh
- https://jiarui-liu.github.io
- @Jiarui_Liu_
Stars
A framework bridging cognitive science and LLM reasoning research to diagnose and improve how large language models reason, based on analysis of 192K model traces and 54 human think-aloud traces.
This repository contains a regularly updated paper list for LLMs-reasoning-in-latent-space.
Training Large Language Model to Reason in a Continuous Latent Space
A curated list of LLM Interpretability related material - Tutorial, Library, Survey, Paper, Blog, etc..
800,000 step-level correctness labels on LLM solutions to MATH problems
📖 This is a repository for organizing papers, codes, and other resources related to Latent Reasoning.
Code for the paper: "Learning to Reason without External Rewards"
L1: Controlling How Long A Reasoning Model Thinks With Reinforcement Learning
Environments for LLM Reinforcement Learning
LogicBench is a natural language question-answering dataset consisting of 25 different reasoning patterns spanning over propositional, first-order, and non-monotonic logics.
[NeurIPS 2025 Spotlight] Reasoning Environments for Reinforcement Learning with Verifiable Rewards
Official Repo for Open-Reasoner-Zero
✨✨Latest Advances on Multimodal Large Language Models
A comprehensive evaluation dataset encompassing multi-step logical reasoning with various inference rules and depths
Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL
Repository for the paper "Seemingly Plausible Distractors in Multi-Hop Reasoning: Are Large Language Models Attentive Readers?"
[EMNLP 2024] The official GitHub repo for the survey paper "Knowledge Conflicts for LLMs: A Survey"
A package to evaluate factuality of long-form generation. Original implementation of our EMNLP 2023 paper "FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation"
Inference-Time Intervention: Eliciting Truthful Answers from a Language Model
Kim, J., Evans, J., & Schein, A. (2025). Linear Representations of Political Perspective Emerge in Large Language Models. ICLR.
A collection of awesome-prompt-datasets, awesome-instruction-dataset, to train ChatLLM such as chatgpt 收录各种各样的指令数据集, 用于训练 ChatLLM 模型。