-
Johns Hopkins University
- Baltimore, MD
- https://zichengxu.github.io/
- in/zicheng-xu
Stars
Decoding Tree Sketching migrated to vLLM backend - retroactive branching with PagedAttention and automatic prefix caching
RouterArena: An open framework for evaluating LLM routers with standardized datasets, metrics, an automated framework, and a live leaderboard.
[ICML 2026] Decoding Tree Sketching (DTS): a training-free & model agonistic & plug-in framework for LLM parallel reasoning.
[NeurIPS 2025] VeriThinker: Learning to Verify Makes Reasoning Model Efficient
[COLM 2025] Official PyTorch implementation of "Quantization Hurts Reasoning? An Empirical Study on Quantized Reasoning Models"
Summer 2026 software engineering, data science, AI, quant, product management, and hardware internship postings. Updated daily by Simplify and Pitt CSC.