Skip to content
View zhixin612's full-sized avatar
  • Tianjin University
  • Tianjin, China
  • 11:54 (UTC +08:00)

Highlights

  • Pro

Organizations

@TJU-NSL

Block or report zhixin612

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
zhixin612/README.md

I am a Ph.D. student in TANKLAB at Tianjin University, advised by Wenyu Qu and Yitao Hu. My research interests include Machine Learning Systems, LLM inference serving, and Distributed Systems. I received my B.S. degree in computer science from Northwest A&F University.

GitHub: zhixin612

Email: zhao612@tju.edu.cn


📑 Publications

  1. PAT: Accelerating LLM Decoding via Prefix-Aware Attention with Resource Efficient Multi-Tile Kernel
    Jinjun Yi†, Zhixin Zhao†, Yitao Hu*, Ke Yan, Weiwei Sun, Hao Wang, Laiping Zhao, Yuhao Zhang, Wenxin Li, Keqiu Li.
    ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2026

  2. SLOpt: Serving Real-Time Inference Pipeline with Strict Latency Constraint
    Zhixin Zhao, Yitao Hu*, Guotao Yang, Ziqi Gong, Chen Shen, Laiping Zhao, Wenxin Li, Xiulong Liu, and Wenyu Qu.
    IEEE Transactions on Computers (TC), 2025.

  3. Harpagon: Minimizing DNN Serving Cost via Efficient Dispatching, Scheduling and Splitting
    Zhixin Zhao, Yitao Hu*, Ziqi Gong, Guotao Yang, Wenxin Li, Xiulong Liu, Keqiu Li, and Hao Wang.
    IEEE International Conference on Computer Communications (INFOCOM), 2025.

  4. TightLLM: Maximizing Throughput for LLM Inference via Adaptive Offloading Policy
    Yitao Hu, Xiulong Liu*, Guotao Yang, Linxuan Li, Kai Zeng, Zhixin Zhao, Sheng Chen, Laiping Zhao, Wenxin Li, and Keqiu Li.
    IEEE Transactions on Computers (TC), 2025.

  5. SuperSpec: Enhanced Verification and Sampling for End-to-End LLM Speculative Decoding
    Chen Shen, Rui Guo, Yang Cheng, Yang Lin, Zhixin Zhao, Yitao Hu*, Sheng Chen, Xiulong Liu, and Keqiu Li.
    IEEE International Conference on High Performance Computing and Communications (HPCC), 2025.

  6. SmartCache: Two-Dimensional KV-Cache Similarity for Efficient Long-Context LLM Decoding
    Chen Shen, Hao Chen, Kaining Hui, Zhixin Zhao, Yang Cheng, Yitao Hu*, Sheng Chen, Xiulong Liu, and Keqiu Li.
    IEEE International Conference on High Performance Computing and Communications (HPCC), 2025.

  7. High-throughput Sampling, Communicating and Training for Reinforcement Learning Systems
    Laiping Zhao, Xinan Dai, Zhixin Zhao, Yusong Xin, Yitao Hu*, Jun Qian, Jun Yao, and Keqiu Li.
    IEEE/ACM International Symposium on Quality of Service (IWQoS), 2023.


👨‍🎓 Academic Services

  • 2025: ASPLOS - Artifact Evaluation Reviewer
  • 2023: ICA3PP - Reviewer

⭐ Main Awards

  • 2021: The 2021 ICPC Shaanxi National Invitational: Silver Medal
  • 2020: The 2020 ICPC Asia-East Continent Final: Bronze Medal
  • 2020: The 45th ICPC Asia Regional Contest Shanghai Site: Silver Medal

🎓 Honors

  • 2024: Academic Scholarships, Tianjin University
  • 2023: Distinguished Academic Scholarship, Tianjin University
  • 2022: Outstanding Graduate, Northwest A&F University
  • 2021: Presidential Scholarship, Northwest A&F University
  • 2020: National Encouragement Scholarship, Northwest A&F University
  • 2019: National Encouragement Scholarship, Northwest A&F University

🏃‍♂️ Hobbies

photography📸 ping-pong🏓 badminton🏸 ...

Pinned Loading

  1. flashserve/PAT flashserve/PAT Public

    Prefix-Aware Attention for LLM Decoding

    Python 17

  2. awesome-papers-LMsys awesome-papers-LMsys Public

    Python 1