Skip to content
View chiakicage's full-sized avatar
🦀
rusting
🦀
rusting
  • Zhejiang University

Highlights

  • Pro

Block or report chiakicage

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
11 stars written in Jupyter Notebook
Clear filter

A guidance language for controlling large language models.

Jupyter Notebook 21,254 1,145 Updated Feb 4, 2026

An open-source, low-code machine learning library in Python

Jupyter Notebook 9,685 1,856 Updated Apr 21, 2025

Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads

Jupyter Notebook 2,699 193 Updated Jun 25, 2024

A throughput-oriented high-performance serving framework for LLMs

Jupyter Notebook 945 46 Updated Oct 29, 2025

Disaggregated serving system for Large Language Models (LLMs).

Jupyter Notebook 772 85 Updated Apr 6, 2025

FastKAN: Very Fast Implementation of Kolmogorov-Arnold Networks (KAN)

Jupyter Notebook 467 62 Updated Jun 20, 2024
Jupyter Notebook 189 30 Updated Jun 16, 2024
Jupyter Notebook 130 14 Updated Nov 11, 2024

Compare different hardware platforms via the Roofline Model for LLM inference tasks.

Jupyter Notebook 120 5 Updated Mar 13, 2024

[ICLR 2025] DeFT: Decoding with Flash Tree-attention for Efficient Tree-structured LLM Inference

Jupyter Notebook 48 2 Updated Jun 17, 2025

Efficient Interactive LLM Serving with Proxy Model-based Sequence Length Prediction | A tiny BERT model can tell you the verbosity of an LLM (with low latency overhead!)

Jupyter Notebook 46 8 Updated Jun 1, 2024