Skip to content
Change the repository type filter

All

    Repositories list

    • tinyserve-vllm

      Public
      [ACM MM 2025 Oral] TinyServe: Query-Aware Page Allocation Optimization
      Python
      21003Updated Dec 8, 2025Dec 8, 2025
    • HSGM

      Public
      [ICPADS 2025 Oral, *SEM 2025 Oral] HSGM: Hierarchical Segment-Graph Memory for Scalable Long-Text Semantics
      Python
      0700Updated Nov 23, 2025Nov 23, 2025
    • [FPGA'26 Highlight] CXL-SpecKV: A Disaggregated FPGA Speculative KV-Cache for Datacenter LLM Serving
      C++
      1800Updated Nov 23, 2025Nov 23, 2025
    • SPI_VecDB

      Public
      Distributed Parallel Multi-Resolution Vector Search
      Go
      0800Updated Nov 9, 2025Nov 9, 2025
    • CogLoad

      Public
      Cognitive Load Traces
      Python
      0100Updated Nov 3, 2025Nov 3, 2025
    • NeuroSpec

      Public
      Grammar- and Resource-Aligned Certifiable Speculative Decoding
      Python
      0000Updated Oct 31, 2025Oct 31, 2025
    • CSV-Decode: Certifiable Sub-Vocabulary Decoding for Efficient Large Language Model Inference
      Python
      0800Updated Oct 30, 2025Oct 30, 2025
    • PiKV

      Public
      PiKV: KV Cache Management System for MoE [Efficient ML System]
      Python
      7400Updated Oct 26, 2025Oct 26, 2025
    • GraphSnapShot: Caching Local Structure for Fast Graph Learning [Efficient ML System]
      Python
      5200Updated Sep 22, 2025Sep 22, 2025
    • FastCache

      Public
      FastCache: Fast Caching for Diffusion Transformer Through Learnable Linear Approximation [Efficient ML Model]
      Python
      295600Updated Sep 22, 2025Sep 22, 2025
    • SemToken

      Public
      [IWCS 2025 Oral] SemToken: Semantic-Aware Tokenization for Efficient Long-Context Language Modeling
      Python
      0400Updated Sep 21, 2025Sep 21, 2025
    • QTM

      Public
      Python
      3000Updated Sep 21, 2025Sep 21, 2025