-
Xi'an Jiaotong University
- XI'an
Lists (1)
Sort Name ascending (A-Z)
Stars
Minimal and readable coding agent harness implementation in Python to explain the core components of coding agents.
[NeurIPS 2025🔥:] EVODiff is an inference-time refinement method for diffusion models that improves sampling efficiency and generative fidelity by systematically reducing conditional entropy, withou…
📰 Must-read papers on KV Cache Compression (constantly updating 🤗).
Train speculative decoding models effortlessly and port them smoothly to SGLang serving.
This repository is the official implementation of "Jakiro: Boosting Speculative Decoding with Decoupled Multi-Head via MoE" [ACL 2026 Main Accepted]
An open-source implementation for training LLaVA-NeXT.
"DeepKD: A Deeply Decoupled and Denoised Knowledge Distillation Trainer" [NeurIPS 2025 Accepted]
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
This repository is the official implementation of "KernelDNA: Dynamic Kernel Sharing via Decoupled Naive Adapters"
📰 Must-read papers and blogs on Speculative Decoding ⚡️
OpenMMLab Foundational Library for Training Deep Learning Models
This repository is the official implementation of "Partial Channel Network: Compute Fewer, Perform Better". [AAAI 2026 Accepted]
An Numpy and PyTorch Implementation of CKA-similarity with CUDA support
haiduo / transformers
Forked from huggingface/transformers🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
This repository is the official implementation of "Nearly Lossless Adaptive Bit Switching". [AAAI 2026 Accepted]
[ICLR2025, ICML2025, NeurIPS2025 Spotlight] Quantized Attention achieves speedup of 2-5x compared to FlashAttention, without losing end-to-end metrics across language, image, and video models.
[ICCV2023] Dataset Quantization
Implementation of Symmetric SNE and t-SNE in numpy and python
Official Implementation of EAGLE-1 (ICML'24), EAGLE-2 (EMNLP'24), and EAGLE-3 (NeurIPS'25).