Skip to content
View haiduo's full-sized avatar
💭
Studying
💭
Studying
  • Xi'an Jiaotong University
  • XI'an

Organizations

@xjtuiair-cag

Block or report haiduo

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Minimal and readable coding agent harness implementation in Python to explain the core components of coding agents.

Python 852 160 Updated Apr 7, 2026

SpecVLM: Fast Speculative Decoding in Vision-Language Models

7 Updated Oct 5, 2025
Python 1 Updated Dec 9, 2025
Python 51 3 Updated Jul 2, 2025

[NeurIPS 2025🔥:] EVODiff is an inference-time refinement method for diffusion models that improves sampling efficiency and generative fidelity by systematically reducing conditional entropy, withou…

Python 31 Updated Feb 2, 2026

📰 Must-read papers on KV Cache Compression (constantly updating 🤗).

706 25 Updated Apr 15, 2026

Train speculative decoding models effortlessly and port them smoothly to SGLang serving.

Python 833 228 Updated Apr 2, 2026

青稞Talk

205 2 Updated Jan 21, 2026

诺亚盘古大模型研发背后的真正的心酸与黑暗的故事。

11,408 1,320 Updated Jul 9, 2025

This repository is the official implementation of "Jakiro: Boosting Speculative Decoding with Decoupled Multi-Head via MoE" [ACL 2026 Main Accepted]

Python 38 4 Updated Oct 5, 2025

An open-source implementation for training LLaVA-NeXT.

Python 436 23 Updated Oct 23, 2024

"DeepKD: A Deeply Decoupled and Denoised Knowledge Distillation Trainer" [NeurIPS 2025 Accepted]

Python 18 1 Updated May 22, 2025

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 24,801 2,766 Updated Aug 12, 2024

This repository is the official implementation of "KernelDNA: Dynamic Kernel Sharing via Decoupled Naive Adapters"

Python 8 Updated Apr 1, 2025

📰 Must-read papers and blogs on Speculative Decoding ⚡️

1,217 76 Updated May 11, 2026

OpenMMLab Foundational Library for Training Deep Learning Models

Python 1,473 447 Updated Dec 23, 2025

This repository is the official implementation of "Partial Channel Network: Compute Fewer, Perform Better". [AAAI 2026 Accepted]

Python 38 2 Updated Feb 11, 2025

An Numpy and PyTorch Implementation of CKA-similarity with CUDA support

Jupyter Notebook 96 13 Updated May 13, 2021

ILSVRC2012_devkit_t12.tar.gz

6 1 Updated Jun 28, 2020

A code reproduction of VITA.

Python 2 Updated Sep 19, 2022

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Python 1 Updated Nov 27, 2024

This repository is the official implementation of "Nearly Lossless Adaptive Bit Switching". [AAAI 2026 Accepted]

Python 9 Updated Feb 11, 2025

[ICLR2025, ICML2025, NeurIPS2025 Spotlight] Quantized Attention achieves speedup of 2-5x compared to FlashAttention, without losing end-to-end metrics across language, image, and video models.

Cuda 3,369 416 Updated Jan 17, 2026

[ICCV2023] Dataset Quantization

Python 262 19 Updated Jan 6, 2024

Implementation of Symmetric SNE and t-SNE in numpy and python

Python 75 19 Updated Jul 28, 2021

Official Implementation of EAGLE-1 (ICML'24), EAGLE-2 (EMNLP'24), and EAGLE-3 (NeurIPS'25).

Python 2,347 277 Updated Feb 20, 2026
Python 34 5 Updated Sep 21, 2019

EQ-Net [ICCV 2023]

Python 32 4 Updated Aug 15, 2023
Next