Skip to content
View LuletterSoul's full-sized avatar
🎯
Focusing
🎯
Focusing

Highlights

  • Pro

Block or report LuletterSoul

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

NVFP4 Flash-Attention 4 on BlackWell

Python 13 1 Updated Jun 13, 2026

分享AI Infra知识&代码练习:PyTorch/vLLM/SGLang框架入门⚡️、性能加速🚀、大模型基础🧠、AI软硬件🔧等

Jupyter Notebook 2,578 233 Updated May 30, 2026

AI agents running research on single-GPU nanochat training automatically

Python 86,570 12,539 Updated Mar 26, 2026

Official implementation of "Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding"

Python 1,040 131 Updated May 30, 2026

TurboQuant: Near-optimal KV cache quantization for LLM inference (3-bit keys, 2-bit values) with Triton kernels + vLLM integration

Python 1,555 181 Updated Mar 27, 2026

An open-source long-horizon SuperAgent harness that researches, codes, and creates. With the help of sandboxes, memories, tools, skill, subagents and message gateway, it handles different levels of…

Python 71,120 9,634 Updated Jun 13, 2026

A flexible and efficient codebase for training visually-conditioned language models (VLMs)

Python 993 1,116 Updated Jul 4, 2024

An open-source implementaion for fine-tuning Qwen-VL series by Alibaba Cloud.

Python 1,910 217 Updated May 26, 2026
Python 26 1 Updated Jan 20, 2026

Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM

Python 3,392 545 Updated Jun 12, 2026
Python 49 4 Updated May 9, 2026

Strong and Open Vision Language Assistant for Mobile Devices

Python 1,359 88 Updated Apr 15, 2024

My learning notes for ML SYS.

Python 6,518 442 Updated Jun 8, 2026

[IEEE TCSVT'26] 🂡 AceVFI: A Comprehensive Survey of Advances in Video Frame Interpolation

150 5 Updated Apr 6, 2026

This repository contains low-bit quantization papers from 2020 to 2025 on top conference.

171 5 Updated Apr 29, 2026

An official implementation of "Scheduling Weight Transitions for Quantization-Aware Training" (ICCV 2025) in PyTorch.

Python 60 13 Updated Nov 17, 2025

Virtual whiteboard for sketching hand-drawn like diagrams

TypeScript 125,267 14,002 Updated Jun 13, 2026

[Information Fusion 2025] A Survey on Occupancy Perception for Autonomous Driving: The Information Fusion Perspective

612 36 Updated Jun 8, 2026

An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.

Python 14,353 1,859 Updated Jul 3, 2024

DC-Gen: Post-Training Diffusion Acceleration with Deeply Compressed Latent Space

Python 386 12 Updated Oct 5, 2025

Efficient vision foundation models for high-resolution generation and perception.

Python 3,321 249 Updated Sep 5, 2025

EfficientSAM3 compresses SAM3 into lightweight, edge-friendly models via progressive knowledge distillation for fast promptable concept segmentation and tracking.

Jupyter Notebook 604 47 Updated Jun 12, 2026

The repository provides code for running inference and finetuning with the Meta Segment Anything Model 3 (SAM 3), links for downloading the trained model checkpoints, and example notebooks that sho…

Python 10,543 1,584 Updated May 23, 2026
Python 115 20 Updated Feb 26, 2026

A curated list of foundation models for vision and language tasks

1,164 60 Updated Apr 20, 2026

Train speculative decoding models effortlessly and port them smoothly to SGLang serving.

Python 888 252 Updated Jun 13, 2026

Pytorch implementation of RAPQ, IJCAI 2022

Python 23 3 Updated Jul 19, 2023

📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉

Cuda 11,247 1,149 Updated May 29, 2026

🎉CUDA 笔记 / 高频面试题汇总 / C++笔记,个人笔记,更新随缘: sgemm、sgemv、warp reduce、block reduce、dot product、elementwise、softmax、layernorm、rmsnorm、hist etc.

Cuda 6 Updated Feb 26, 2026
Next