Stars
A general 2-8 bits quantization toolbox with GPTQ/AWQ/HQQ/VPTQ, and export to onnx/onnx-runtime easily.
Project page for Neural Shell Texture Splatting (ICCV 2025)
Fast and memory-efficient exact kmeans
Official repository for SIGGRAPH Asia 2025 Conference paper Light-SQ.
QuTLASS: CUTLASS-Powered Quantized BLAS for Deep Learning
[ICCV 2025] Official repository of the paper "BlinkTrack: Feature Tracking over 80 FPS via Events and Images". This repository contains the implementation of the BlinkTrack method and the MultiTrac…
High-performance inference framework for large language models, focusing on efficiency, flexibility, and availability.
Official inference framework for 1-bit LLMs
A lightweight point-based visualization tool used for inspecting Gaussian data, designing camera motion, and exporting setups for external Gaussian renderers.
Just wanna see what type and how many GPUs/TPUs are used in CVPR 2025 oral papers. Fun vibe coding with LLMs.
An interactive NVIDIA-GPU process viewer and beyond, the one-stop solution for GPU process management.
Official repo for paper "Structured 3D Latents for Scalable and Versatile 3D Generation" (CVPR'25 Spotlight).
Blender addon for remote debugging Blender with VS Code (and Visual Studio)
[ICML 2025] Official PyTorch implementation of "FlatQuant: Flatness Matters for LLM Quantization"
An algorithm for weight-activation quantization (W4A4, W4A8) of LLMs, supporting both static and dynamic quantization
The provided code is a Python script that uses the CuPy library to perform optimized GPU operations, specifically matrix multiplication. The script includes a custom CUDA kernel that is optimized f…
Fast CUDA matrix multiplication from scratch
[TVCG 2025] SplatLoc: 3D Gaussian Splatting-based Visual Localization for Augmented Reality
Code for "Multi-View Neural 3D Reconstruction of Micro-/Nanostructures with Atomic Force Microscopy"
[NeurIPS 2024 Oral🔥] DuQuant: Distributing Outliers via Dual Transformation Makes Stronger Quantized LLMs.
OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.
[EMNLP 2024 & AAAI 2026] A powerful toolkit for compressing large models including LLMs, VLMs, and video generative models.
[SIGGRAPH 2024] Coin3D: Controllable and Interactive 3D Assets Generation with Proxy-Guided Conditioning