Starred repositories
程序员延寿指南 | A programmer's guide to live longer
Accelerating MoE with IO and Tile-aware Optimizations
[SIGMOD2026] Reveal Hidden Pitfalls and Navigate Next Generation of Vector Similarity Search with Task-Centric Benchmarks
ZeroSearch: Incentivize the Search Capability of LLMs without Searching
SigNoz is an open-source observability platform native to OpenTelemetry with logs, traces and metrics in a single application. An open-source alternative to DataDog, NewRelic, etc. 🔥 🖥. 👉 Open sour…
PDF craft can convert PDF files into various other formats. This project will focus on processing PDF files of scanned books.
TurboDiffusion: 100–200× Acceleration for Video Diffusion Models
Adamas: Hadamard Sparse Attention for Efficient Long-context Inference
DS SERVE: The Largest Open Vector Store over Pretain Data; A Framework for Efficient and Scalable Neural Retrieval
Tile-Based Runtime for Ultra-Low-Latency LLM Inference
Ada-ef — Adaptive efSearch for HNSW-based vector search
train a model on huchenfeng dataset
The ultimate training toolkit for finetuning diffusion models
A unified library for building, evaluating, and storing speculative decoding algorithms for LLM inference in vLLM
A framework for efficient model inference with omni-modality models
Helpful kernel tutorials and examples for tile-based GPU programming
cuTile is a programming model for writing parallel kernels for NVIDIA GPUs
A comprehensive guide for beginners in the field of data management and artificial intelligence.
Code for the paper “Four Over Six: More Accurate NVFP4 Quantization with Adaptive Block Scaling”
Official inference repo for FLUX.2 models
Classic papers and resources on recommendation