Lists (1)
Sort Name ascending (A-Z)
Stars
OSCAR: Offline Spectral Covariance-Aware Rotation for 2-bit KV Cache Quantization
A list of works on video generation towards world model
[NeurIPS 2024] KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization
Productive, portable, and performant GPU programming in Python.
Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞
[ICML2025, NeurIPS2025 Spotlight] Sparse VideoGen 1 & 2: Accelerating Video Diffusion Transformers with Sparse Attention
Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels
UCLA-VAST / Callipepla
Forked from linghaosong/CallipeplaLarge-scale sparse Conjugate Gradient (CG) solvers on High Bandwidth Memory (HBM) FPGAs
Implementation of ConjugateGradients method using C and Nvidia CUDA
Design preconditioners with a CNN to accelerate the conjugate gradient method.
GPU-accelerated linear solvers based on the conjugate gradient (CG) method, supporting NVIDIA and AMD GPUs with GPU-aware MPI, NCCL, RCCL or NVSHMEM
AIInfra(AI 基础设施)指AI系统从底层芯片等硬件,到上层软件栈支持AI大模型训练和推理。
[OSDI 2025] DecDEC: A Systems Approach to Advancing Low‑Bit LLM Quantization
[VLDB 25] Maximum Inner Product is Query-Scaled Nearest Neighbor
A vector indexing library to bring fast, fresh and filtered search to your database
Navigating Spreading-out Graph For Approximate Nearest Neighbor Search
📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉
zhaijiaqi / MediaCrawler
Forked from NanmiCoder/MediaCrawler小红书笔记 | 评论爬虫、抖音视频 | 评论爬虫、快手视频 | 评论爬虫、B 站视频 | 评论爬虫、微博帖子 | 评论爬虫、百度贴吧帖子 | 百度贴吧评论回复爬虫 | 知乎问答文章|评论爬虫
小红书笔记 | 评论爬虫、抖音视频 | 评论爬虫、快手视频 | 评论爬虫、B 站视频 | 评论爬虫、微博帖子 | 评论爬虫、百度贴吧帖子 | 百度贴吧评论回复爬虫 | 知乎问答文章|评论爬虫