Lists (2)
Sort Name ascending (A-Z)
Stars
Portable file server with accelerated resumable uploads, dedup, WebDAV, FTP, TFTP, zeroconf, media indexer, thumbnails++ all in one file, no deps
A tool for bandwidth measurements on NVIDIA GPUs.
NVIDIA Linux open GPU with P2P support
Patches to enable PCIe resizable BARs in the Linux NVIDIA kernel driver
🚀 AI-powered Hacker News Chinese translation with smart categorization and real-time summaries. Powered by Doubao 1.6
ademeure / DeeperGEMM
Forked from deepseek-ai/DeepGEMMDeeperGEMM: crazy optimized version
[NeurIPS 24 Spotlight] MaskLLM: Learnable Semi-structured Sparsity for Large Language Models
CalebDu / ABQ-LLM
Forked from bytedance/ABQ-LLMAn acceleration library that supports arbitrary bit-width combinatorial quantization operations
Machinery is an asynchronous task queue/job queue based on distributed message passing.
FlashMLA: Efficient Multi-head Latent Attention Kernels
Simple, reliable, and efficient distributed task queue in Go
Efficient Triton Kernels for LLM Training
FlagGems is an operator library for large language models implemented in the Triton Language.
An innovative superfamily of fonts for code
Display PDFs in your React app as easily as if they were images.
#1 Locally hosted web application that allows you to perform various operations on PDF files
Several optimization methods of half-precision general matrix multiplication (HGEMM) using tensor core with WMMA API and MMA PTX instruction.
unimpaired.vim: Pairs of handy bracket mappings
Awesome LLM compression research papers and tools.
A web extension to save, manage and restore sessions, windows and tabs.
[ICCV 2023] Q-Diffusion: Quantizing Diffusion Models.