Lists (2)
Sort Name ascending (A-Z)
- All languages
- Assembly
- C
- C#
- C++
- CMake
- CSS
- CoffeeScript
- Cuda
- Cython
- Dockerfile
- Emacs Lisp
- Go
- HCL
- HTML
- Java
- JavaScript
- Jinja
- Jupyter Notebook
- Kotlin
- Less
- Lua
- MATLAB
- MDX
- MLIR
- Makefile
- Markdown
- Mojo
- Mustache
- OpenEdge ABL
- PLpgSQL
- Perl
- PureBasic
- Python
- Roff
- Rust
- SRecode Template
- Scala
- Shell
- Starlark
- Svelte
- Swift
- TeX
- TypeScript
- Typst
- Vue
- Zig
Starred repositories
The fastest BM25 scoring engine: 2,300x faster than BM25S. 28K QPS on 8.8M docs. 5 BM25 variants (Robertson, Lucene, ATIRE, BM25L, BM25+). Memory-mapped persistence, BMW pruning, streaming indexing…
Sparton: Fast and Memory-Efficient Triton Kernel for Learned Sparse Retrieval
Async-friendly WebTransport implementation in Rust
Opencli-rs is a Blazing fast, memory-safe command-line tool — Fetch information from any website with a single command. Covers Twitter/X, Reddit, YouTube, HackerNews, Bilibili, Zhihu, Xiaohongshu, …
KV cache store for distributed LLM inference
InfiniteVL: Synergizing Linear and Sparse Attention for Highly-Efficient, Unlimited-Input Vision-Language Models
🍡 50x faster tokenization for every HuggingFace model
FlexTok: Resampling Images into 1D Token Sequences of Flexible Length
FastAPI-compatible Python framework with Zig HTTP core; 7x faster, free-threading native
Per-collection OCR leaderboards using VLM-as-judge
Get clean data from tricky documents, powered by vision-language models ⚡
A high-quality PDF to Markdown tool based on large language model visual recognition. 一款基于大模型视觉识别的高质量PDF转Markdown工具
Sparse Embedding Compression for Scalable Retrieval in Recommender Systems
An efficient implementation of the NSA (Native Sparse Attention) kernel
Bayesian probability transforms for BM25 retrieval scores
[CVPR 2025] A Comprehensive Benchmark for Document Parsing and Evaluation
Accelerating MoE with IO and Tile-aware Optimizations
A high-performance and light-weight router for vLLM large scale deployment
VeOmni: Scaling Any Modality Model Training with Model-Centric Distributed Recipe Zoo
Code for paper: [ICLR2025 Oral] FlexPrefill: A Context-Aware Sparse Attention Mechanism for Efficient Long-Sequence Inference
[NeurIPS 2025] Official Implementation of ViSpec: Accelerating Vision-Language Models with Vision-Aware Speculative Decoding.