- Seattle, WA
-
11:08
(UTC -07:00)
Stars
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
Rapid fuzzy string matching in Python using various string metrics
Access a database of word frequencies, in various natural languages.
🚀 Efficient implementations for emerging model architectures
High performance and CommonMark compliant HTML to Markdown converter. Maintained by the Kreuzberg team. Kreuzberg is a fast, polyglot document intelligence engine with a Rust core. It extracts stru…
Augmentation pipeline for rendering synthetic paper printing, faxing, scanning and copy machine processes
High-performance In-browser LLM Inference Engine
An open-source framework for detecting, redacting, masking, and anonymizing sensitive data (PII) across text, images, and structured data. Supports NLP, pattern matching, and customizable pipelines.
Next-generation Punkt sentence boundary detection with zero dependencies
OCR & Document Extraction using vision models
OLMost every training recipe you need to perform data interventions with the OLMo family of models.
A pipeline for performing OCR on historical newspapers
LaTeXML: a TeX and LaTeX to XML/HTML/ePub/MathML translator.
A computer algebra system written in pure Python
Toolkit for linearizing PDFs for LLM datasets/training
📦 Repomix is a powerful tool that packs your entire repository into a single, AI-friendly file. Perfect for when you need to feed your codebase to Large Language Models (LLMs) or other AI tools lik…
Synthetic data curation for post-training and structured data extraction
Code at the speed of thought – Zed is a high-performance, multiplayer code editor from the creators of Atom and Tree-sitter.
SGLang is a high-performance serving framework for large language models and multimodal models.
A pure-python PDF library capable of splitting, merging, cropping, and transforming the pages of PDF files
LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.