Drop-in prompt compression for production LLM apps. Cut your token bill 40-60% without changing your code. Python SDK, LLMLingua-2, MIT.
-
Updated
Jun 8, 2026 - Python
Drop-in prompt compression for production LLM apps. Cut your token bill 40-60% without changing your code. Python SDK, LLMLingua-2, MIT.
JavaScript/TypeScript implementation of LLMLingua-2 (Experimental)
Python command-line tool for interacting with AI models through the OpenRouter API/Cloudflare AI Gateway, or local self-hosted Ollama. Optionally support Microsoft LLMLingua prompt token compression
TokenPack packs long documents, codebases, PDFs, and folders into compact, evidence-dense LLM context using local embeddings, evidence scoring, and budget-aware selection.
A Claude Code skill that shrinks massive prompts and files using LLMLingua to save tokens.
LLMLingua-2 prompt compression hook for Claude Code — cut token usage by ~55%
Instant text, video & audio summaries on iOS Flutter client · Python (llmlingua) & Node.js (Whisper) services · MongoDB · RevenueCat IAP
Hybrid prompt compression toolkit for LLM workflows
Self-hosted HTTP sidecar for LLM context compression. Reduce token costs 3–5× before calling any AI API — powered by LLMLingua-2 and MarkItDown. No proxy, no API keys, no GPU required.
Add a description, image, and links to the llmlingua topic page so that developers can more easily learn about it.
To associate your repository with the llmlingua topic, visit your repo's landing page and select "manage topics."