Build better LLM apps — faster, smarter, production-ready.
A curated, list of 100+ libraries and frameworks for AI engineers building with Large Language Models. This toolkit includes battle-tested tools, frameworks, templates, and reference implementations for developing, deploying, and optimizing LLM-powered systems.
Tool | Description | Language | License |
---|---|---|---|
Pinecone | Managed vector database for production AI applications | API/SDK | Commercial |
Weaviate | Open-source vector database with GraphQL API | Go | BSD-3 |
Qdrant | Vector similarity search engine with extended filtering | Rust | Apache-2.0 |
Chroma | Open-source embedding database for LLM apps | Python | Apache-2.0 |
Milvus | Cloud-native vector database for scalable similarity search | Go/C++ | Apache-2.0 |
FAISS | Library for efficient similarity search and clustering | C++/Python | MIT |
Tool | Description | Language | License |
---|---|---|---|
LangChain | Framework for developing LLM applications | Python/JS | MIT |
LlamaIndex | Data framework for LLM applications | Python | MIT |
Haystack | End-to-end NLP framework for production | Python | Apache-2.0 |
DSPy | Framework for algorithmically optimizing LM prompts | Python | MIT |
Semantic Kernel | SDK for integrating AI into conventional programming languages | C#/Python/Java | MIT |
Langflow | Visual no-code platform for building and deploying LLM workflows | Python/TypeScript | MIT |
Flowise | Drag-and-drop UI for creating LLM chains and agents | TypeScript | MIT |
Promptflow | Workflow orchestration for LLM pipelines, evaluation, and deployment | Python | MIT |
Tool | Description | Language | License |
---|---|---|---|
Docling | AI-powered toolkit converting PDF, DOCX, PPTX, HTML, images into structured JSON/Markdown with layout, OCR, table, and code recognition | Python | MIT |
pdfplumber | Drill through PDFs at a character level, extract text & tables, and visually debug extraction | Python | MIT |
PyMuPDF (fitz) | Lightweight, high-performance PDF parser for text/image extraction and manipulation | Python / C | AGPL-3.0 |
PDF.js | Browser-based PDF renderer with text extraction capabilities | JavaScript | Apache-2.0 |
Camelot | Extracts structured tabular data from PDFs into DataFrames and CSVs | Python | MIT |
Llama Parse | Structured parsing of PDFs and documents optimized for LLMs | Python | Apache-2.0 |
MegaParse | Universal parser for PDFs, HTML, and semi-structured documents | Python | Apache-2.0 |
ExtractThinker | Intelligent document extraction framework with schema mapping | Python | MIT |
PyMuPDF4LLM | Wrapper around PyMuPDF for LLM-ready text, tables, and image extraction | Python | Apache-2.0 |
Tool | Description | Language | License |
---|---|---|---|
RAGFlow | Open-source RAG engine based on deep document understanding | Python | Apache-2.0 |
Verba | Retrieval Augmented Generation (RAG) chatbot | Python | BSD-3 |
PrivateGPT | Interact with documents using local LLMs | Python | Apache-2.0 |
AnythingLLM | All-in-one AI application for any LLM | JavaScript | MIT |
Quivr | Your GenAI second brain | Python/TypeScript | Apache-2.0 |
Jina | Cloud-native neural search framework for multimodal RAG | Python | Apache-2.0 |
txtai | All-in-one embeddings database for semantic search and workflows | Python | Apache-2.0 |
FastGraph RAG | Graph-based RAG framework for structured retrieval | Python | MIT |
Chonkie | Chunking utility for efficient document processing in RAG | Python | - |
SQLite-Vec | Vector search extension for SQLite, useful in lightweight RAG setups | C/Python | MIT |
FlashRAG | Low-latency RAG research toolkit with modular design and benchmarks | Python | - |
Llmware | Lightweight framework for building RAG-based apps | Python | Apache-2.0 |
Vectara | Managed RAG platform with APIs for retrieval and generation | Python/Go | Commercial |
GPTCache | Semantic cache for LLM responses to accelerate RAG pipelines | Python | Apache-2.0 |
Tool | Description | Language | License |
---|---|---|---|
Ragas | Evaluation framework for RAG pipelines | Python | Apache-2.0 |
LangSmith | Platform for debugging, testing, and monitoring LLM applications | API/SDK | Commercial |
Phoenix | ML observability for LLM, vision, language, and tabular models | Python | Apache-2.0 |
DeepEval | LLM evaluation framework for unit testing LLM outputs | Python | Apache-2.0 |
TruLens | Evaluation and tracking for LLM experiments | Python | MIT |
Inspect | Framework for large language model evaluations | Python | Apache-2.0 |
UpTrain | Open-source tool to evaluate and improve LLM applications | Python | Apache-2.0 |
Weave | Experiment tracking, debugging, and logging for LLM workflows | Python | Apache-2.0 |
Giskard | Open-source testing framework for ML/LLM applications | Python | Apache-2.0 |
Lighteval | Lightweight and fast evaluation framework from Hugging Face | Python | Apache-2.0 |
LangTest | NLP/LLM test suite for robustness, bias, and quality | Python | Apache-2.0 |
PromptBench | Benchmarking framework for evaluating prompts | Python | MIT |
EvalPlus | Advanced evaluation framework for code generation models | Python | Apache-2.0 |
FastChat | Framework for chat-based LLM benchmarking and evaluation | Python | Apache-2.0 |
judges | Human + AI judging framework for LLM evaluation | Python | Apache-2.0 |
Evals | OpenAI's framework for creating and running LLM evaluations | Python | MIT |
AgentEvals | Evaluation framework for autonomous AI agents | Python | Apache-2.0 |
UQLM | Unified framework for evaluating quality of LLMs | Python | Apache-2.0 |
LLMBox | Toolkit for evaluation + training of LLMs | Python | Apache-2.0 |
Opik | DevOps platform for evaluation, monitoring, and observability | Python | Apache-2.0 |
PydanticAI Evals | Built-in evaluation utilities for PydanticAI agents | Python | MIT |
LLM Transparency Tool | Framework for probing and evaluating LLM transparency | Python | Apache-2.0 |
AnnotateAI | Annotation and evaluation framework for LLM datasets | Python | Apache-2.0 |
Tool | Description | Language | License |
---|---|---|---|
Hugging Face Hub | Client library for Hugging Face Hub | Python | Apache-2.0 |
MLflow | Platform for ML lifecycle management | Python | Apache-2.0 |
Weights & Biases | Developer tools for ML | Python | MIT |
DVC | Data version control for ML projects | Python | Apache-2.0 |
Comet ML | Experiment tracking and visualization for ML/LLM workflows | Python | MIT |
ClearML | End-to-end MLOps platform with LLM support | Python | Apache-2.0 |
Tool | Description | Language | License |
---|---|---|---|
Firecrawl | AI-powered web crawler that extracts and structures content for LLM pipelines | TypeScript | MIT |
Scrapy | Fast, high-level web crawling & scraping framework | Python | BSD-3 |
Playwright | Web automation & scraping with headless browsers | TypeScript/Python/Java/.NET | Apache-2.0 |
BeautifulSoup | Easy HTML/XML parsing for quick scraping tasks | Python | MIT |
Selenium | Browser automation framework (supports scraping) | Multiple | Apache-2.0 |
Apify SDK | Web scraping & automation platform SDK | Python/JavaScript | Apache-2.0 |
Newspaper3k | News & article extraction library | Python | MIT |
Data Prep Kit | Toolkit for cleaning, transforming, and preparing datasets for LLMs | Python | Apache-2.0 |
ScrapeGraphAI | Use LLMs to extract structured data from websites and documents | Python | MIT |
Crawlee | Web scraping and crawling framework for large-scale data collection | TypeScript | Apache-2.0 |
Tool | Description | Language | License |
---|---|---|---|
Promptify | Prompt engineering toolkit for NLP/LLM tasks | Python | Apache-2.0 |
PromptSource | Toolkit for creating, sharing, and managing prompts | Python | Apache-2.0 |
Promptimizer | Microsoft toolkit for optimizing prompts via evaluation | Python | MIT |
Py-Priompt | Library for prioritizing and optimizing LLM prompts | Python | MIT |
Selective Context | Context selection and compression for efficient prompting | Python | MIT |
LLMLingua | Prompt compression via token selection and ranking | Python | MIT |
betterprompt | Prompt experimentation & optimization framework | Python | Apache-2.0 |
PCToolkit | Toolkit for prompt compression and efficiency | Python | Apache-2.0 |
Tool | Description | Language | License |
---|---|---|---|
Instructor | Structured LLM outputs with Pydantic schema validation | Python | MIT |
XGrammar | Grammar-based constrained generation for LLMs | Python | Apache-2.0 |
Outlines | Controlled generation with regex, CFGs, and schemas | Python | MIT |
Guidance | Programmatic control of LLM outputs with constraints | Python | MIT |
LMQL | Query language for structured interaction with LLMs | Python | Apache-2.0 |
Jsonformer | Efficient constrained decoding for valid JSON outputs | Python | MIT |
Framework | Description | Language | License |
---|---|---|---|
AutoGen | Multi-agent conversation framework | Python | CC-BY-4.0 |
CrewAI | Framework for orchestrating role-playing autonomous AI agents | Python | MIT |
LangGraph | Build resilient language agents as graphs | Python | MIT |
AgentOps | Python SDK for AI agent monitoring, LLM cost tracking, benchmarking | Python | MIT |
Swarm | Educational framework for exploring ergonomic, lightweight multi-agent orchestration | Python | MIT |
Agency Swarm | An open-source agent framework designed to automate your workflows | Python | MIT |
Multi-Agent Systems | Research into multi-agent systems and applications | Python | MIT |
Auto-GPT | Autonomous AI agent for task execution using GPT models | Python | MIT |
BabyAGI | Task-driven autonomous agent inspired by AGI | Python | MIT |
SuperAGI | Infrastructure for building and managing autonomous agents | Python | MIT |
Phidata | Build AI agents with memory, tools, and knowledge | Python | MIT |
MemGPT | Self-improving agents with infinite context via memory management | Python | MIT |
Griptape | Framework for building AI agents with structured pipelines and memory | Python | Apache-2.0 |
mem0 | AI memory framework for storing & retrieving agent context across sessions | Python | MIT |
Memoripy | Lightweight persistent memory library for LLMs and agents | Python | MIT |
Memobase | Database-like persistent memory for conversational agents | Python | MIT |
Letta (MemGPT) | Long-term memory management for LLM agents | Python | MIT |
Agno | Framework for building AI agents with RAG, workflows, and memory | Python | Apache-2.0 |
Agents SDK | SDK from Vercel for building agentic workflows and applications | TypeScript | Apache-2.0 |
Smolagents | Lightweight agent framework from Hugging Face | Python | Apache-2.0 |
Pydantic AI | Agent framework built on Pydantic for structured reasoning | Python | MIT |
CAMEL | Multi-agent framework enabling role-play and collaboration | Python | Apache-2.0 |
BeeAI | LLM agent framework for AI-driven workflows and automation | Python | Apache-2.0 |
gradio-tools | Integrate external tools into agents via Gradio apps | Python | Apache-2.0 |
Composio | Tool orchestration framework to connect 100+ APIs for agents | Python | Apache-2.0 |
Atomic Agents | Modular agent framework with tool usage and reasoning | Python | Apache-2.0 |
Memary | Memory-augmented agent framework for persistent context | Python | MIT |
Browser Use | Framework for browser automation with AI agents | Python | Apache-2.0 |
OpenWebAgent | Agents for interacting with and extracting from the web | Python | Apache-2.0 |
Lagent | Lightweight agent framework from InternLM | Python | Apache-2.0 |
LazyLLM | Agent framework for lazy evaluation and efficient execution | Python | Apache-2.0 |
Swarms | Enterprise agent orchestration framework (“Agency Swarm”) | Python | MIT |
ChatArena | Multi-agent simulation platform for research and evaluation | Python | Apache-2.0 |
AgentStack | Agent orchestration framework (different from Agency Swarm) | Python | Apache-2.0 |
Archgw | Agent runtime for structured workflows and graph execution | Python | Apache-2.0 |
Flow | Low-code agent workflow framework for LLMs | Python | Apache-2.0 |
Langroid | Framework for building multi-agent conversational systems | Python | Apache-2.0 |
Agentarium | Platform for creating multi-agent environments | Python | Apache-2.0 |
Upsonic | Agent framework focused on context management and tool use | Python | Apache-2.0 |
Tool | Description | Language | License |
---|---|---|---|
PyTorch Lightning | High-level PyTorch interface for LLMs | Python | Apache-2.0 |
unsloth | Fine-tune LLMs faster with less memory | Python | Apache-2.0 |
Axolotl | Post-training pipeline for AI models | Python | Apache-2.0 |
LLaMA-Factory | Easy & efficient LLM fine-tuning | Python | Apache-2.0 |
PEFT | Parameter-Efficient Fine-Tuning library | Python | Apache-2.0 |
DeepSpeed | Distributed training & inference optimization | Python | MIT |
TRL | Train transformer LMs with reinforcement learning | Python | Apache-2.0 |
Transformers | Pretrained models for text, vision, and audio tasks | Python | Apache-2.0 |
LitGPT | Train and fine-tune LLMs lightning fast | Python | Apache-2.0 |
Mergoo | Merge multiple LLM experts efficiently | Python | Apache-2.0 |
Ludwig | Low-code framework for custom LLMs | Python | Apache-2.0 |
txtinstruct | Framework for training instruction-tuned models | Python | Apache-2.0 |
xTuring | Fast fine-tuning of open-source LLMs | Python | Apache-2.0 |
RL4LMs | RL library to fine-tune LMs to human preferences | Python | Apache-2.0 |
torchtune | PyTorch-native library for fine-tuning LLMs | Python | BSD-3 |
Accelerate | Library to easily train on multiple GPUs/TPUs with mixed precision | Python | Apache-2.0 |
BitsandBytes | 8-bit optimizers and quantization for efficient LLM training | Python | MIT |
Lamini | Python SDK for building and fine-tuning LLMs with Lamini API | Python | Apache-2.0 |
Tool | Description | Language | License |
---|---|---|---|
LLM Compressor | Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment | Python | Apache-2.0 |
LightLLM | Lightweight Python-based LLM inference and serving framework with easy scalability and high performance | Python | Apache-2.0 |
vLLM | High-throughput and memory-efficient inference and serving engine for LLMs | Python | Apache-2.0 |
torchchat | Run PyTorch LLMs locally on servers, desktop, and mobile | Python | MIT |
TensorRT-LLM | NVIDIA library for optimizing LLM inference with TensorRT | C++/Python | Apache-2.0 |
WebLLM | High-performance in-browser LLM inference engine | TypeScript/Python | Apache-2.0 |
Tool | Description | Language | License |
---|---|---|---|
JailbreakEval | Automated evaluators for assessing jailbreak attempts | Python | MIT |
EasyJailbreak | Easy-to-use Python framework to generate adversarial jailbreak prompts | Python | Apache-2.0 |
Guardrails | Add guardrails to large language models | Python | MIT |
LLM Guard | Security toolkit for LLM interactions | Python | Apache-2.0 |
AuditNLG | Reduce risks in generative AI systems for language | Python | MIT |
NeMo Guardrails | Toolkit for adding programmable guardrails to LLM conversational systems | Python | Apache-2.0 |
Garak | LLM vulnerability scanner | Python | MIT |
DeepTeam | LLM red teaming framework | Python | Apache-2.0 |
MarkLLM | Watermarking toolkit for LLM outputs | Python | Apache-2.0 |
LLMSanitize | Security toolkit for sanitizing LLM inputs/outputs | Python | MIT |
Tool | Description | Language | License |
---|---|---|---|
Reflex | Build full-stack web apps powered by LLMs with Python-only workflows and reactive UIs. | Python | Apache-2.0 |
Gradio | Create quick, interactive UIs for LLM demos and prototypes. | Python | Apache-2.0 |
Streamlit | Build and share AI/ML apps fast with Python scripts and interactive widgets. | Python | Apache-2.0 |
Taipy | End-to-end Python framework for building production-ready AI apps with dashboards and pipelines. | Python | Apache-2.0 |
AI SDK UI | Vercel’s AI SDK for building chat & generative UIs | TypeScript | Apache-2.0 |
Simpleaichat | Minimal Python interface for prototyping conversational LLMs | Python | MIT |
Chainlit | Framework for building and debugging LLM apps with a rich UI | Python | Apache-2.0 |
Tool | Description | Language | License |
---|---|---|---|
Ollama | Get up and running with large language models locally | Go | MIT |
LM Studio | Desktop app for running local LLMs | - | Commercial |
GPT4All | Open-source chatbot ecosystem | C++ | MIT |
LocalAI | Self-hosted OpenAI-compatible API | Go | MIT |
LiteLLM | Lightweight OpenAI-compatible gateway for multiple LLM providers | Python | MIT |
AI Gateway | Gateway for managing LLM requests, caching, and routing | Python | Apache-2.0 |
Langcorn | Serve LangChain applications via FastAPI with production-ready endpoints | Python | MIT |
LitServe | High-speed GPU inference server with autoscaling and batch support | Python | Apache-2.0 |
Tool | Description | Language | License |
---|---|---|---|
DataDreamer | Framework for creating synthetic datasets to train & evaluate LLMs | Python | Apache-2.0 |
fabricator | Data generation toolkit for crafting synthetic training data | Python | MIT |
Promptwright | Toolkit for prompt engineering, evaluation, and dataset curation | Python | Apache-2.0 |
EasyInstruct | Instruction data generation framework for large-scale LLM training | Python | Apache-2.0 |
Text Machina | Dataset generation framework for robust AI training | Python | Apache-2.0 |
Platform | Description | Pricing | Features |
---|---|---|---|
Clarifai | Lightning-fast compute for AI models & agents | Free tier + Pay-as-you-go | Pre-trained models, Deploy your own models on Dedicated compute, Model training, Workflow automation |
Modal | Serverless platform for AI/ML workloads | Pay-per-use | Serverless GPU, Auto-scaling |
Replicate | Run open-source models with a cloud API | Pay-per-use | Pre-built models, Custom training |
Together AI | Cloud platform for open-source models | Various | Open models, Fine-tuning |
Anyscale | Ray-based platform for AI applications | Enterprise | Distributed training, Serving |
RouteLLM | Dynamic router for selecting best LLMs based on cost & performance | Open-source | Cost optimization, Multi-LLM routing |
We welcome contributions! This toolkit grows stronger with community input.
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-tool
) - Add your contribution (new tool, template, or tutorial)
- Submit a pull request
- Quality over quantity - Focus on tools and resources that provide real value
- Production-ready - Include tools that work in real-world scenarios
- Well-documented - Provide clear descriptions and usage examples
- Up-to-date - Ensure tools are actively maintained
Get weekly AI engineering insights, tool reviews, and exclusive demos and AI Projects delivered to your inbox:
📧 Subscribe to AI Engineering Newsletter →
Join 100,000+ engineers building better LLM applications
Built with ❤️ for the AI Engineering community
Star ⭐ this repo if you find it helpful!