A curated collection of resources for prompt engineering, optimization, and automatic prompt generation across text, image, video, and multimodal AI systems.
Prompt optimization is the systematic process of improving prompts to achieve better AI model performance, consistency, and safety. This collection covers everything from manual techniques and best practices to cutting-edge automated optimization frameworks, evaluation tools, and research papers.
- Manual Prompt Engineering
- Automatic Prompt Optimization
- Multimodal Prompting
- Tools and Frameworks
- Research Papers
- Datasets and Benchmarks
- Educational Resources
- Contributing
- Prompt Engineering Guide - Comprehensive guide covering all major techniques including chain-of-thought, few-shot learning, and advanced applications with research backing
- Learn Prompting - Most comprehensive free guide cited by Google, Microsoft, and Wikipedia with 3M+ learners across 13 languages
- OpenAI Prompt Engineering Best Practices - Official OpenAI documentation covering fundamental techniques and model-specific guidance
- Lakera's Ultimate Guide (2025) - Latest techniques including adversarial prompting and security considerations with hands-on examples
- Microsoft Azure OpenAI Prompting - Technical guide for production implementation with system vs user messages and construction patterns
- Chain-of-Thought Prompting - Foundational paper by Google Research showing how intermediate reasoning steps dramatically improve performance
- Zero-Shot Chain-of-Thought - Adding "Let's think step by step" enables zero-shot reasoning without examples
- Plan-and-Solve Prompting - Advanced technique addressing calculation errors by dividing tasks into subtasks before execution
- IBM Prompt Engineering Guide - Enterprise-focused guide with business applications and GitHub repository with practical examples
- Google Cloud Vertex AI Strategies - Production considerations, avoiding pitfalls, and safety guidelines
- Gandalf (Lakera) - Interactive red teaming game for learning adversarial prompting and security testing techniques
- OpenAI Prompt Optimization Cookbook - Hands-on examples with before/after comparisons using optimization tools
- DSPy - Stanford's declarative framework for programming language models with automatic optimization (20k+ stars)
- Automatic Prompt Engineer (APE) - Treats instructions as programs optimized through LLM-proposed candidates with up to 25% improvement on benchmarks
- EvoPrompt - Microsoft's evolutionary algorithm approach connecting LLMs with genetic optimization for discrete prompt optimization
- RLPrompt - Reinforcement learning framework formulating prompt optimization as RL problem with policy networks
- GeneticPromptLab - Python library using genetic algorithms with Sentence Transformers and k-means clustering
- PromptAgent - Monte Carlo Tree Search approach for expert-level prompt optimization using strategic planning
- Promptomatix (Salesforce) - AI-driven framework with DSPy integration and real-time human feedback
- GEPA (Genetic-Pareto) Research - Outperforms GRPO by 10% average using 35x fewer rollouts than traditional RL approaches
- AutoPrompt - Intent-based prompt calibration framework for production moderation and classification tasks
- Comprehensive Text-to-Image Guide - Cross-platform strategies for DALL-E, Midjourney, and Stable Diffusion with structured templates
- Style Transfer Methodology - Separating style and content prompts for consistent aesthetic control
- NegOpt Research - Automated negative prompt optimization achieving 25% improvement in Inception Score
- PromptHero - World's largest searchable database with millions of prompts across major image generation models
- Awesome Prompting on Vision-Language Model - Comprehensive survey of prompting methods for CLIP, BLIP, LLaVA, and GPT-4V
- LLaVA - Visual instruction tuning framework with two-stage training and GPT-4 generated data
- ViP-LLaVA - Framework for understanding visual prompts with rectangles, ellipses, points, and arrows
- OpenAI Sora Official Guide - Authoritative source for video generation with complex scenes and 60-second capability
- Sora Cinematic Framework - Complete template for professional cinematic video generation with storyboard techniques
- Gemini Multimodal Audio - Production-ready techniques for audio analysis and processing
- Awesome Multimodal Chain-of-Thought - Survey of prompt-based, plan-based, and learning-based multimodal reasoning approaches
- DDCoT Prompting Method - Duty-distinct chain-of-thought for advanced multimodal reasoning with negative-space prompting
- promptfoo - Developer-friendly testing tool with red teaming, CI/CD integration, and multi-provider support (5k+ stars)
- Microsoft PromptBench - Unified evaluation framework supporting 30+ models with adversarial testing and dynamic evaluation
- DeepEval - Pytest-like framework with G-Eval, hallucination detection, and safety vulnerability scanning
- PromptTools (Hegel AI) - Open-source platform for prompt testing with local playground and Jupyter integration
- VS Code Prompt Runner - Transforms VS Code into powerful prompt IDE with multi-provider support and agent workflows
- Microsoft Prompty Extension - VS Code extension for single-file prompt assets with built-in execution and debugging
- LangChain Visualizer - Real-time visualization for LangChain workflows with cost tracking and interactive trace inspection
- LangChain - Comprehensive framework with prompt templates, chain system, and agent frameworks (90k+ stars)
- Promptim (LangChain Labs) - Experimental optimization library with automated loops and human feedback integration
- Language Models are Few-Shot Learners (GPT-3) - Introduced in-context learning paradigm establishing zero-shot, one-shot, and few-shot prompting concepts
- Chain-of-Thought Prompting Elicits Reasoning - Demonstrated that intermediate reasoning steps dramatically improve complex reasoning task performance
- Zero-Shot Reasoners - Showed "Let's think step by step" enables reasoning without examples across arithmetic and logic tasks
- Survey of Automatic Prompt Engineering (2025) - First comprehensive survey organizing methods across discrete, continuous, and hybrid optimization spaces
- Systematic Survey of APO Techniques (2025) - 21-author comprehensive categorization with unifying 5-part framework across paradigms
- Prompt Engineering a Prompt Engineer (PE2) - Meta-prompting with detailed descriptions and reasoning templates showing 6.3% improvement on MultiArith
- The Prompt Report (2024) - Most comprehensive survey with taxonomy of 58 LLM techniques and 40 multimodal techniques
- Benchmarking LLM Uncertainty - Introduced benchmark evaluating uncertainty metrics for prompt optimization with Answer, Correctness, Aleatoric, Epistemic measures
- Visual Prompting in MLLMs Survey - First comprehensive survey on visual prompting methods examining alignment between visual encoders and LLMs
- Transferability of Visual Prompts - Proposed Transferable Visual Prompting enabling cross-MLLM prompt transfer with consistency alignment
- MathCoder2 - Novel method for mathematical reasoning using model-translated code with comprehensive MathCode-Pile dataset
- PromptBench (Microsoft) - Unified framework supporting GLUE, MMLU, BigBench Hard, GSM8K with adversarial and dynamic evaluation
- HELM (Stanford) - Holistic evaluation across 16 scenarios and 7 metrics (accuracy, calibration, robustness, fairness, bias, toxicity, efficiency)
- BigBench Hard - 23 challenging tasks requiring chain-of-thought reasoning where LLMs previously failed to exceed human performance
- P3 (Public Pool of Prompts) - BigScience collection of 2,000+ prompts across 270+ datasets with Apache 2.0 license
- Awesome ChatGPT Prompts - Community-maintained repository of curated prompts for various applications
- Stable Diffusion Prompts - 80,000+ prompts for text-to-image generation extracted from Lexica.art
- GSM8K - 8.5K high-quality grade school math word problems (7.5K training, 1K test) with 2-8 step solutions
- MMLU (Massive Multitask Language Understanding) - 15,908 questions across 57 subjects from elementary to professional level
- MultiArith - Arithmetic word problems requiring multi-step reasoning for chain-of-thought evaluation
- BOLD Dataset - 23,679 prompts across 5 demographic domains for bias evaluation in open-ended language generation
- HolisticBias - 460,000 sentence prompts across 13 demographic axes with 600 associated terms per category
- TruthfulQA - Evaluation framework for truthfulness and factual accuracy with adversarial question design
- VQAv2 - Visual question answering with image-text pairs for vision-language model evaluation
- MMBench - 3,000 single-choice questions across 20 vision skills with bilingual support
- MathVista - Visual mathematical reasoning evaluation combining multiple math datasets with visual elements
- Learn Prompting - Free Comprehensive Guide - Most comprehensive resource with 3M+ learners, 13 languages, interactive content rated by difficulty
- MIT Sloan Effective Prompts - Academic perspective on fundamentals with clear context provision and specificity examples
- ChatGPT Prompt Engineering for Developers - DeepLearning.AI course by OpenAI's Isa Fulford and Andrew Ng
- DAIR.AI Prompt Engineering Guide - Academic-grade resource with 1-hour lectures, notebooks, and latest research integration (40k+ stars)
- NirDiamant Tutorials - 22 hands-on Jupyter notebooks covering beginner to advanced techniques with practical implementations (6k+ stars)
- IBM Prompt Engineering Course - Enterprise-focused training with business applications and hands-on GitHub repository
- Awesome Prompt Engineering - Hand-curated collection focusing on GPT, ChatGPT, and PaLM with tools, datasets, and research papers (5k+ stars)
- The Big Prompt Library - System prompts and custom instructions with multi-provider support and security focus
- F/Awesome ChatGPT Prompts - Curated collection of ready-to-use ChatGPT prompts for creative and professional applications
We welcome contributions! Here's how you can help improve this collection:
- Adding Resources: Submit PRs with new tools, papers, or datasets that fit our focus on open-source and research resources
- Improving Descriptions: Help make descriptions more accurate and helpful for both beginners and experts
- Categorization: Suggest better organization or new categories as the field evolves
- Quality Control: Report broken links, outdated information, or resources that no longer meet quality standards
- Focus on open-source tools and freely accessible resources
- Include brief but informative descriptions (1-2 sentences)
- Provide context on target audience (beginner/intermediate/advanced)
- Verify links work and resources are actively maintained
- Follow the existing format and categorization structure
- Educational Value: Resources should teach techniques or provide practical value
- Accessibility: Prefer free and open-source over commercial/proprietary tools
- Recency: Favor resources updated within the last 2 years unless foundational
- Documentation: Well-documented tools and clear usage instructions
Maintained by the community • Licensed under CC0 • Star this repo if it helps you!