Skip to content

Elixir implementation of GEPA: LLM-driven evolutionary optimization using Pareto-efficient search for text-based systems. Features OpenAI/Gemini integration, BEAM concurrency, OTP supervision, 218 tests with 75% coverage.

License

Notifications You must be signed in to change notification settings

nshkrdotcom/gepa_ex

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GEPA Elixir Logo

GEPA for Elixir

Hex.pm Elixir OTP Tests Coverage License

An Elixir implementation of GEPA (Genetic-Pareto), a framework for optimizing text-based system components using LLM-based reflection and Pareto-efficient evolutionary search.

Installation

Add gepa_ex to your list of dependencies in mix.exs:

def deps do
  [
    {:gepa_ex, "~> 0.1.2"}
  ]
end

About GEPA

GEPA optimizes arbitrary systems composed of text components—like AI prompts, code snippets, or textual specs—against any evaluation metric. It employs LLMs to reflect on system behavior, using feedback from execution traces to drive targeted improvements.

This is an Elixir port of the Python GEPA library, designed to leverage:

  • 🚀 BEAM concurrency for 5-10x evaluation speedup (coming in Phase 4)
  • 🛡️ OTP supervision for fault-tolerant external service integration
  • 🔄 Functional programming for clean, testable code
  • 📊 Telemetry event schema for lifecycle, iteration, proposal, and evaluation metrics
  • Production LLMs - OpenAI GPT-4o-mini & Google Gemini Flash Lite (gemini-flash-lite-latest)

Production Ready

Core Features

Optimization System:

  • GEPA.optimize/1 - Public API (working!)
  • GEPA.Engine - Full optimization loop with stop conditions
  • GEPA.Proposer.Reflective - Mutation strategy
  • ✅ LLM-based instruction proposal via reflection_llm and custom templates
  • GEPA.State - State management with automatic Pareto updates (96.5% coverage)
  • GEPA.Utils.Pareto - Multi-objective optimization (93.5% coverage, property-verified)
  • GEPA.Result - Result analysis (100% coverage)
  • GEPA.Adapters.Basic - Q&A adapter (92.1% coverage)
  • ✅ Stop conditions with budget control
  • ✅ State persistence (save/load)
  • ✅ Telemetry event emitters for runs, iterations, proposals, and evaluation batches
  • ✅ End-to-end integration tested

Phase 1 Additions - NEW! 🎉

Production LLM Integration:

  • GEPA.LLM - Unified LLM behavior
  • GEPA.LLM.ReqLLM - Production implementation via ReqLLM
    • OpenAI support (GPT-4o-mini default)
    • Google Gemini support (gemini-flash-lite-latest)
    • Error handling, retries, timeouts
    • Configurable via environment or runtime
  • GEPA.LLM.Mock - Testing implementation with flexible responses

Advanced Batch Sampling:

  • GEPA.Strategies.BatchSampler.EpochShuffled - Epoch-based training with shuffling
  • ✅ Reproducible with seed control
  • ✅ Better training dynamics than simple sampling

Working Examples:

  • ✅ 4 .exs script examples (quick start, math, custom adapter, persistence)
  • ✅ 3 Livebook notebooks (interactive learning)
  • ✅ Comprehensive examples/README.md guide
  • ✅ Livebook guide with visualizations

Phase 2 Additions - NEW! 🎉

Merge Proposer:

  • GEPA.Proposer.Merge - Genealogy-based candidate merging
  • GEPA.Utils - Pareto dominator detection (93.3% coverage)
  • GEPA.Proposer.MergeUtils - Ancestry tracking (92.3% coverage)
  • ✅ Engine integration with merge scheduling
  • ✅ 44 comprehensive tests (34 unit + 10 properties)

Incremental Evaluation:

  • GEPA.Strategies.EvaluationPolicy.Incremental - Progressive validation
  • ✅ Configurable sample sizes and thresholds
  • ✅ Reduces computation on large validation sets
  • ✅ 12 tests

Advanced Stop Conditions:

  • GEPA.StopCondition.Timeout - Time-based stopping
  • GEPA.StopCondition.NoImprovement - Early stopping
  • ✅ Flexible time units and patience settings
  • ✅ 9 tests

Test Quality:

  • 201 tests (185 unit + 16 properties + 1 doctest)
  • 100% passing ✅
  • 75.4% coverage (excellent!)
  • Property tests with 1,600+ runs
  • Zero Dialyzer errors
  • TDD methodology throughout

What's Next?

✅ Phase 1: Production Viability - COMPLETE!

  • ✅ Real LLM integration (OpenAI, Gemini)
  • ✅ Quick start examples (4 scripts + 3 livebooks)
  • ✅ EpochShuffledBatchSampler

✅ Phase 2: Core Completeness - COMPLETE!

  • ✅ Merge proposer (genealogy-based recombination)
  • ✅ IncrementalEvaluationPolicy (progressive validation)
  • ✅ Additional stop conditions (Timeout, NoImprovement)
  • ✅ Engine integration for merge proposer

Phase 3: Production Hardening - in progress

  • ✅ Telemetry event schema and helpers
  • 🎨 Progress tracking (planned)
  • 🛡️ Robust error handling (planned)

Phase 4: Ecosystem Expansion - 12-14 weeks

  • 🔌 Additional adapters (Generic, RAG)
  • 🚀 Performance optimization (parallel evaluation)
  • 🌟 Community infrastructure

Quick Start

With Mock LLM (No API Key Required)

# Define training data
trainset = [
  %{input: "What is 2+2?", answer: "4"},
  %{input: "What is 3+3?", answer: "6"}
]

valset = [%{input: "What is 5+5?", answer: "10"}]

# Create adapter with mock LLM (for testing)
adapter = GEPA.Adapters.Basic.new(llm: GEPA.LLM.Mock.new())

# Run optimization
{:ok, result} = GEPA.optimize(
  seed_candidate: %{"instruction" => "You are a helpful assistant."},
  trainset: trainset,
  valset: valset,
  adapter: adapter,
  max_metric_calls: 50
)

# Access results
best_program = GEPA.Result.best_candidate(result)
best_score = GEPA.Result.best_score(result)

IO.puts("Best score: #{best_score}")
IO.puts("Iterations: #{result.i}")

With Production LLMs (NEW!)

# OpenAI (GPT-4o-mini) - Requires OPENAI_API_KEY
llm = GEPA.LLM.ReqLLM.new(provider: :openai)
adapter = GEPA.Adapters.Basic.new(llm: llm)

# Or Gemini (`gemini-flash-lite-latest`) - Requires GEMINI_API_KEY
llm = GEPA.LLM.ReqLLM.new(provider: :gemini)
adapter = GEPA.Adapters.Basic.new(llm: llm)

# Then run optimization as above
{:ok, result} = GEPA.optimize(
  seed_candidate: %{"instruction" => "..."},
  trainset: trainset,
  valset: valset,
  adapter: adapter,
  max_metric_calls: 50
)

See Examples overview for complete working examples!

Candidate Selection Strategies (NEW)

GEPA includes multiple candidate selectors to balance exploration vs. exploitation:

  • GEPA.Strategies.CandidateSelector.Pareto (default): frequency-weighted sampling from Pareto front
  • GEPA.Strategies.CandidateSelector.CurrentBest: always pick the best-scoring program
  • GEPA.Strategies.CandidateSelector.EpsilonGreedy: configurable exploration with optional epsilon decay

Stateful selectors (like epsilon-greedy) are carried forward automatically so decay persists across iterations.

To enable epsilon-greedy with decay:

selector =
  GEPA.Strategies.CandidateSelector.EpsilonGreedy.new(
    epsilon: 0.3,
    epsilon_decay: 0.95,
    epsilon_min: 0.05
  )

{:ok, result} =
  GEPA.optimize(
    seed_candidate: %{"instruction" => "..."},
    trainset: trainset,
    valset: valset,
    adapter: adapter,
    max_metric_calls: 50,
    candidate_selector: selector
  )

LLM-Based Instruction Proposal (NEW!)

Use an LLM to propose improved component instructions based on reflective feedback. You can also provide a custom proposal template.

reflection_llm = GEPA.LLM.ReqLLM.new(provider: :openai, model: "gpt-4o-mini")

custom_template = """
Improve {component_name}:
Current: {current_instruction}
Feedback: {reflective_dataset}
New instruction:
"""

{:ok, result} = GEPA.optimize(
  seed_candidate: %{"instruction" => "You are a concise math tutor."},
  trainset: trainset,
  valset: valset,
  adapter: adapter,
  max_metric_calls: 50,
  reflection_llm: reflection_llm,
  proposal_template: custom_template
)

When reflection_llm is not provided, GEPA falls back to a simple testing-only improvement marker ("[Optimized]").

Interactive Livebooks (NEW!)

For interactive learning and experimentation:

# Install Livebook
mix escript.install hex livebook

# Open a livebook
livebook server livebooks/01_quick_start.livemd

Available Livebooks:

  • 01_quick_start.livemd - Interactive introduction
  • 02_advanced_optimization.livemd - Parameter tuning and visualization
  • 03_custom_adapter.livemd - Build adapters interactively

See livebooks/README.md for details!

With State Persistence

{:ok, result} = GEPA.optimize(
  seed_candidate: seed,
  trainset: trainset,
  valset: valset,
  adapter: GEPA.Adapters.Basic.new(),
  max_metric_calls: 100,
  run_dir: "./my_optimization"  # State saved here, can resume
)

Development

# Get dependencies
mix deps.get

# Run tests
mix test

# Run with coverage
mix test --cover

# Run specific tests
mix test test/gepa/utils/pareto_test.exs

# Format code
mix format

# Type checking
mix dialyzer

Architecture

Based on behavior-driven design with functional core:

GEPA.optimize/1
  ↓
GEPA.Engine ← Behaviors → User Implementations
  ├─→ Adapter (evaluate, reflect, propose)
  ├─→ Proposer (reflective, merge)
  ├─→ Strategies (selection, sampling, evaluation)
  └─→ StopCondition (budget, time, threshold)

Documentation

Technical Documentation

Changelog

v0.1.2 (2025-11-29)

  • Epsilon-greedy candidate selector with decay/reset and stateful selector support in engine/proposer
  • Telemetry event schema and LLM-backed instruction proposal with custom templates
  • Reflective proposer consumes instruction proposals with fallback marker when no LLM is provided
  • Docs for completing the port and telemetry-first experiment tracking

v0.1.1 (2025-11-29)

  • Documentation cleanup and release tagging

v0.1.0 (2025-10-29)

  • Initial release with Phase 1 & 2 complete
  • Production LLM integration (OpenAI GPT-4o-mini, Google Gemini Flash Lite)
  • Core optimization engine with reflective and merge proposers
  • Incremental evaluation and advanced stop conditions
  • 218 tests passing with 75.4% coverage

Related Projects

License

MIT License

About

Elixir implementation of GEPA: LLM-driven evolutionary optimization using Pareto-efficient search for text-based systems. Features OpenAI/Gemini integration, BEAM concurrency, OTP supervision, 218 tests with 75% coverage.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

  •  
  •  

Languages