Automatically learns repository-specific skills for coding agents using evolutionary search.
Given a GitHub repository, gskill produces a .claude/skills/{repo}/SKILL.md file containing optimized instructions that dramatically improve an agent's resolve rate on that repo's issues. It implements the pipeline described in the GEPA blog post, which demonstrated improvements from 24% → 93% resolve rate on some repositories.
- Loads verifiable software engineering tasks from SWE-smith for the target repository
- Generates an initial skill via static analysis of the repo (README, config files) + gpt 5.2.
- Uses GEPA's
optimize_anythingto iteratively refine the skill through evolutionary search - Each candidate skill is evaluated by running mini-SWE-agent on training tasks inside Docker and checking whether the FAIL_TO_PASS tests pass
- Writes the best-scoring skill to disk
- Python 3.13+
- uv
- Docker (for running SWE-smith task environments)
OPENAI_API_KEYset in your environment (for initial skill generation and GEPA reflection)GSKILL_AGENT_MODEL(optional) — LiteLLM model string for mini-SWE-agent (default:openai/gpt-5.2)
git clone https://github.com/your-org/gskill
cd gskill
uv syncuv run python main.py run https://github.com/pallets/jinjaThis will:
- Load SWE-smith tasks for
pallets/jinja - Generate an initial skill
- Run up to 150 mini evaluations to optimize the skill
- Write the result to
.claude/skills/jinja/SKILL.md
run only works for repositories that have task instances in SWE-bench/SWE-smith.
If a GitHub repository exists but is not covered by that dataset, gskill will fail with
an unsupported-repo message.
# Custom evaluation budget (more evals = better skill, slower run)
uv run python main.py run https://github.com/pallets/jinja --max-evals 300
# Custom output directory
uv run python main.py run https://github.com/pallets/jinja --output-dir ~/skills
# Skip static analysis, start from an empty seed
uv run python main.py run https://github.com/pallets/jinja --no-initial-skill
# Use a different model for the coding agent
uv run python main.py run https://github.com/pallets/jinja --agent-model openai/gpt-5-mini
# Use a local model (e.g. qwen2.5-coder running on localhost:11434)
OPENAI_BASE_URL=http://localhost:11434/v1 \
uv run python main.py run https://github.com/pallets/jinja --agent-model openai/gpt-oss-120bYou can also set the agent model via the GSKILL_AGENT_MODEL environment variable instead of passing --agent-model every time.
# Show the first 10 SWE-smith tasks for a repo
uv run python main.py tasks pallets/jinja
# Show more
uv run python main.py tasks pallets/jinja --limit 25# List the first 50 supported repos
uv run python main.py repos
# Filter supported repos by substring
uv run python main.py repos --filter fastuv run python main.py --help
uv run python main.py run --help
uv run python main.py tasks --helpThe optimized skill is written to:
.claude/skills/{repo}/SKILL.md
To use it with Claude Code, add the skill path to your project's .claude/settings.json or reference it from your CLAUDE.md.
A Taskfile.yml provides shortcuts for common operations (requires Task):
task sync # uv sync
task lint # ruff check
task format # ruff format
task test # pytest
task run -- owner/repo # gskill run (pass args via CLI_ARGS)
task tasks # gskill tasks (pass args via CLI_ARGS)gskill/
├── main.py # CLI entry point (typer)
├── src/
│ ├── pipeline.py # Top-level orchestration
│ ├── tasks.py # SWE-smith dataset loading & splitting
│ ├── evaluator.py # mini runner + pass/fail evaluation
│ └── skill.py # Initial skill generation (gpt-5.2) + file I/O
├── Taskfile.yml # Task runner shortcuts
└── pyproject.toml