Skip to content

epappas/autojepa

Repository files navigation

AutoJEPA

License: MIT Python 3.10+ Phase-3 KPI ADRs

Autonomous design-space search over Joint-Embedding Predictive Architecture (JEPA) pretraining recipes. A clean fork of autoresearch-rl, purpose-built for self-supervised pretraining. Prime deployment target: Basilica GPU cloud.

prepare.py  -->  [data + probe-eval]  -->  train.py  -->  [probe_auroc]  -->  keep/discard  -->  repeat
 (frozen)                                  (mutable)                            |
                                                ^                               |
                                                |    LLM proposes next          |
                                                +-------- params or diff -------+

Identity

AutoJEPA inherits the autoresearch pattern — frozen prepare.py + mutable train.py, AST-validated LLM-proposed diffs, hybrid (param → diff on stall) policy — and replaces RL/SFT-shaped defaults with JEPA-shaped defaults:

  • Probe-based downstream evaluation as the campaign objective (probe_auroc, not training loss — JEPA loss collapses).
  • RankMe / LiDAR / latent-variance / effective-rank as hard fail gates against representation collapse.
  • VICReg-aware loss defaults (C-JEPA).
  • Composable mask primitives as first-class building blocks the LLM combines.
  • Forecaster recalibrated for SSL learning curves (long plateau where only probe score moves).
  • Multi-seed scoring because JEPA outcomes are seed-sensitive.

Quickstart

uv sync --extra dev --extra jepa
uv run autojepa run examples/ijepa-cifar10/config.yaml

Common workflows are wrapped in Makefile:

make help        # list targets
make check       # lint + typecheck + full tests
make test-fast   # tests excluding slow integration suite

Basilica-first

AutoJEPA targets GPU pretraining; Basilica is the prime deployment target and basilica-sdk is a default dependency. Local command and http targets remain available (inherited from autoresearch-rl) but campaign configs default to target: basilica.

target:
  type: basilica
  image: pytorch:2.4.1-cuda12.4
  gpu_count: 1
  gpu_models: [A100, H100]
  memory: 32Gi

The two scripts

Every campaign has two scripts connected by the filesystem, never by imports:

prepare.py (frozen) — runs once via prepare_cmd. Produces data shards, defines the probe-eval pipeline and collapse-detection callbacks. The LLM cannot modify this file. Trust boundary: evaluation integrity is guaranteed by freezing it.

train.py (mutable) — runs each iteration. Reads prepared data, trains the JEPA model (Φc context encoder + Φt EMA target encoder + Ψ predictor), prints metrics to stdout via emit_progress. The LLM proposes diffs in llm_diff or hybrid mode.

Roadmap

See TODO.md for the live phased plan, CHANGELOG.md for releases, and docs/research/ for the cited research corpus. Architecture writeup: gist 2567a53.

The Phase-2 falsifier (CIFAR I-JEPA) was the kill criterion for the framework approach. The Phase-3 falsifier (trace-jepa) was the kill criterion for the application (JEPA-for-LLM-agent-traces). Both were crossed:

  • Phase 2: framework approach validated — LLM-authored diffs produced kept ratchet on Basilica
  • Phase 3: probe_auroc = 0.7516 at 5% FPR (kept iter, weights persisted via Git LFS — see v13 evidence in artifacts/trace-jepa/)

Architecture Decisions

This project tracks every load-bearing decision in docs/adr/ — 34 ADRs as of v0.2.0. Start with ADR-001 (why a fork) and ADR-004 (probe AUROC as the campaign objective); skim the index for the full set.

Lineage

FunSearch → AI Scientist v1/v2 → ADAS → AIDE → AlphaEvolve → karpathy/autoresearch → autoresearch-rlAutoJEPA.

Sibling upstream: ../autoresearch-rl (added as git remote upstream for cherry-pick reference only — capped at ~1h/wk).

Contributing

Pull requests welcome. See CONTRIBUTING.md for the per-area pre-merge checklist; the CLAUDE.md hard rule applies: do not call a feature done without a realistic-config end-to-end run on the same day you wrote it.

Citation

If AutoJEPA helps your research, please cite the architecture writeup:

@misc{pappas2026autojepa,
  title  = {AutoJEPA: Autonomous Design-Space Search over JEPA Pretraining Recipes},
  author = {Pappas, Evangelos},
  year   = {2026},
  howpublished = {\url{https://github.com/epappas/autojepa}},
}

License

MIT. See LICENSE.

About

No description, website, or topics provided.

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors