Skip to content

PC-060 — Automated submission pipeline: ATLAS-controlled HF middleman (Phase 3) #102

@itigges22

Description

@itigges22

Roadmap entry — Phase 3. Surfaced 2026-05-01 during PC-056 architectural conversation. Project lead's preferred long-term contribution architecture.

Why

PC-059 (manual PR review) is the cheap stepping-stone. PC-060 is the end-state contribution flow that lowers barrier to a single CLI invocation while keeping safety guarantees:

  • Contributors don't need HF accounts or tokens
  • Single source of truth for what's accepted (CI gate)
  • HF token is held by ATLAS infrastructure only — no user-supplied secret leak vectors
  • Auto-merge on green = zero-touch contribution at maintainer's end

Architecture

user CLI: atlas lens publish gemma-4b
   |
   v
[ATLAS-controlled upload endpoint]
   |
   v
[GHA validation pipeline]
   - safetensors-only constraint (rejects pickle)
   - tensor shape + dim sanity vs declared model
   - load + score against held-out validation set
     (good/bad pairs — verifies Lens isn't poisoned)
   - license + model-card scrape
   |
   v (on pass)
[ATLAS HF org account uploads artifacts]
   |
   v
[Registry merge bot opens + auto-merges PR with HF link]
   |
   v
Downstream: atlas model list shows new entry on next pull

What this requires

  • Upload endpoint (cheap object storage + auth)
  • GHA runners with GPU access (for Lens validation step)
  • ATLAS HF org account + token in CI secrets
  • Held-out validation set (designed alongside PC-058 / PC-059)
  • Registry merge bot (PR auto-merge once CI greens)
  • Revocation flow (pull registry entry + delete from HF if discovered bad post-merge)

Trust gate

Static security check is limited (cannot detect poisoned weights from the binary alone). The validation set is the actual trust gate — submission must produce sensible scores on known good/bad pairs before merge. Worth designing the validation suite alongside the submission pipeline so we don't ship the upload path before the gate exists.

Why Phase 3

  • Real infrastructure cost (storage, GPU runners, secret management)
  • Justifies the build only after PC-059 produces enough submission volume to feel the friction
  • Validation suite design itself is a project (likely PC-060.1)

Dependencies

  • PC-059 must land first (volume + validation criteria from manual review experience)
  • PC-058 validation set design (the "good/bad pairs" gate inputs)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions