Skip to content

Tags: KevinRabun/judges

Tags

v3.129.9

Toggle v3.129.9's commit message
release: v3.129.9 — OpenSSF CVE L2 evaluation pipeline, 30 real-world…

… CVE cases prepared

v3.129.8

Toggle v3.129.8's commit message
release: v3.129.8 — focused security+correctness judge set for extern…

…al benchmarks, cap candidates at 5

v3.129.7

Toggle v3.129.7's commit message
release: v3.129.7 — adaptive content-based judge selection for extern…

…al benchmarks

v3.129.6

Toggle v3.129.6's commit message
release: v3.129.6 — cap candidates at top 8 by severity for honest Ma…

…rtian scoring

v3.129.5

Toggle v3.129.5's commit message
release: v3.129.5 — Martian semantic judge + export-to-martian script

v3.129.4

Toggle v3.129.4's commit message
release: v3.129.4 — code review mode prompt directive for external be…

…nchmarks

v3.129.3

Toggle v3.129.3's commit message
release: v3.129.3 — broad judge selection for external benchmarks

v3.129.2

Toggle v3.129.2's commit message
release: v3.129.2 — benchmark output isolation per suite

v3.129.1

Toggle v3.129.1's commit message
release: v3.129.1 — Martian benchmark L2 accuracy, full diff context,…

… codify-amendments auto-clear

v3.129.0

Toggle v3.129.0's commit message
release: v3.129.0 — external benchmark registry, OpenSSF CVE + Martia…

…n Code Review adapters, LLM eval pipeline