Tags: goyaljai/chief-of-staff
Tags
v2.0 Phase 1 closed — Foundation (T1 + T2 + C1.5) Shipped: - T1: Voyage rerank-2 cross-encoder for retrieval quality boost - T2: Databricks gte-large-en (1024-dim) embeddings replace HuggingFace MiniLM - C1.5: eval suite expanded 16 → 40 tasks (28 fast + 12 slow incl. 8 hard) - Round-5 audit: 4 critical bugs in T1/T2 found and fixed (test_rag_pipeline.py) - Round-6 audit: 4 more critical bugs (R6-1 silent corruption, R6-2 KeyError, R6-3 rerank floor, R6-4 atomic dual-write) — all fixed and tested Baseline run for v2.0 Phase 1 in flight at tag time (40-task eval against the new T1+T2 stack). Result file lands in eval/results/ when complete. Tests: 11 unit suites, 13 RAG cases — all green. Phase 1 of v2.0 is now structurally complete. Phase 2 (B1 mid-stream interrupt) is next.
v1.0 — closing snapshot Final state of the P0 sprint. All 4 phases (1-2-3 + Phase 4/5/6 quick wins) shipped. 29 audit fixes across 4 rounds. 10 unit suites + 60+ assertions all green. Eval baseline: 16/16 PASS frozen. main is frozen at v1.0. All future work happens on develop only. v2.0 will ship from develop after Phases 1-6 of the closing scope.