Skip to content
@cimo-labs

cimo-labs

Popular repositories Loading

  1. cje cje Public

    Causal Judge Evaluation: calibrate LLM-as-judge scores against oracle labels with valid uncertainty.

    Python 43 4

  2. kvault kvault Public

    Agent-first knowledge vault framework for extracting structured knowledge from unstructured data

    Python 4 1

  3. cje-arena-experiments cje-arena-experiments Public

    Python 2

  4. healthbench-judge-audit healthbench-judge-audit Public

    Reproducibility artifact for HealthBench judge calibration audit using CJE

    Python

Repositories

Showing 4 of 4 repositories

People

This organization has no public members. You must be a member to see who’s a part of this organization.

Top languages

Loading…

Most used topics

Loading…