Tags: Tracer-Cloud/opensre
Tags
feat(bench): structured-outputs predictor + overfit controls (#2794) * full experiment package (revert + new variant + overfit controls) * added overfit into bench framework * fix(bench): address greptile review on structured-outputs PR * fixed A/A variant issue * fixed float division * the same experiment but for N=100
fix(bench): experiments: false-healthy guard with plumbing and bridge… … contract tests, floor 0 tools; refactoring (#2770) * fix(bench): false-healthy guard + plumbing + bridge contract tests * added B2-fire stats column * fix(bench): gate corpus-required false-healthy tests + B2 validation config * revert false helath as it did not work * config for openai comparison * bench openai config for floor 0 * added config for experiemnt with floor 0 of tools * fix(bench): add min_tool_calls config field + CLI override * chore(bench): move configs/ into cloudopsbench/configs/ * move configs/ + cloudopsbench AWS docs into cloudopsbench/
fix(bench): dispatcher singleton + vocab snap + object_a1 + llm_alone… … control arm (#2759) * report consistency-selected A@1, fixed dispatcher bug, and object_a1, object_a3 not emitted in per-case metrics, added floor sweep and pure llm, added MTTI metric, fixed MODEL_CONTEXT_WINDOWS, handled oversized prompt, running experiment, set default min tool cals as 5, was 8
chore(deps): bump huggingface-hub from 1.15.0 to 1.17.0 (#2727) Bumps [huggingface-hub](https://github.com/huggingface/huggingface_hub) from 1.15.0 to 1.17.0. - [Release notes](https://github.com/huggingface/huggingface_hub/releases) - [Commits](huggingface/huggingface_hub@v1.15.0...v1.17.0) --- updated-dependencies: - dependency-name: huggingface-hub dependency-version: 1.17.0 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
PreviousNext