Skip to content

CI / Telemetry: Investigate how to add logic in the CI to track the quality of synthetic data generation #516

@courtneypacheco

Description

@courtneypacheco

Background / Context

We currently do not capture or assess the quality of our synthetic data generation (SDG). As a result, we do not know if the quality of the generated output has improved over time, regressed over time, or stayed the same, and we have nothing at the moment to indicate that a PR's proposed changes could impact quality.

Desired Outcomes

Going forward, we would like to increase our confidence in making more sweeping changes in SDG without manual testing and inspection of outputs.

This issue is fairly open-ended, but at a bare minimum, we need to:

  • Capture model quality
  • Find a place to store that quality information
  • Report the quality in an easily-digestible format ("easily-digestible" is subjective though, so best to verify with the maintainers)

As far as capturing model quality goes, this will require some investigation on the assignee's part, and will likely involve direct collaboration with the SDG maintainers. Same with reporting the quality. Specifically, when, how, and where to report it?

Metadata

Metadata

Assignees

No one assigned

    Labels

    CI/CDAffects CI/CD configurationenhancementNew feature or requeststale

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions