Skip to content
@Toloka

Toloka

Data labeling platform for ML

Pinned Loading

  1. beemo beemo Public

    Benchmark for fine-grained machine-generated text detection. 6.5k texts written by humans, generated by ten open-source instruction-finetuned LLMs and edited by expert annotators.

    7 1

  2. u-math u-math Public

    Official evaluation code for the U-MATH and μ-MATH benchmarks. These datasets are designed to test the mathematical reasoning and meta-evaluation capabilities of LLMs on university-level problems.

    Python 9 3

  3. crowd-kit crowd-kit Public

    Control the quality of your labeled data with the Python tools you already know.

    Python 233 19

Repositories

Showing 10 of 29 repositories