Skip to content
#

data-centric-ai

Here are 87 public repositories matching this topic...

Cleanlab's open-source library is the standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.

  • Updated Dec 17, 2025
  • Python

This project focuses on the data side of MLOps — building a simple, reliable pipeline around the NYC Green Taxi dataset. It covers data ingestion, validation, and versioning, with automation through FastAPI, Docker, and GitHub Actions. Learning how to make data workflows cleaner, reproducible, and easier to extend toward full ML pipelines.

  • Updated Oct 22, 2025
  • Jupyter Notebook

Improve this page

Add a description, image, and links to the data-centric-ai topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the data-centric-ai topic, visit your repo's landing page and select "manage topics."

Learn more