Skip to content
#

data-generation

Here are 144 public repositories matching this topic...

datamimic

🧠 Model-driven synthetic test data for CI/CD and analytics - deterministic, privacy-preserving, and domain-aware. Includes Python APIs, XML pipelines, and MCP/IDE integration to orchestrate realistic datasets for finance, healthcare, and other regulated environments.

  • Updated Nov 9, 2025
  • Python

🚀 AI-powered synthetic data generator that creates educational flowcharts and diagrams using LangGraph workflows. Features FastAPI integration, OpenAI LLM processing, and automated Mermaid diagram generation with iterative quality improvement through reflection patterns.

  • Updated Nov 2, 2025
  • Python

Generate relevant synthetic data quickly for your projects. The Databricks Labs synthetic data generator (aka `dbldatagen`) may be used to generate large simulated / synthetic data sets for test, POCs, and other uses in Databricks environments including in Delta Live Tables pipelines

  • Updated Nov 6, 2025
  • Python

Improve this page

Add a description, image, and links to the data-generation topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the data-generation topic, visit your repo's landing page and select "manage topics."

Learn more