Skip to content
#

data-quality

Here are 28 public repositories matching this topic...

This GitHub repository provides a comprehensive set of tools and algorithms for detecting fraud anomalies in various data sources. Fraudulent activities can have severe consequences, impacting businesses and individuals alike. With this repository, we aim to empower researchers with effective techniques to identify and prevent fraudulent behavior.

  • Updated Aug 16, 2023
  • HTML
DataTrustEngineering

Data Trust Engineering (DTE) is a vendor-neutral, engineering-first approach to building trusted, Data, Analytics and AI-ready data systems. This repo hosts the Manifesto, Patterns, and the Trust Dashboard MVP.

  • Updated Oct 1, 2025
  • HTML

Comprehensive data governance pipeline for SSH honeypot logs—covering data profiling, cleansing, quality assurance, encryption, classification, and GDPR/CCPA/HIPAA compliance. Built with Pandas, Pandera, YData Profiling, and cryptography, with simulated Caesar cipher attacks to demonstrate practical data-security techniques.

  • Updated Jun 23, 2025
  • HTML

Agentic Data Engineering Platform is an open-source, production-ready ETL solution that combines the Medallion Architecture with AI-powered agents that autonomously profile, clean, and optimize your data—so you can focus on insights, not infrastructure.

  • Updated Nov 16, 2025
  • HTML

Improve this page

Add a description, image, and links to the data-quality topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the data-quality topic, visit your repo's landing page and select "manage topics."

Learn more