Data file examples and user guides for VerityPy and VerityDotNet libraries
-
Updated
Oct 17, 2024 - HTML
Data file examples and user guides for VerityPy and VerityDotNet libraries
Agentic Data Engineering Platform is an open-source, production-ready ETL solution that combines the Medallion Architecture with AI-powered agents that autonomously profile, clean, and optimize your data—so you can focus on insights, not infrastructure.
Detecting errors and anomalies in structured data using automation
End-to-End Data Engineering Pipeline for E-commerce Analytics.
FIMUS imputes numerical and categorical missing values by using a data set’s existing patterns including co-appearances of attribute values, correlations among the attributes and similarity of values belonging to an attribute.
A complete data mining project proposal and methodological blueprint for predicting the link between hypertension and psychopathologies in geriatric care, following the CRISP-DM framework.
Data Migration Quality Framework - A robust ETL pipeline with advanced anomaly detection for ensuring data quality during migrations
Ergebnisse der Datenanalyse vom Feinstaub Hackathon 2018 der Stuttgarter Zeitung
LEILA - Librería de calidad de datos
A comprehensive repository housing a collection of insightful blog posts, in-depth documentation, and resources exploring various facets of data engineering. From ETL processes and database management to orchestration tools, data quality, monitoring, and deployment strategies
TellMeQuality is a tool for measuring Data Quality according to ISO/IEC 25024.
This GitHub repository provides a comprehensive set of tools and algorithms for detecting fraud anomalies in various data sources. Fraudulent activities can have severe consequences, impacting businesses and individuals alike. With this repository, we aim to empower researchers with effective techniques to identify and prevent fraudulent behavior.
To describe age-gender unbiased COVID-19 subphenotypes regarding severity patterns through a two-stage clustering approach using patient phenotypes and demographic features. Additional source and temporal variability assessments are included as part of data quality analyses.
Data Trust Engineering (DTE) is a vendor-neutral, engineering-first approach to building trusted, Data, Analytics and AI-ready data systems. This repo hosts the Manifesto, Patterns, and the Trust Dashboard MVP.
Comprehensive data governance pipeline for SSH honeypot logs—covering data profiling, cleansing, quality assurance, encryption, classification, and GDPR/CCPA/HIPAA compliance. Built with Pandas, Pandera, YData Profiling, and cryptography, with simulated Caesar cipher attacks to demonstrate practical data-security techniques.
A web application for displaying automation test reports.
Just what you expect from your electricity grid data
Add a description, image, and links to the data-quality topic page so that developers can more easily learn about it.
To associate your repository with the data-quality topic, visit your repo's landing page and select "manage topics."