🔍 Clean and manage data effortlessly with DataSentry, an open-source platform for real-time and batch processing, featuring smart analytics and compliance tools.
-
Updated
Apr 19, 2026 - Java
🔍 Clean and manage data effortlessly with DataSentry, an open-source platform for real-time and batch processing, featuring smart analytics and compliance tools.
🔧 Showcase fail-fast Kafka schema validation in Spring Boot with this demo, preventing broken deployments through contract enforcement and Avro support.
The Metadata Platform for your Data and AI Stack
HiveMQ Edge is an MQTT gateway that enables interoperability between OT devices and IT systems. It translates diverse protocols into MQTT for streamlined communication and helps organize data into a unified namespace, making managing and streaming data across your infrastructure easier.
Egeria core
qData is an open-source, all-in-one data middle platform that supports core capabilities including data infrastructure, data governance, data development, monitoring & alerting, data services, and data visualization.
Collect, aggregate, and visualize a data ecosystem's metadata
AI-powered metadata management platform with data lineage & governance.
First open-source data discovery and observability platform. We make a life for data practitioners easy so you can focus on your business.
A data contract governance system for managing and validating data contracts across distributed services.
Collect, aggregate, and visualize a data ecosystem's metadata
🛡️An AI-powered data governance agent platform. Supports Real-time Interception & Database Batch Cleaning for sensitive info (PII) and spam. Features configurable policies, dual-engine detection (Rule+LLM), and safe write-back with rollback. 🛡️ DataSentry:基于 AI 的全链路数据治理 Agent 平台。支持实时 API 拦截与数据库存量清洗(PII/垃圾信息)。融合“规则+大模型”双引擎检测,提供可配置策略、全链路审计及安全回滚机制。
Kafka Schema Evolution & Contract Enforcement demo with Avro, Schema Registry and Spring Kafka.
Identify and tokenize sensitive data automatically using Cloud DLP and Dataflow
An *open specification* multi-language, multi-protocol to describe your Data Protection rules and Personal Identifying Information as part of your schema
Reference implementation for real-time Data Lineage tracking for BigQuery using Audit Logs, ZetaSQL and Dataflow.
Data-Export支持将链上数据导出到MySQL、ES等便于进行大数据处理的存储介质中,解决区块链数据复杂查询、分析、可视化和处理的问题。
Classify Confluence pages using existing SKOS and RDFS controlled vocabularies, improve search, build better tables of contents, capture structured data alongside other content and integrate Confluence into knowledge graphs using SPARQL.
Data-Stash是基于FISCO-BCOS的数据仓库组件,通过解析节点的binlog日志,生成该节点状态的全量备份,从而使节点能够实现冷热数据分离和数据裁剪。
DataSphere is the first open-source cloud-native data observability platform that helps you trace the whole data infrastructure in your warehouses, lakes and databases.
Add a description, image, and links to the data-governance topic page so that developers can more easily learn about it.
To associate your repository with the data-governance topic, visit your repo's landing page and select "manage topics."