Abixen Platform is a microservices based software platform for building enterprise applications delivering functionalities through creating particular microservices and integrating by provided CMS.

Java 687 215 Updated Jun 21, 2022

GoogleCloudDataproc / spark-bigquery-connector

BigQuery data source for Apache Spark: Read data from BigQuery into DataFrames, write DataFrames into BigQuery tables.

Java 412 219 Updated Dec 16, 2025

raystack / firehose

Firehose is an extensible, no-code, and cloud-native service to load real-time streaming data from Kafka to data stores, data lakes, and analytical storage systems.

Java 341 63 Updated Sep 12, 2024

Cascading / cascading

Forked from cwensel/cascading

All development now happens over here: https://github.com/cwensel/cascading. Cascading is a feature rich API for defining and executing complex and fault tolerant data processing workflows on vario…

Java 332 110 Updated Nov 29, 2018

ananthdurai / schemata

Schema modelling framework for decentralised domain-driven ownership of data.

Java 259 17 Updated Dec 5, 2023

GoogleCloudPlatform / cloud-bigtable-examples

Examples of how to use Cloud Bigtable both with GCE map/reduce as well as stand alone applications.

Java 232 225 Updated Dec 2, 2025

GoogleCloudPlatform / bigquery-data-lineage

Reference implementation for real-time Data Lineage tracking for BigQuery using Audit Logs, ZetaSQL and Dataflow.

Java 148 40 Updated Jun 3, 2024

googleapis / java-bigquery

Java 123 129 Updated Dec 19, 2025

GoogleCloudPlatform / bigquery-antipattern-recognition

Utility to identify and rewrite common anti patterns in BigQuery SQL syntax

Java 113 31 Updated Aug 26, 2025

cata-network / cata_database

CATA.Search. Blockchain database, cata metadata query

Java 105 Updated Aug 19, 2021

GoogleCloudPlatform / dlp-dataflow-deidentification

Multi Cloud Data Tokenization Solution By Using Dataflow and Cloud DLP

Java 95 49 Updated Aug 13, 2024

ExpediaGroup / circus-train

Circus Train is a dataset replication tool that copies Hive tables between clusters and clouds.

Java 91 16 Updated Mar 5, 2024

googleapis / java-bigquerystorage

Java 74 88 Updated Dec 19, 2025

StreakYC / mache

Java App Engine -> BigQuery log export framework

Java 70 11 Updated Nov 14, 2023

GoogleCloudPlatform / bigquery-etl-dataflow-sample

Java 66 45 Updated Aug 16, 2024

borjavb / bq-lineage-tool

BigQuery Column Lineage parser

Java 65 6 Updated Aug 25, 2024

GoogleCloudPlatform / bq-pii-classifier

Automatically discover and tag PII data across BigQuery tables and apply column-level access controls based on confidentiality level.

Java 61 18 Updated Dec 17, 2025

squashql / squashql

Official repository of SquashQL, the SQL query engine for multi-dimensional and hierarchical analysis that empowers your SQL database

Java 60 9 Updated Nov 17, 2025

yu-iskw / bigquery-to-datastore

Export a whole BigQuery table to Google Datastore with Apache Beam/Google Dataflow

Java 58 17 Updated Oct 12, 2020

data-integrations / google-cloud

A collection of Google Cloud Platform (GCP) plugins

Java 49 86 Updated Oct 28, 2025

GoogleCloudPlatform / zetasql-toolkit

The ZetaSQL Toolkit is a library that helps users use ZetaSQL Java API to perform SQL analysis for multiple query engines, including BigQuery and Cloud Spanner.

Java 41 8 Updated Oct 28, 2025

Unytics unytics

Lists (1)

🔮 Future ideas

Starred repositories

Data visualization

data-engineering

BigQuery