DBT SPICE

Introduction

The aim of this repository is both to provide tools for Heavy crunching data (deep statistical analyses, Machine Learning methods refactored to DBT Jinja SQL, etc) and Big Data best practices with DBT (cleaners to run be triggered to maintain your datasets unpolluted, metadata crawlers for BigQuery, etc).

Currently focused on GCP work with BigQuery. Support to the mission & PRs are also accepted.

Maybe one day this can be turned into a DBT package to install.

Development

Methodology

Data processing macros will be developed using dummy CSVs as DBT seeds. Then they will be run against massive columns. Processing rows and computing times will be added to the documentation

Currently working with Python 3.11.9. DBT/SQL libraries at requirements.txt

Macros for data processing will be tested using CSVs as seeds to create the input and expected output

Features

Utils

BigQuery autocleaner

Description:

Helps to keep the BigQuery environment clean and organized. Automatically removes redundant objects in BigQuery (tables that are not needed anymore, tables that were renamed and the old versions still exist, etc)

Path:

macros/utils/bq_cleaner.sql

Numerical processing

...

String processing

...

Work in progress and backlog

⚒️ In progress

String Occurrence Count

📋 TODO

If there is a specific functionality that you would like to cover with DBT, contact me. 
Also support and PRs are accepted

TDF-IDF

Max-min Scaler

Z-score Scaler

Contact

LinkedIn

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
analyses		analyses
macros		macros
models		models
seeds		seeds
snapshots		snapshots
tests		tests
.gitignore		.gitignore
.sqlfluff		.sqlfluff
.sqlfluffignore		.sqlfluffignore
LICENSE		LICENSE
README.md		README.md
catstruction.png		catstruction.png
dbt_project.yml		dbt_project.yml
package-lock.yml		package-lock.yml
packages.yml		packages.yml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

DBT SPICE

Table of contents

Introduction

Development

Methodology

Features

Utils

BigQuery autocleaner

Numerical processing

String processing

Work in progress and backlog

Contact

About

Uh oh!

Releases

Packages

License

albertovpd/dbt_spice

Folders and files

Latest commit

History

Repository files navigation

DBT SPICE

Table of contents

Introduction

Development

Methodology

Features

Utils

BigQuery autocleaner

Numerical processing

String processing

Work in progress and backlog

Contact

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages