Lists (1)
Sort Name ascending (A-Z)
Stars
All the links, books, and creators you need to follow to stay up to date with AI!
An extremely fast Python package and project manager, written in Rust.
Examples from Holden's intro to PySpark workshop. This is an intro level workshop focused on using Spark with Python.
This is a public repository to go over all the LLM-driven data engineering concepts.
All Algorithms implemented in Python
[WIP] Resources for AI engineers. Also contains supporting materials for the book AI Engineering (Chip Huyen, 2025)
Code for DE101 book at https://de101.startdataengineering.com/
Databricks Data Engineer Associate Certification Lab: End-to-end hands-on project covering Auto Loader, Medallion Architecture, SCD Type 2, Unity Catalog governance, and Databricks Jobs orchestrati…
This repository helps teach people how to correctly define and create cumulative tables!
Data Modeling Guide: ClickHouse ETL with NOAA Weather Data Example
This repository goes over how to handle massive variety in data engineering
42 Lisboa | This is a summary of all the work I did during 42 Piscine
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
LLM Zoomcamp - a free online course about real-life applications of LLMs. In 10 weeks you will learn how to build an AI system that answers questions about your knowledge base.
This repo has all the resources you need to become an amazing analytics engineer!
A collection of projects showcasing RAG, agents, workflows, and other AI use cases
Collection of Snowflake Notebook demos, tutorials, and examples
This repo contains a collection of Streamlit in Snowflake demos, tutorials, and examples
A lightweight data processing framework built on DuckDB and 3FS.