Highlights
- All languages
- Awk
- Batchfile
- C
- C#
- C++
- CMake
- CSS
- Clojure
- CodeQL
- Cypher
- Dockerfile
- Git Attributes
- Go
- Groovy
- HCL
- HTML
- Haskell
- JSON
- Java
- JavaScript
- Jinja
- Jsonnet
- Julia
- Jupyter Notebook
- Just
- Kotlin
- LLVM
- Lean
- Lua
- MDX
- Makefile
- Markdown
- Mustache
- Objective-C
- PHP
- PLSQL
- PLpgSQL
- Perl
- Pug
- Python
- RobotFramework
- Roff
- Ruby
- Rust
- SCSS
- SQL
- SVG
- Scala
- Shell
- Smarty
- Starlark
- Svelte
- Swift
- TLA
- TeX
- Thrift
- TypeScript
- Vim Script
- Vue
- YAML
- reStructuredText
Starred repositories
Sentiment Analysis of Tweets in real time
A time-series database for high-performance real-time analytics packaged as a Postgres extension
100+ RAG interview questions with answers.
The easy-to-use open source Business Intelligence and Embedded Analytics tool that lets everyone work with data 📊
The paper list of "Memory in the Age of AI Agents: A Survey"
This repository contains the notebooks and presentations we use for our Databricks Tech Talks
A repository of data on coronavirus cases and deaths in the U.S.
A free to use dbt package for creating and loading Data Vault 2.0 compliant Data Warehouses (powered by dbt, an open source data engineering tool, registered trademark of dbt Labs)
MetricFlow allows you to define, build, and maintain metrics in code.
High-performance automatic differentiation of LLVM and MLIR.
Create full-fledged APIs for slowly moving datasets without writing a single line of code.
Knowledge sharing - Material about data-lakes, data warehouses and data lake-houses
A highly efficient daemon for streaming data from Kafka into Delta Lake
Lakehouse (Delta Lake, Apache Iceberg & Apache HUDI)
API Framework heavily relying on the power of DuckDB and DuckDB extensions. Ready to build performant and cost-efficient APIs on top of BigQuery or Snowflake for AI Agents and Data Apps
The leader in Customer Data Infrastructure
A fully incremental model, that transforms raw web event data generated by the Snowplow JavaScript tracker into a series of derived tables of varying levels of aggregation.
Compare tables within or across databases
dbt Package for modeling raw data exported by Google Analytics 4. BigQuery support, only.
A collection of Python agent samples built with the Google Agent Development Kit (ADK), demonstrating integrations with services like BigQuery and Vertex AI Search.
Batch data ingestion into Amazon OpenSearch Service using AWS Glue
Databricks framework to validate Data Quality of pySpark DataFrames and Tables