Highlights
Stars
Polars extension for general data science use cases
A reactive notebook for Python โ run reproducible experiments, query with SQL, execute as a script, deploy as an app, and version with git. Stored as pure Python. All in a modern, AI-native editor.
A curated list of Polars talks, tools, examples & articles. Contributions welcome !
Words of the same length with related meanings.
Mesa is an open-source Python library for agent-based modeling, ideal for simulating complex systems and exploring emergent behaviors.
๐ฆ A curated list of awesome DuckDB resources
Scikit-learn compatible decision trees beyond those offered in scikit-learn
A non-validating SQL parser module for Python
Learning the practice of Monte Carlo simulations in data science (Research Module in Econometrics and Statistics; Master's/PhD)
Data Engineering Zoomcamp is a free 9-week course on building production-ready data pipelines. The next cohort starts in January 2026. Join the course here ๐๐ผ
A list of publicly available datasets with real-time data maintained by the team at bytewax.io
Free MLOps course from DataTalks.Club
Website sources for Applied Machine Learning for Tabular Data
Data quality assessment and metadata reporting for data frames and database tables
This is a repo with links to everything you'd ever want to learn about data engineering
A lightweight version of R Markdown (without using Pandoc or knitr)
An introduction to network analysis and applied graph theory using Python and NetworkX
An R package for working with NCAA Basketball Play-by-Play Data
Get started with dbt in less than 1 minute from `git clone` to `dbt docs serve` for free!