Highlights
Stars
A reactive notebook for Python — run reproducible experiments, query with SQL, execute as a script, deploy as an app, and version with git. Stored as pure Python. All in a modern, AI-native editor.
Preswald is a WASM packager for Python-based interactive data apps: bundle full complex data workflows, particularly visualizations, into single files, runnable completely in-browser, using Pyodide…
A non-validating SQL parser module for Python
Mesa is an open-source Python library for agent-based modeling, ideal for simulating complex systems and exploring emergent behaviors.
A collection of research papers on decision, classification and regression trees with implementations.
👩🏫 Advanced NLP with spaCy: A free online course
nannyml: post-deployment data science in python
An end-to-end GoodReads Data Pipeline for Building Data Lake, Data Warehouse and Analytics Platform.
PMLB: A large, curated repository of benchmark datasets for evaluating supervised machine learning algorithms.
Code & Data for "Tabular Transformers for Modeling Multivariate Time Series" (ICASSP, 2021)
Words of the same length with related meanings.
Datasets derived from US census data
Python based GBDT implementation on GPU. Efficient multioutput (multiclass/multilabel/multitask) training
Demo Project for Open Source MDS
An agent orchestration framework for economic agents
Render reproducible examples of Python code for sharing.
A few public recipes for things I've wanted to do and have solved in dagster
Python package for Recentered Influence Function (RIF) regression