Stars
This is a repo with links to everything you'd ever want to learn about data engineering
Data Engineering Zoomcamp is a free 9-week course on building production-ready data pipelines. The next cohort starts in January 2026. Join the course here 👇🏼
A helpful 5-page machine learning cheatsheet to assist with exam reviews, interview prep, and anything in-between.
Minimal and clean examples of machine learning algorithms implementations
Free online textbook of Jupyter notebooks for fast.ai Computational Linear Algebra course
A collection of machine learning examples and tutorials.
WIP: Roadmap to becoming a machine learning engineer in 2020
Roadmap to becoming a data engineer in 2021
Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of big tabular data at a billion rows per second 🚀
Python and Lua tutorials
Grid studio is a web-based application for data science with full integration of open source data science frameworks and languages.
TensorFlow's Visualization Toolkit
Machine Learning and Python for Beginners. This repo will contain all the materials for 6-7 week course I shall teach to non-ML experts at Harris Manchester College, University of Oxford.
A better notebook for Scala (and more)
This is the repo for the Data Engineering - Cloud Computing and Manage AI Services course delivered at the Central European University ceu.edu
Linux, Jenkins, AWS, SRE, Prometheus, Docker, Python, Ansible, Git, Kubernetes, Terraform, OpenStack, SQL, NoSQL, Azure, GCP, DNS, Elastic, Network, Virtualization. DevOps Interview Questions
An Introduction to Statistical Learning (James, Witten, Hastie, Tibshirani, 2013): Python code
A simple and efficient tool to parallelize Pandas operations on all available CPUs
An implementation of some of the tools used by the winner of the box plots competition using scikit-learn.
Companion webpage to the book "Mathematics For Machine Learning"
Checks for the Datadog Agent that Stripe finds useful.
A data visualization curriculum of interactive notebooks.
Notebooks and other files from the workshops.