Stars
Run MapReduce jobs on Hadoop or Amazon Web Services
PlanOut is a library and interpreter for designing online experiments.
Python wrapper for Google Maps JavaScript API V3 and Google Earth API.
Ranking and ordinal regression algorithms in Python
Pandas integration with sklearn
Tutorial on scikit-learn and IPython for parallel machine learning
Official content for Harvard CS109
Effective June 1, 2021: Phabricator is no longer actively maintained.
The official home of the Presto distributed SQL query engine for big data
Discover interests from check-in data in social network
aka "Bayesian Methods for Hackers": An introduction to Bayesian methods + probabilistic programming with a computation/understanding-first, mathematics-second point of view. All in pure Python ;)
[NOT MAINTAINED] Light-weight Python OLAP framework for multi-dimensional data analysis