AI course Notebooks and Exercises
-
Updated
Jun 16, 2025 - Jupyter Notebook
AI course Notebooks and Exercises
Apache Hadoop development environment integrated with Jupyter Notebook using Docker
Local playground for Spark and Jupyter notebooks, plus Iceberg support
Exercise of using the Streaming API with Hadoop to determine the word count of Wikipedia articles.
Hadoop beginner exercise in analyzing European football teams' statistics over the last 20 years. The goal is to determine which team had the highest win percentage-rate.
Recently updated with 50 new notebooks! Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
A spark cluster configuration with the Apache Toree notebook
This is the final project I had to do to finish my Big Data Expert Program in U-TAD in September 2017. It uses the following technologies: Apache Spark v2.2.0, Python v2.7.3, Jupyter Notebook (PySpark), HDFS, Hive, Cloudera Impala, Cloudera HUE and Tableau.
重庆大学2024年秋大数据架构与技术课程,本仓库基于学校提供的原开源项目进行了优化和扩展,包含最新的工具版本、简化的环境配置流程、以及对更便捷高效的开发工具(如 Jupyter Notebook)的全面支持。
Hadoop environment with HDFS, Spark, Hue, Jupyter Notebooks, etc. all installed in docker-compose
📓 [Active] Portafolio of data science projects. Using: Python, PyTorch, Spark, Tensorflow, Scikit, Keras. Includes Classification, Regression, Time series, NLP, Deep learning, among others.
Add a description, image, and links to the hadoop topic page so that developers can more easily learn about it.
To associate your repository with the hadoop topic, visit your repo's landing page and select "manage topics."