Skip to content

Teanlouise/shared-world-data

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

50 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Back to Home

sw_data

Overview | Develop | Deploy | Data

This is all of the Big Data aspects of the shared-world project. This includes:

sw_data_workflow

Getting Started

(1) Setup Dataproc cluster

sw_data_setup_1

(2) Enable BigQuery API

sw_data_setup_2

(3) Dataproc Cluster for notebooks

  • Create cluster on Dataproc
  • Enable component gateway
  • Add path to bucket for outputs
  • Add Anaconda, Jupyter and Zeppelin notebook
  • Once created created go to Web Interface tab and select notebook
  • Once open select the desired kernel

sw_data_setup_3