Fetch and analyze your Fitbit health data. The project automatically pulls data from the Fitbit API, backs it up to AWS S3 as gzipped JSON, and processes it into Parquet format for efficient analysis with Jupyter notebooks. All automated via a cron job that runs once a day. It's your personal health data pipeline! 🚀
https://dev.fitbit.com/build/reference/web-api/
https://support.google.com/fitbit/#topic=14236398
-
Data Collection - The
fitbit2s3.pyscript runs daily via cron, fetching:- Heart rate (daily summary and intraday)
- Sleep data (levels, stages, and summaries)
- Steps (daily and intraday)
- Activities and GPS data
- Daily summary metrics
-
Cloud Backup - Data is compressed (gzip) and uploaded to AWS S3
-
Local Processing - Use
data_tools/update_fitbit_data.shto download and convert JSON to Parquet format -
Analysis - Open Jupyter notebooks for exploratory data analysis and visualization
Just activate the venv and run jupyter lab:
source cw_venv/bin/activate
jupyter labjupyter lab --version
pip install --upgrade jupyterlabpython3 -m venv cw_venv
source cw_venv/bin/activate
pip install -r requirements.txt Make sure you've registered an app on the Fitbit developer portal and have:
CLIENT_ID- Your Fitbit app client IDCLIENT_SECRET- Your Fitbit app client secretTOKEN_FILE_PATH- Path to store access/refresh tokensFITBIT_LANGUAGE- Language setting (default: en_US)DEVICENAME- Your Fitbit device name (e.g., PixelWatch3)
Configure AWS credentials for S3 backup:
AWS_ACCESS_KEY_ID- Your AWS access keyAWS_SECRET_ACCESS_KEY- Your AWS secret keyS3_BUCKET_NAME- S3 bucket for data storageAWS_REGION- AWS region (if needed)
FITBIT_LOG_FILE_PATH- Path for log file output
All credentials should be stored securely in a .env file (not committed to git).
On the server (dobox), set proper permissions:
chmod 755 run_fitbit2s3.sh
chown root:root run_fitbit2s3.shThe script includes email notifications on failures and comprehensive error logging.
cromWell/
├── fitbit2s3.py # Main script to fetch Fitbit data and upload to S3
├── run_fitbit2s3.sh # Cron job wrapper script with error handling
├── requirements.txt # Python dependencies
├── data/ # Local Parquet data storage
│ ├── daily_summaries.parquet
│ ├── gps.parquet
│ ├── sleep_levels.parquet
│ ├── heartrate_intraday/ # Intraday heart rate data
│ └── steps_intraday/ # Intraday steps data
├── data_tools/ # Data management utilities
│ ├── split_parquet.py # Split large Parquet files
│ ├── sync_from_s3.py # Sync data from S3 to local
│ ├── update_parquet_lowmem.py # Memory-efficient incremental updates
│ └── update_fitbit_data.sh # Update script
├── notebooks/ # Jupyter notebooks for analysis
│ ├── SLEEP_ANALYSIS.ipynb # Sleep data analysis
│ ├── Activities_Refine.ipynb
│ ├── Sleep_Redux.ipynb
│ └── functions/ # Helper functions for notebooks
│ ├── load_data.py # Data loading utilities
│ └── sleep/ # Sleep analysis helpers
│ └── sleep_helpers.py
└── docs/ # Documentation files
├── fitbit-tokens.txt
└── sleep-values.txt
- Automated Data Collection: Daily cron job fetches data from Fitbit API
- Cloud Backup: Gzipped JSON files stored in AWS S3
- Efficient Storage: Parquet format for fast querying and analysis
- Memory-Efficient Updates: Incremental data updates without loading entire datasets
- Comprehensive Analysis: Jupyter notebooks for sleep, activity, and health metrics
- Error Handling: Email notifications on job failures
The data_tools/ directory contains utilities for managing your Fitbit data.
🌍 followCrom: followcrom.com 🌐
📫 followCrom: get in touch 👋