A production-ready machine learning service for forecasting Air Quality Index (AQI) values using gradient boosting regression.
- β Real-time Data: Fetches live data from Zephra API
- π§ ML Model: Gradient Boosting Regressor with robust error handling
- π Forecasting: Single-hour and 24-hour AQI predictions
- π REST API: FastAPI with CORS support and health checks
- π³ Docker Ready: Containerization for easy deployment
- βοΈ Render Deploy: One-click deployment to Render platform
zephra-ml/
βββ data/ # (optional) store local datasets if needed
βββ models/
β βββ gbr_model.pkl # trained model
βββ src/
β βββ data_loader.py # fetch from /api/dashboard
β βββ train.py # training + save model
β βββ forecast.py # 24h recursive forecast
βββ api/
β βββ app.py # FastAPI/Flask app
βββ requirements.txt # dependencies
βββ Dockerfile # for deployment
βββ run.sh # deployment script
βββ README.md # usage instructions
- Python 3.9+
- pip
- Clone or download the project
- Install dependencies:
pip install -r requirements.txt
For Windows:
.\run.batFor Linux/Mac:
./run.shThis will automatically:
- Install all dependencies
- Train the model with fresh data from the Zephra API
- Start the API server on http://localhost:8080
Test the API:
python test_api.pyFirst, train the model with the latest data:
python src/train.pyThis will:
- Fetch data from the Zephra API
- Train a Gradient Boosting Regressor
- Save the model to
models/gbr_model.pkl - Display training RMSE
uvicorn api.app:app --host 0.0.0.0 --port 8080 --reloaddocker build -t zephra-ml .
docker run -p 8080:8080 zephra-mlUse the provided script:
./run.shReturns API information and version.
Returns AQI prediction for the next hour.
Response:
{
"next_hour_AQI": 85.3
}Returns 24-hour AQI forecast.
Response:
{
"forecast_24h": [85.3, 87.1, 89.2, ...]
}- Algorithm: Gradient Boosting Regressor
- Parameters:
- n_estimators: 200
- learning_rate: 0.05
- max_depth: 5
- subsample: 0.8
- Features: All columns except AQI, timestamp, and location
- Target: AQI values
- Push code to your repository
- Connect to Render/Heroku
- Use the
run.shscript as your start command - Set environment variables if needed
docker build -t zephra-ml .
docker run -p 8080:8080 zephra-mlThe model's performance is evaluated using RMSE (Root Mean Square Error) on the test set. Training metrics are displayed during model training.
- Data Source: https://zephra.onrender.com/api/dashboard
- Model Storage:
models/gbr_model.pkl - API Port: 8080
- Host: 0.0.0.0 (for Docker compatibility)
- The model assumes the last feature in the dataset is the lag AQI value for recursive forecasting
- Adjust the column names in
data_loader.pyand training scripts based on your actual API response structure - The forecast uses recursive prediction where each prediction becomes input for the next step