CloudPulse is an intelligent server resource forecasting and auto-scaling system that leverages machine learning to predict future CPU, memory, and network load patterns. It automatically scales container clusters (Docker/Kubernetes) based on predictive analytics, ensuring optimal resource utilization and preventing performance bottlenecks before they occur.
CloudPulse solves the critical challenge of proactive resource management in modern cloud environments by:
- 🔮 Predictive Scaling: Uses LSTM neural networks and Prophet time series models to forecast resource needs 5-15 minutes ahead
- 📊 Real-time Monitoring: Continuously collects system metrics (CPU, memory, network) from distributed agents
- 🚨 Intelligent Alerting: ML-driven anomaly detection and trend-based alerting
- 🔄 Auto-scaling: Dynamic scaling of Docker containers and Kubernetes pods based on predictions
- 📈 Interactive Dashboard: Modern web interface with real-time charts and control panels
- 🤖 Multi-Model ML: Combines LSTM and Prophet models for robust forecasting
- 🔧 Multi-platform: Seamlessly works with Docker Compose and Kubernetes
- Prevents Downtime: Scales resources before bottlenecks occur
- Cost Optimization: Avoids over-provisioning while maintaining performance
- Intelligent Automation: Reduces manual intervention in scaling decisions
- Real-time Insights: Provides comprehensive visibility into system behavior
CloudPulse operates through a sophisticated multi-layered architecture:
- Go Agents deployed on each node collect real-time system metrics
- Metrics include CPU usage, memory consumption, network I/O, and load averages
- Data is collected every few seconds and sent to the central dashboard
- Python ML Service processes collected metrics using two complementary models:
- LSTM Neural Network: Captures complex temporal patterns and dependencies
- Prophet Model: Handles seasonality and trend analysis
- Models generate forecasts 5-15 minutes into the future
- Anomaly detection identifies unusual patterns in real-time
- Node.js Dashboard acts as the central controller
- Combines ML predictions with configurable scaling rules
- Makes intelligent scaling decisions based on predicted resource needs
- Supports both reactive and proactive scaling strategies
- Docker Compose: For development and small-scale deployments
- Kubernetes: For production environments with advanced orchestration
- Automatic scaling of containers/pods based on predictions
- Integration with existing monitoring and alerting systems
- 🔮 Predictive Scaling: Forecast resource needs before bottlenecks occur
- 📊 Real-time Monitoring: Continuous collection of node metrics
- 🚨 Intelligent Alerts: ML-driven alerting based on predicted trends
- 🔄 Auto-scaling: Dynamic scaling with Docker and Kubernetes
- 📈 Interactive Dashboard: Modern web interface with real-time updates
- 🤖 ML Models: LSTM and Prophet forecasting models
- 🔧 Multi-platform: Works with Docker and Kubernetes
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Go Agent │ │ Go Agent │ │ Go Agent │
│ (Node 1) │ │ (Node 2) │ │ (Node N) │
└─────────┬───────┘ └─────────┬───────┘ └─────────┬───────┘
│ │ │
└──────────────────────┼──────────────────────┘
│
┌─────────────┴─────────────┐
│ Node.js Dashboard │
│ (Central Controller) │
└─────────────┬─────────────┘
│
┌─────────────┴─────────────┐
│ Python ML Service │
│ (LSTM + Prophet Models) │
└─────────────┬─────────────┘
│
┌─────────────┴─────────────┐
│ Docker/Kubernetes │
│ (Auto-scaling) │
└───────────────────────────┘
- Go: High-performance agent for metric collection
- Python: Machine learning models (LSTM, Prophet)
- Node.js: Web dashboard and API
- Docker/Kubernetes: Container orchestration and scaling
- Prometheus/Grafana: Monitoring and visualization
CloudPulse/
├── agent/ # Go agent for metric collection
│ ├── main.go # Main agent application
│ ├── Dockerfile # Agent container image
│ └── README.md # Agent documentation
├── ml-models/ # Python ML forecasting models
│ ├── lstm_forecaster.py # LSTM model implementation
│ ├── prophet_forecaster.py # Prophet model implementation
│ ├── ml_service.py # ML service orchestrator
│ ├── requirements.txt # Python dependencies
│ ├── Dockerfile # ML service container
│ └── README.md # ML models documentation
├── dashboard/ # Node.js dashboard and API
│ ├── server.js # Main dashboard server
│ ├── services/ # Service modules
│ │ ├── AlertManager.js # Alert management
│ │ ├── DataCollector.js # Metrics collection
│ │ ├── ForecastService.js # Forecast integration
│ │ └── ScalingService.js # Auto-scaling
│ ├── public/ # Web interface
│ │ └── index.html # Dashboard UI
│ ├── package.json # Node.js dependencies
│ ├── Dockerfile # Dashboard container
│ └── README.md # Dashboard documentation
├── docker/ # Docker configurations
│ ├── docker-compose.yml # Complete stack
│ └── README.md # Docker documentation
├── k8s/ # Kubernetes manifests
│ ├── cloudpulse-deployment.yaml # Core services
│ ├── cloudpulse-autoscaling.yaml # Auto-scaling
│ ├── cloudpulse-monitoring.yaml # Monitoring stack
│ └── README.md # K8s documentation
└── README.md # This file
# Clone the repository
git clone https://github.com/srdarkseer/CloudPulse.git
cd CloudPulse
# Start all services
docker-compose -f docker/docker-compose.yml up -d
# Access the dashboard
open http://localhost:3000# Deploy to Kubernetes
kubectl apply -f k8s/cloudpulse-deployment.yaml
kubectl apply -f k8s/cloudpulse-autoscaling.yaml
# Access the dashboard
kubectl get svc -n cloudpulseThe CloudPulse dashboard provides:
- Real-time Metrics: Live CPU, memory, and network usage
- Forecast Visualization: ML model predictions and trends
- Alert Management: Configurable alerts with acknowledgment
- Scaling Control: Manual and automated scaling operations
- Node Status: Health monitoring of all nodes
| Service | URL | Description |
|---|---|---|
| Dashboard | http://localhost:3000 | Main web interface |
| Agent API | http://localhost:8080 | Metrics collection |
| ML Service | http://localhost:5001 | Forecasting API |
| Redis | http://localhost:6379 | Caching layer |
- Agent Health:
GET http://localhost:8080/health - Agent Metrics:
GET http://localhost:8080/metrics - ML Service Health:
GET http://localhost:5001/health - ML Forecasts:
GET http://localhost:5001/api/forecasts - Dashboard Health:
GET http://localhost:3000/health
Access the dashboard at http://localhost:3000 (Docker) or via LoadBalancer (Kubernetes).
NODE_ID: Unique node identifierPORT: Agent port (default: 8080)
AGENT_ENDPOINTS: Comma-separated agent URLsMODEL_SAVE_PATH: Path to save modelsFORECAST_INTERVAL_MINUTES: Forecast update interval
AGENT_ENDPOINTS: Agent endpointsML_SERVICE_URL: ML service URLCORS_ORIGIN: CORS configuration
Configure auto-scaling rules:
apiVersion: cloudpulse.io/v1
kind: ScalingRule
metadata:
name: cpu-scale-rule
spec:
nodeId: "node-1"
metric: "cpu_usage"
threshold: 80
action: "scale_up"
replicas: 2
cooldown: 300- Health Checks: All services include health endpoints
- Metrics: Prometheus-compatible metrics
- Logs: Structured logging with Winston
- Alerts: Configurable alerting rules
- Prometheus: Metrics collection and alerting
- Grafana: Visualization and dashboards
- AlertManager: Alert routing and notifications
GET /metrics- Current system metricsGET /health- Health check
GET /api/metrics- Aggregated metricsGET /api/forecasts- ML predictionsGET /api/alerts- Active alertsPOST /api/scale- Manual scaling
GET /api/forecasts- Get predictionsPOST /api/forecast- Generate forecastsGET /api/anomalies- Detected anomalies
# Test individual components
cd agent && go test
cd ml-models && python -m pytest
cd dashboard && npm test
# Integration tests
docker-compose -f docker/docker-compose.yml up -d
# Run tests against running servicesdocker-compose -f docker/docker-compose.yml up -dkubectl apply -f k8s/cloudpulse-deployment.yaml
kubectl apply -f k8s/cloudpulse-autoscaling.yaml
kubectl apply -f k8s/cloudpulse-monitoring.yaml# Scale ML service
kubectl scale deployment cloudpulse-ml --replicas=3 -n cloudpulse
# Scale dashboard
kubectl scale deployment cloudpulse-dashboard --replicas=2 -n cloudpulse-
Agent not collecting metrics
- Check host network configuration
- Verify volume mounts for /proc, /sys
- Check node tolerations
-
ML service failing to start
- Verify persistent volume claims
- Check resource limits
- Review model training logs
-
Dashboard not accessible
- Check LoadBalancer service status
- Verify CORS configuration
- Check WebSocket connectivity
# Check pod status
kubectl get pods -n cloudpulse
# View logs
kubectl logs -f deployment/cloudpulse-ml -n cloudpulse
# Check services
kubectl get svc -n cloudpulse
# Check HPA status
kubectl get hpa -n cloudpulse- Fork the repository
- Create a feature branch
- Make your changes
- Add tests
- Submit a pull request
# Install dependencies
cd agent && go mod download
cd ml-models && pip install -r requirements.txt
cd dashboard && npm install
# Run locally
cd agent && go run main.go
cd ml-models && python ml_service.py
cd dashboard && npm run dev- Agent Documentation
- ML Models Documentation
- Dashboard Documentation
- Docker Documentation
- Kubernetes Documentation
- Documentation: Project Wiki MIT License - see LICENSE file for details.
- Prometheus for metrics collection
- Grafana for visualization
- Prophet for time series forecasting
- TensorFlow for deep learning models \
Sushant R. Dangal