Universal AI training monitor with real-time graphs like Process Explorer - monitor any AI/ML training framework with smooth, professional visualizations.
AI Training Monitor is a native desktop application that provides real-time visualization of AI/ML training metrics. Similar to Process Explorer but designed specifically for monitoring training progress, it offers smooth graphs, automatic pattern detection, and a plugin architecture that supports any training framework.
- Universal Plugin Architecture - Works with any training framework through extensible parsers
- Real-time Smooth Graphs - 60 FPS visualization using PyQtGraph (not terminal-based)
- Intelligent Analysis - Automatic detection of overfitting, plateaus, and divergence
- Professional Native UI - Desktop application with Process Explorer-style interface
- Multi-Metric Tracking - Monitor loss, learning rate, speed, memory usage simultaneously
- Framework Support - Ostris, Kohya_ss, HuggingFace, PyTorch Lightning, and more
- Export Options - Save graphs as images, export data as CSV
- ️Cross-Platform - Works on Windows, Linux, and macOS
# Clone the repository
git clone https://github.com/yourusername/ai-training-monitor.git
cd ai-training-monitor
# Install dependencies
pip install -r requirements.txt
# Or install in development mode
pip install -e .# Auto-detect framework and monitor
python -m ai_training_monitor /path/to/training/output
# Specify framework explicitly
python -m ai_training_monitor --framework ostris /path/to/output/character
# Monitor specific log file
python -m ai_training_monitor --log /path/to/log.txt --framework kohya
## Development
### Prerequisites
- List prerequisites here
### Setup
```bash
# Setup instructions here- Individual Graphs View: Three separate graphs for Loss, Learning Rate, and Speed with proper scaling
- Threshold Overlays: Dynamic colored zones and threshold lines that adapt to your hardware
- Interactive Control Panel: Adjust thresholds, toggle monitoring features, pause/resume
- Pattern Detection: Automatic detection of overfitting, plateaus, and divergence
- Parser Support: Fully functional Ostris AI Toolkit parser
- Export Functions: Save data as CSV for further analysis
- OmniGraph Axis Scaling: The unified "Omni" view doesn't properly update Y-axis numeric values when switching primary metrics (shows 0-1 instead of actual values)
- Time Verification: "Space invader blip" visualization not yet implemented
- Historical Data Loading: Toggle for loading past vs current-only data not yet available
- Limited Parser Support: Currently only Ostris format is fully implemented
- Fix OmniGraph axis scaling issue
- Add historical data loading with toggle
- Implement time verification visualization
- Add support for Kohya_ss, HuggingFace Trainer, PyTorch Lightning
- Add data export and session management
- Improve real-time performance for very long training runs
Contributions are welcome! Please read our Contributing Guide for details on how to contribute.
Like the project?
This project is licensed under the terms specified in the LICENSE file.
- Ostris AI Toolkit - Primary training framework that inspired this monitor's initial parser implementation
- PyQt6 - Python bindings for Qt6, providing the native GUI framework
- PyQtGraph - High-performance real-time graphing library built on PyQt