(WIP - Development Repository)
A comprehensive, production-grade toolkit for training high-quality LoRA (Low-Rank Adaptation) models for Flux2-dev, featuring real-time monitoring, automatic quality assessment, and an accessible web interface.
Note: This is currently a development repository. Installation requires setting up from source. See the Installation section below for detailed setup instructions.
- π Enterprise-grade training with real-time monitoring and validation
- π― Automatic quality assessment using CLIP-based metrics
- π₯οΈ Intuitive web interface for both technical and creative users
- π Comprehensive evaluation tools for checkpoint comparison and selection
- π§ Flexible CLI with presets and extensive configuration options
- π¨ Multiple LoRA types support (character, style, concept)
- πΎ Production-ready with safetensors and robust error handling
- Quick Start
- Installation
- Web Interface
- Command Line Interface
- Configuration
- Dataset Preparation
- Training Guide
- Evaluation & Testing
- Troubleshooting
- Advanced Usage
- Contributing
- Python 3.14+
- CUDA-compatible GPU (NVIDIA A100 or H100 recommended for LoRA training)
- 16GB+ system RAM
- 100GB+ free disk space
- FLUX2-dev model (NOT FLUX1 - requires different components)
# Clone the repository
git clone https://github.com/your-repo/flux2-lora-training-toolkit.git
cd flux2-lora-training-toolkit
# Create virtual environment (uses Python 3.14)
python -m venv venv314
source venv314/bin/activate # On Windows: venv314\Scripts\activate
# Install the toolkit
pip install -e ".[dev]"
# Launch the web interface
python app.py
# Open http://localhost:7860 in your browser# Clone and setup (same as above)
git clone https://github.com/your-repo/flux2-lora-training-toolkit.git
cd flux2-lora-training-toolkit
python -m venv venv314
source venv314/bin/activate # On Windows: venv314\Scripts\activate
pip install -e ".[dev]"
# Train with a preset
python cli.py train --preset character --dataset /path/to/dataset --output ./output
# Evaluate your trained model
python cli.py eval test-prompts --checkpoint ./output/best_checkpoint.safetensors- GPU: CUDA-compatible GPU (NVIDIA A100 or H100 recommended for LoRA training)
- RAM: 16GB+ system RAM
- Storage: 100GB+ free disk space
- OS: Linux, macOS, or Windows with WSL2
- Python: 3.10+
This is a development repository. You'll need to install it from source.
-
Clone the repository:
git clone https://github.com/your-repo/flux2-lora-training-toolkit.git cd flux2-lora-training-toolkit -
Create a virtual environment:
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install the toolkit and dependencies:
# Install with development dependencies (recommended) pip install -e ".[dev]" # Or install with just core dependencies pip install -e .
-
Verify installation:
python cli.py system info
For enhanced functionality, you can install additional packages:
# For hyperparameter optimization
pip install optuna
# For memory-efficient attention (may conflict with Flash Attention)
pip install xformers
# For Flash Attention 2 (requires CUDA)
pip install flash-attnCUDA Issues: Make sure you have CUDA installed and PyTorch with CUDA support:
# Check PyTorch CUDA version
python -c "import torch; print(torch.version.cuda)"Virtual Environment Issues: Always activate your virtual environment before running commands:
source venv/bin/activate # Linux/macOS
# or
venv\Scripts\activate # WindowsThe web interface provides an intuitive way to train and evaluate LoRA models without command-line knowledge.
After installation, launch the web interface:
# Make sure your virtual environment is activated
source venv/bin/activate # On Windows: venv\Scripts\activate
# Launch the web interface
python app.pyNavigate to http://localhost:7860 in your browser.
- Upload Dataset: Click "Upload Dataset" and select a ZIP file containing your training images and captions
- Choose Preset: Select from Character, Style, or Concept presets optimized for different use cases
- Configure Training: Adjust advanced settings or use defaults
- Start Training: Click "Start Training" and monitor progress in real-time
- Load Checkpoint: Upload or specify path to your trained LoRA checkpoint
- Test Prompts: Enter prompts to test your model's capabilities
- Quality Assessment: Run automatic quality metrics and overfitting detection
- Compare Checkpoints: Upload multiple checkpoints for side-by-side comparison
- Analyze Dataset: Upload or specify dataset path for comprehensive analysis
- Validate Structure: Check for common issues and get actionable recommendations
- Browse Images: Review your dataset with caption display and navigation
The CLI provides full control over training and evaluation with extensive options.
# Train with character preset
python cli.py train --preset character --dataset ./my_dataset --output ./output
# Train with custom config
python cli.py train --config my_config.yaml --dataset ./data --output ./results
# Override specific settings
python cli.py train --preset style --steps 2000 --lr 5e-5 --batch-size 4# Test a checkpoint with prompts
python cli.py eval test --checkpoint ./output/checkpoint-1000.safetensors --prompt "A portrait of a person"
# Run comprehensive prompt testing
python cli.py eval test-prompts --checkpoint ./output/best.safetensors --concept "my_character"
# Compare multiple checkpoints
python cli.py eval compare checkpoint1.safetensors checkpoint2.safetensors --prompt "Test prompt"
# Assess quality metrics
python cli.py eval assess-quality --checkpoint ./checkpoint.safetensors --training-data ./dataset
# Select best checkpoint from multiple
python cli.py eval select-best checkpoint1.safetensors checkpoint2.safetensors checkpoint3.safetensors# Analyze dataset
python cli.py data analyze --dataset ./my_dataset --output analysis.json
# Validate dataset structure
python cli.py data validate --dataset ./data --fix# Show system information
python cli.py system info
# Display GPU details
python cli.py system gpu
# Get optimization recommendations
python cli.py system optimize --config my_config.yaml
# List available presets
python cli.py system presetsThe toolkit includes optimized presets for different LoRA types:
- Character: Optimized for training character-specific LoRAs (rank=128, higher learning rate)
- Style: Optimized for artistic style transfer (rank=64, balanced settings)
- Concept: Optimized for object/concept training (rank=32, conservative settings)
Create a YAML configuration file:
model:
base_model: "/path/to/black-forest-labs/FLUX.2-dev"
dtype: "bfloat16"
device: "cuda:0"
lora:
rank: 128
alpha: 128
dropout: 0.1
training:
learning_rate: 5e-5
batch_size: 4
max_steps: 1000
gradient_accumulation_steps: 1
data:
resolution: 1024
caption_format: "txt"
cache_images: true
output:
output_dir: "./output"
checkpoint_every_n_steps: 100my_dataset/
βββ image_001.jpg
βββ image_001.txt # Caption file
βββ image_002.png
βββ image_002.txt
βββ ...
The toolkit supports multiple caption formats:
.txtfiles: Standard text files with captions.captionfiles: Alternative caption extension- EXIF metadata: Captions embedded in image metadata
- JSON sidecar files: Structured caption data
- Image Quality: Use high-resolution images (1024x1024 minimum)
- Consistency: Maintain consistent style, lighting, and composition
- Diverse Poses: Include multiple angles and expressions for characters
- Captions: Write detailed, descriptive captions
- Quantity: Start with 10-50 high-quality images
- Naming: Use sequential naming (image_001.jpg, image_002.jpg, etc.)
character_dataset/
βββ character_front.jpg
βββ character_front.txt # "A portrait of a character with distinctive features"
βββ character_side.jpg
βββ character_side.txt # "Side profile of the character"
βββ character_action.jpg
βββ character_action.txt # "The character in an action pose"
βββ ...
- Prepare Dataset: Organize images and captions as described above
- Choose Preset: Select appropriate preset based on your use case
- Configure Training:
- Set output directory
- Adjust training steps (1000-5000 typically)
- Configure batch size based on GPU memory
- Monitor Training:
- Watch loss decrease over time
- Review validation samples
- Monitor GPU memory usage
- Evaluate Results: Test trained checkpoints with various prompts
- Loss: Training objective (lower is better)
- CLIP Score: How well generated images match prompts (0-1, higher better)
- Diversity Score: Variety in generated images (higher better)
- Overfitting Detection: Similarity to training images (lower risk better)
- GPU Memory: Reduce batch size if encountering OOM errors
- Training Steps: More steps generally improve quality but increase training time
- Learning Rate: Start with preset defaults, adjust based on convergence
- Resolution: Higher resolution requires more memory but improves detail
Test your trained LoRA with various prompt patterns:
# Basic usage
python cli.py eval test-prompts --checkpoint my_lora.safetensors --trigger-word "my_character"
# Custom concept
python cli.py eval test-prompts --concept "cyberpunk city" --trigger-word "cyberpunk"Get detailed quality metrics for your checkpoints:
python cli.py eval assess-quality \
--checkpoint my_lora.safetensors \
--training-data ./dataset \
--output quality_report.jsonCompare multiple checkpoints side-by-side:
python cli.py eval compare \
checkpoint_500.safetensors \
checkpoint_1000.safetensors \
checkpoint_1500.safetensors \
--output comparison_results.htmlError: CUDA out of memory
Solutions:
- Reduce batch size:
--batch-size 2 - Enable gradient checkpointing (automatic)
- Reduce resolution if using high-res images
- Close other GPU-intensive applications
Loss not decreasing significantly
Solutions:
- Increase training steps:
--steps 2000 - Adjust learning rate:
--lr 1e-4 - Check dataset quality and consistency
- Try different preset or custom configuration
Generated images don't match expectations
Solutions:
- Ensure trigger word is used correctly in prompts
- Check dataset quality and variety
- Increase training steps or adjust LoRA rank
- Review caption quality and specificity
ModuleNotFoundError: No module named 'diffusers'
Solutions:
- Install missing dependencies:
pip install -e ".[dev]" - Check Python version (3.10+ required)
- Ensure virtual environment is activated
- Check system compatibility:
python cli.py system info - Validate configuration: Use
--dry-runflag - Check dataset:
python cli.py data validate --dataset ./my_dataset - Review logs: Check output directory for detailed logs
For H100 GPUs:
python cli.py system optimize --config my_config.yamlThis will provide specific recommendations for your hardware.
Automatically find the best training settings for your specific dataset using Bayesian optimization.
- 10-30% Quality Improvement: Optimized settings produce better LoRA models
- Faster Training: Better parameters converge faster and more reliably
- Memory Efficiency: Optimized batch sizes and accumulation maximize GPU utilization
- Dataset-Specific: Each dataset may have different optimal hyperparameters
# Basic optimization (50 trials, ~10-20 hours)
python cli.py train optimize --dataset ./my_dataset
# Quick optimization for testing (20 trials, ~4-8 hours)
python cli.py train optimize --dataset ./data --trials 20 --max-steps 300
# Custom output directory
python cli.py train optimize --dataset ./data --output ./my_optimization- LoRA Rank (4-128): Model capacity - higher values learn more complex patterns
- LoRA Alpha (4-128): LoRA strength scaling factor
- Learning Rate (1e-6 to 1e-2): Training convergence speed
- Batch Size (1, 2, 4, 8, 16): Images processed simultaneously
- Gradient Accumulation (1, 2, 4, 8): Effective batch size for memory management
After optimization completes, you'll get:
best_config.yaml: Ready-to-use configuration for production trainingoptimization_results.json: Complete optimization summary with best parameterstrials_data.json: Detailed results from each optimization trial
Use the Optimization tab in the web interface for:
- Interactive parameter range selection
- Real-time progress monitoring
- Visual optimization history
- Easy configuration download
- Start Small: Use 20-30 trials for initial optimization
- Good Dataset: Optimization works best with representative, high-quality data
- Monitor Progress: Check that quality scores improve over trials
- Production Training: Use optimized settings for your final LoRA training
from flux2_lora.core.trainer import LoRATrainer
from flux2_lora.utils.config_manager import config_manager
# Load configuration
config = config_manager.get_preset_config("character")
# Initialize trainer
trainer = LoRATrainer(model=model, config=config, output_dir="./output")
# Custom training loop
for step in range(1000):
loss = trainer.train_step(batch)
if step % 100 == 0:
trainer.save_checkpoint(f"checkpoint-{step}")The toolkit can be integrated into existing ML pipelines:
from flux2_lora import LoRADataset, create_dataloader
# Load your custom dataset
dataset = LoRADataset(data_dir="./my_data", resolution=1024)
dataloader = create_dataloader(dataset, batch_size=4)
# Use with your training framework
for batch in dataloader:
# Your training logic here
passfrom flux2_lora.evaluation import QualityAssessor
assessor = QualityAssessor()
# Add custom metrics
results = assessor.assess_checkpoint_quality(
checkpoint_path="my_lora.safetensors",
test_prompts=["custom prompt"],
custom_metrics={"my_metric": custom_function}
)This project is licensed under the MIT License - see the LICENSE file for details.