fix: provide explicit scheduler params for LinearWarmupCosineAnnealingLR in train.py by kev-hanwen-yang · Pull Request #16 · lucas-maes/le-wm

kev-hanwen-yang · 2026-03-31T08:45:14Z

Hi! A small fix to the scheduler params.

Summary

The LinearWarmupCosineAnnealingLR scheduler config in train.py only specifies {"type": "LinearWarmupCosineAnnealingLR"} without the required warmup_steps and max_steps arguments.

This causes training to crash immediately for all tasks (not just PushT) with:

TypeError: LinearWarmupCosineAnnealingLR.__init__() missing 2 required positional arguments: 'warmup_steps' and 'max_steps'

Root Cause

stable_pretraining's create_scheduler has a fallback that attempts to auto-infer these parameters from module.trainer.estimated_stepping_batches.

However, at the time configure_optimizers is called by Lightning, the trainer has not yet estimated stepping batches (the dataloader iterator hasn't been created), so estimated_stepping_batches returns None.

The auto-inference code filters out None values, which means max_steps is silently dropped, and the scheduler constructor fails with the missing argument error.

Full traceback:

File ".../stable_pretraining/module.py", line 634, in configure_optimizers
    scheduler = create_scheduler(opt, sched_config, module=self)
File ".../stable_pretraining/optim/lr_scheduler.py", line 193, in create_scheduler
    return fn(optimizer, **params)
TypeError: LinearWarmupCosineAnnealingLR.__init__() missing 2 required positional arguments: 'warmup_steps' and 'max_steps'

Changes

In train.py, compute max_steps and warmup_steps explicitly from the dataloader length and max_epochs, then pass them directly to the scheduler config:

max_steps = len(train_dataloader) * max_epochs (total training steps)
warmup_steps = 1% of max_steps (linear warmup phase)
Also passes warmup_start_lr=0.0 and eta_min=0.0 for completeness
Changes scheduler interval from "epoch" to "step" to match the step-level scheduling

This removes the dependency on the auto-inference path entirely, making it robust across stable_pretraining versions.

Testing

Verified that python train.py data=pusht launches training successfully and progresses through epochs (tested on macOS with MPS backend, stable_pretraining==0.1.6, stable_worldmodel==0.0.6)
The scheduler initializes without error and learning rate warmup + cosine annealing behaves as expected

The scheduler config only specified {"type": "LinearWarmupCosineAnnealingLR"} without warmup_steps or max_steps. The auto-inference in stable_pretraining relies on trainer.estimated_stepping_batches, which is not yet available when configure_optimizers runs, causing a TypeError. Compute max_steps and warmup_steps explicitly from the dataloader length and max_epochs, and change the scheduler interval from "epoch" to "step" to match the step-level scheduling. Made-with: Cursor

lucas-maes · 2026-03-31T17:10:49Z

Hi! Thank you for the PR! I think the problem might come from your side, as we don't see any issue with the code. Do you have the latest version of stable-pretraining?

kev-hanwen-yang · 2026-03-31T20:56:36Z

Thanks for looking into this! I went ahead and verified from a completely clean setup:

# Fresh clone of the original repo
git clone https://github.com/lucas-maes/le-wm.git
cd le-wm

# Fresh venv + install
uv venv --python=3.10
source .venv/bin/activate
uv pip install stable-worldmodel[train,env]

# Confirm latest versions
python -c "import importlib.metadata; print(importlib.metadata.version('stable-pretraining'))"
# → 0.1.6 (latest on PyPI)

# Run training
python train.py data=pusht

This crashes with:
TypeError: LinearWarmupCosineAnnealingLR.__init__() missing 2 required positional arguments: 'warmup_steps' and 'max_steps'

Full traceback:

File ".../stable_pretraining/module.py", line 634, in configure_optimizers
    scheduler = create_scheduler(opt, sched_config, module=self)
File ".../stable_pretraining/optim/lr_scheduler.py", line 193, in create_scheduler
    return fn(optimizer, **params)
TypeError: LinearWarmupCosineAnnealingLR.__init__() missing 2 required positional arguments: 'warmup_steps' and 'max_steps'

The issue is that stable_pretraining's create_scheduler enters the dict path, pops "type" leaving params = {}, then falls into the auto-inference fallback. The _build_default_params factory accesses trainer.estimated_stepping_batches, but at configure_optimizers time, this either returns None or raises, so the try/except at line 187-190 silently swallows the error and leaves params = {}, causing the constructor to fail.

This is reproducible on a clean clone with stable-pretraining==0.1.6. Could you let me know which version/setup you're testing with? Happy to adjust the fix if needed.

ValerianRey mentioned this pull request Apr 21, 2026

Provide lockfile / requirements.txt #48

Closed

lucas-maes force-pushed the main branch from e8a2763 to 8edfeb3 Compare May 22, 2026 21:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: provide explicit scheduler params for LinearWarmupCosineAnnealingLR in train.py#16

fix: provide explicit scheduler params for LinearWarmupCosineAnnealingLR in train.py#16
kev-hanwen-yang wants to merge 1 commit into
lucas-maes:mainfrom
kev-hanwen-yang:fix/scheduler-params

kev-hanwen-yang commented Mar 31, 2026

Uh oh!

lucas-maes commented Mar 31, 2026

Uh oh!

kev-hanwen-yang commented Mar 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

kev-hanwen-yang commented Mar 31, 2026

Summary

Root Cause

Changes

Testing

Uh oh!

lucas-maes commented Mar 31, 2026

Uh oh!

kev-hanwen-yang commented Mar 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants