You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When using the flux dev 10B model as the target and discriminator for training with fsdp, we encountered an OOM problem. Does LADD not support distillation of such a large model?
When using the flux dev 10B model as the target and discriminator for training with fsdp, we encountered an OOM problem. Does LADD not support distillation of such a large model?