-
Notifications
You must be signed in to change notification settings - Fork 57
Description
Why the three mismatches occur (observations and consequences)
Checkpoint contains:
y_embedder.embedding_table.weight shape = (1001, 1152)
final_layer.linear.weight shape = (32, 1152)
final_layer.linear.bias shape = (32,)
RuntimeError: Error(s) in loading state_dict for SiT:
size mismatch for y_embedder.embedding_table.weight: copying a param with shape torch.Size([1001, 1152]) from checkpoint, the shape in current model is torch.Size([33, 1152]).
size mismatch for final_layer.linear.weight: copying a param with shape torch.Size([32, 1152]) from checkpoint, the shape in current model is torch.Size([16, 1152]).
size mismatch for final_layer.linear.bias: copying a param with shape torch.Size([32]) from checkpoint, the shape in current model is torch.Size([16]).