Open
Description
feature
when using --deepspeed zero3.json and --pretrain_mm_mlp_adapter at the same time, the code now doesn't support.
For the weights has already been shard, the load_state_dict in the function initialize_vision_modules doesn't work anymore.
Command:
--pretrain_mm_mlp_adapter
Log:
the size from the checkpoints is torch.tensors[4096, 4096], dismatches torch.tensors[0]
Perhaps you can add the code in the function initialize_vision_modules like:
with deepspeed.zero.GatheredParameters(
list(self.mm_projector.parameters()), modifier_rank=0):
if dist.get_rank() == 0:
This works for me. You can further verify.
Metadata
Metadata
Assignees
Labels
No labels