Skip to content

Commit d7411b7

Browse files
authored
Fix glm4v batch size (#1223)
1 parent 489a859 commit d7411b7

File tree

3 files changed

+5
-6
lines changed

3 files changed

+5
-6
lines changed

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -548,7 +548,7 @@ The complete list of supported models and datasets can be found at [Supported Mo
548548
| DeepSeek-VL | [DeepSeek series vision models](https://github.com/deepseek-ai) | Chinese<br>English | 1.3B-7B | chat model |
549549
| MiniCPM-V<br>MiniCPM-V-2<br>MiniCPM-V-2_5 | [OpenBmB MiniCPM vision model](https://github.com/OpenBMB/MiniCPM) | Chinese<br>English | 3B-9B | chat model |
550550
| CogVLM<br>CogVLM2<br>CogAgent<br>GLM4V | [Zhipu ChatGLM visual QA and Agent model](https://github.com/THUDM/) | Chinese<br>English | 9B-19B | chat model |
551-
| Llava | [Llava series models](https://github.com/haotian-liu/LLaVA) | English | 7B-34B | chat model |
551+
| Llava1.5<br>Llava1.6 | [Llava series models](https://github.com/haotian-liu/LLaVA) | English | 7B-34B | chat model |
552552
| Llava-Next | [Llava-Next series models](https://github.com/LLaVA-VL/LLaVA-NeXT) | Chinese<br>English | 8B-110B | chat model |
553553
| mPLUG-Owl | [mPLUG-Owl series models](https://github.com/X-PLUG/mPLUG-Owl) | English | 11B | chat model |
554554
| InternVL | [InternVL](https://github.com/OpenGVLab/InternVL) | Chinese<br>English | 2B-25.5B<br>including quantized version | chat model |

README_CN.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -545,7 +545,7 @@ CUDA_VISIBLE_DEVICES=0 swift deploy \
545545
| DeepSeek-VL | [幻方系列视觉模型](https://github.com/deepseek-ai) | 中文<br>英文 | 1.3B-7B | chat模型 |
546546
| MiniCPM-V<br>MiniCPM-V-2<br>MiniCPM-V-2_5 | [OpenBmB MiniCPM视觉模型](https://github.com/OpenBMB/MiniCPM) | 中文<br>英文 | 3B-9B | chat模型 |
547547
| CogVLM<br>CogVLM2<br>CogAgent<br>GLM4V | [智谱ChatGLM视觉问答和Agent模型](https://github.com/THUDM/) | 中文<br>英文 | 9B-19B | chat模型 |
548-
| Llava | [Llava系列模型](https://github.com/haotian-liu/LLaVA) | 英文 | 7B-34B | chat模型 |
548+
| Llava1.5<br>Llava1.6 | [Llava系列模型](https://github.com/haotian-liu/LLaVA) | 英文 | 7B-34B | chat模型 |
549549
| Llava-Next | [Llava-Next系列模型](https://github.com/LLaVA-VL/LLaVA-NeXT) | 中文<br>英文 | 8B-110B | chat模型 |
550550
| mPLUG-Owl | [mPLUG-Owl系列模型](https://github.com/X-PLUG/mPLUG-Owl) | 英文 | 11B | chat模型 |
551551
| InternVL | [InternVL](https://github.com/OpenGVLab/InternVL) | 中文<br>英文 | 2B-25.5B<br>包含量化版本 | chat模型 |

swift/llm/utils/template.py

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -395,10 +395,7 @@ def _simplify_context_list(self, context_list: List[Context], loss_scale_list: L
395395
res.append(''.join(temp))
396396
res_loss_scale.append(0.0)
397397

398-
if is_multi_modal:
399-
return Template.split_special_tokens(res, res_loss_scale)
400-
else:
401-
return res, res_loss_scale
398+
return res, res_loss_scale
402399

403400
@staticmethod
404401
def split_special_tokens(context_list: List[Context],
@@ -978,6 +975,8 @@ def encode(self, example: Dict[str, Any]) -> Tuple[Dict[str, Any], Dict[str, Any
978975

979976
def data_collator(self, batch: List[Dict[str, Any]], padding_to: Optional[int] = None) -> Dict[str, Any]:
980977
res = super().data_collator(batch, padding_to)
978+
pad_len = res['labels'].shape[1] - res['input_ids'].shape[1]
979+
res['attention_mask'] = F.pad(res['attention_mask'], (pad_len, 0), 'constant', 1)
981980
images = [b['images'] for b in batch if 'images' in b]
982981
if images:
983982
res['images'] = torch.concat(images)

0 commit comments

Comments
 (0)