Description
System Info
transformers
version: 4.52.3- Platform: macOS-15.4.1-arm64-arm-64bit-Mach-O
- Python version: 3.13.0
- Huggingface_hub version: 0.32.0
- Safetensors version: 0.5.3
- Accelerate version: not installed
- Accelerate config: not found
- DeepSpeed version: not installed
- PyTorch version (GPU?): 2.7.0 (False)
- Tensorflow version (GPU?): not installed (NA)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Using distributed or parallel set-up in script?:
Who can help?
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examples
folder (such as GLUE/SQuAD, ...) - My own task or dataset (give details below)
Reproduction
- Install
pytorch
,pillow
, andtransformers=4.52.3
using pip. - Execute the following script:
import torch
from transformers import AutoProcessor
processor = AutoProcessor.from_pretrained("google/paligemma-3b-pt-224")
batch_features = processor(
text="<image> What's in this image?",
images=torch.zeros(3, 224, 224),
suffix="Nothing",
return_tensors="pt"
)
This yields an AttributeError
with transformers==4.52.3
File "/private/tmp/venv/lib/python3.13/site-packages/transformers/models/paligemma/processing_paligemma.py", line 313, in __call__
labels = inputs["input_ids"].masked_fill(inputs["token_type_ids"] == 0, -100)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'list' object has no attribute 'masked_fill'
Expected behavior
The batch_features
should be created without error.
There seems to be a recent bug in the __call__
method of many processors, including, e.g., PaliGemmaProcessor
This is likely caused by
transformers/src/transformers/models/paligemma/processing_paligemma.py
Lines 301 to 307 in 31f8a0f
which was changed in commit 32eca71
I believe the intention was to call .get()
instead of .pop()
on text_kwargs
on line 301. Calling .pop()
modifies text_kwargs
in-place and hence the tokenizer would return inputs["input_ids"]
as list instead of pytorch tensors. The masked_fill
call below will fail when it's a list.
transformers/src/transformers/models/paligemma/processing_paligemma.py
Lines 312 to 313 in 31f8a0f