Processors do not pass on `return_tensors` to tokenizers properly.

### System Info

- `transformers` version: 4.52.3
- Platform: macOS-15.4.1-arm64-arm-64bit-Mach-O
- Python version: 3.13.0
- Huggingface_hub version: 0.32.0
- Safetensors version: 0.5.3
- Accelerate version: not installed
- Accelerate config: not found
- DeepSpeed version: not installed
- PyTorch version (GPU?): 2.7.0 (False)
- Tensorflow version (GPU?): not installed (NA)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Using distributed or parallel set-up in script?: <fill in>

### Who can help?

@ArthurZucker @zucchini-nlp 


### Information

- [ ] The official example scripts
- [x] My own modified scripts

### Tasks

- [ ] An officially supported task in the `examples` folder (such as GLUE/SQuAD, ...)
- [x] My own task or dataset (give details below)

### Reproduction

1. Install `pytorch`, `pillow`, and `transformers=4.52.3` using pip.
2. Execute the following script:
```python3
import torch
from transformers import AutoProcessor
processor = AutoProcessor.from_pretrained("google/paligemma-3b-pt-224")
batch_features = processor(
    text="<image> What's in this image?",
    images=torch.zeros(3, 224, 224),
    suffix="Nothing",
    return_tensors="pt"
)
```

This yields an `AttributeError` with `transformers==4.52.3`
```
  File "/private/tmp/venv/lib/python3.13/site-packages/transformers/models/paligemma/processing_paligemma.py", line 313, in __call__
    labels = inputs["input_ids"].masked_fill(inputs["token_type_ids"] == 0, -100)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'list' object has no attribute 'masked_fill'
```


### Expected behavior

The `batch_features` should be created without error.

There seems to be a recent bug in the `__call__` method of many processors, including, e.g., `PaliGemmaProcessor`
This is likely caused by
https://github.com/huggingface/transformers/blob/31f8a0fe8a7e2db1ee30bf32ed5976cd11f3283c/src/transformers/models/paligemma/processing_paligemma.py#L301-L307
which was changed in commit 32eca7197a8d2618417a0d665db38d0af3695a2c

I believe the intention was to call `.get()` instead of `.pop()` on `text_kwargs` on line 301. Calling `.pop()` modifies `text_kwargs` in-place and hence the tokenizer would return `inputs["input_ids"]` as list instead of pytorch tensors. The `masked_fill` call below will fail when it's a list. 

https://github.com/huggingface/transformers/blob/31f8a0fe8a7e2db1ee30bf32ed5976cd11f3283c/src/transformers/models/paligemma/processing_paligemma.py#L312-L313


	return_tensors = output_kwargs["text_kwargs"].pop("return_tensors", None)
	inputs = self.tokenizer(
	input_strings,
	text_pair=suffix,
	return_token_type_ids=return_token_type_ids,
	**output_kwargs["text_kwargs"],
	)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Processors do not pass on `return_tensors` to tokenizers properly. #38341

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

	if return_token_type_ids:
	labels = inputs["input_ids"].masked_fill(inputs["token_type_ids"] == 0, -100)

Processors do not pass on return_tensors to tokenizers properly. #38341

Description

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Processors do not pass on `return_tensors` to tokenizers properly. #38341