Skip to content

Investigate TRTLLM runtime repetitive issue #5254

Open
@Fridah-nv

Description

@Fridah-nv

For models listed below, trtllm runtime generates repetitive outputs and less coherent than demollm's outputs:

EleutherAI/pythia-6.9b
HuggingFaceTB/SmolVLM2-2.2B-Instruct
allenai/OLMo-2-1124-7B-SFT
deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
meta-llama/CodeLlama-7b-Python-hf
meta-llama/Llama-3.2-1B-Instruct
mistralai/Mistral-Large-Instruct-2407
mistralai/Mistral-Nemo-Instruct-2407
nvidia/Llama-3.1-Minitron-4B-Width-Base

Let's take a closer look to see if there's any misconfiguration with trtllm runtime.
This analysis is based on Jun 1st dashboard run. View the coverage google sheet Tab 6/1/2025 for specific outputs.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions