Open
Description
For models listed below, trtllm runtime generates repetitive outputs and less coherent than demollm's outputs:
EleutherAI/pythia-6.9b
HuggingFaceTB/SmolVLM2-2.2B-Instruct
allenai/OLMo-2-1124-7B-SFT
deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
meta-llama/CodeLlama-7b-Python-hf
meta-llama/Llama-3.2-1B-Instruct
mistralai/Mistral-Large-Instruct-2407
mistralai/Mistral-Nemo-Instruct-2407
nvidia/Llama-3.1-Minitron-4B-Width-Base
Let's take a closer look to see if there's any misconfiguration with trtllm runtime.
This analysis is based on Jun 1st dashboard run. View the coverage google sheet Tab 6/1/2025 for specific outputs.