📚A curated list of Awesome LLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.
mla vllm llm-inference awesome-llm flash-attention tensorrt-llm paged-attention deepseek flash-attention-3 deepseek-v3 minimax-01 deepseek-r1 flash-mla qwen3
-
Updated
Jun 20, 2025 - Python