Moole Vulnerability Database

Find real vulnerabilities before they ship

Medium

CVE-2025-62426

vLLM is an inference and serving engine for large language models (LLMs). From version 0.5.5 to before 0.11.1, the /v1/chat/completions and /tokenize endpoints allow a chat_template_kwargs request parameter that is used in the code before it is properly validated against the chat template. With the right chat_template_kwargs parameters, it is possible to block processing of the API server for long periods of time, delaying all other requests. This issue has been patched in version 0.11.1.

VectorNETWORK

PublishedNov 21, 2025, 02:15

Published Bysecurity-advisories@github.com

Base Score

Score6.5

Affected Versions

Package (Ecosystem)IntroducedFixedLimit

vllm(PyPI)0.5.50.11.1N/A

References

https://github.com/vllm-project/vllm/blob/2a6dc67eb520ddb9c4138d8b35ed6fe6226997fb/vllm/entrypoints/chat_utils.py#L1602-L1610
https://github.com/vllm-project/vllm/blob/2a6dc67eb520ddb9c4138d8b35ed6fe6226997fb/vllm/entrypoints/openai/serving_engine.py#L809-L814
https://github.com/vllm-project/vllm/commit/3ada34f9cb4d1af763fdfa3b481862a93eb6bd2b
https://github.com/vllm-project/vllm/pull/27205
https://github.com/vllm-project/vllm/security/advisories/GHSA-69j4-grxj-j64p

Weakness Type

CWE-770

CVSS Metrics

Base Score

6.5

Vector String

CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H

Base SeverityMedium

Version

3.1

Attack Vector (AV)

NETWORK