- Security researchers found three flaws in Nvidia Triton Inference Server
- When used together, they can grant remote code execution capabilities
- A patch has been released, so users should update immediately
Nvidia Triton Inference Server carried three vulnerabilities which, when combined, could lead to remote code execution (RCE) and other risks, security experts from Wiz have warned
Triton is a free open source tool working on both Windows and Linux which helps companies run AI models efficiently on servers, whether in the cloud, on-site, or at the edge.
It supports many popular AI frameworks and speeds up tasks by handling multiple models at once and grouping similar requests together.
Patching the flaw
Wiz found three flaws in the Python backend:
CVE-2025-23319 (out-of-bounds write bug with an 8.1/10 severity score), CVE-2025-23320 (shared memory limit exceeding vulnerability with a 7.5/10 severity score), and CVE-2025-23334 (an out-of-bounds vulnerability with a 5.9/10 score).
“When chained together, these flaws can potentially allow a remote, unauthenticated attacker to gain complete control of the server, achieving remote code execution (RCE),” Wiz said in its security advisory.
The risk is real, too, they added, stressing that companies stand to lose sensitive data:
“This poses a critical risk to organizations using Triton for AI/ML, as a successful attack could lead to the theft of valuable AI models, exposure of sensitive data, manipulating the AI model’s responses, and a foothold for attackers to move deeper into a network,” the researchers added.
Nvidia said it addressed the issues in version 25.07, and users are “strongly recommended” to update to the latest version as soon as possible.
At press time, there were no reports of anyone abusing these flaws in the wild, however many cybercriminals will wait until a vulnerability is disclosed to target organizations that aren’t that diligent when patching and keep their endpoints vulnerable for longer periods of time.
Via The Hacker News