Scientists develop a groundbreaking technique to reverse-engineer AI models with stunning precision, exposing vulnerabilities in commercial devices and sparking a call for new safeguards.
TPUXtract recovers the model hyperparameters from the Edge TPU in the Coral Dev board. The figure shows the back (a), front (b), and SoM (c) of the Coral Dev board. Source: https://coral.ai/products/dev-board/ Research: TPUXtract: An Exhaustive Hyperparameter Extraction Framework
Researchers have demonstrated the ability to steal an artificial intelligence (AI) model without hacking into the device where it was running. This novel technique leverages an innovative online template-building approach, marking a significant advancement over prior methods. This technique is novel in that it works even when the thief has no prior knowledge of the software or architecture supporting the AI.
"AI models are valuable, we don't want people to steal them," says Aydin Aysu, co-author of a paper on the work and an associate professor of electrical and computer engineering at North Carolina State University. "Building a model is expensive and requires significant computing sources. But just as importantly, when a model is leaked, or stolen, the model also becomes more vulnerable to attacks – because third parties can study the model and identify any weaknesses."
"As we note in the paper, model stealing attacks on AI and machine learning devices undermine intellectual property rights, compromise the competitive advantage of the model's developers, and can expose sensitive data embedded in the model's behavior," says Ashley Kurian, first author of the paper and a Ph.D. student at NC State.
In this work, the researchers stole the hyperparameters of an AI model running on a Google Edge Tensor Processing Unit (TPU).
"In practical terms, that means we were able to determine the architecture and specific characteristics – known as layer details – we would need to make a copy of the AI model," says Kurian.
"Because we stole the architecture and layer details, we were able to recreate the high-level features of the AI," Aysu says. "We then used that information to recreate the functional AI model, or a very close surrogate of that model."
This achievement stands out as the first comprehensive attack to extract hyperparameters across both sequential and non-sequential models, addressing challenges with non-linear architectures like add and concatenate layers.
The researchers used the Google Edge TPU for this demonstration because it is a commercially available chip widely used to run AI models on edge devices—devices utilized by end users in the field, as opposed to AI systems used for database applications.
"This technique could be used to steal AI models running on many different devices," Kurian says. "As long as the attacker knows the device they want to steal from, can access the device while it is running an AI model, and has access to another device with the same specifications, this technique should work."
The technique used in this demonstration relies on monitoring electromagnetic signals. Specifically, the researchers placed an electromagnetic probe on top of a TPU chip. The probe provides real-time data on the changes in the TPU's electromagnetic field during AI processing.
"The electromagnetic data from the sensor essentially gives us a 'signature' of the AI processing behavior," Kurian says. "That's the easy part."
To determine the AI model's architecture and layer details, the researchers compare the electromagnetic signature of the model to a database of other AI model signatures made on an identical device – meaning another Google Edge TPU, in this case.
How can the researchers "steal" an AI model for which they don't already have a signature? That's where things get tricky.
The researchers have a technique for estimating the number of layers in the targeted AI model. Layers are a series of sequential operations that the AI model performs, with the result of each operation informing the next operation. Most AI models have 50 to 242 layers.
"Rather than trying to recreate a model's entire electromagnetic signature, which would be computationally overwhelming, we break it down by layer," Kurian says. "We already have a collection of 5,000 first-layer signatures from other AI models. So, we compare the stolen first-layer signature to the first-layer signatures in our database to see which one matches most closely."
"Once we've reverse-engineered the first layer, that informs which 5,000 signatures we select to compare with the second layer," Kurian says. "And this process continues until we've reverse-engineered all of the layers and have effectively made a copy of the AI model."
The researchers demonstrated that this technique could recreate a stolen AI model with remarkable accuracy, achieving 99.91%.
"Now that we've defined and demonstrated this vulnerability, the next step is to develop and implement countermeasures to protect against it," says Aysu.
The researchers disclosed the vulnerability to Google as part of an ethical disclosure process. They also propose several countermeasures, including adding dummy operations to obscure EM traces and altering the execution order of layers during inference to disrupt template matching.
Source:
Journal reference: