Making AI updates greener: discover how RESQUE helps researchers cut energy use and reduce carbon emissions while keeping models dynamic and efficient.
Research: RESQUE: Quantifying Estimator to Task and Distribution Shift for Sustainable Model Reusability
*Important notice: arXiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as definitive, used to guide development decisions, or treated as established information in the field of artificial intelligence research.
The process of updating deep learning/AI models when they face new tasks or must accommodate changes in data can be costly in terms of computational resources and energy consumption. Researchers have developed a novel method for predicting those costs, allowing users to make informed decisions about when to update AI models to improve AI sustainability. Their work is posted to the arXiv preprint* server.
"There have been studies that focused on making deep learning model training more efficient," says Jung-Eun Kim, corresponding author of a paper on the work and an assistant professor of computer science at North Carolina State University. "However, over a model's life cycle, it will likely need to be updated many times. One reason is that, as our work here shows, retraining an existing model is much more cost-effective than training a new model from scratch.
"If we want to address sustainability issues related to deep learning AI, we must look at computational and energy costs across a model's entire life cycle – including the costs associated with updates. If you cannot predict what the costs will be ahead of time, it is impossible to engage in the type of planning that makes sustainability efforts possible. That makes our work here particularly valuable."
Training a deep learning model is a computationally intensive process, and users want to go as long as possible without having to update the AI. However, two types of shifts can happen that make these updates inevitable. First, the task that the AI is performing may need to be modified. For example, if a model was initially tasked with only classifying digits and traffic symbols, you may need to modify the task to identify vehicles and humans as well. This is called a task shift, where the model’s original representation space must adapt to accommodate new class boundaries.
Second, the data users provide to the model may change. For example, you may need to use a new kind of data, or perhaps the data you are working with is being coded differently. Either way, the AI needs to be updated to accommodate the change. This is called a distribution shift, which RESQUE quantifies by measuring the angle between normalized embedding vectors of the original and shifted datasets.
"Regardless of what is driving the need for an update, it is extremely useful for AI practitioners to have a realistic estimate of the computational demand that will be required for the update," Kim says. "This can help them make informed decisions about when to conduct the update, as well as how much computational demand they will need to budget for the update."
The researchers developed a new technique called the REpresentation Shift Quantifying Estimator (RESQUE) to forecast computational and energy costs.
Essentially, RESQUE allows users to compare the dataset on which a deep learning model was initially trained to the new dataset that will be used to update the model. The tool provides two specific quantifiers: RESQUEdist for distributional shifts and RESQUEtask for task shifts, enabling users to plan for distinct update scenarios. This comparison estimates the computational and energy costs associated with conducting the update.
Those costs are presented as a single index value, which can then be compared with five metrics: epochs, parameter change, gradient norm, carbon, and energy. Epochs, parameter change, and gradient norm are all ways of measuring the computational effort necessary to retrain the model.
"However, to provide insight regarding what this means in a broader sustainability context, we also tell users how much energy, in kilowatt hours, will be needed to retrain the model," Kim says. "And we predict how much carbon, in kilograms, will be released into the atmosphere in order to provide that energy."
To validate RESQUE's performance, the researchers conducted extensive experiments involving multiple datasets (such as CIFAR10 and SVHN), varying noise levels, and target tasks. These tests evaluated models like convolutional neural networks and vision transformers.
"We found that the RESQUE predictions aligned very closely with the real-world costs of conducting deep learning model updates," Kim says. "Also, as I noted earlier, all of our experimental findings tell us that training a new model from scratch demands far more computational power and energy than retraining an existing model."
In the short term, RESQUE is a useful methodology for anyone who needs to update a deep learning model.
"RESQUE can be used to help users budget computational resources for updates, allow them to predict how long the update will take, and so on," Kim says.
"In the bigger picture, this work offers a deeper understanding of the costs associated with deep learning models across their entire life cycle, which can help us make informed decisions related to the sustainability of the models and how they are used. It also contributes to the growing field of Green AI, focusing on the environmental impacts of AI development. Because if we want AI to be viable and useful, these models must be not only dynamic but sustainable."
The paper, "RESQUE: Quantifying Estimator to Task and Distribution Shift for Sustainable Model Reusability," will be presented at the Thirty-Ninth Association for the Advancement of Artificial Intelligence (AAAI) Conference on Artificial Intelligence, which will be held Feb. 25-Mar. 4 in Philadelphia, Penn. The paper’s first author is Vishwesh Sangarya, a graduate student at NC State who collaborated with Kim to develop the RESQUE framework.
*Important notice: arXiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as definitive, used to guide development decisions, or treated as established information in the field of artificial intelligence research.
Source:
Journal reference:
- Preliminary scientific report.
Sangarya, V., & Kim, J. (2024). RESQUE: Quantifying Estimator to Task and Distribution Shift for Sustainable Model Reusability. ArXiv. https://arxiv.org/abs/2412.15511