
A variety of common deep learning models benefit from Zeus’ ability to tune GPU power limits and the training batch size. When both parameters were tuned, the software achieved up to 75% energy reduction.
Image credit: SymbioticLab, University of Michigan
Deep learning models that power giants like TikTok and Amazon, as well as tools like ChatGPT, could save energy without new hardware or infrastructure.
A new way to optimize the training of deep learning models, a rapidly evolving tool for powering artificial intelligence, could slash AI’s energy demands.
Developed at the University of Michigan, the open-source optimization framework studies deep learning models during training, pinpointing the best tradeoff between energy consumption and the speed of the training.
“At extreme scales, training the GPT-3 model just once consumes 1,287 MWh, which is enough to supply an average U.S. household for 120 years,” said Mosharaf Chowdhury, an associate professor of electrical engineering and computer science.
With Zeus, the new energy optimization framework developed by Chowdhury and his team, figures like this could be reduced by up to 75% without any new hardware—and with only minor impacts on the time it takes to train a model. It was presented at the 2023 USENIX Symposium on Networked Systems Design and Implementation (NSDI), in Boston.

Mainstream uses for hefty deep learning models have exploded over the past three years, ranging from image-generation models and expressive chatbots to the recommender systems powering TikTok and Amazon. With cloud computing already out-emitting commercial aviation, the increased climate burden from artificial intelligence is a significant concern.
“Existing work primarily focuses on optimizing deep learning training for faster completion, often without considering the impact on energy efficiency,” said Jae-Won Chung, a doctoral student in computer science and engineering and co-first author of the study. “We discovered that the energy we’re pouring into GPUs is giving diminishing returns, which allows us to reduce energy consumption significantly, with relatively little slowdown.”
Deep learning is a family of techniques making use of multilayered, artificial neural networks to tackle a range of common machine learning tasks. These are also known as deep neural networks (DNNs). The models themselves are extremely complex, learning from some of the most massive data sets ever used in machine learning. Because of this, they benefit greatly from the multitasking capabilities of graphical processing units (GPUs), which burn through 70% of the power that goes into training one of these models.
Zeus uses two software knobs to reduce energy consumption. One is the GPU power limit, which lowers a GPU’s power use while slowing down the model’s training until the setting is adjusted again. The other is the deep learning model’s batch size parameter, which controls how many samples from the training data the model works through before updating the way the model represents the relationships it finds in the data. Higher batch sizes reduce training time, but with increased energy consumption.
Zeus is able to tune each of these settings in real time, seeking the optimal tradeoff point at which energy usage is minimized with as little impact on training time as possible. In examples, the team was able to visually demonstrate this tradeoff point by showing every possible combination of these two parameters. While that level of thoroughness won’t happen in practice with a particular training job, Zeus will take advantage of the repetitive nature of machine learning to come very close.
“Fortunately, companies train the same DNN over and over again on newer data, as often as every hour. We can learn about how the DNN behaves by observing across those recurrences,” said Jie You, a recent doctoral graduate in computer science and engineering and co-lead author of the study.
Zeus is the first framework designed to plug into existing workflows for a variety of machine learning tasks and GPUs, reducing energy consumption without requiring any changes to a system’s hardware or datacenter infrastructure.
In addition, the team has developed complementary software that they layer on top of Zeus to reduce the carbon footprint further. This software, called Chase, privileges speed when low-carbon energy is available, and chooses efficiency at the expense of speed during peak times, which are more likely to require ramping up carbon-intensive energy generation such as coal. Chase took second place at last year’s CarbonHack hackathon and is to be presented May 4 at the International Conference on Learning Representations Workshop.
“It is not always possible to readily migrate DNN training jobs to other locations due to large dataset sizes or data regulations,” said Zhenning Yang, a master’s student in computer science and engineering. “Deferring training jobs to greener time frames may not be an option either, since DNNs must be trained with the most up-to-date data and quickly deployed to production to achieve the highest accuracy.
“Our aim is to design and implement solutions that do not conflict with these realistic constraints, while still reducing the carbon footprint of DNN training.”
Original Article: Optimization could cut the carbon footprint of AI training by up to 75%
More from: University of Michigan
The Latest Updates from Bing News
Go deeper with Bing News on:
Synthetic molecular motors
- Team resolves molecular switching behavior of azonium compounds for light-controlled drugs
Molecules that change shape under the influence of light can be used as switches in biomedical applications, for instance to inhibit an enzyme. An international team of researchers, including chemists ...
- ‘Simplistic’ molecular motor avoids complex synthesis
A new molecular motor, which is much simpler than many before it, can carry cargo across a surface without missteps or changes in direction. The simple motor avoids the complex synthesis associated ...
- Physicists solve mysteries of microtubule movers
The research team focused on one of the most common examples of active matter, a suspension of self-propelled particles, such as bacteria or synthetic ... is driven by molecular motors powered ...
- Physicists Solve Mysteries of Microtubule Movers
School of Physics graduate researcher Matthew Golden is the study's lead author. Co-authors are graduate researcher Jyothishraj Nambisan and Alberto Fernandez-Nieves, professor in the Department of ...
- Valvoline™ Global Operations launches new full synthetic 4-stroke motor oils
Formulated for the demanding needs of powersport vehicles and boats, Valvoline Full Synthetic 4-Stroke Motor Oil provides superior protection and performance in all conditions and climates.
Go deeper with Bing News on:
Nano factory
- 36,000 more RMG workers to get salary through bKash payroll solution
Another eight leading readymade garments (RMG) companies in the country have started using bKash’s digital payroll solution to disburse salaries and allowances to their workers. With this latest ...
- CYhair Factory: Trustworthy Vietnamese hair factory for your business
CYhair Vietnamese Hair Factory is a professional human hair products manufacturer, which combines research, development, design, production with sales and services. Located in Hair Village, Dong Bich, ...
- IIT Guwahati researchers turn tea factory waste into pharmaceutical and food products
Indian Institute of Technology Guwahati IIT Guwahati researchers have developed innovative technologies for sustainable and efficient utilization of tea waste ...
- IIT Guwahati researchers turn tea factory waste to pharma products
The findings of these studies have been published in various international journals including the International Journal of Biological Macromolecules, Chemosphere, and Critical Reviews in Biotechnology ...
- Lenovo ThinkPad X1 Nano Gen 3 Review: Solely focused on professional portability
Lenovo is bringing out a third generation of its ThinkPad X1 Nano, the little brother laptop to the X1 Carbon, and its 2023 release is mixed bag of ...