A variety of common deep learning models benefit from Zeus’ ability to tune GPU power limits and the training batch size. When both parameters were tuned, the software achieved up to 75% energy reduction.
Image credit: SymbioticLab, University of Michigan
Deep learning models that power giants like TikTok and Amazon, as well as tools like ChatGPT, could save energy without new hardware or infrastructure.
A new way to optimize the training of deep learning models, a rapidly evolving tool for powering artificial intelligence, could slash AI’s energy demands.
Developed at the University of Michigan, the open-source optimization framework studies deep learning models during training, pinpointing the best tradeoff between energy consumption and the speed of the training.
“At extreme scales, training the GPT-3 model just once consumes 1,287 MWh, which is enough to supply an average U.S. household for 120 years,” said Mosharaf Chowdhury, an associate professor of electrical engineering and computer science.
With Zeus, the new energy optimization framework developed by Chowdhury and his team, figures like this could be reduced by up to 75% without any new hardware—and with only minor impacts on the time it takes to train a model. It was presented at the 2023 USENIX Symposium on Networked Systems Design and Implementation (NSDI), in Boston.
Mainstream uses for hefty deep learning models have exploded over the past three years, ranging from image-generation models and expressive chatbots to the recommender systems powering TikTok and Amazon. With cloud computing already out-emitting commercial aviation, the increased climate burden from artificial intelligence is a significant concern.
“Existing work primarily focuses on optimizing deep learning training for faster completion, often without considering the impact on energy efficiency,” said Jae-Won Chung, a doctoral student in computer science and engineering and co-first author of the study. “We discovered that the energy we’re pouring into GPUs is giving diminishing returns, which allows us to reduce energy consumption significantly, with relatively little slowdown.”
Deep learning is a family of techniques making use of multilayered, artificial neural networks to tackle a range of common machine learning tasks. These are also known as deep neural networks (DNNs). The models themselves are extremely complex, learning from some of the most massive data sets ever used in machine learning. Because of this, they benefit greatly from the multitasking capabilities of graphical processing units (GPUs), which burn through 70% of the power that goes into training one of these models.
Zeus uses two software knobs to reduce energy consumption. One is the GPU power limit, which lowers a GPU’s power use while slowing down the model’s training until the setting is adjusted again. The other is the deep learning model’s batch size parameter, which controls how many samples from the training data the model works through before updating the way the model represents the relationships it finds in the data. Higher batch sizes reduce training time, but with increased energy consumption.
Zeus is able to tune each of these settings in real time, seeking the optimal tradeoff point at which energy usage is minimized with as little impact on training time as possible. In examples, the team was able to visually demonstrate this tradeoff point by showing every possible combination of these two parameters. While that level of thoroughness won’t happen in practice with a particular training job, Zeus will take advantage of the repetitive nature of machine learning to come very close.
“Fortunately, companies train the same DNN over and over again on newer data, as often as every hour. We can learn about how the DNN behaves by observing across those recurrences,” said Jie You, a recent doctoral graduate in computer science and engineering and co-lead author of the study.
Zeus is the first framework designed to plug into existing workflows for a variety of machine learning tasks and GPUs, reducing energy consumption without requiring any changes to a system’s hardware or datacenter infrastructure.
In addition, the team has developed complementary software that they layer on top of Zeus to reduce the carbon footprint further. This software, called Chase, privileges speed when low-carbon energy is available, and chooses efficiency at the expense of speed during peak times, which are more likely to require ramping up carbon-intensive energy generation such as coal. Chase took second place at last year’s CarbonHack hackathon and is to be presented May 4 at the International Conference on Learning Representations Workshop.
“It is not always possible to readily migrate DNN training jobs to other locations due to large dataset sizes or data regulations,” said Zhenning Yang, a master’s student in computer science and engineering. “Deferring training jobs to greener time frames may not be an option either, since DNNs must be trained with the most up-to-date data and quickly deployed to production to achieve the highest accuracy.
“Our aim is to design and implement solutions that do not conflict with these realistic constraints, while still reducing the carbon footprint of DNN training.”
Original Article: Optimization could cut the carbon footprint of AI training by up to 75%
More from: University of Michigan
The Latest Updates from Bing News
Go deeper with Bing News on:
Synthetic molecular motors
- Revving Up a Protein Lawnmower
Scientists devised a synthetic protein-based motor fueled by biological reactions to cut through a peptide lawn.
- Nanoscale gadgets
described how the supramolecular chirality of liquid crystals could be determined by the chirality of a synthetic molecular motor. By adding a small amount of a homochiral molecular motor to a ...
- More efficient molecular motor widens potential applications
Light-driven molecular motors were first developed nearly 25 years ago at the University of Groningen, the Netherlands. This resulted in a shared Nobel Prize for Chemistry for Professor Ben ...
- More efficient molecular motor widens potential applications
Light-driven molecular motors were first developed nearly 25 years ago at the University of Groningen, the Netherlands. This resulted in a shared Nobel Prize for Chemistry for Professor Ben Feringa in ...
- More efficient molecular motor widens potential applications
Light-driven molecular motors were first developed nearly 25 years ago. However, making these motors do actual work proved to be a challenge. In a new paper, scientists describe improvements that ...
Go deeper with Bing News on:
Nano factory
- Nokia strikes private 5G deal with IT reseller CGI in the UK
Nokia has another private-5G channel deal, this time with the UK arm of IT giant CGI, which has already deployed a major 5G testbed at Seagate in Londonderry.
- Sabrent debuts 5GB/s Rocket Nano 2242 Gen 4 SSD — a good fit for Lenovo Legion Go, laptops, and NUCs
Sabrent has unveiled its new Rocket Nano 2242 Gen 4 NVMe, which, as the name suggests, is an M.2 2242 form factor SSD designed for devices like the Lenovo Legion Go as well as NUCs. It offers speeds ...
- Discover 12 Stores Like World Market: Best Alternatives and Affordable Options
World Market is a fun and interesting store, but there are better alternatives with cheaper prices and more sustainable products.
- Lok Sabha Elections 2024: Case of the missing Locket
The constituency drew national headlines because of Singur in Hooghly district, where Mamata stalled Tata’s Nano factory from coming up in 2008. That Singur land has returned to being an ...
- Stellar Blade Guide – All Eve Outfits And How To Unlock Them
Here is a guide to all of the outfits that Eve can change into in Stellar Blade, as well as how to unlock them.