The Carbon Footprint of AI and Deep Learning

Prateek Majumder Last Updated : 01 Mar, 2022

5 min read

This article was published as a part of the Data Science Blogathon.

There are immense computational costs of Deep Learning and AI. Artificial intelligence algorithms, which power some of technology’s most cutting-edge applications, such as producing logical stretches of text or creating visuals from descriptions, may need massive amounts of computational power to train. This, in turn, necessitates a vast quantity of power, prompting many to fear that the carbon footprint of these increasingly popular ultra-large A.I. systems would render them environmentally unsustainable.

AI and Energy

The artificial intelligence sector is sometimes likened to the oil industry: data, like oil, can be a tremendously profitable commodity if collected and processed. Researchers from the University of Massachusetts, Amherst, conducted a life cycle analysis for training several typical big AI models in a recent publication. They discovered that the procedure may produce almost 626,000 pounds of CO2 equivalent.

( Source: MIT Technology Review )

( Image: https://www.technologyreview.com/2019/06/06/239031/training-a-single-ai-model-can-emit-as-much-carbon-as-five-cars-in-their-lifetimes/ )

Modern AI models consume a tremendous amount of energy, and their energy demands are rapidly increasing. The computing resources required to create a best-in-class AI model have doubled every 3 to 4 months in the deep learning era.

From: Rob Toews for Forbes Jun 17, 2020,11:54am EDT ( Link Below ).

( Source: Forbes)

Today, AI has a significant carbon footprint, and if current market trends continue, it will soon be much worse. Artificial intelligence might become an adversary in the battle against climate change in the years ahead unless we are ready to examine and revise today’s AI development programme.

Understanding the Carbon Footprint

Artificial intelligence is becoming a more vital part of the research, health, and even our everyday life. Deep learning is used in chatbots, digital assistants, and streaming service movie and music recommendations. Deep learning is a process in which computer models are trained to spot patterns in data.

( Image: https://www.pexels.com/photo/female-software-engineer-coding-on-computer-3861972/ )

The GPT-3 model has a staggering 175 billion parameters. To put this amount in context, its previous model GPT-2 had just 1.5 billion parameters when it was introduced, which was considered cutting-edge at the time. GPT-2 took a few dozen petaflop-days to train, which was already a large amount of computing input, but GPT-3 took thousands.

The datasets utilised to train these algorithms are becoming increasingly large. After being trained on a dataset of 3 billion words, the BERT model achieved best-in-class NLP performance in 2018. Based on a training set of 32 billion words, XLNet surpassed BERT. GPT-2 was trained on a dataset of 40 billion words not long after that. GPT-3 was trained using a weighted dataset of around 500 billion words, which dwarfed all prior attempts.

Such exponential increase in training data leads to the rising carbon footprint of AI and Deep Learning. For each piece of data they are fed during training, neural networks perform a lengthy sequence of mathematical operations (both forward and reverse propagation), adjusting their parameters in sophisticated ways. Larger datasets necessitate increased computation and energy requirements.

The process of deploying AI models to take action in real-world settings, known as inference, requires considerably more energy than training. Indeed, Nvidia believes that inference, rather than training, accounts for 80% to 90% of the cost of a neural network.

Research regarding The Carbon Footprint of AI and Deep Learning

Given the everyday consequences of climate change, the consensus is growing on the necessity for AI research ethics to include a focus on limiting and offsetting the study’s carbon imprint. Along with time, accuracy, and other factors, researchers should include the cost of energy in research paper results. In a recent study report published by MIT researchers, the process of deep learning outsizing the environmental effect was further underlined. Researchers conducted a life cycle study for training many typical big AI models in the article “Energy and Policy Considerations for Deep Learning in NLP.

Transformer, ELMo, BERT, and GPT-2 are four models in the field that have been responsible for the most significant improvements in performance. They trained for up to a day on a single GPU to determine its power usage. They then calculated the total energy spent during the whole training procedure using the number of training hours stated in the model’s original papers. Based on the average energy mix in the United States, which roughly matches the energy mix utilised by Amazon’s AWS, the largest cloud services provider, that amount was translated into pounds of carbon dioxide equivalent.

( Image: https://www.technologyreview.com/2019/06/06/239031/training-a-single-ai-model-can-emit-as-much-carbon-as-five-cars-in-their-lifetimes/ )

The researchers discovered that the environmental costs of training increased in direct proportion to model size. It grew rapidly when more tuning steps were applied to improve the model’s ultimate accuracy. Neural architecture search, in particular, has large associated costs for little performance advantage. Neural architecture search is a tuning procedure that attempts to improve a model by progressively altering the design of a neural network via extensive trial and error. The researchers also stated that these data should only be used as a starting point. In practice, AI researchers either create a new model from start or adapt an existing model to a new data set, both of which need many additional rounds of training and tweaking.

Reducing the Carbon Footprint

There are various ways in which the carbon footprint can be reduced. Let us have a look.

Make use of computationally efficient machine learning algorithms. This is possible because quantization technologies can bring CPUs up to speed with GPUs and TPUs. One possibility is to use resource-constrained devices, such as IoT devices, for edge computing.
Don’t train a model from scratch if it is not necessary. Even though the majority of researchers don’t train from scratch, training a heavy model from scratch can burn up a lot of CPU/GPU power. Hence, using a pre-trained model can reduce the carbon footprint significantly as less electricity is necessary. Azure offers various sophisticated pre-trained models for vision, speech, language & search. These are already trained on massive datasets using many compute hours training the model, the end-user only has to transfer the knowledge of these models to their own dataset. This way only a few compute hours and power are spent. There is a side note to this, model inferences have been indicated to be a bigger consumer of energy than the training itself based on research done by Nvidia in 2019.
When necessary, use Automated ML. Whereas machine learning was formerly primarily a laboratory activity, an increasing number of businesses are beginning to employ technology outside of the laboratory. This implies that determining which model is the most efficient and quickly eliminating the less efficient models not only saves money on the computational side but also minimizes the carbon footprint.
The use of federated learning (FL)methods can be advantageous in the area of carbon footprint. For example, a study conducted by experts at the University of Cambridge, FL has an advantage because of the cooling requirements of data centers. Though GPUs and TPUs are becoming more efficient in terms of processing power given per unit of energy spent, the requirement for a powerful and energy-consuming cooling system persists, so the FL can always benefit from hardware developments.
Deep learning and particular optical neural networks take a lot of energy. Second, the model inference is a significant energy user. Nvidia estimates that model inference costs 80-90 percent of the model cost. One solution is to use customized processors that increase the speed and efficiency of training and testing neural networks. As a result, having the proper hardware is critical.

Conclusion

AI appears to be destined for a dual function. On the one hand, technology has the potential to help mitigate the consequences of the climate problem, such as through smart grid design, low-emission infrastructure development, and climate change modelling. AI, on the other hand, is a big carbon emitter.

The media shown in this article is not owned by Analytics Vidhya and are used at the Author’s discretion.

Prateek Majumder

Prateek is a dynamic professional with a strong foundation in Artificial Intelligence and Data Science, currently pursuing his PGP at Jio Institute. He holds a Bachelor's degree in Electrical Engineering and has hands-on experience as a System Engineer at TCS Digital, where he excelled in API management and data integration. Prateek also has a background in product marketing and analytics from his time with start-ups like AppleX and Milkie Way, Inc., where he was involved in growth campaigns and technical blog management. Recognized for his structured thinking and problem-solving abilities, he has received accolades like the Dr. Sudarshan Chakraborty Award for Best Student Performance. Fluent in multiple languages and passionate about technology, Prateek continues to expand his expertise in the rapidly evolving AI and tech landscape.

Beginner Deep Learning

Free Courses

4.7

Generative AI - A Way of Life

Explore Generative AI for beginners: create text and images, use top AI tools, learn practical skills, and ethics.

4.5

Getting Started with Large Language Models

Master Large Language Models (LLMs) with this course, offering clear guidance in NLP and model training made simple.

4.6

Building LLM Applications using Prompt Engineering

This free course guides you on building LLM apps, mastering prompt engineering, and developing chatbots with enterprise data.

4.8

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Explore practical solutions, advanced retrieval strategies, and agentic RAG systems to improve context, relevance, and accuracy in AI-driven applications.

4.7

Microsoft Excel: Formulas & Functions

Master MS Excel for data analysis with key formulas, functions, and LookUp tools in this comprehensive course.

Reading list

Introduction to Deep Learning

Feed Forward Networks

Gradient Descent

Loss Function

Activation Functions

Introduction to Neural networks

Forward and Backward Propagation

Optimizers

Learning Rate Schedulers

NN on Structured Data

Improving the Deep Learning Model

Deep Learning Model Optimization

Unsupervised Deep Learning

AutoDL

Model Deployment

Introduction to PyTorch

The Carbon Footprint of AI and Deep Learning

AI and Energy

Understanding the Carbon Footprint

Research regarding The Carbon Footprint of AI and Deep Learning

Reducing the Carbon Footprint

Conclusion

Login to continue reading and enjoy expert-curated content.

Free Courses

Generative AI - A Way of Life

Getting Started with Large Language Models

Building LLM Applications using Prompt Engineering

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Microsoft Excel: Formulas & Functions

Recommended Articles

Responses From Readers

Write for us

Analytics Vidhya (4)

brahmaid

csrftoken

Identityid

sessionid

Google (1)

g_state

Microsoft (7)

MUID

_clck

_clsk

SRM_I

SM

CLID

SRM_B

Google (7)

_gid

_ga_#

_gat_#

collect

AEC

G_ENABLED_IDPS

test_cookie

Webengage (2)

_we_us

WebKlipperAuth

LinkedIn (16)

ln_or

JSESSIONID

li_rm

AnalyticsSyncHistory

lms_analytics

liap

visit

li_at

s_plt

lang

s_tp

AMCV_14215E3D5995C57C0A495C55%40AdobeOrg

s_pltp

s_tslv

li_theme

li_theme_set

Google (11)

_gcl_au

SID

SAPISID