vihari/Carbon Footprint of AI.md

## Carbon Footprint of AI.md

      
    Raw
  

              Carbon Footprint of AI.md
            
          
    This document contains rough calculations of the power consumption and CO2e due to AI. If you're a ML/AI researcher or a heavy user of AI-related applications, please read on.
Common emitters (for comparison)


CO2e (in Kg)
Ref.


Lifetime car emissions
57,152
[1]


London->NY flight (per person)
179
[3]


London->NY flight (overall)
59,600
[3]


UK Collective (per-annum)
CO2e (in Million ton)
[4]


Electricity production
24


Food & catering
22.4


Recreation & leisure 
(Travel, socialising, Entertainment, culture)
31.6


Aviation fuel emissions
11


Global emissions from aviation (per annum)
~1B ton
[15]


Training cost of AI models


Model
# params
kWh-PUE
CO2e (in Kg)
Ref.


ELMo
93.6M
275
119
[1]


BERT-Base (gpu)
108M
1507
652
[1]


GPT-2
1.5B
-
-


GPT-3
175B
-
1,287MWh
-
[16]


GPT-4
Unknown
(in order of T?)
-
-


OLMO 7B
7B
239,000
69,780
[2]


LLama2
7B
~115,015
(184K GPU-hrs)
31,220
(100% offset claimed)
[5]


70B
~1,060,800
(1.7M GPU-hrs)
291,420
[5]


AI deployment

The next after training is to deploy the models for inference on new inputs.
Inference is considered cheaper, but with millions of queries a day the inference energy costs can be far greater than that of training. For instance, according to [16], [17]

Research firm SemiAnalysis suggested that OpenAI required 3,617 of NVIDIA’s HGX A100 servers, with a total of 28,936 graphics processing units (GPUs), to support ChatGPT, implying an energy demand of 564 MWh per day

What is the energy cost of AI applications?

This is a bit tricky to calculate. There are way too many consumers to keep track. We will attempt a back-of-the-envelope calculation using NVIDIA sales report. The two most popular offerings of Nvidia GPUs are A100, released in May 2020, and H100, which was released March 2022. [7] reports that NVIDIA sold 500,000 H100 GPUs in just one quarter (Q3). Overall, Nvidia is estimated to have sold 1.5 million units of H100 in 2023 and is expected to sell 3.5 million units in 2024 [9]. I could not find any estimate for the sale of A100 units. If H100 clocked 1.5 million units in just one year, it is reasonable to expect that at least 2 million A100s are sold in its lifetime.
Power Usage Effectiveness (PUE) (i.e. extra power required to keep the systems cool) of big corporations tend to be very good. For instance, that of Microsoft Azure is 1.2 [13]. But we will use a PUE value of 1.4, which is a slight improvement over the global average used in [1] from five years ago. If we assume that the people buying the GPUs will at least use it for 5 hr a day, then we have the following estimate for the required powered consumption per year
=(PUE=1.4)*(5 hr 365)(2 million A100 * 800W + 1.5 million H100 * 1300W)

= 9,070,250MWH
Where I used a power rating estimated from DGX A100 server with 8 GPUs: 6.5KW[18] and DGX H100 server with 8 GPUs: 10.2 KW[19] to calculate the power rating for each A100 and H100 unit: 6.5 kW/8=0.8KW, 10.2KW/8=1.3KW. 9TWh at an average operation of 5 hr per day (~20%) and 43.2TWh if operating at full capacity. [16] makes a similar calculation to estimate the power consumption of 85.4–134.0 TWh. 9-43 TWh just from sales in 2021-23 (from Nvidia alone) is massive but it is only a fraction of overall datacenter cost of 205TWh from 2022 [20].
What is the corresponding CO2 estimate?

According to [10], 55% Nvidia revenue comes from the US, Taiwan 20%, and others 25%. The average CO2 emission per KWh energy produced of USA is 0.82 lb/KWh [11] (this is an improvement over 0.95 reported in [1]) and that of Taiwan roughly is 0.5 kg CO2/KWh = 1.1 lb/KWh [12]. I will use 0.95 lb/KWh for global average to accommodate "other" countries. Overall, the estimated CO2 emissions due to power consumption alone is
= 9,070,250 * 1000 KWH * (0.55 * 0.82 + 0.2 * 1.1 + 0.25 * 0.95) lbs/KWH

= 3.7M tons of CO2 per year from existing A100s, H100s alone.
Estimating CO2 emissions per WH is not as straight-forward because many big corps claim sustainable energy production, some examples [13], [14]. However, with rapidly increasing demands for energy, I will be surprised if they can meet the need using only renewable sources. I will, for the sake of argument, used the national average values above because big corps are not the only consumers.
A very rough under-estimate gave us a CO2 emissions comparable to UK collective emissions due to aviation. Note that I did not account many related contributors including the following.

Nvidia is not the only manufacturer of GPUs. TPUs, AMD chips, and any other in-house products are ignored.
A100 and H100 are not the only active products of Nvidia
Emissions from production of the many chips is not accounted, which is even higher than deployment emissions [6].

Cause for concern?

I am worried the situation may only get worser with time. It is unsurprising now that some are making appeals for energy generation breakthroughs to meet the growing need.
Efficient algorithms/hardware etc. need not improve the situation because they could increase the demand by reducing the financial cost [16].
Existing infrastructure may not support generating energy from renewable-only sources to reach net-zero in the near-future.

Take-aways


Training of large models, although often portrayed as the most expensive, is relatively cheap. Training a 70B LLM has an equivalent overall emission of five flights from London->NY. If we compare the costs, it takes about 800K USD = 477 USD per ticket * 333 * 5 to buy five flight-load of tickets on British Airways with an average capacity of 333. On the other hand, the cost of Azure compute for an equivalent CO2 emission from 1.7M GPU-hrs is more than 6.5M USD [21] (850,000 hrs of NC48ads A100 v4 instance with 2xA100s). The price of emitting an equivalent amount of CO2 through compute instead of aviation is ten times higher. The much higher price of compute will keep such intense training efforts in check for some time.
Per-day deployment energy costs of AI systems can rival the lifetime training costs of the very model. An estimated 564MWh per-day energy cost of GPT-4 is almost half the overall training cost of GPT-3 1,287MWh. We estimated 3.7 - 17.8 M tonnes of CO2 per annum, which is an under-estimate and yet immense. However, one could argue that it is still a small fraction of the global emissions from aviation at 1B ton, which itself contributes only 2.5-3% to the overall emissions. In that regard, AI energy costs/emissions are not the most critical. Moreover, we could curtail the emissions even more if we choose to deploy AI models using established sustainable cloud services.
Nevertheless, the emissions from AI are significant and are expected to get worser with ever increasing sales of Nvidia products and omnipresent integration of LLMs. I hope the financial cost to run the models will soon deter its omnipresence.
[16] puts it nicely.


Therefore, it would be advisable for developers not only to focus on optimizing AI, but also to critically consider the necessity of using AI in the first place, as it is unlikely that all applications will benefit from AI or that the benefits will always outweigh the costs.

References


Strubell, Emma, Ananya Ganesh, and Andrew McCallum. "Energy and policy considerations for deep learning in NLP." arXiv preprint arXiv:1906.02243 (2019). https://arxiv.org/abs/1906.02243
Groeneveld, Dirk, et al. "OLMo: Accelerating the Science of Language Models." arXiv preprint arXiv:2402.00838 (2024). https://arxiv.org/abs/2402.00838 The reported energy costs are an under-estimate since they do not account for debugging, hyperparameter tuning, and downtime related costs.
https://www.carbonindependent.org/22.html The reported emissions do not account radiative forcing, expect 2.7 times worser emissions otherwise.
https://www.carbonindependent.org/files/ctc603.pdf
Touvron, Hugo, et al. "Llama 2: Open foundation and fine-tuned chat models." arXiv preprint arXiv:2307.09288 (2023). https://arxiv.org/abs/2307.09288
Gupta, Udit, et al. "Chasing carbon: The elusive environmental footprint of computing." 2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA). IEEE, 2021. https://arxiv.org/pdf/2011.02839.pdf
https://www.tomshardware.com/tech-industry/nvidia-ai-and-hpc-gpu-sales-reportedly-approached-half-a-million-units-in-q3-thanks-to-meta-facebook
https://resources.nvidia.com/en-us-tensor-core/nvidia-tensor-core-gpu-datasheet
https://www.moomoo.com/community/feed/the-h100-could-generate-at-least-50-billion-in-revenue-111691287625733
https://en.macromicro.me/charts/81141/nvda-revenue-region-us-tw-cn
https://www.epa.gov/system/files/documents/2024-01/egrid2022_summary_tables.pdf
https://energypedia.info/wiki/Energy_Transition_in_Taiwan
https://azure.microsoft.com/en-us/blog/how-microsoft-measures-datacenter-water-and-energy-use-to-improve-azure-cloud-sustainability/
https://sustainability.fb.com/
https://ourworldindata.org/co2-emissions-from-aviation
de Vries, Alex. "The growing energy footprint of artificial intelligence." Joule 7.10 (2023): 2191-2194.
Patel, Dylan, and Afzal Ahmad. "The Inference Cost Of Search Disruption–Large Language Model Cost Analysis." (2023).
https://resources.nvidia.com/en-us-dgx-systems/dgx-ai
https://resources.nvidia.com/en-us-dgx-systems/ai-enterprise-dgx
Patterson, David, et al. "The carbon footprint of machine learning training will plateau, then shrink." Computer 55.7 (2022): 18-28
https://azure.microsoft.com/en-gb/pricing/details/virtual-machines/linux/
	CO2e (in Kg)	Ref.
Lifetime car emissions	57,152	[1]
London->NY flight (per person)	179	[3]
London->NY flight (overall)	59,600	[3]
UK Collective (per-annum)	CO2e (in Million ton)	[4]
Electricity production	24
Food & catering	22.4
Recreation & leisure (Travel, socialising, Entertainment, culture)	31.6
Aviation fuel emissions	11
Global emissions from aviation (per annum)	~1B ton	[15]
Model	# params	kWh-PUE	CO2e (in Kg)	Ref.
ELMo	93.6M	275	119	[1]
BERT-Base (gpu)	108M	1507	652	[1]
GPT-2	1.5B	-	-
GPT-3	175B	- 1,287MWh	-	[16]
GPT-4	Unknown (in order of T?)	-	-
OLMO 7B	7B	239,000	69,780	[2]
LLama2	7B	~115,015 (184K GPU-hrs)	31,220 (100% offset claimed)	[5]
	70B	~1,060,800 (1.7M GPU-hrs)	291,420	[5]