Is Agentic AI Bad for the Environment?

How coding agents, long reasoning traces, and a $650B data center buildout could reshape AI’s carbon footprint

Artificial intelligence is entering a new phase, so much so that no sooner had ShrinkThatFootprint covered the energy and carbon implications of AI here that we’re forced to confront its next evolutionary step.

The first wave of tools like ChatGPT answered questions in a single exchange. The next wave—often called agentic AI—can plan, write code, run tests, call tools, spawn sub‑agents, and iterate for hours. Rather than a human interacting with an agent at the speed of a conversation, agentic AI will talk feature agents talking to other agents, interacting at the speed of internet connections.

That shift has implications far beyond productivity. It may also change the energy and carbon footprint of AI systems.

ShrinkthatFootprint examines what we can infer—using public data—about the environmental impact of agentic large language model (LLM) systems, and how they relate to the roughly $650 billion in infrastructure spending projected by the largest technology companies.

Based on our calculations below, the agentic revolution in AI has the potential to in increase energy usage and carbon output by 10,000-fold over usage now.


From Chatbot to Autonomous Coding Agent

Early conversational AI typically processed a prompt of a few hundred tokens and generated a short reply. Each interaction was largely self‑contained.

Agentic systems are different. They orchestrate multiple model calls, often across parallel “agents,” using language as an internal coordination layer. They may read an entire code repository, run commands in a terminal, check logs, and revise outputs repeatedly.

OpenAI’s Codex describes itself as a “command center for agentic coding.” Anthropic markets Claude Code as an “agentic coding tool” embedded in developer workflows. Google’s Antigravity is positioned as an “agent‑first development platform.” These systems explicitly emphasize multi‑step reasoning and autonomous task execution.

The architectural change matters because energy use in LLMs scales strongly with the number of tokens processed.


Why Tokens Matter for Energy

Every word an LLM reads or writes is converted into tokens. Each token requires matrix multiplications across billions of parameters.

Peer‑reviewed and measurement studies show that inference energy varies widely by model size and serving configuration. Reported values range from roughly 0.1 joules per token in highly optimized small‑model deployments to tens of joules per token for large frontier models under certain conditions.

Energy per token also increases with context length. Measurements such as TokenPowerBench show that moving from a few thousand tokens of context to tens of thousands can materially increase energy consumption per token.

A recent bottom‑up estimate from Epoch AI suggests that a typical ChatGPT‑style query might consume on the order of 0.3 watt‑hours. That is less than the energy used by a standard LED lightbulb in a few minutes.

But an agentic coding session may process tens of thousands—or even hundreds of thousands—of tokens. In some cases, context windows now extend to one million tokens.

The difference in scale is what changes the carbon arithmetic.


The $650 Billion Question

Bloomberg reports that Amazon, Alphabet, Meta, and Microsoft together forecast roughly $650 billion in capital expenditures for 2026, much of it tied to data centers and AI infrastructure.

Public earnings disclosures support this magnitude:

  • Amazon expects to invest about $200 billion in 2026 capital expenditures.
  • Alphabet guides to roughly $175–185 billion.
  • Meta guides to roughly $115–135 billion.
  • Microsoft reported $37.5 billion in capex in one fiscal quarter, with the majority allocated to AI infrastructure.

These figures include not just servers but buildings, cooling systems, networking, and power infrastructure.

What does that buy in physical terms?

Industry benchmarks suggest that fully built and equipped hyperscale data center capacity can cost on the order of $30–40 million per megawatt of IT load. Using a mid‑range estimate of $38 million per megawatt, $650 billion could correspond—very roughly—to about 17 gigawatts of IT capacity.

That is comparable to the output of more than a dozen large nuclear reactors.


How Much Electricity Could That Mean?

Data center electricity use depends on three major factors:

  1. IT load utilization
  2. Power Usage Effectiveness (PUE)
  3. Grid carbon intensity

Major hyperscalers report fleet‑average PUE values between roughly 1.08 and 1.17. PUE reflects the overhead for cooling and power delivery beyond the IT equipment itself.

If we assume 17 gigawatts of IT capacity operating at 60% utilization with a PUE of 1.2, total facility electricity consumption could exceed 100 terawatt‑hours per year.

For comparison, the International Energy Agency estimates that global data center electricity use was about 460 TWh in 2022 and could exceed 1,000 TWh by 2026.

The precise share attributable to AI—and specifically to agentic AI—is not publicly disclosed. But even a modest fraction of such infrastructure dedicated to long‑running agentic workloads could represent significant electricity demand.


Comparing a Chat Query to an Agentic Session

To illustrate scale differences, consider simplified scenarios grounded in published measurement ranges.

A regular, human and chatbot query:

  • ~700 total tokens
  • ~0.3 Wh per query (order of magnitude estimate)

An agentic coding session (conservative case):

  • ~30,000 total tokens
  • ~60 Wh per session

A more complex agentic session:

  • ~200,000 tokens
  • ~1 kWh per session

A very large long‑context session:

  • ~1,000,000 tokens
  • ~10–15 kWh per session

These numbers are illustrative but consistent with measured joule‑per‑token ranges reported in academic and benchmarking literature.

The difference between 0.3 Wh and 10 kWh is more than 10,000 fold.

At a grid intensity of 0.35 kg CO₂ per kWh, a 1 kWh session corresponds to roughly 0.35 kg CO₂. A 10 kWh session would approach 3.5 kg CO₂.

Not every session will be large. But heavy‑tail behavior—where a minority of sessions consume most tokens—is common in software development and research workflows.


Why Averages Can Be Misleading

Just as in agriculture insurance data, averages can hide extremes.

If millions of users run small queries and a smaller number run massive autonomous coding loops, total energy use may be dominated by those longer sessions.

Public data does not yet reveal the distribution of token usage in agentic systems. That uncertainty is central.

However, the physics is clear: emissions scale roughly linearly with total tokens multiplied by energy per token.


Cooling, Efficiency, and Mitigation

The carbon outcome is not determined by tokens alone.

Hyperscalers increasingly deploy advanced cooling systems, including liquid cooling. A recent life‑cycle assessment published in Nature shows that advanced cooling configurations can reduce operational greenhouse gas emissions by roughly 15–20% compared with traditional air‑cooled systems under certain conditions.

Moreover, if facilities are powered by low‑carbon electricity through power purchase agreements or clean grids, operational emissions fall sharply.

In such cases, embodied emissions—the carbon cost of manufacturing servers, accelerators, and buildings—become relatively more important.

The lifetime of buildings (often 15+ years) differs from that of GPUs (often 4–6 years). Amortization assumptions strongly affect annualized embodied carbon estimates.


What We Can Say with Confidence

Several conclusions are supported by current evidence:

  1. Agentic systems process far more tokens per user objective than single‑turn chat.
  2. Energy use scales with tokens and context length.
  3. Frontier model inference can vary by orders of magnitude in energy intensity depending on serving conditions.
  4. Big tech capital expenditures are at historically unprecedented levels and are strongly tied to AI infrastructure expansion.
  5. Data center electricity demand is rising rapidly at a global scale.

What remains uncertain is the share of this infrastructure that will be devoted specifically to long‑running agentic workloads, and how quickly serving efficiency improves.


A Balanced Perspective

It is tempting to frame agentic AI as either an environmental disaster or an efficiency breakthrough.

The evidence suggests a more nuanced view.

On one hand, longer reasoning traces and autonomous coding loops plausibly increase per‑task energy consumption by orders of magnitude compared with early chatbot interactions.

On the other hand, hyperscaler efficiency gains, liquid cooling, hardware improvements, and renewable energy procurement can significantly reduce per‑unit emissions.

The climate impact of agentic AI will therefore depend on three levers:

  • How large and frequent agentic workloads become
  • How quickly hardware and serving efficiency improve
  • How fast electricity grids decarbonize

The Bigger Food‑System Analogy

ShrinkThatFootprint often examines how climate risk shows up in real economic data—insurance claims, crop yields, and price volatility.

AI infrastructure can be analyzed similarly.

The headline number—$650 billion—signals scale. But the carbon outcome depends on how intensively that infrastructure is used and what powers it.

Agentic AI represents a structural shift in workload intensity. Whether that shift becomes a marginal or dominant driver of emissions remains an open empirical question.


Conclusion

Agentic AI systems are not just smarter chatbots. They are autonomous computational processes that can run for extended periods, multiplying token volumes and therefore energy use.

If even a modest fraction of newly built AI infrastructure supports long‑running agentic workloads, incremental electricity demand could be significant.

But carbon impact is not predetermined. It is shaped by design choices: model efficiency, context management, cooling architecture, hardware lifecycle, and power sourcing.

The next few years will reveal whether agentic AI becomes primarily an emissions multiplier—or an efficiency‑optimized layer running on increasingly decarbonized grids.

As with agriculture and food systems, the data—not the hype—will ultimately determine the footprint.

Staff Writer
+ posts

Leave a Comment