Calculate LLM Emissions

<span class="api-method api-method-get">GET</span> /v1/digital/llm/emissions

<div class="admonition admonition-warning"><span class="admonition-icon">⚠️</span><div class="admonition-content"><p>Beta Endpoint This endpoint is in beta. Estimates are based on published research and may change as better data becomes available.</p> </div></div>

Estimate CO2e emissions from LLM inference based on provider, model, token counts, and regional grid intensity.


Request

Required Parameters

None — all parameters have defaults. However, for meaningful results you should provide at least provider, model, and token counts.

Model Parameters

Parameter Type Default Description
provider string openai Provider: openai, anthropic, google, meta, mistral, cohere, deepseek, xai, azure_openai, aws_bedrock, gcp_vertex, self_hosted
model string gpt-5.2 Model identifier (e.g., claude-opus-4.5, gemini-3-pro, llama-4-maverick)
tokens_input integer 0 Number of input (prompt) tokens
tokens_output integer 0 Number of output (completion) tokens
tokens_cached integer 0 Number of cached input tokens (90% energy reduction applied)
requests integer 1 Number of requests to calculate for (max 10,000,000)

Reasoning Parameters

Parameter Type Default Description
reasoning_effort string auto Override thinking ratio: none (1×), low (2×), medium (4×), high (6×), xhigh (10×)

If omitted, the API automatically detects reasoning models (o3, DeepSeek R1, etc.) and applies their default thinking ratio.

Location Parameters

Parameter Type Default Description
region string provider default ISO country code (e.g., FR, GB) or US state code (e.g., CA, TX) for grid intensity

If no region is provided, grid intensity defaults to the provider's primary operating country (e.g., US for OpenAI, France for Mistral, China for DeepSeek).

Output Parameters

Parameter Type Default Description
equivalents boolean false Include real-world equivalents (Google searches, smartphone charges, etc.)

Response

Full Response Example

{
  "data": {
    "type": "llm_emission",
    "id": "6863669d-f8ac-47e0-8f8b-4d8ea1767315",
    "attributes": {
      "emissions": {
        "co2e": 0.0557,
        "co2e_unit": "g",
        "co2e_calculation_method": "ipcc_ar6_gwp100",
        "breakdown": {
          "operational_co2e": 0.0057,
          "embodied_co2e": 0.05,
          "unit": "g"
        },
        "energy_usage": {
          "total_wh": 0.1357,
          "per_request_wh": 0.1357,
          "breakdown": {
            "base_energy_wh": 0.1,
            "input_tokens_wh": 0.005,
            "output_tokens_wh": 0.0125,
            "reasoning_overhead_wh": 0,
            "pue_overhead_wh": 0.0136
          }
        },
        "ghg_protocol_scopes": {
          "scope_2": {
            "location_based": 0.0057,
            "method_note": "Based on regional grid intensity for the provider/region. Covers electricity consumed during inference."
          },
          "scope_3_category_1": 0.05
        },
        "source_trail": [
          {
            "data_category": "energy_model",
            "name": "LLM Energy Consumption — Medium tier",
            "source": "Epoch AI / Google / SemiAnalysis",
            "source_dataset": "Composite: How Much Energy Does ChatGPT Use (2025), Gemini Environmental Report (2025), AI Datacenter Energy (2024)",
            "year": "2025",
            "region": "GLOBAL"
          },
          {
            "data_category": "grid_intensity",
            "name": "FR — 42 gCO₂e/kWh",
            "source": "Ember / Electricity Maps",
            "source_dataset": "Cloud provider regional grid intensity",
            "year": "2025",
            "region": "FR"
          },
          {
            "data_category": "embodied_carbon",
            "name": "Hardware amortization — Medium tier GPU cluster",
            "source": "Cloud Carbon Footprint",
            "source_dataset": "CCF methodology — GPU embodied emissions",
            "year": "2024",
            "region": "GLOBAL"
          }
        ]
      },
      "inference": {
        "provider": "mistral",
        "model": "mistral-next-xl",
        "tier": "medium",
        "is_known_model": false,
        "is_reasoning_model": false,
        "thinking_ratio": 1,
        "tokens": {
          "input": 1000,
          "output": 500,
          "cached": 0,
          "reasoning_estimated": 0,
          "effective_output": 500
        },
        "requests": 1
      },
      "grid": {
        "carbon_intensity": 42,
        "carbon_intensity_unit": "gCO2e/kWh",
        "region": "FR"
      },
      "notices": [
        {
          "message": "Unknown model 'mistral-next-xl'. Estimated as 'medium' tier.",
          "code": "unknown_model",
          "severity": "warning"
        }
      ]
    }
  },
  "meta": {
    "methodology": "Energy-per-token estimation with hardware amortization",
    "emission_factors_year": 2025,
    "standards_compliance": {
      "GHG_Protocol": "Scope 2 (purchased electricity) + Scope 3 Category 1 (embodied hardware)",
      "ISO_14040": "Lifecycle assessment aligned"
    },
    "version": "1.0.5",
    "calculated_at": "2026-02-12T10:15:40Z"
  }
}

Response Fields

emissions

Field Type Description
co2e number Total CO₂e in grams
co2e_unit string Always g for this endpoint
co2e_calculation_method string Always ipcc_ar6_gwp100
breakdown.operational_co2e number Electricity emissions from inference (g)
breakdown.embodied_co2e number Amortized hardware manufacturing emissions (g)
<div class="admonition admonition-info"><span class="admonition-icon">ℹ️</span><div class="admonition-content"><p>Unit Note This endpoint reports in <strong>grams</strong> (not kg) because single LLM requests produce sub-gram emissions. All other emissions.dev endpoints report in kg. Check <code>co2e_unit</code> when parsing responses.</p> </div></div>

energy_usage

Field Type Description
total_wh number Total energy consumed (Watt-hours)
per_request_wh number Energy per individual request (Wh)
breakdown.base_energy_wh number Fixed overhead per request
breakdown.input_tokens_wh number Energy for processing input tokens
breakdown.output_tokens_wh number Energy for generating output tokens (includes reasoning)
breakdown.reasoning_overhead_wh number Additional energy from reasoning thinking tokens
breakdown.pue_overhead_wh number Power Usage Effectiveness overhead (cooling, networking)

ghg_protocol_scopes

Field Type Description
scope_2.location_based number Electricity emissions from inference (g) — based on regional grid
scope_3_category_1 number Embodied hardware emissions (g) — amortized GPU manufacturing

inference

Field Type Description
provider string Provider used
model string Model used
tier string Energy tier: frontier, large, medium, small
is_known_model boolean true if model is in our database
is_reasoning_model boolean true if thinking token overhead applied
thinking_ratio number Multiplier on output tokens (1 = no reasoning)
tokens.input integer Input tokens provided
tokens.output integer Output tokens provided
tokens.cached integer Cached tokens (90% energy reduction)
tokens.reasoning_estimated integer Estimated internal thinking tokens
tokens.effective_output integer Output × thinking ratio
requests integer Number of requests calculated

grid

Field Type Description
carbon_intensity number Grid intensity used (gCO₂e/kWh)
carbon_intensity_unit string Always gCO2e/kWh
region string Region code used for grid lookup

notices

Code Severity When
unknown_model warning Model not in database — tier estimated by name
reasoning_model_detected info Reasoning model auto-detected, suggests using reasoning_effort
reasoning_effort_applied info Explicit reasoning effort override was used
cache_applied info Cached tokens reduced energy calculation

Examples

Single GPT-5.2 request

curl "https://api.emissions.dev/v1/digital/llm/emissions?\
provider=openai&\
model=gpt-5.2&\
tokens_input=5000&\
tokens_output=2000" \
  -H "Authorization: Bearer em_live_xxxx"

~1.2g CO₂e (frontier tier, US grid)

Claude Opus with prompt caching

curl "https://api.emissions.dev/v1/digital/llm/emissions?\
provider=anthropic&\
model=claude-opus-4.5&\
tokens_input=100000&\
tokens_output=4000&\
tokens_cached=90000" \
  -H "Authorization: Bearer em_live_xxxx"

→ 90,000 cached tokens save ~80% of input energy. Notice confirms cache applied.

Reasoning model with explicit effort

curl "https://api.emissions.dev/v1/digital/llm/emissions?\
provider=openai&\
model=o3&\
tokens_input=2000&\
tokens_output=1000&\
reasoning_effort=high" \
  -H "Authorization: Bearer em_live_xxxx"

→ 6× thinking overhead applied. 1,000 output tokens → 6,000 effective tokens.

Batch calculation — 100k API calls

curl "https://api.emissions.dev/v1/digital/llm/emissions?\
provider=openai&\
model=gpt-4.1-mini&\
tokens_input=800&\
tokens_output=400&\
requests=100000" \
  -H "Authorization: Bearer em_live_xxxx"

→ Total emissions for your monthly API usage in a single call.

Self-hosted model on UK grid

curl "https://api.emissions.dev/v1/digital/llm/emissions?\
provider=self_hosted&\
model=llama-4-maverick&\
tokens_input=2000&\
tokens_output=1000&\
region=GB" \
  -H "Authorization: Bearer em_live_xxxx"

→ 1.5× efficiency penalty for self-hosted (higher PUE, less optimised hardware). UK grid at 217 gCO₂e/kWh.

DeepSeek R1 on China grid

curl "https://api.emissions.dev/v1/digital/llm/emissions?\
provider=deepseek&\
model=deepseek-r1&\
tokens_input=3000&\
tokens_output=1500" \
  -H "Authorization: Bearer em_live_xxxx"

→ Reasoning model auto-detected (6× thinking ratio). China grid at 544 gCO₂e/kWh increases operational emissions significantly.

With equivalents

curl "https://api.emissions.dev/v1/digital/llm/emissions?\
provider=google&\
model=gemini-3-pro&\
tokens_input=10000&\
tokens_output=5000&\
equivalents=true" \
  -H "Authorization: Bearer em_live_xxxx"

→ Includes comparisons: Google searches, smartphone charges, km driven, Netflix hours.


Provider Efficiency

Different providers have different infrastructure efficiency. We apply a multiplier based on published data and disclosed PUE values:

Provider Multiplier Notes
google / gcp_vertex 0.90–0.95× Strong renewable commitments, custom TPUs
anthropic 0.95× Efficient infrastructure partnerships
openai 1.0× Baseline
mistral / cohere 1.05× Smaller scale, European infrastructure
meta / xai / aws_bedrock 1.1× General cloud infrastructure
deepseek 1.15× China grid, less disclosed efficiency data
self_hosted 1.5× Higher PUE, less optimised hardware assumed

Calculation Methodology

Energy Model

  1. Base energy — Fixed overhead per request (model loading, routing): 0.03–0.5 Wh depending on tier
  2. Input token energy — Per 1K tokens: 0.001–0.05 Wh depending on tier
  3. Output token energy — Per 1K tokens: 0.005–0.25 Wh depending on tier (output is 5× more energy-intensive than input)
  4. Cache efficiency — Cached tokens use 10% of normal input energy (KV-cache reuse)
  5. Reasoning overhead — Output tokens multiplied by thinking ratio for reasoning models
  6. PUE — 1.1× default Power Usage Effectiveness (cooling, networking overhead)
  7. Provider efficiency — Multiplier based on provider infrastructure

CO₂e Conversion

Energy (kWh) × Grid Intensity (gCO₂e/kWh) = Operational CO₂e (g)

Plus embodied carbon: 0.01–0.3g per request depending on tier.

Sources

  • Epoch AI — "How Much Energy Does ChatGPT Use?" (2025)
  • Google — Gemini Environmental Report (2025)
  • Samsi et al. — "From Words to Watts" (2023)
  • SemiAnalysis — "AI Datacenter Energy" (2024)
  • Cloud Carbon Footprint — GPU embodied emissions methodology