Calculate LLM Emissions

<span class="api-method api-method-get">GET</span> /v1/digital/llm/emissions

<div class="admonition admonition-warning"><span class="admonition-icon">⚠️</span><div class="admonition-content"><p>Beta Endpoint This endpoint is in beta. Estimates are based on published research and may change as better data becomes available.</p> </div></div>

Estimate CO2e emissions from LLM inference based on provider, model, token counts, and regional grid intensity.

Request

Required Parameters

None — all parameters have defaults. However, for meaningful results you should provide at least provider, model, and token counts.

Model Parameters

Parameter	Type	Default	Description
`provider`	string	`openai`	Provider: `openai`, `anthropic`, `google`, `meta`, `mistral`, `cohere`, `deepseek`, `xai`, `azure_openai`, `aws_bedrock`, `gcp_vertex`, `self_hosted`
`model`	string	`gpt-5.2`	Model identifier (e.g., `claude-opus-4.5`, `gemini-3-pro`, `llama-4-maverick`)
`tokens_input`	integer	`0`	Number of input (prompt) tokens
`tokens_output`	integer	`0`	Number of output (completion) tokens
`tokens_cached`	integer	`0`	Number of cached input tokens (90% energy reduction applied)
`requests`	integer	`1`	Number of requests to calculate for (max 10,000,000)

Reasoning Parameters

Parameter	Type	Default	Description
`reasoning_effort`	string	auto	Override thinking ratio: `none` (1×), `low` (2×), `medium` (4×), `high` (6×), `xhigh` (10×)

If omitted, the API automatically detects reasoning models (o3, DeepSeek R1, etc.) and applies their default thinking ratio.

Location Parameters

Parameter	Type	Default	Description
`region`	string	provider default	ISO country code (e.g., `FR`, `GB`) or US state code (e.g., `CA`, `TX`) for grid intensity

If no region is provided, grid intensity defaults to the provider's primary operating country (e.g., US for OpenAI, France for Mistral, China for DeepSeek).

Output Parameters

Parameter	Type	Default	Description
`equivalents`	boolean	`false`	Include real-world equivalents (Google searches, smartphone charges, etc.)

Response

Full Response Example

{
  "data": {
    "type": "llm_emission",
    "id": "6863669d-f8ac-47e0-8f8b-4d8ea1767315",
    "attributes": {
      "emissions": {
        "co2e": 0.0557,
        "co2e_unit": "g",
        "co2e_calculation_method": "ipcc_ar6_gwp100",
        "breakdown": {
          "operational_co2e": 0.0057,
          "embodied_co2e": 0.05,
          "unit": "g"
        },
        "energy_usage": {
          "total_wh": 0.1357,
          "per_request_wh": 0.1357,
          "breakdown": {
            "base_energy_wh": 0.1,
            "input_tokens_wh": 0.005,
            "output_tokens_wh": 0.0125,
            "reasoning_overhead_wh": 0,
            "pue_overhead_wh": 0.0136
          }
        },
        "ghg_protocol_scopes": {
          "scope_2": {
            "location_based": 0.0057,
            "method_note": "Based on regional grid intensity for the provider/region. Covers electricity consumed during inference."
          },
          "scope_3_category_1": 0.05
        },
        "source_trail": [
          {
            "data_category": "energy_model",
            "name": "LLM Energy Consumption — Medium tier",
            "source": "Epoch AI / Google / SemiAnalysis",
            "source_dataset": "Composite: How Much Energy Does ChatGPT Use (2025), Gemini Environmental Report (2025), AI Datacenter Energy (2024)",
            "year": "2025",
            "region": "GLOBAL"
          },
          {
            "data_category": "grid_intensity",
            "name": "FR — 42 gCO₂e/kWh",
            "source": "Ember / Electricity Maps",
            "source_dataset": "Cloud provider regional grid intensity",
            "year": "2025",
            "region": "FR"
          },
          {
            "data_category": "embodied_carbon",
            "name": "Hardware amortization — Medium tier GPU cluster",
            "source": "Cloud Carbon Footprint",
            "source_dataset": "CCF methodology — GPU embodied emissions",
            "year": "2024",
            "region": "GLOBAL"
          }
        ]
      },
      "inference": {
        "provider": "mistral",
        "model": "mistral-next-xl",
        "tier": "medium",
        "is_known_model": false,
        "is_reasoning_model": false,
        "thinking_ratio": 1,
        "tokens": {
          "input": 1000,
          "output": 500,
          "cached": 0,
          "reasoning_estimated": 0,
          "effective_output": 500
        },
        "requests": 1
      },
      "grid": {
        "carbon_intensity": 42,
        "carbon_intensity_unit": "gCO2e/kWh",
        "region": "FR"
      },
      "notices": [
        {
          "message": "Unknown model 'mistral-next-xl'. Estimated as 'medium' tier.",
          "code": "unknown_model",
          "severity": "warning"
        }
      ]
    }
  },
  "meta": {
    "methodology": "Energy-per-token estimation with hardware amortization",
    "emission_factors_year": 2025,
    "standards_compliance": {
      "GHG_Protocol": "Scope 2 (purchased electricity) + Scope 3 Category 1 (embodied hardware)",
      "ISO_14040": "Lifecycle assessment aligned"
    },
    "version": "1.0.5",
    "calculated_at": "2026-02-12T10:15:40Z"
  }
}

Response Fields

emissions

Field	Type	Description
`co2e`	number	Total CO₂e in grams
`co2e_unit`	string	Always `g` for this endpoint
`co2e_calculation_method`	string	Always `ipcc_ar6_gwp100`
`breakdown.operational_co2e`	number	Electricity emissions from inference (g)
`breakdown.embodied_co2e`	number	Amortized hardware manufacturing emissions (g)

<div class="admonition admonition-info"><span class="admonition-icon">ℹ️</span><div class="admonition-content"><p>Unit Note This endpoint reports in <strong>grams</strong> (not kg) because single LLM requests produce sub-gram emissions. All other emissions.dev endpoints report in kg. Check <code>co2e_unit</code> when parsing responses.</p> </div></div>

energy_usage

Field	Type	Description
`total_wh`	number	Total energy consumed (Watt-hours)
`per_request_wh`	number	Energy per individual request (Wh)
`breakdown.base_energy_wh`	number	Fixed overhead per request
`breakdown.input_tokens_wh`	number	Energy for processing input tokens
`breakdown.output_tokens_wh`	number	Energy for generating output tokens (includes reasoning)
`breakdown.reasoning_overhead_wh`	number	Additional energy from reasoning thinking tokens
`breakdown.pue_overhead_wh`	number	Power Usage Effectiveness overhead (cooling, networking)

ghg_protocol_scopes

Field	Type	Description
`scope_2.location_based`	number	Electricity emissions from inference (g) — based on regional grid
`scope_3_category_1`	number	Embodied hardware emissions (g) — amortized GPU manufacturing

inference

Field	Type	Description
`provider`	string	Provider used
`model`	string	Model used
`tier`	string	Energy tier: `frontier`, `large`, `medium`, `small`
`is_known_model`	boolean	`true` if model is in our database
`is_reasoning_model`	boolean	`true` if thinking token overhead applied
`thinking_ratio`	number	Multiplier on output tokens (1 = no reasoning)
`tokens.input`	integer	Input tokens provided
`tokens.output`	integer	Output tokens provided
`tokens.cached`	integer	Cached tokens (90% energy reduction)
`tokens.reasoning_estimated`	integer	Estimated internal thinking tokens
`tokens.effective_output`	integer	Output × thinking ratio
`requests`	integer	Number of requests calculated

grid

Field	Type	Description
`carbon_intensity`	number	Grid intensity used (gCO₂e/kWh)
`carbon_intensity_unit`	string	Always `gCO2e/kWh`
`region`	string	Region code used for grid lookup

notices

Code	Severity	When
`unknown_model`	warning	Model not in database — tier estimated by name
`reasoning_model_detected`	info	Reasoning model auto-detected, suggests using `reasoning_effort`
`reasoning_effort_applied`	info	Explicit reasoning effort override was used
`cache_applied`	info	Cached tokens reduced energy calculation

Examples

Single GPT-5.2 request

curl "https://api.emissions.dev/v1/digital/llm/emissions?\
provider=openai&\
model=gpt-5.2&\
tokens_input=5000&\
tokens_output=2000" \
  -H "Authorization: Bearer em_live_xxxx"

→ ~1.2g CO₂e (frontier tier, US grid)

Claude Opus with prompt caching

curl "https://api.emissions.dev/v1/digital/llm/emissions?\
provider=anthropic&\
model=claude-opus-4.5&\
tokens_input=100000&\
tokens_output=4000&\
tokens_cached=90000" \
  -H "Authorization: Bearer em_live_xxxx"

→ 90,000 cached tokens save ~80% of input energy. Notice confirms cache applied.

Reasoning model with explicit effort

curl "https://api.emissions.dev/v1/digital/llm/emissions?\
provider=openai&\
model=o3&\
tokens_input=2000&\
tokens_output=1000&\
reasoning_effort=high" \
  -H "Authorization: Bearer em_live_xxxx"

→ 6× thinking overhead applied. 1,000 output tokens → 6,000 effective tokens.

Batch calculation — 100k API calls

curl "https://api.emissions.dev/v1/digital/llm/emissions?\
provider=openai&\
model=gpt-4.1-mini&\
tokens_input=800&\
tokens_output=400&\
requests=100000" \
  -H "Authorization: Bearer em_live_xxxx"

→ Total emissions for your monthly API usage in a single call.

Self-hosted model on UK grid

curl "https://api.emissions.dev/v1/digital/llm/emissions?\
provider=self_hosted&\
model=llama-4-maverick&\
tokens_input=2000&\
tokens_output=1000&\
region=GB" \
  -H "Authorization: Bearer em_live_xxxx"

→ 1.5× efficiency penalty for self-hosted (higher PUE, less optimised hardware). UK grid at 217 gCO₂e/kWh.

DeepSeek R1 on China grid

curl "https://api.emissions.dev/v1/digital/llm/emissions?\
provider=deepseek&\
model=deepseek-r1&\
tokens_input=3000&\
tokens_output=1500" \
  -H "Authorization: Bearer em_live_xxxx"

→ Reasoning model auto-detected (6× thinking ratio). China grid at 544 gCO₂e/kWh increases operational emissions significantly.

With equivalents

curl "https://api.emissions.dev/v1/digital/llm/emissions?\
provider=google&\
model=gemini-3-pro&\
tokens_input=10000&\
tokens_output=5000&\
equivalents=true" \
  -H "Authorization: Bearer em_live_xxxx"

→ Includes comparisons: Google searches, smartphone charges, km driven, Netflix hours.

Provider Efficiency

Different providers have different infrastructure efficiency. We apply a multiplier based on published data and disclosed PUE values:

Provider	Multiplier	Notes
`google` / `gcp_vertex`	0.90–0.95×	Strong renewable commitments, custom TPUs
`anthropic`	0.95×	Efficient infrastructure partnerships
`openai`	1.0×	Baseline
`mistral` / `cohere`	1.05×	Smaller scale, European infrastructure
`meta` / `xai` / `aws_bedrock`	1.1×	General cloud infrastructure
`deepseek`	1.15×	China grid, less disclosed efficiency data
`self_hosted`	1.5×	Higher PUE, less optimised hardware assumed

Calculation Methodology

Energy Model

Base energy — Fixed overhead per request (model loading, routing): 0.03–0.5 Wh depending on tier
Input token energy — Per 1K tokens: 0.001–0.05 Wh depending on tier
Output token energy — Per 1K tokens: 0.005–0.25 Wh depending on tier (output is 5× more energy-intensive than input)
Cache efficiency — Cached tokens use 10% of normal input energy (KV-cache reuse)
Reasoning overhead — Output tokens multiplied by thinking ratio for reasoning models
PUE — 1.1× default Power Usage Effectiveness (cooling, networking overhead)
Provider efficiency — Multiplier based on provider infrastructure

CO₂e Conversion

Energy (kWh) × Grid Intensity (gCO₂e/kWh) = Operational CO₂e (g)

Plus embodied carbon: 0.01–0.3g per request depending on tier.

Sources

Epoch AI — "How Much Energy Does ChatGPT Use?" (2025)
Google — Gemini Environmental Report (2025)
Samsi et al. — "From Words to Watts" (2023)
SemiAnalysis — "AI Datacenter Energy" (2024)
Cloud Carbon Footprint — GPU embodied emissions methodology