Calculate LLM Emissions
<span class="api-method api-method-get">GET</span> /v1/digital/llm/emissions
Estimate CO2e emissions from LLM inference based on provider, model, token counts, and regional grid intensity.
Request
Required Parameters
None — all parameters have defaults. However, for meaningful results you should provide at least provider, model, and token counts.
Model Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
provider |
string | openai |
Provider: openai, anthropic, google, meta, mistral, cohere, deepseek, xai, azure_openai, aws_bedrock, gcp_vertex, self_hosted |
model |
string | gpt-5.2 |
Model identifier (e.g., claude-opus-4.5, gemini-3-pro, llama-4-maverick) |
tokens_input |
integer | 0 |
Number of input (prompt) tokens |
tokens_output |
integer | 0 |
Number of output (completion) tokens |
tokens_cached |
integer | 0 |
Number of cached input tokens (90% energy reduction applied) |
requests |
integer | 1 |
Number of requests to calculate for (max 10,000,000) |
Reasoning Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
reasoning_effort |
string | auto | Override thinking ratio: none (1×), low (2×), medium (4×), high (6×), xhigh (10×) |
If omitted, the API automatically detects reasoning models (o3, DeepSeek R1, etc.) and applies their default thinking ratio.
Location Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
region |
string | provider default | ISO country code (e.g., FR, GB) or US state code (e.g., CA, TX) for grid intensity |
If no region is provided, grid intensity defaults to the provider's primary operating country (e.g., US for OpenAI, France for Mistral, China for DeepSeek).
Output Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
equivalents |
boolean | false |
Include real-world equivalents (Google searches, smartphone charges, etc.) |
Response
Full Response Example
{
"data": {
"type": "llm_emission",
"id": "6863669d-f8ac-47e0-8f8b-4d8ea1767315",
"attributes": {
"emissions": {
"co2e": 0.0557,
"co2e_unit": "g",
"co2e_calculation_method": "ipcc_ar6_gwp100",
"breakdown": {
"operational_co2e": 0.0057,
"embodied_co2e": 0.05,
"unit": "g"
},
"energy_usage": {
"total_wh": 0.1357,
"per_request_wh": 0.1357,
"breakdown": {
"base_energy_wh": 0.1,
"input_tokens_wh": 0.005,
"output_tokens_wh": 0.0125,
"reasoning_overhead_wh": 0,
"pue_overhead_wh": 0.0136
}
},
"ghg_protocol_scopes": {
"scope_2": {
"location_based": 0.0057,
"method_note": "Based on regional grid intensity for the provider/region. Covers electricity consumed during inference."
},
"scope_3_category_1": 0.05
},
"source_trail": [
{
"data_category": "energy_model",
"name": "LLM Energy Consumption — Medium tier",
"source": "Epoch AI / Google / SemiAnalysis",
"source_dataset": "Composite: How Much Energy Does ChatGPT Use (2025), Gemini Environmental Report (2025), AI Datacenter Energy (2024)",
"year": "2025",
"region": "GLOBAL"
},
{
"data_category": "grid_intensity",
"name": "FR — 42 gCO₂e/kWh",
"source": "Ember / Electricity Maps",
"source_dataset": "Cloud provider regional grid intensity",
"year": "2025",
"region": "FR"
},
{
"data_category": "embodied_carbon",
"name": "Hardware amortization — Medium tier GPU cluster",
"source": "Cloud Carbon Footprint",
"source_dataset": "CCF methodology — GPU embodied emissions",
"year": "2024",
"region": "GLOBAL"
}
]
},
"inference": {
"provider": "mistral",
"model": "mistral-next-xl",
"tier": "medium",
"is_known_model": false,
"is_reasoning_model": false,
"thinking_ratio": 1,
"tokens": {
"input": 1000,
"output": 500,
"cached": 0,
"reasoning_estimated": 0,
"effective_output": 500
},
"requests": 1
},
"grid": {
"carbon_intensity": 42,
"carbon_intensity_unit": "gCO2e/kWh",
"region": "FR"
},
"notices": [
{
"message": "Unknown model 'mistral-next-xl'. Estimated as 'medium' tier.",
"code": "unknown_model",
"severity": "warning"
}
]
}
},
"meta": {
"methodology": "Energy-per-token estimation with hardware amortization",
"emission_factors_year": 2025,
"standards_compliance": {
"GHG_Protocol": "Scope 2 (purchased electricity) + Scope 3 Category 1 (embodied hardware)",
"ISO_14040": "Lifecycle assessment aligned"
},
"version": "1.0.5",
"calculated_at": "2026-02-12T10:15:40Z"
}
}
Response Fields
emissions
| Field | Type | Description |
|---|---|---|
co2e |
number | Total CO₂e in grams |
co2e_unit |
string | Always g for this endpoint |
co2e_calculation_method |
string | Always ipcc_ar6_gwp100 |
breakdown.operational_co2e |
number | Electricity emissions from inference (g) |
breakdown.embodied_co2e |
number | Amortized hardware manufacturing emissions (g) |
energy_usage
| Field | Type | Description |
|---|---|---|
total_wh |
number | Total energy consumed (Watt-hours) |
per_request_wh |
number | Energy per individual request (Wh) |
breakdown.base_energy_wh |
number | Fixed overhead per request |
breakdown.input_tokens_wh |
number | Energy for processing input tokens |
breakdown.output_tokens_wh |
number | Energy for generating output tokens (includes reasoning) |
breakdown.reasoning_overhead_wh |
number | Additional energy from reasoning thinking tokens |
breakdown.pue_overhead_wh |
number | Power Usage Effectiveness overhead (cooling, networking) |
ghg_protocol_scopes
| Field | Type | Description |
|---|---|---|
scope_2.location_based |
number | Electricity emissions from inference (g) — based on regional grid |
scope_3_category_1 |
number | Embodied hardware emissions (g) — amortized GPU manufacturing |
inference
| Field | Type | Description |
|---|---|---|
provider |
string | Provider used |
model |
string | Model used |
tier |
string | Energy tier: frontier, large, medium, small |
is_known_model |
boolean | true if model is in our database |
is_reasoning_model |
boolean | true if thinking token overhead applied |
thinking_ratio |
number | Multiplier on output tokens (1 = no reasoning) |
tokens.input |
integer | Input tokens provided |
tokens.output |
integer | Output tokens provided |
tokens.cached |
integer | Cached tokens (90% energy reduction) |
tokens.reasoning_estimated |
integer | Estimated internal thinking tokens |
tokens.effective_output |
integer | Output × thinking ratio |
requests |
integer | Number of requests calculated |
grid
| Field | Type | Description |
|---|---|---|
carbon_intensity |
number | Grid intensity used (gCO₂e/kWh) |
carbon_intensity_unit |
string | Always gCO2e/kWh |
region |
string | Region code used for grid lookup |
notices
| Code | Severity | When |
|---|---|---|
unknown_model |
warning | Model not in database — tier estimated by name |
reasoning_model_detected |
info | Reasoning model auto-detected, suggests using reasoning_effort |
reasoning_effort_applied |
info | Explicit reasoning effort override was used |
cache_applied |
info | Cached tokens reduced energy calculation |
Examples
Single GPT-5.2 request
curl "https://api.emissions.dev/v1/digital/llm/emissions?\
provider=openai&\
model=gpt-5.2&\
tokens_input=5000&\
tokens_output=2000" \
-H "Authorization: Bearer em_live_xxxx"
→ ~1.2g CO₂e (frontier tier, US grid)
Claude Opus with prompt caching
curl "https://api.emissions.dev/v1/digital/llm/emissions?\
provider=anthropic&\
model=claude-opus-4.5&\
tokens_input=100000&\
tokens_output=4000&\
tokens_cached=90000" \
-H "Authorization: Bearer em_live_xxxx"
→ 90,000 cached tokens save ~80% of input energy. Notice confirms cache applied.
Reasoning model with explicit effort
curl "https://api.emissions.dev/v1/digital/llm/emissions?\
provider=openai&\
model=o3&\
tokens_input=2000&\
tokens_output=1000&\
reasoning_effort=high" \
-H "Authorization: Bearer em_live_xxxx"
→ 6× thinking overhead applied. 1,000 output tokens → 6,000 effective tokens.
Batch calculation — 100k API calls
curl "https://api.emissions.dev/v1/digital/llm/emissions?\
provider=openai&\
model=gpt-4.1-mini&\
tokens_input=800&\
tokens_output=400&\
requests=100000" \
-H "Authorization: Bearer em_live_xxxx"
→ Total emissions for your monthly API usage in a single call.
Self-hosted model on UK grid
curl "https://api.emissions.dev/v1/digital/llm/emissions?\
provider=self_hosted&\
model=llama-4-maverick&\
tokens_input=2000&\
tokens_output=1000&\
region=GB" \
-H "Authorization: Bearer em_live_xxxx"
→ 1.5× efficiency penalty for self-hosted (higher PUE, less optimised hardware). UK grid at 217 gCO₂e/kWh.
DeepSeek R1 on China grid
curl "https://api.emissions.dev/v1/digital/llm/emissions?\
provider=deepseek&\
model=deepseek-r1&\
tokens_input=3000&\
tokens_output=1500" \
-H "Authorization: Bearer em_live_xxxx"
→ Reasoning model auto-detected (6× thinking ratio). China grid at 544 gCO₂e/kWh increases operational emissions significantly.
With equivalents
curl "https://api.emissions.dev/v1/digital/llm/emissions?\
provider=google&\
model=gemini-3-pro&\
tokens_input=10000&\
tokens_output=5000&\
equivalents=true" \
-H "Authorization: Bearer em_live_xxxx"
→ Includes comparisons: Google searches, smartphone charges, km driven, Netflix hours.
Provider Efficiency
Different providers have different infrastructure efficiency. We apply a multiplier based on published data and disclosed PUE values:
| Provider | Multiplier | Notes |
|---|---|---|
google / gcp_vertex |
0.90–0.95× | Strong renewable commitments, custom TPUs |
anthropic |
0.95× | Efficient infrastructure partnerships |
openai |
1.0× | Baseline |
mistral / cohere |
1.05× | Smaller scale, European infrastructure |
meta / xai / aws_bedrock |
1.1× | General cloud infrastructure |
deepseek |
1.15× | China grid, less disclosed efficiency data |
self_hosted |
1.5× | Higher PUE, less optimised hardware assumed |
Calculation Methodology
Energy Model
- Base energy — Fixed overhead per request (model loading, routing): 0.03–0.5 Wh depending on tier
- Input token energy — Per 1K tokens: 0.001–0.05 Wh depending on tier
- Output token energy — Per 1K tokens: 0.005–0.25 Wh depending on tier (output is 5× more energy-intensive than input)
- Cache efficiency — Cached tokens use 10% of normal input energy (KV-cache reuse)
- Reasoning overhead — Output tokens multiplied by thinking ratio for reasoning models
- PUE — 1.1× default Power Usage Effectiveness (cooling, networking overhead)
- Provider efficiency — Multiplier based on provider infrastructure
CO₂e Conversion
Energy (kWh) × Grid Intensity (gCO₂e/kWh) = Operational CO₂e (g)
Plus embodied carbon: 0.01–0.3g per request depending on tier.
Sources
- Epoch AI — "How Much Energy Does ChatGPT Use?" (2025)
- Google — Gemini Environmental Report (2025)
- Samsi et al. — "From Words to Watts" (2023)
- SemiAnalysis — "AI Datacenter Energy" (2024)
- Cloud Carbon Footprint — GPU embodied emissions methodology