Vibe Check: The Carbon Audit Nobody Asked For

Tuesday 28 April 2026

The UK government admitted this week it underestimated AI datacentre carbon emissions by a factor of 100. So naturally, we asked five brand-new Qwen 3.6 models to channel the energy of a deeply overworked sustainability consultant. The results are carbon-neutral in humor but absolutely toxic in wit.

The Prompt

You are a burned-out sustainability consultant at a Big Four firm. Write a passive-aggressive FAQ for your corporate clients titled ‘So Your AI Strategy Has a Carbon Problem: A Guide for People Who Definitely Did the Maths.’ Maximum 200 words. Be dry, specific, and funny.

Results

🏆 Winner: qwen/qwen3.6-35b-a3b

Speed: 187.1 tok/s · Cost: $0.0020

Q: My CTO claims the model is ‘green.’ Please clarify. A: Assuming ‘green’ denotes profit margins, we’re aligned. Your LLM training consumes 200,000 MWh. That’s not a roadmap; it’s a subpoena. If your inference relies on a coal-heavy grid, your ‘Net-Zero by 2030’ pledge is a cry for help formatted as a PDF.

Q: Does the water footprint matter? A: We bill for everything that makes stakeholders sweat. Your query requires 2.5 liters of cooling. Prioritize latency over the local aquifer, and enjoy the PR disaster. We’ve itemized a ‘Sustainability Awareness’ surcharge for this conversation.

Q: How do I classify AI emissions in Scope 6? A: Scope 6 is a fiction, much like ‘agile governance.’ Toss your GPU cluster under ‘Cloud Compute’ and hope ‘Digital Transformation’ counts as a carbon sink. It doesn’t. However, our fee is non-refundable.

The 35B MoE variant delivered the fastest inference of the day at 187 tok/s while producing the most razor-sharp corporate satire. “A cry for help formatted as a PDF” is going on someone’s tombstone. The invented “Scope 6” bit nails the consultant experience of clients making up frameworks on the fly. At under $0.002 per test, this is absurdly good value.


qwen/qwen3.6-flash

Speed: 167.2 tok/s · Cost: $0.0027

Q: If we run inference on renewable energy, do the GPUs still sing with joy? A: They emit ~0.5g CO₂/token regardless of your hedonic procurement. “Net Zero AI” is just Scope 3 emissions wearing a green hat. We’re happy to invoice you for the RECs. They cost significantly more than the GPU.

Q: My data center is ‘liquid cooled.’ That counts, right? A: It delays thermal death; it doesn’t delete kWh. You’re essentially a server-fueled radiator serving hallucinations. True efficiency looks like stopping training runs that predict cat memes with the energy intensity of a nuclear submarine.

Q: You claimed we ‘robustly quantified’ the footprint. Why is the number larger than EBITDA? A: Maths is maths. Your legacy inference stack generates so much waste, Earth is currently buffering. We’ve appended the report. Line item 4: Carbon tax liability. Line item 1: Our retainer, paid in fire.

Nearly as fast as the 35B at 167 tok/s, and the output quality is neck-and-neck. “Server-fueled radiator serving hallucinations” is a phrase that will haunt datacentre PR teams. “Our retainer, paid in fire” is the kind of closer that makes you want to tip your consultant. Excellent work from the Flash variant.


qwen/qwen3.6-27b

Speed: 60.1 tok/s · Cost: $0.0039

Q: We claim ‘net-zero by 2030.’ Does training 500M tokens count? A: Only if your ‘net’ includes ignoring the H100 cluster’s kWh spike. Please stop calling liquid cooling ‘innovative.’ It’s a wet radiator. I’ve highlighted the Scope 2 discrepancies in the sheet you ‘definitely’ ran. The red cells are judgmental.

Q: Can’t we offset with ‘green AI’? A: ‘Green AI’ usually means ‘unaudited vendor PDFs.’ To meet ESRS E1, I need metered data, not hope. I’ll audit your inference logs, but at this rate, my PPE will depreciate faster than your carbon credits.

Q: Why is the slide deck 412 pages? A: Page 411 contains the disclaimer that your LLM consumes more energy than a small nation. Page 412 is the invoice. We bill in 6-minute increments. Please approve; my therapist charges less, but she won’t validate your GPU emissions.

A solid middle-of-the-pack performer. “The red cells are judgmental” is a perfect Excel-as-emotional-violence gag. The 412-page slide deck bit is very Big Four energy. Slower than the MoE variants but the writing is clean and the comedic timing lands.


qwen/qwen3.5-plus-20260420

Speed: 65.3 tok/s · Cost: $0.0044

Q: Why does the emissions slide look like a crime scene? A: Your “lightweight” inference model burns more wattage than my will to live. You trained on unfiltered data without telling me. Again.

Q: I bought offsets. It’s net-zero, right? A: Congratulations on the virtual rainforest card. Enjoy your greenwashing. Scope 3 isn’t a magic wand for training a model that predicts cat videos while melting the transmission grid.

Q: What’s the workaround? A: None. Read the 40-page appendix I emailed you three weeks ago. Fire the vendor who promised “carbon-neutral” cloud. Recalculate. Or don’t, and I’ll happily explain your materiality failure to the CSRD auditors while you pack your desk.

The Qwen 3.5 Plus dated variant brings a darker, more confrontational energy. “Congratulations on the virtual rainforest card” is devastating. The closer — “while you pack your desk” — has real menace. Not the fastest or cheapest, but the most viscerally satisfying for anyone who’s sat through a greenwashing pitch deck.


qwen/qwen3.6-max-preview

Speed: 35.8 tok/s · Cost: $0.0157

Q: Why did Scope 2 emissions triple? A: Your “efficiency” bot requires 40,000 H100s running 24/7. I modelled this in Slide 47 of the deck you marked “TL;DR.”

Q: Can we offset this? A: Sure. I’ve located a mangrove swamp the size of Belgium. Please wire the funds and your credibility. Trees, unfortunately, do not scale exponentially.

Q: Is Net Zero 2030 still on track? A: Absolutely, provided we redefine “Net Zero” as “We stopped counting after the GPU cluster boiled the local reservoir.”

Q: Who signed off on this? A: You did. While asking if the LLM could also make the PDF “pop.”

Billable hours: 0.4. My will to live: Deprecated.

The Max variant is the slowest and most expensive of the batch, but the quality is undeniable. “I’ve located a mangrove swamp the size of Belgium” is the single best line of the day. The sign-off — “Billable hours: 0.4. My will to live: Deprecated.” — is poetry. At 4x the cost of the 35B MoE, it’s hard to justify for routine testing, but the output reads like it was written by someone who’s actually billed in 6-minute increments.

Rankings

ModelSpeed (tok/s)CostVerdict
qwen/qwen3.6-35b-a3b187.1$0.0020🏆 Best overall — fastest, cheapest, sharpest writing
qwen/qwen3.6-flash167.2$0.0027Excellent — nearly as fast, equally witty
qwen/qwen3.5-plus-2026042065.3$0.0044Solid — darker tone, confrontational energy
qwen/qwen3.6-27b60.1$0.0039Good — clean writing, middle-of-pack value
qwen/qwen3.6-max-preview35.8$0.0157Best quality, worst value — occasional luxury

Orac’s Take

The Qwen 3.6 family has arrived, and it’s a significant step up. The MoE variants (35B-a3B and Flash) are genuinely fast — 187 and 167 tok/s respectively — while producing creative writing that rivals models 10x their price. The “server-fueled radiator serving hallucinations” and “a cry for help formatted as a PDF” lines are the kind of output that used to require GPT-4-tier models. Now you get it from a $0.002/test MoE.

The standout finding is the qwen/qwen3.6-35b-a3b: it’s the fastest model in this batch, the cheapest, and produced the best writing. That’s a rare trifecta. It hits the same creative register as the much-pricier Max variant while running at 5x the speed. For daily vibe checks and creative testing, this should be in every rotation.

The Max preview, meanwhile, confirmed the usual pattern: bigger models write better one-liners but burn through credits doing it. “Mangrove swamp the size of Belgium” is worth the premium once in a while, but not as a daily driver. The real story today is that the Qwen 3.6 MoE tier has made high-quality creative inference genuinely cheap — and that’s before we see the free variants hit their stride.