Sunday 26 April 2026

model: xiaomi/mimo-v2.5-pro

DeepSeek dominates the news cycle with the long-awaited V4 release — two models, 1M-token context, MIT license, and a technical report that researchers are calling one of the year’s most important. Meanwhile, Qwen 3.6 27B quietly ties Sonnet 4.6 on agentic benchmarks at a fraction of the size, and Xiaomi’s MiMo 2.5 Pro enters the open-weights race.

🤖 Models

DeepSeek V4 Preview: Pro and Flash Released — The first major version since V3 arrives in two tiers: V4 Pro (1.6T params, 49B active) and V4 Flash (284B, 13B active). Both support 1M-token context under MIT license. Pro is priced at $1.74/$3.48 per 1M tokens; Flash at an aggressive $0.14/$0.28. Independent benchmarks place V4 Pro at #2 open-weights on Artificial Analysis (score 52), behind Kimi K2.6 (54), with strong agentic coding performance.

DeepSeek V4—Almost on the Frontier, a Fraction of the Price — Simon Willison’s analysis notes V4 Flash is cheaper than even GPT-5.4 Nano, and the efficiency focus (especially KV cache compression) is the real headline. The architecture achieves 8.7× smaller KV cache than V3.2 at 1M context.

Qwen 3.6 27B Ties Sonnet 4.6 on Agentic Index — Alibaba’s dense 27B model matches Claude Sonnet 4.6 on Artificial Analysis’s Agentic Index, surpassing Gemini 3.1 Pro, GPT 5.2, and GPT 5.3. The 122B version is eagerly anticipated. Community debates whether gains are real capability or “benchmaxxing.”

Xiaomi MiMo V2.5 Pro Lands at Score 54 — Tied with Kimi K2.6 as the top open-weights model, slightly ahead of DeepSeek V4 Pro. Integrates vision, audio, and action in one model at half the price of its predecessor. Weights release pending.

DeepSeek V4: AI Twitter’s Verdict — Consensus: V4 is ~4–5 months behind the closed frontier (GPT-5.4, Gemini 3.1 Pro, Opus 4.7). The real contribution may be the long-context architecture rather than raw benchmark position. Sharp debate on whether V4 is “democratizing” or too complex for other labs to replicate.

🔬 Research

DeepSeek V4 Technical Report: Compressed Sparse Attention — The 58-page report details Compressed Sparse Attention (CSA) and Heavily Compressed Attention (HCA), achieving 27% of FLOPs and 10% of KV cache memory at 1M tokens compared to V3.2. Also introduces Manifold Constrained Hyper-Connections (mHC) and continued use of Moonshot’s Muon optimizer.

Lambda Calculus Benchmark for AI (LamBench) — 120 pure lambda calculus problems testing AI models. GPT-5.4 leads at 91.7%, Opus 4.6 at 90%. DeepSeek V4 Pro scores only 53.3%, revealing a gap in abstract reasoning vs. practical coding. Interesting lens on model capabilities.

DeepSeek V4 Runs on Huawei Ascend Chips — A geopolitical milestone: V4 is compatible with Huawei’s CANN stack, reducing dependence on export-controlled NVIDIA hardware. Ascend supply is still ~¼ of H100s, but DeepSeek says prices could drop sharply once Ascend 950 supernodes scale in H2.

🛠️ Agents & Tools

DeepSeek Open-Sources DeepEP V2 and TileKernels — Low-level GPU infrastructure for training and serving massive MoE models. TileKernels introduces parallelization that reportedly scales linearly. DeepEP V2 adds Engram, pipeline parallelism, and context parallelism support.

Stash: Open-Source Memory Layer for AI Agents — Persistent memory for any AI agent, replicating what Claude.ai and ChatGPT do internally. Self-hosted, works with any OpenAI-compatible backend. Uses pgvector for embeddings. Trending on HN with 67 comments.

Show HN: Karpathy-Style LLM Wiki Your Agents Maintain — WUPHF ships a wiki layer for AI agents using markdown + git as source of truth, with BM25 + SQLite indexing. Each agent gets a private notebook; entries get promoted to a shared canonical wiki. 98 comments on HN.

Facing AI and a Tough Job Market, Gen Z Turns to Entrepreneurship — As AI erases entry-level corporate roles, some Gen Z workers skip the ladder entirely. Harvard’s Fuller notes “many entry-level jobs involve substantial amounts of routine cognitive work.” The ones getting ahead are “the ones who are building stuff.”

UK Officials Massively Underestimated AI Datacentre Carbon Impact — Revised estimates show AI datacentres could generate 34–123M tonnes of CO₂ over the next decade, up to 3.4% of UK total emissions. Previous estimates were off by a factor of more than 100. The FT also covered this.

China Chases the Driverless Dream at Beijing Auto Show — AI is the defining force at the world’s biggest car fair. Nearly every major Chinese carmaker is investing heavily in autonomous driving software. Xpeng’s updated AI model lets drivers give natural language commands like “park near the entrance.”

🌐 Notable

Sony AI Robot Beats Elite Table Tennis Players — Robot “Ace” won 3 of 5 matches against elite players under official rules. First robot to reach expert level in a commonly played competitive sport. A former Olympic player said Ace taught him a shot he didn’t think was possible.

🔥 Hacker News

New 10GbE USB Adapters Are Cooler, Smaller, Cheaper — Jeff Geerling reviews the latest crop of affordable 10-gigabit Ethernet USB adapters. 299 comments.

Niri 26.04: Scrollable-Tiling Wayland Compositor — Major update adds blur support and other features to this increasingly popular tiling compositor. Users praise the scroll-based window management approach.

Replace IBM Quantum Back End with /dev/urandom — A humorous but pointed demo showing that replacing a quantum computing backend with random output produces indistinguishable results. 44 comments.

Plain Text Has Been Around for Decades and It’s Here to Stay — A love letter to plain text as a format. 127 comments with enthusiastic community discussion.

Martin Galway’s Music Source Files from 1980s Commodore 64 Games — The actual source code for iconic C64 game music, preserved and shared by the original composer.