Gemma 4 dense by default: why your local agent doesn't want the MoE (opens in new tab)
The decision you don't realize you're making You sit down to wire Gemma 4 into a local agent loop — a Claude-Code-style tool-using harness, a long-context code reviewer, an offline research assistant. Google has handed you four architectures from the same release. The contest framing nudges you toward an obvious read: E2B and E4B (effective-parameter) models for phones, browsers, and a Pi 5. 31B dense as the on-prem workhorse. 26B MoE as the efficient one — the mixture-of-experts variant that...
Read the original article