🧠 LLMs - dmndxld

🔧MLOps Blog

medium.com·

Paper: LLM Translation of Compiler Intermediate Representation

🔧MLOps

compilers.iecc.com·

Building & Benchmarking: LLMs on a 16GB Jetson Orin NX for Hermes Agent

💻GPU Computing Blog

dnhkng.github.io·

Research Proposal: Decoupled RISC-LLM Architectures via Circadian Synaptic Consolidation

🐧Operating Systems

aermia.com··Hacker News

Melanie Mitchell: What We Get Wrong About AI

🏗️AI Infra

yalereview.org··Substack, Hacker News, Hacker News

Google Shrank Gemma 4 by 72% and Unsloth Fixed the 4-Bit Bug Nobody Else Caught on One 4090, and 4-Bit Shouldn’t Be This Good

🏗️AI Infra Blog

towardsai.net·

The Rise of Agentic AI: What Every Engineer Should Learn

🏗️AI Infra Blog

medium.com·

What Are Tokens in LLMs?

🔥PyTorch Blog

bearisland.dev··Hacker News

DiffusionGemma: The Developer Guide

💻GPU Computing Blog

developers.googleblog.com·

Should LLM Agents Decide in Social Simulations? Comparing Finite-State and LLM-Based Decision Policies

⚡Distributed Training Academic

arxiv.org·

Gemma Collins’ mum rushed to hospital as I’m A Celeb star says she’s ‘so worried she can’t sleep’

💻GPU Computing News

thesun.co.uk·

Google Gemma 4 12B nearly matches 26B benchmarks — and runs on your laptop

💻GPU Computing

thenewstack.io·

Making Local LLM Fast

🔥PyTorch

bogdan.nimblex.net··Hacker News

Creating ADK Agent using locally running Gemma 4

🐍Python Blog

medium.com·

Claude Fable 5 is Mythos for the masses

🏗️AI Infra Blog

techzine.eu·

KaiFelixBennett/gemma4-turboquant-rdna4: Run Gemma-4-31B at full 256K context on a $1,400 AMD RDNA4 GPU (gfx1201): TurboQuant KV cache + HIP-graph-safe Flash-Attention for llama.cpp, fully measured on real hardware.

🏗️AI Infra Code

github.com··Hacker News

LLM Research Papers: The 2026 List (January to May)

🔧MLOps News

magazine.sebastianraschka.com

··Hacker News

What's in the Box? A Field Guide to AI Models

🔧MLOps Blog

iankduncan.com·

Google DeepMind releases Gemma 4 QAT, but Unsloth developer Daniel Han warns naive llama.cpp conversions suffer accuracy loss

🔥PyTorch News

digg.com·

AI The Truly Environmentally Friendly Way

Foundation Models: Apple Isn’t Building an AI Model. It’s Building an AI Platform.

Paper: LLM Translation of Compiler Intermediate Representation

Building & Benchmarking: LLMs on a 16GB Jetson Orin NX for Hermes Agent

Research Proposal: Decoupled RISC-LLM Architectures via Circadian Synaptic Consolidation

Melanie Mitchell: What We Get Wrong About AI

Google Shrank Gemma 4 by 72% and Unsloth Fixed the 4-Bit Bug Nobody Else Caught on One 4090, and 4-Bit Shouldn’t Be This Good

The Rise of Agentic AI: What Every Engineer Should Learn

What Are Tokens in LLMs?

DiffusionGemma: The Developer Guide

Should LLM Agents Decide in Social Simulations? Comparing Finite-State and LLM-Based Decision Policies

Gemma Collins’ mum rushed to hospital as I’m A Celeb star says she’s ‘so worried she can’t sleep’

Google Gemma 4 12B nearly matches 26B benchmarks — and runs on your laptop

Making Local LLM Fast

Creating ADK Agent using locally running Gemma 4

Claude Fable 5 is Mythos for the masses

KaiFelixBennett/gemma4-turboquant-rdna4: Run Gemma-4-31B at full 256K context on a $1,400 AMD RDNA4 GPU (gfx1201): TurboQuant KV cache + HIP-graph-safe Flash-Attention for llama.cpp, fully measured on real hardware.

LLM Research Papers: The 2026 List (January to May)

What's in the Box? A Field Guide to AI Models

Google DeepMind releases Gemma 4 QAT, but Unsloth developer Daniel Han warns naive llama.cpp conversions suffer accuracy loss