⚡ Model Efficiency - jimman · Scour

The economics of speculative decoding

⚡LLM Optimization Blog

fergusfinn.com··Hacker News

How to cut the cost of long AI agent threads (without making the agent dumber)

✍️Prompt Engineering Blog

viktor.com··Hacker News

A system programmer’s guide to LLM inference

⚡LLM Optimization Blog

blog.xiangpeng.systems··Hacker News

Machinic Psychopharmacology: Do LLMs Self-Medicate?

⚡LLM Optimization

lesswrong.com··Hacker News

Launch HN: General Instinct (YC P26) – Frontier models on edge devices

⚡LLM Optimization Discussion

news.ycombinator.com··Hacker News

Tangram: Unlocking Non-Uniform KV Cache for Efficient Multi-turn LLM Serving

⚡LLM Optimization Academic

arxiv.org··Hacker News

Measuring Embedding Drift: Why Hybrid Search Saves Stale Models.

pub.towardsai.net

·

Tired of GitHub Trending being GitHub-only, so we made a multi-forge version (GitLab and Codeberg included)

🛠️Developer Tools

gitgem.org··Hacker News, r/opensource

Less-relevant results

Catlantean 3D - Making Graphics Like It's 1993

✍️Prompt Engineering

staniks.github.io··Lobsters, Hacker News, r/programming

Gemma 4 QAT models: Optimizing model compression for mobile and laptop efficiency

⚡LLM Optimization News Blog

blog.google··Hacker News

The iPhone’s Last Stand

⚡LLM Optimization

stratechery.com··Hacker News

Efficient and Training-Free Single-Image Diffusion Models

⚡LLM Optimization

haojunqiu.github.io··Hacker News

NetX-lab/Frontier: Frontier: A Discrete-Event Simulator for Modern LLM Serving

⚡LLM Optimization Code

github.com··Hacker News

LLM Research Papers: The 2026 List (January to May)

🤖AI News

magazine.sebastianraschka.com

··Hacker News

Apple rebuilt its on-device AI stack at WWDC 2026

🤖AI Blog

ziraph.com··Hacker News

NVIDIA and LG Group Build an AI Factory to Advance Physical AI, Mobility and AI Infrastructure

⚡LLM Optimization Blog

blogs.nvidia.com··Hacker News

Show HN: Magenta Real-Time Music Generation on iPhone, Without the GPU

🤖AI Code

github.com··Hacker News

OpenCV 5 Is Here: The Biggest Leap in Years for Computer Vision

⚡LLM Optimization

opencv.org··Hacker News, Hacker News

Linux 7.1-rc7: give rc7 a whirl and keep testing

lwn.net··Hacker News

Fine-tune FLUX.2 [Klein] with a LoRA under 60 minutes

🤖AI Blog

huggingface.co··Hacker News

Log in to enable infinite scrolling