Model Efficiency

Inference Optimization, VRAM Calculation, Performance Tuning, Resource Management

Feeds to Scour
SubscribedAll
Scoured 49 posts in 7.9 ms

The economics of speculative decoding

 LLM Optimization  Content type: Blog

How to cut the cost of long AI agent threads (without making the agent dumber)

 ✍️Prompt Engineering  Content type: Blog
viktor.com··Hacker News

A system programmer’s guide to LLM inference

 LLM Optimization  Content type: Blog

Machinic Psychopharmacology: Do LLMs Self-Medicate?

 LLM Optimization

Launch HN: General Instinct (YC P26) – Frontier models on edge devices

 LLM Optimization  Content type: Discussion

Tangram: Unlocking Non-Uniform KV Cache for Efficient Multi-turn LLM Serving

 LLM Optimization  Content type: Academic
arxiv.org··Hacker News

Measuring Embedding Drift: Why Hybrid Search Saves Stale Models.

 🤖AI
pub.towardsai.net
·

Tired of GitHub Trending being GitHub-only, so we made a multi-forge version (GitLab and Codeberg included)

 🛠️Developer Tools
Less-relevant results

Catlantean 3D - Making Graphics Like It's 1993

 ✍️Prompt Engineering

Gemma 4 QAT models: Optimizing model compression for mobile and laptop efficiency

 LLM Optimization  Content type: News  Content type: Blog
blog.google··Hacker News

The iPhone’s Last Stand

 LLM Optimization

Efficient and Training-Free Single-Image Diffusion Models

 LLM Optimization

NetX-lab/Frontier: Frontier: A Discrete-Event Simulator for Modern LLM Serving

 LLM Optimization  Content type: Code
github.com··Hacker News

LLM Research Papers: The 2026 List (January to May)

 🤖AI  Content type: News

Apple rebuilt its on-device AI stack at WWDC 2026

 🤖AI  Content type: Blog
ziraph.com··Hacker News

NVIDIA and LG Group Build an AI Factory to Advance Physical AI, Mobility and AI Infrastructure

 LLM Optimization  Content type: Blog

Show HN: Magenta Real-Time Music Generation on iPhone, Without the GPU

 🤖AI  Content type: Code
github.com··Hacker News

OpenCV 5 Is Here: The Biggest Leap in Years for Computer Vision

 LLM Optimization

Linux 7.1-rc7: give rc7 a whirl and keep testing

 🔓Hacking
lwn.net··Hacker News

Fine-tune FLUX.2 [Klein] with a LoRA under 60 minutes

 🤖AI  Content type: Blog

Keyboard Shortcuts

Navigation

Next / previous item
j/k
Open post
oorEnter
Preview post
v

Post Actions

Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s

Recommendations

Add interest / feed
Enter
Not interested
x

Go to

Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/

General

Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help