KV Cache

Feeds to Scour
SubscribedAll
Scoured 183 posts in 32.3 ms

GLM-5.2: Z.ai Ships 1M-Token Coding Model With Zero Benchmarks

 💻Software Engineering  Content type: Blog
wowhow.cloud··DEV·Covers: DEV Community

12B Gemma 4 QAT Deployment with NVIDIA L4, Cloud Run, MCP, and Antigravity CLI

 🧠LLM Inference  Content type: Blog
medium.com
·

Mlx-optiq: per-layer mixed-precision LLM quantization for Apple Silicon

 💬LLMs  Content type: Video  Content type: Discussion  Content type: Tutorial

PolyKV: Heterogeneous Retention and Allocation for KV Cache Compression

 🔢Vector DBs  Content type: Academic
arxiv.org·

Show HN: Quant Picker – which GGUF file fits your model and machine

 💬LLMs

Rebellions bets on memory-centric AI inference

 🧠LLM Inference
jonpeddie.com·

Inference cost at scale with napkin math (13 minute read)

 🧠LLM Inference  Content type: Blog

Native Inference Engine for macOS 14 or newer

 🧠LLM Inference  Content type: Code
github.com··Hacker News

Inside the LLM KV Cache: The Hidden System Behind Fast AI Inference

 🧠LLM Inference  Content type: Blog
fardinkai.medium.com·

I gave my gaming PC and phone the same local LLM tasks, and only one of them is still in my daily rotation

 🧠LLM Inference
xda-developers.com·

vLLM Transformers Backend: Bridging Hugging Face Compatibility and High-Performance Inference

 🧠LLM Inference  Content type: Blog
odsc.medium.com·

SMEPilot: Characterizing and Optimizing LLM Inference with Scalable Matrix Extensions

 🧠LLM Inference  Content type: Academic
arxiv.org·

Running Local LLMs With Ollama For Private Development

 🧠LLM Inference  Content type: Tutorial
nazarboyko.com··DEV

Google OpenRL Tames AI Model Tuning, Kubernetes-Style

 🔧MLOps

All sorts of famous Attention Layers

 🧠LLM Inference  Content type: Blog

Parallelize speculative decoding with P-EAGLE on Amazon SageMaker AI

 🧠LLM Inference  Content type: Blog
aws.amazon.com·

Please Use My Free Software

 🗄️Storage Engines  Content type: Blog
artlu.bearblog.dev·
Sign up or log in to see more results

Keyboard Shortcuts

Navigation

Next / previous item
j/k
Open post
oorEnter
Preview post
v

Post Actions

Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s

Recommendations

Add interest / feed
Enter
Not interested
x

Go to

Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/

General

Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help