Continuous Batching

Feeds to Scour
SubscribedAll
Scoured 161 posts in 6.7 ms

Speculators v0.5.0: DFlash support and online training

 🚀LLM Deployment
developers.redhat.com·

Two Leaps to 1000 Tokens/s on a 1T-Parameter Model: On Inference Systems, Execution Boundaries, and Co-Design

 Quantization  Content type: Blog
tilert.ai··Hacker News

Alignment Collapse Under KV Cache Quantization: Diagnosis and Mitigation

 🚀LLM Deployment  Content type: Academic
arxiv.org·

Running Qwen 35B MoE at 450k Context on a Single 32GB GPU

 Quantization

OpenCV 5.0 Computer Vision Library Released with Rewritten DNN Engine

 🚀LLM Deployment
linuxiac.com·

FOD#155: Continual Learning in LLMs: Why AI Models Need Sleep

 🗂️RAG Systems
turingpost.com·

Latest technical articles & videos.

 🤖LLMs
certdepot.net·

The LLM Gateway Pattern: Why Every Kubernetes-Based AI App Needs One

 🤖LLMs
freecodecamp.org·

Building & Benchmarking: LLMs on a 16GB Jetson Orin NX for Hermes Agent

 💻Local AI  Content type: Blog
dnhkng.github.io·

defai-digital/ax-engine: Apple Silicon LLM runtime supporting Gemma 4 and Qwen 3.6 MTP modes

 Speculative Decoding  Content type: Code
github.com··Hacker News

LLM Research Papers: The 2026 List (January to May)

 🗣️NLP  Content type: News
Less-relevant results

Anatomy of a high-performance EP kernel

 🚀LLM Deployment  Content type: Blog
fergusfinn.com·

MLPerf and the rise of latency-aware LLM benchmarking

 🗣️NLP
edn.com·

Youssof Altoukhi (@Youssofal_)

 🚀LLM Deployment
xcancel.com··r/LocalLLaMA

How the UK Is Turning Sovereign AI Ambition Into Action With NVIDIA Technologies

 🏢LLM Adoption  Content type: Blog
blogs.nvidia.com·

The Edge LLM Offload Story

 🗣️NLP
semiengineering.com·

Where to Host Your Open-Source Model (Under 10B Parameters)

 🔓Open Source AI
digitalocean.com·

Integrate OpenShift AI and PG Airman MCP Server

 🔓Open Source AI

IntentKV: Cross-Turn Intent-Aware KV Cache Pruning for Agent Inference

 🚀LLM Deployment  Content type: Academic
arxiv.org·

Issue #390 - The ML Engineer 🤖

 🗣️NLP  Content type: News  Content type: Blog

Keyboard Shortcuts

Navigation

Next / previous item
j/k
Open post
oorEnter
Preview post
v

Post Actions

Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s

Recommendations

Add interest / feed
Enter
Not interested
x

Go to

Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/

General

Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help