⚙️ LLMOps - joshwonghc · Scour

Context compression finally works in production: new research cuts LLM input 16x without the accuracy hit

💻AI Engineering

venturebeat.com··r/LocalLLaMA

Your AI agent reads the fine print: building a RAG pipeline over EU regulations with Elasticsearch and OGX

💻AI Engineering Blog

The Containment Gap: How Deployed Agentic AI Frameworks Fail Public-Facing Safety Requirements

🤖AI Agents Academic

PagedAttention vs Traditional KV Cache: How vLLM Reinvented GPU Memory for LLM Inference

🌐Open Source AI Blog

·

Mi50 32GB / GFX906 - vLLM Qwen 3.5 Configuration for Qwen 3.5:9B AWQ-4bit

🌐Open Source AI

huggingface.co··r/LocalLLaMA

SmithDB

🤖AI Agents News

NULL BITMAP by Justin Jaffray via buttondown.com··Lobsters, Hacker News

Systematic research with LangChain's Deep Agents framework and Elasticsearch

🤖AI Agents Blog

Token4Token — pay-per-token inference on Gnosis + Swarm

🌐Open Source AI

t4t.eth.link··Hacker News

For whom the door-bell tolls

💻AI Engineering

GGUF vs GPTQ vs AWQ: The Plain-English Guide to LLM Quantization (and Which One to Pick)

🌐Open Source AI

vettedconsumer.com··Hacker News

Location: Lubbock, TX, USA Remote: Yes (Remote-friendly, US-based) Technologies:...

📚RAG Discussion

news.ycombinator.com··Hacker News

massimo92/spark: CLI tool for serving LLMs with vLLM on NVIDIA DGX Spark. One file, zero friction.

🌐Open Source AI Code

github.com··Hacker News

Kimi K2.7-Code cuts thinking tokens 30% — but practitioners say the benchmarks don't check out

💻AI Engineering

venturebeat.com·

Build a local voice agent with Red Hat OpenShift AI

developers.redhat.com·

RKSC: Reasoning-Aware KV Cache Sharing and Confident Early Exit for Multi-Step LLM Inference

🌐Open Source AI Academic

CommBench: Can LLMs Write Correct and Efficient GPU Communication Code?

💻AI Engineering

uccl-project.github.io··Hacker News

DiffusionGemma: The Developer Guide- Google Developers Blog

🌐Open Source AI Blog

developers.googleblog.com··r/LocalLLaMA·Cited by 1 article

How to Build an Agentic RAG with RubyLLM and Rails

💻AI Engineering Blog

panasiti.me··Hacker News

Youssof Altoukhi (@Youssofal_)

🌐Open Source AI

xcancel.com··r/LocalLLaMA

[AINews] Open Models, Model Labs vs Agent Labs, and What's Untrainable — Sarah Guo

🌐Open Source AI News

·

Sign up or log in to see more results

Log in to enable infinite scrolling