🧠 LLMs - sbhargava92 · Scour

Sparse Mixture-of-Experts Reward Models Learn Interpretable and Specialized Experts for Personalized Preference Modeling

🗄️Vector Databases Academic

A system programmer’s guide to LLM inference

⚙️Systems Programming Blog

blog.xiangpeng.systems··Hacker News

LeLab Is Hugging Face’s New Browser-Based GUI for the LeRobot Ecosystem

🌐Open Source News

magenta/magenta-realtime: Magenta RealTime 2: An Open-Weights Live Music Model

💾Storage Engines Code

What's in the Box? A Field Guide to AI Models

🤖AI Agents Blog

iankduncan.com·

Build a Medical Report Analyzer on Dedicated Inference with Python

⚙️Systems Programming

digitalocean.com·

Microsoft just shared the frontier data engineering secrets

mail.bycloud.ai·

Can Voice Agents Handle Bilingual Customers? Benchmarking Frontier ASR on Code-Switched Speech

🤖AI Agents Blog

huggingface.co·

Xiaomi MiMo-V2.5-Pro Just Hit 1,000 Tokens Per Second!

Nvidia Nemotron 3 Ultra

research.nvidia.com··Hacker News

Cohere open-sources a coding agent that runs on a single H100

venturebeat.com·

Location: Göttingen, Germany Remote: Yes (preferred; hybrid also fine) Willing t...

🤖AI Agents Discussion

news.ycombinator.com··Hacker News

Google Gemma 4 12B brings native multimodal AI to standard laptops

Less-relevant results

Google fills out the middle with the Gemma 4 12B

jonpeddie.com·

Running LLM Inference on Kubernetes: What It Actually Takes

🖥️OS Blog

fairwinds.com·

China's Xiaomi MiMo Is Now 15X Faster Than ChatGPT and Claude (4 minute read)

🔗Networking News

LLM Research Papers: The 2026 List (January to May)

🤖AI Agents News

magazine.sebastianraschka.com

··Hacker News

defai-digital/ax-engine: Apple Silicon LLM runtime supporting Gemma 4 and Qwen 3.6 MTP modes

🤖AI Agents Code

github.com··Hacker News

"North Mini Code"; open weights, 30B param, Canadian coding model

🤖AI Agents Blog

cohere.com··Hacker News

Running Qwen 35B MoE at 450k Context on a Single 32GB GPU

local-llm.utop.workers.dev··Hacker News

Log in to enable infinite scrolling