🤖 AI - scour · Scour

harshuljain13/llm-inference-at-scale: A Practitioner handbook for production llm serving.

🤖LLMs Code

github.com··Hacker News

Siri AI at WWDC 2026

🖥️operating systems

simonwillison.net··Hacker News

Inferoa AI harness claimed 90% cache savings. We ran it and measured 97.8%

⚙️AI Applications

zozo123.github.io··Hacker News

DiffusionGemma: 4x Faster Text Generation

⚡Performance Engineering News Blog

blog.google··Hacker News, r/LocalLLaMA, r/singularity

Introducing the Third Generation of Apple’s Foundation Models

machinelearning.apple.com··Hacker News, r/apple

Machinic Psychopharmacology: Do LLMs Self-Medicate?

lesswrong.com··Hacker News

CodegenBench: Can LLMs Write Efficient Code Across Architectures?

🏗️software engineering Academic

arxiv.org··Hacker News

Apple rebuilt its on-device AI stack at WWDC 2026

🖥️operating systems Blog

ziraph.com··Hacker News

OpenEnv is now owned by HF, Torch, Prime Intellect, Unsloth, Modal, Mercor, and more! Use it for training agents.

🔌MCP Blog

huggingface.co··Hacker News, r/LocalLLaMA

How we fight GPU scarcity without compromise

🤖LLMs Blog

equixly.com··Hacker News

DeepSeekV4 1.6T Day 0 to Day 43 Performance Over Time - Huawei, GB300 NVL72, MI355X, B200

⚙️AI Applications News

newsletter.semianalysis.com

··Hacker News

Mbodi AI (YC P25) Is Hiring Founding Machine Learning Engineer (Robotics)

📊data engineering

ycombinator.com··Hacker News

Apple Outlines Major AI and Developer Tool Updates at 2026 Platforms State of the Union

🧰developer tools News

macrumors.com··Hacker News

GGUF vs GPTQ vs AWQ: The Plain-English Guide to LLM Quantization (and Which One to Pick)

vettedconsumer.com··Hacker News

Researchers trained an open source AI search agent, Harness-1, that outperforms GPT-5.4 on recalling relevant information

venturebeat.com··Hacker News

Integrate on-device AI models into your app using Core AI - WWDC26 - Videos

developer.apple.com··Hacker News

Introducing Granite Libraries and Project Granite Switch

🏗️software engineering Blog

research.ibm.com··Hacker News

Token4Token — pay-per-token inference on Gnosis + Swarm

⚙️AI Applications

t4t.eth.link··Hacker News

Show HN: Run Llama.cpp In-Process from Java with Project Panama FFM

🧰developer tools

deemwar-products.github.io··Hacker News

Tokenminning: Because Tokenmaxxing Is a Bad Idea

tokenminning.com··Hacker News

Log in to enable infinite scrolling