💬 LLMs - sarah · Scour

🧠Agentic AI llama-dash.dev·

One go-to control plane for local inference

Discussed on Hacker News

📚LMS GitHub·

Native Inference Engine for macOS 14 or newer

Discussed on Hacker News

🤖AI Claude·

The full Claude Desktop experience on AWS, Google Cloud, and Microsoft Foundry

Discussed on Hacker News

🤖AI Baseten·

We built the fastest API for GLM-5.2 (280 TPS)

Covers GLM-5.2 (6 minute read)

Discussed on Hacker News

🧠Agentic AI akarouter.dev·

Flat per-call LLM API gateway (20x cheaper than Claude Max)

Discussed on Hacker News

📚LMS lector.dev·

Show HN: Evaluating Local LLMs as language translators for my app

Discussed on Hacker News

🤖AI fareedkhan-dev.github.io·

Train LLM from Scratch

Discussed on Hacker News

🔧Hardware ludion.ai·

WebGPU feature detection was not enough to run small LLMs on phones

Discussed on Hacker News

🤖AI whyopensource.ai·

A running list of reasons to move to open source

Covers 3 stories including Statement on the US government directive to suspend access to Fable 5 and Mythos 5

Discussed on Hacker News

🔧Hardware Hacker News·

Ask HN: What are some good/fast coding models for Apple Silicon?

Discussed on Hacker News

🤖AI portal.neuralwatt.com·

Neuralwatt: Energy-based pricing for AI inference. Efficient prompts cost less

Discussed on Hacker News

🔧Hardware graphsignal.com·

CUDA Profiler for Production Inference

Discussed on Hacker News

🤖AI hello-fri-end.github.io·

Integer Quantization: Deep Dive

Discussed on Hacker News

🔧Hardware groq.com·

Groq Raises Another $650M

Covered by TechCrunch, SiliconANGLE

Discussed on Hacker News

🔧Hardware brightray.ai·

Built Uber aggregator that tracks top AI researchers and leaders

Discussed on Hacker News

🔧Hardware julianrdcosta.substack.com·

Any Sufficiently Large Lookup Table Must Be Conscious

Discussed on Substack

🔧Hardware rocm.blogs.amd.com·

Unlocking Extreme AMD Instinct Inference with Software-Hardware Co-Optimization

Discussed on Hacker News

🚀Tech Trends tokenprices.io·

I Tracked LLM Pricing for 8 Weeks. Here's What the Data Shows

Discussed on Hacker News

🔧Hardware xcancel.comVideo·

Fable 5 pushed Gemma 4 to 255 tok/s on WebGPU

Discussed on Hacker News

Less-relevant results

🧠Agentic AI moorcheh.ai·

Information-Theoretic Vector Search Is Having Its Moment

Covered by GitHub

Discussed on Hacker News

Log in to enable infinite scrolling