🤖 artificial intelligence - MarkGao · Scour

On-device AI is a margin decision

⚙️AI Automation Blog

ziraph.com··Hacker News

A system programmer’s guide to LLM inference

🧠LLMs Blog

blog.xiangpeng.systems··Hacker News

HNSW vs LSH: How Elasticsearch hits 0.99 recall@10 at 15,000 QPS — and what it costs

🔬AI Research Tools Blog

Timing Trick Cuts Energy Used in LLM Training by Up to 14 Percent

✍️Prompt Engineering News

spectrum.ieee.org

··Hacker News

China's Xiaomi MiMo Is Now 15X Faster Than ChatGPT and Claude (4 minute read)

📰AI News News

decrypt.co··Hacker News

Magenta RealTime 2: Open and Local Live Music Models

🎨AI for Creators

magenta.withgoogle.com··Hacker News, Hacker News, r/LocalLLaMA

Loss Landscape Diagnosis for Gradient-Based Gray-Scott System Inversion: Disentangling the Roles of PINN Components

🤖AI Agents Academic

Xiaomi MiMo-V2.5-Pro Just Hit 1,000 Tokens Per Second!

🎼Agent Orchestration

harshuljain13/llm-inference-at-scale: A Practitioner handbook for production llm serving.

🧠LLMs Code

github.com··Hacker News, r/LLM

Two Leaps to 1000 Tokens/s on a 1T-Parameter Model: On Inference Systems, Execution Boundaries, and Co-Design

⚙️AI Automation Blog

tilert.ai··Hacker News

TurboQuant in PostgreSQL

⚛️Quantum Computing Blog

blog.mayflower.de·

Re-quantizing a local LLM 14x faster by skipping the tensors that didn't change

🧠LLMs News Blog

andreaborio.substack.com··Substack

The latest Gemma 4 models use a training trick to slash their on-device memory footprint

androidauthority.com·

Alduin 4B, an uncensored Vision LLm just released.

huggingface.co··r/StableDiffusion

The Smallest Brain You Can Build: A Perceptron in Python

🧠LLMs Discussion

news.ycombinator.com··Hacker News

MoQ GGUFs and GSQ: Low-Bit GGUFs Are About to Get Much Better

🧠LLMs News Blog

kaitchup.substack.com··r/LocalLLaMA

The week AI infrastructure crossed from a technology story to a financial one

🤖claude code News

Train Models Faster with JAX and MaxText Using NVFP4 on NVIDIA Blackwell

🧠LLMs News Blog

developer.nvidia.com·

Machine Learning With Manya: The Great Toy Shop Whisper Game

📦AI Product Launches Blog

Generalizable self-supervised learning for imaging flow cytometry on multi-dataset leukocyte differential

⚙️AI Automation Academic

Log in to enable infinite scrolling