🔥 Burn - nmarshall · Scour

I stopped using most of Rust’s advanced features for my ML library

🔥PyTorch Code

github.com··r/rust

Re-quantizing a local LLM 14x faster by skipping the tensors that didn't change

💻Local LLMs News Blog

andreaborio.substack.com··Substack

TensorBench: Benchmarking Coding Agents on a Compiler-Based Tensor Framework

🔥PyTorch Academic

Is Doom a Tensor? [video]

🔥PyTorch Video

youtube.com··Hacker News

How we fight GPU scarcity without compromise

🤖AI Inference Blog

equixly.com··Hacker News

A system programmer’s guide to LLM inference

💻Local LLMs Blog

blog.xiangpeng.systems··Hacker News

lbj96347/nemotron-3.5-asr-ios: On-device, offline speech recognition for iPhone/iPad using NVIDIA's Nemotron-3.5-ASR Streaming 0.6B (multilingual) via CoreML.SwiftUI app with mic capture + audio file import, RNN-Tdecoding, and live benchmark metrics (latency, RTF, memory).

🎙️Whisper Code

github.com··Hacker News

Google reportedly orders at least three million chips from Intel to arrive in 2028, as TSMC struggles to keep up with the AI boom

🖥computers News

··Hacker News

Tensor Shapes in Pyrefly – Avik Chaudhuri – PyCon US 2026 Typing Summit [video]

🔥PyTorch Video

youtube.com··Hacker News

Apple rebuilt its on-device AI stack at WWDC 2026

💻Local LLMs Blog

ziraph.com··Hacker News

Tensor Algebraic Property Skeletons: Amplifying Property-Based Testing for AI Compilers

🔨Compilers Academic

The Edge LLM Offload Story

semiengineering.com·

harshuljain13/llm-inference-at-scale: A Practitioner handbook for production llm serving.

🤖AI Inference Code

github.com··Hacker News

Integrate on-device AI models into your app using Core AI - WWDC26 - Videos

🌟cool github projects

developer.apple.com··Hacker News

Florian Brand, Prime Intellect research engineer, adopts Gemma 4 E4B 6-bit quantized as his primary local Mac LLM

🏗️AI Infrastructure News

digg.com··Hacker News

Running Qwen 35B MoE at 450k Context on a Single 32GB GPU

local-llm.utop.workers.dev··Hacker News

Unpacking AI: The Hardware Behind AI

⚡Hardware Acceleration News

pathtostaff.com··Hacker News

Gemma 4 QAT models: Optimizing model compression for mobile and laptop efficiency

💻Local LLMs News Blog

blog.google··Hacker News

Higgs Audio v3 TTS 4B. Built for voice chat. Support 100 languages and inline control.

🗣️Speech Synthesis

huggingface.co··Hacker News, r/LocalLLaMA

Apples to Apples: MLX vs. Llama.cpp for Gemma 4 12B on an M1 16GB

🤖AI agents Blog

ziraph.com··Hacker News

No more posts from nmarshall's subscribed feeds.

Scour all 25255 feeds Learn more about Feeds

Log in to enable infinite scrolling