🔲 ML Hardware - nickyfoto · Scour

Efficient, VRAM-Constrained xLM Inference on Clients ⚡Performance Engineering

GPU Power Prediction Tool for AI Workloads (MIT, IBM) ⚡Performance Engineering

semiengineering.com·1d

Part III: The evolution to AI GPUs 🤖AI Research

jonpeddie.com·10h

Google New TPU Generation is Specifically Designed for Agents and SOTA Model Training 🤖AI Research

Supercharging LLM inference on Google TPUs: Achieving 3X speedups with diffusion-style speculative decoding 🤖LLM

developers.googleblog.com·2d·Hacker News, Hacker News, r/LocalLLaMA

Distributing model weights to your AI cluster: a faster pre-flight on AKS and Slurm ☁️Cloud Computing

techcommunity.microsoft.com·4h

Performance of CUDA Python in AI 🤖AI Research

·5d

Your old GPU can still run big LLMs – you just need the right tweaks 🧠LLMs

xda-developers.com·13h

Fitting LLMs on Self-Hosted GPUs ⚡Performance Engineering

OpenAI is teaming up with other companies to improve supercomputer networking for AI training. 🤖AI Research

·3h

## deck\.gl is a GPU-powered framework for visual exploratory data analysis of large datasets\. 📊Data Science

OpenCV’s DNN Library for Optimal Model Performance on CPU, CUDA, and New Architectures 👁️Computer Vision

armdevices.net·1d

Google Announces TPU v8t Sunfish and TPU v8i Zebrafish 🏗️System Design

storagereview.com·5d

e3ntity/e3rl: Fast and simple implementation of RL algorithms, designed to run fully on GPU. 🤖AI Research

github.com·12h·Hacker News

Boosting multimodal inference performance by >10% with a single Python dictionary ⚡Performance Engineering

modal.com·2d·Hacker News

Difference between revisions of "Jetson/L4T/Power" 🐹Go

elinux.org·19h

A SOTA quantization algorithm for high-accuracy low-bit LLM inference, seamlessly optimized for CPU/XPU/CUDA, with multi-datatype support and full compatibility... 🧠LLMs

SMG: The Case for Disaggregating CPU from GPU in LLM Serving ⚡Performance Engineering

pytorch.org·1d·Hacker News

Why we’re at a decisive turning point for resolving data fragmentation [Q&A] 🏗️System Design

betanews.com·14h

AI galaxy hunters could be adding to the global GPU crunch 🤖AI Research

techxplore.com·1d

Log in to enable infinite scrolling