🖥 GPUs - emschwartz · Scour

I've been running some of the biggest open-weight LLMs for free on Nvidia's cloud 🏗️LLM Infrastructure

xda-developers.com·18h

NVIDIA and Google Cloud Expand AI Hypercomputer Platform at Next 2026 📊Model Serving Economics

storagereview.com·6d

Designing multitenant GPU infrastructure: Isolation across virtualization and Kubernetes platforms 📦Container Runtimes

Powering AI Factories with NVIDIA Enterprise Reference Architectures 🏗️LLM Infrastructure

developer.nvidia.com·1d

Sources: AI startups are struggling to access Nvidia GPUs as Microsoft and other cloud providers divert supply to internal teams and large customers like OpenAI... 🤖AI

techmeme.com·6d

Is your compute strategy ready for AI workloads in the cloud? 🏗️LLM Infrastructure

·2d

FOMO is why enterprises pay for GPUs they don't use — and why prices keep climbing 📊Model Serving Economics

venturebeat.com·1d

Show HN: Utilyze, an open source GPU monitoring tool more accurate than nvtop 📊Model Serving Economics

systalyze.com·3d·Hacker News

Efficient, VRAM-Constrained xLM Inference on Clients 🏗️LLM Infrastructure

NVIDIA deploys GPT-5.5-powered Codex to 10,000 employees, with engineers calling results 'mind-blowing' 🤖AI

tweaktown.com·6d

Nvidia is no longer just selling the shovels. Nemotron 3 Nano Omni is the company’s most aggressive move into AI models. 🆕New AI

thenextweb.com·2d·r/LLM

nvidia/Nemotron-3-Nano-Omni-30B-A3B-Reasoning-BF16 🏗️LLM Infrastructure

huggingface.co·2d·r/LocalLLaMA

Fine-Tuned LLMs on Serverless Architecture 🏗️LLM Infrastructure

digitalocean.com·3d

PMZFX/intel-arc-pro-b70-benchmarks: Benchmark results and performance data for the Intel Arc Pro B70 GPU (Xe2/Battlemage) - LLM inference, video generation, dual-GPU scaling. 🤖AI

github.com·6d·Hacker News

Can Google Win the AI Hardware Race Through TPUs? 📊Model Serving Economics

google-ai-race.pagey.site·5d·Hacker News

MauroCE/m3serve: Optimised BAAI/bge-m3 serving with dense + sparse + ColBERT embeddings, async dynamic batching and pipeline GPU inference 📦Batch Embeddings

github.com·3d·r/SideProject

A Matrix-Free Galerkin Multigrid Solver and Failure-Mode Screen for Single-GPU 3D SIMP Linear Systems 📊Model Serving Economics

AMD GPUs are finally getting the one feature that has become Nvidia's new USP ⚡Hardware Acceleration

xda-developers.com·7h

How we built the most performant DeepSeek V3.2, MiniMax-M2.5 and Qwen 3.5 397B on DigitalOcean NVIDIA HGX™ B300 GPU Droplets 🏗️LLM Infrastructure

digitalocean.com·2d

Meta signs multibillion-dollar deal for Amazon Graviton5 chips as AI compute demand outstrips $135B capex budget 🏗️LLM Infrastructure

thenextweb.com·5d

Log in to enable infinite scrolling