🐿️ ScourBrowse
LoginSign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
πŸ“Š Model Serving Economics

GPU Costs, Inference Pricing, Batch Optimization, Resource Efficiency

16 Changes to AI in the Enterprise: 2025 Edition | Andreessen Horowitz
a16z.comΒ·8h
πŸ†•New AI
Jan Nano + Deepseek R1: Combining Remote Reasoning with Local Models using MCP
huggingface.coΒ·8hΒ·
Discuss: r/LocalLLaMA
πŸ“‹MCP
Plan for Speed -- Dilated Scheduling for Masked Diffusion Language Models
arxiv.orgΒ·13h
🧠LLM Inference
Toward Environmentally Equitable AI
cacm.acm.orgΒ·1h
πŸ–₯GPUs
Introducing Active CPU pricing for Fluid compute
vercel.comΒ·4h
πŸ–₯GPUs
Who Would Win: A State-of-the-Art Foundation Model or a Neural Net?
pub.towardsai.netΒ·1h
πŸ”’BitNet
What does 10x-ing effective compute get you?
lesswrong.comΒ·22h
πŸ†LLM Benchmarking
Meta's V-JEPA 2 Aims to Redefine AI's Spatial Reasoning Without Video Data
gazeon.siteΒ·45mΒ·
Discuss: Hacker News
πŸ†•New AI
AI benchmarking tools evaluate real world performance
infoworld.comΒ·12h
πŸ†LLM Benchmarking
AMD researchers reduce graphics card VRAM capacity of 3D-rendered trees from 38GB to just 52 KB with work graphs and mesh nodes β€” shifting CPU work to the GPU y...
tomshardware.comΒ·6h
πŸ–₯GPUs
AMD Instinct MI60 (32gb VRAM) "llama bench" results for 10 models - Qwen3 30B A3B Q4_0 resulted in: pp512 - 1,165 t/s | tg128 68 t/s - Overall very pleased and ...
preview.redd.itΒ·19hΒ·
Discuss: r/LocalLLaMA
πŸ–₯GPUs
Your Data Engine Is the Moat - Here’s How to Own It.
labelstud.ioΒ·22hΒ·
Discuss: Hacker News
πŸ†•New AI
Nvidia Can Reach $6 Trillion Market Cap on AI Growth, Loop Says
bloomberg.comΒ·4h
πŸ–₯GPUs
6 Key Security Risks in LLMs: A Platform Engineer’s Guide
thenewstack.ioΒ·22h
πŸ•³LLM Vulnerabilities
Greedy Is Good. Less Greedy May Be Better
gojiberries.ioΒ·16hΒ·
Discuss: Hacker News
πŸ†LLM Benchmarking
Slashing CI Costs at Uber
uber.comΒ·7hΒ·
Discuss: Hacker News
πŸ› οΈBuild Optimization
Llama.cpp vs API - Gemma 3 Context Window Performance
reddit.comΒ·13hΒ·
Discuss: r/LocalLLaMA
πŸ’ΎPrompt Caching
Getting an LLM to set its own temperature
amanvir.comΒ·17hΒ·
Discuss: Hacker News
πŸ•³LLM Vulnerabilities
The dynamics and geometry of choice in the premotor cortex
nature.comΒ·17h
πŸ“ŠEmbeddings
Some Thoughts On The Future β€œDoudna” NERSC-10 Supercomputer
nextplatform.comΒ·12hΒ·
Discuss: Hacker News
πŸ–₯GPUs
Loading...Loading more...
AboutBlogChangelogRoadmap