๐Ÿฟ๏ธ ScourBrowse
LoginSign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
๐Ÿง  Inference Serving

Request Batching, Model Loading, Throughput Optimization, Latency Management

Jan Nano + Deepseek R1: Combining Remote Reasoning with Local Models using MCP
huggingface.coยท14hยท
Discuss: r/LocalLLaMA
๐Ÿ“‹MCP
Slashing CI Costs at Uber
uber.comยท13hยท
Discuss: Hacker News
๐Ÿ› ๏ธBuild Optimization
Alleviating User-Sensitive bias with Fair Generative Sequential Recommendation Model
arxiv.orgยท19h
๐ŸŽ›๏ธFeed Filtering
How to Streamline Complex LLM Workflows Using NVIDIA NeMo-Skills
developer.nvidia.comยท23h
๐Ÿ†LLM Benchmarking
Who Would Win: A State-of-the-Art Foundation Model or a Neural Net?
pub.towardsai.netยท7h
๐Ÿ”ขBitNet
Build a Personalized AI Assistant with Postgres
supabase.comยท16h
๐Ÿ’พPrompt Caching
Introducing Active CPU pricing for Fluid compute
vercel.comยท10h
๐Ÿ–ฅGPUs
Trace Distributed Map states for AWS Step Functions with Datadog
datadoghq.comยท23h
๐ŸŒDistributed systems
MUVERA: Making multi-vector retrieval as fast as single-vector search
research.googleยท9h
๐ŸŽฏQdrant
What Inflection AI Learned Porting Its LLM Inference Stack from NVIDIA to Intel Gaudi
thenewstack.ioยท4h
โšกHardware Acceleration
16 Changes to AI in the Enterprise: 2025 Edition | Andreessen Horowitz
a16z.comยท14h
๐Ÿ“ŠModel Serving Economics
Introducing ByteDance's Revolutionary Seedance 1.0 AI Video Generation Model
sheaupei.comยท8h
๐Ÿ”ƒFeed Algorithms
AI Training Load Fluctuations at Gigawatt-scale โ€“ Risk of Power Grid Blackout?
semianalysis.comยท4hยท
Discuss: Hacker News
๐Ÿ–ฅGPUs
Toward Environmentally Equitable AI
cacm.acm.orgยท7h
๐Ÿ–ฅGPUs
Show HN: Anagnorisis, local data-management with trainable recommendation engine
github.comยท3mยท
Discuss: Hacker News
๐ŸŽฏQdrant
Introducing Northguard and Xinfra: Scalable log storage at Lin...
linkedin.comยท23h
๐Ÿ“‚LiteFS
DRIFT: Data Reduction via Informative Feature Transformation- Generalization Begins Before Deep Learning starts
arxiv.orgยท19h
๐Ÿ“ŠVector Databases
My AI Workflow for Understanding Any Codebase
steipete.meยท12h
๐Ÿช„Prompt Engineering
Show HN: Requests-Based Google Maps Scraper
apify.comยท2hยท
Discuss: Hacker News
๐Ÿ“‘Inverted Indexes
Black-Box Test Code Fault Localization Driven by Large Language Models and Execution Estimation
arxiv.orgยท19h
๐Ÿ•ฏ๏ธCandle
Loading...Loading more...
AboutBlogChangelogRoadmap