🏗️ LLM Infrastructure - emschwartz · Scour

🏗️ LLM Infrastructure

Model Serving, Inference Optimization, GPU Clusters, Production Deployment

Reducing Cold Start Latency for LLM Inference with NVIDIA Run:ai Model Streamer

developer.nvidia.com·19h·

Discuss: Hacker News

📊Model Serving Economics

LLM Enhancement with Domain Expert Mental Model to Reduce LLM Hallucination with Causal Prompt Engineering

arxiv.org·15h

🪄Prompt Engineering

Hardware and model recommendations for on-prem LLM deployment

reddit.com·7h·

Discuss: r/LocalLLaMA

📊Model Serving Economics

The First vLLM Meetup in Korea

blog.vllm.ai·19h

🏆LLM Benchmarking

No Answer Needed: Predicting LLM Answer Accuracy from Question-Only Linear Probes

lesswrong.com·4h

🏆LLM Benchmarking

What will AI look like by 2030 if current trends hold?

threadreaderapp.com·2h

Chapter 1: LLM Fundamentals

cline.ghost.io·4h

🏆LLM Benchmarking

The Case for Compact AI – Communications of the ACM

dl.acm.org·11h·

Discuss: Hacker News

🏆LLM Benchmarking

Is Recursion in LLMs a Path to Efficiency and Quality?

pub.towardsai.net·19h

🧠LLM Inference

How to build AI scaling laws for efficient LLM training and budget maximization

news.mit.edu·4h

🏆LLM Benchmarking

AQUA: Attention via QUery mAgnitudes for Memory and Compute Efficient Inference in LLMs

arxiv.org·15h

🧠LLM Inference

Learnings From 2025 AI For Life Science Conference (AI engineer view)

eamag.me·19h

Inference will win ultimately

i.redd.it·4h·

Discuss: r/LocalLLaMA

🧠LLM Inference

How Coding Agents Actually Work: Inside Opencode

cefboud.com·18h·

Discuss: r/programming

🔧Developer Tools

[URGENT] Which is a reliable and affordable GPU cluster for hosting custom LLMs for business

reddit.com·9h·

Discuss: r/LocalLLaMA

Chip Industry Technical Paper Roundup: Sept 16

semiengineering.com·12h

Model Kombat by HackerRank

producthunt.com·15h

🏆LLM Benchmarking

Clarifying Model Transparency: Interpretability versus Explainability in Deep Learning with MNIST and IMDB Examples

arxiv.org·15h

🔍AI Interpretability

MAIstro – multi-agent framework for medical imaging workflows

github.com·4h·

Discuss: Hacker News

🔓Open Source Software

Automating Data Documentation with AI: How 7-Eleven Bridged the Metadata Gap

databricks.com·19h

👨‍💻AI Coding

Loading more...