Reducing Cold Start Latency for LLM Inference with NVIDIA Run:ai Model Streamer
developer.nvidia.comยท19hยท
Discuss: Hacker News
๐Ÿ“ŠModel Serving Economics
LLM Enhancement with Domain Expert Mental Model to Reduce LLM Hallucination with Causal Prompt Engineering
arxiv.orgยท15h
๐Ÿช„Prompt Engineering
Hardware and model recommendations for on-prem LLM deployment
reddit.comยท7hยท
Discuss: r/LocalLLaMA
๐Ÿ“ŠModel Serving Economics
The First vLLM Meetup in Korea
blog.vllm.aiยท19h
๐Ÿ†LLM Benchmarking
No Answer Needed: Predicting LLM Answer Accuracy from Question-Only Linear Probes
lesswrong.comยท4h
๐Ÿ†LLM Benchmarking
What will AI look like by 2030 if current trends hold?
threadreaderapp.comยท2h
๐Ÿ†•New AI
Chapter 1: LLM Fundamentals
cline.ghost.ioยท4h
๐Ÿ†LLM Benchmarking
The Case for Compact AI โ€“ Communications of the ACM
dl.acm.orgยท11hยท
Discuss: Hacker News
๐Ÿ†LLM Benchmarking
Is Recursion in LLMs a Path to Efficiency and Quality?
pub.towardsai.netยท19h
๐Ÿง LLM Inference
How to build AI scaling laws for efficient LLM training and budget maximization
news.mit.eduยท4h
๐Ÿ†LLM Benchmarking
AQUA: Attention via QUery mAgnitudes for Memory and Compute Efficient Inference in LLMs
arxiv.orgยท15h
๐Ÿง LLM Inference
Learnings From 2025 AI For Life Science Conference (AI engineer view)
eamag.meยท19h
๐Ÿ†•New AI
Inference will win ultimately
i.redd.itยท4hยท
Discuss: r/LocalLLaMA
๐Ÿง LLM Inference
How Coding Agents Actually Work: Inside Opencode
cefboud.comยท18hยท
Discuss: r/programming
๐Ÿ”งDeveloper Tools
[URGENT] Which is a reliable and affordable GPU cluster for hosting custom LLMs for business
reddit.comยท9hยท
Discuss: r/LocalLLaMA
๐Ÿ–ฅGPUs
Chip Industry Technical Paper Roundup: Sept 16
semiengineering.comยท12h
๐Ÿ’ปChips
Model Kombat by HackerRank
producthunt.comยท15h
๐Ÿ†LLM Benchmarking
MAIstro โ€“ multi-agent framework for medical imaging workflows
github.comยท4hยท
Discuss: Hacker News
๐Ÿ”“Open Source Software
Automating Data Documentation with AI: How 7-Eleven Bridged the Metadata Gap
databricks.comยท19h
๐Ÿ‘จโ€๐Ÿ’ปAI Coding