Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
LLMs
💬 LLMs
Specific
large language models, GPT, foundation models, inference
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
629
posts in
5.3
ms
Running
LLM
Inference
on Kubernetes: What It Actually Takes
☁️
Cloud Infrastructure
Content type:
Blog
fairwinds.com
·
6d
6 days ago
Actions for Running LLM Inference on Kubernetes: What It Actually Takes
massimo92/spark: CLI tool for serving
LLMs
with
vLLM
on NVIDIA DGX Spark. One file, zero friction.
🖥️
Hypervisors
Content type:
Code
github.com
·
3h
3 hours ago
·
Hacker News
Actions for massimo92/spark: CLI tool for serving LLMs with vLLM on NVIDIA DGX Spark. One file, zero friction.
The Neutral Mask: How
RLHF
Provides Shallow Alignment while Leaving Partisan Structure Intact in a
Large
Language
Model
🤖
AI/ML
Content type:
Academic
arxiv.org
·
2d
2 days ago
Actions for The Neutral Mask: How RLHF Provides Shallow Alignment while Leaving Partisan Structure Intact in a Large Language Model
Why Your
LLM
Gets Dumber With More
Context
🤖
AI/ML
siliconopera.com
·
8h
8 hours ago
Actions for Why Your LLM Gets Dumber With More Context
Ollama
0.30 GPU Boost: Faster local Qwen
inference
on NVIDIA
🚀
MLOps
everylocalai.com
·
1d
1 day ago
·
DEV
Actions for Ollama 0.30 GPU Boost: Faster local Qwen inference on NVIDIA
Intelligent
inference
scheduling with
llm-d
on Red Hat AI
☁️
Cloud Infrastructure
developers.redhat.com
·
23h
23 hours ago
Actions for Intelligent inference scheduling with llm-d on Red Hat AI
Introducing
LLM
as a Judge: Scaling search relevance evaluation with AI
☁️
Cloud Infrastructure
Content type:
Blog
opensearch.org
·
1h
1 hour ago
Actions for Introducing LLM as a Judge: Scaling search relevance evaluation with AI
Claude vs
GPT-4
: Which AI API Is Better for Developers? (2026)
🧠
Agentic AI
kalyna.pro
·
6d
6 days ago
·
DEV
Actions for Claude vs GPT-4: Which AI API Is Better for Developers? (2026)
Comprehensive evaluation of
LLM
capabilities for interpretation and analysis of genome-scale metabolic
models
in metabolic
engineering
🚀
MLOps
Content type:
Academic
biorxiv.org
·
2d
2 days ago
Actions for Comprehensive evaluation of LLM capabilities for interpretation and analysis of genome-scale metabolic models in metabolic engineering
Prompt
Caching Explained: The AI Concept That Can Save Millions of Tokens
🔍
AI Observability
Content type:
Blog
sweta-nit.medium.com
·
14h
14 hours ago
Actions for Prompt Caching Explained: The AI Concept That Can Save Millions of Tokens
Inferoa
AI harness claimed 90% cache savings. We ran it and measured 97.8%
📡
Observability
zozo123.github.io
·
1d
1 day ago
·
Hacker News
Actions for Inferoa AI harness claimed 90% cache savings. We ran it and measured 97.8%
Introducing the Third Generation of Apple’s
Foundation
Models
🤖
AI/ML
machinelearning.apple.com
·
3d
3 days ago
·
Hacker News
,
r/apple
Actions for Introducing the Third Generation of Apple’s Foundation Models
CommBench: Can
LLMs
Write Correct and Efficient GPU Communication Code?
🚀
MLOps
uccl-project.github.io
·
16h
16 hours ago
·
Hacker News
Actions for CommBench: Can LLMs Write Correct and Efficient GPU Communication Code?
My Notes on the Progression from
Context
to
Prompt
to Harness
engineering
in making GPT LLMs Useful: (TUESDAY) MAMLMs
🧠
Agentic AI
Content type:
News
Content type:
Blog
braddelong.substack.com
·
2d
2 days ago
·
Substack
Actions for My Notes on the Progression from Context to Prompt to Harness engineering in making GPT LLMs Useful: (TUESDAY) MAMLMs
New comment by alroma90 in "Ask HN: Who wants to be hired? (June 2026)"
🧠
Agentic AI
Content type:
Discussion
news.ycombinator.com
·
13h
13 hours ago
·
Hacker News
Actions for New comment by alroma90 in "Ask HN: Who wants to be hired? (June 2026)"
iOS 27 Security: What WWDC 2026’s AI Features Mean for Mobile App Risk
🤖
AI/ML
Content type:
Blog
nowsecure.com
·
2h
2 hours ago
Actions for iOS 27 Security: What WWDC 2026’s AI Features Mean for Mobile App Risk
DeepSeekV4 1.6T Day 0 to Day 43 Performance Over Time - Huawei, GB300 NVL72, MI355X, B200
🤖
AI/ML
Content type:
News
newsletter.semianalysis.com
·
2d
2 days ago
·
Hacker News
Actions for DeepSeekV4 1.6T Day 0 to Day 43 Performance Over Time - Huawei, GB300 NVL72, MI355X, B200
Improved performance and
model
support with GGUF
🤖
AI/ML
Content type:
Blog
ollama.com
·
6d
6 days ago
Actions for Improved performance and model support with GGUF
Show HN: In-browser real
LLM
token counter and cost estimation
🚀
MLOps
holaclaw.ai
·
7h
7 hours ago
·
Hacker News
Actions for Show HN: In-browser real LLM token counter and cost estimation
WWDC 2026:
Foundation
Models
(& Anarlog)
🏗️
Platform Engineering
skushagra.com
·
2d
2 days ago
Actions for WWDC 2026: Foundation Models (& Anarlog)
Page 2 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help