Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
programming, security, AI, llms, science, finance
🤖 programming, security, AI, llms, science, finance
Broad
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
1834
posts in
9.7
ms
harshuljain13/llm-inference-at-scale
: A Practitioner handbook for production
llm
serving.
☕
Espresso
Content type:
Code
github.com
·
4d
4 days ago
·
Hacker News
Actions for harshuljain13/llm-inference-at-scale: A Practitioner handbook for production llm serving.
LangChain
Explained: Understanding Models, Prompts, Chains, Memory, Indexes, and Agents
🔷
Go, typescript
Content type:
Blog
towardsai.net
·
2d
2 days ago
Actions for LangChain Explained: Understanding Models, Prompts, Chains, Memory, Indexes, and Agents
147th airhacks tv: Local
LLMs
, LightMetal, ZSmith Agents,
AI
Rails, Saving Tokens
🔷
Go, typescript
Content type:
Blog
adambien.blog
·
15h
15 hours ago
Actions for 147th airhacks tv: Local LLMs, LightMetal, ZSmith Agents, AI Rails, Saving Tokens
LLM
Routing: From Strategy Selection to Production Architecture
🥓
Charcuterie
Content type:
Blog
blog.n8n.io
·
3h
3 hours ago
Actions for LLM Routing: From Strategy Selection to Production Architecture
Report: GKE
Inference
Gateway delivers up to 92% faster
AI
responses
☕
Espresso
Content type:
Blog
cloud.google.com
·
1d
1 day ago
·
Hacker News
Actions for Report: GKE Inference Gateway delivers up to 92% faster AI responses
RKSC: Reasoning-Aware KV Cache Sharing and Confident Early Exit for Multi-Step
LLM
Inference
🔷
Go, typescript
Content type:
Academic
arxiv.org
·
14h
14 hours ago
Actions for RKSC: Reasoning-Aware KV Cache Sharing and Confident Early Exit for Multi-Step LLM Inference
Philosophy
🔷
Go, typescript
Content type:
Reference
docs.langchain.com
·
4d
4 days ago
Actions for Philosophy
Inferoa
AI
harness claimed 90% cache savings. We ran it and measured 97.8%
🔷
Go, typescript
zozo123.github.io
·
7h
7 hours ago
·
Hacker News
Actions for Inferoa AI harness claimed 90% cache savings. We ran it and measured 97.8%
DeepSeekV4 1.6T Day 0 to Day 43 Performance Over Time - Huawei, GB300 NVL72, MI355X, B200
☕
Espresso
Content type:
News
newsletter.semianalysis.com
·
1d
1 day ago
·
Hacker News
Actions for DeepSeekV4 1.6T Day 0 to Day 43 Performance Over Time - Huawei, GB300 NVL72, MI355X, B200
The
Inference
Alpha: Maximizing Frontier Models on AMD
☕
Espresso
Content type:
Blog
digitalocean.com
·
3h
3 hours ago
Actions for The Inference Alpha: Maximizing Frontier Models on AMD
a place for friends of OpenJDK
🔷
Go, typescript
foojay.io
·
23h
23 hours ago
Actions for a place for friends of OpenJDK
AI
inference
: what it is and why it matters for product managers
🔷
Go, typescript
marcabraham.com
·
1d
1 day ago
Actions for AI inference: what it is and why it matters for product managers
Making Local
LLM
Go Brrr
☕
Espresso
seanpedersen.github.io
·
6d
6 days ago
Actions for Making Local LLM Go Brrr
Infrastructure Options for Scalable
AI
Inference
☕
Espresso
Content type:
Blog
mirantis.com
·
19h
19 hours ago
Actions for Infrastructure Options for Scalable AI Inference
NVIDIA Accelerates Google DeepMind’s DiffusionGemma for Local
AI
☕
Espresso
Content type:
Blog
blogs.nvidia.com
·
2h
2 hours ago
Actions for NVIDIA Accelerates Google DeepMind’s DiffusionGemma for Local AI
A system
programmer
’s guide to
LLM
inference
☕
Coffee Roasting
Content type:
Blog
blog.xiangpeng.systems
·
2d
2 days ago
·
Hacker News
Actions for A system programmer’s guide to LLM inference
DiffusionGemma: 4x Faster Text
Generation
☕
Espresso
Content type:
News
Content type:
Blog
blog.google
·
2h
2 hours ago
·
Hacker News
,
r/LocalLLaMA
,
r/singularity
Actions for DiffusionGemma: 4x Faster Text Generation
Using
Scikit-LLM
with Open-Source LLMs
🔷
Go, typescript
machinelearningmastery.com
·
6d
6 days ago
Actions for Using Scikit-LLM with Open-Source LLMs
DiffusionGemma: The Developer Guide- Google Developers Blog
☕
Espresso
Content type:
Blog
developers.googleblog.com
·
18h
18 hours ago
·
r/LocalLLaMA
Actions for DiffusionGemma: The Developer Guide- Google Developers Blog
Token4Token — pay-per-token
inference
on Gnosis + Swarm
🔷
Go, typescript
t4t.eth.link
·
1d
1 day ago
·
Hacker News
Actions for Token4Token — pay-per-token inference on Gnosis + Swarm
Page 2 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help