Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
AI Engineering
🤖 AI Engineering
AI engineer, ML pipelines, model deployment, inference
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
194
posts in
8.9
ms
RKSC: Reasoning-Aware KV Cache Sharing and Confident Early Exit for Multi-Step LLM
Inference
🤖
ai
Content type:
Academic
arxiv.org
·
14h
14 hours ago
Actions for RKSC: Reasoning-Aware KV Cache Sharing and Confident Early Exit for Multi-Step LLM Inference
Scale Robot Reinforcement Learning with NVIDIA Isaac Lab on Amazon SageMaker
AI
⚙️
MLOps
Content type:
Blog
aws.amazon.com
·
22h
22 hours ago
Actions for Scale Robot Reinforcement Learning with NVIDIA Isaac Lab on Amazon SageMaker AI
New comment by HorizonFlowLive in "Ask HN: Who wants to be hired? (June 2026)"
🧠
LLMs
Content type:
Discussion
news.ycombinator.com
·
1d
1 day ago
·
Hacker News
Actions for New comment by HorizonFlowLive in "Ask HN: Who wants to be hired? (June 2026)"
huawei-csl/KVarN: KVarN is a native
vLLM
KV-cache quantization backend for your agents: 3-5x more context, throughput above FP16, and FP16-level accuracy. Calibration-free, one flag.
🧠
LLM Inference
Content type:
Code
github.com
·
6d
6 days ago
·
Hacker News
Actions for huawei-csl/KVarN: KVarN is a native vLLM KV-cache quantization backend for your agents: 3-5x more context, throughput above FP16, and FP16-level accuracy. Calibration-free, one flag.
Token4Token — pay-per-token
inference
on Gnosis + Swarm
🧠
LLMs
t4t.eth.link
·
1d
1 day ago
·
Hacker News
Actions for Token4Token — pay-per-token inference on Gnosis + Swarm
How to Run Gemma 4 12B Locally - The Best
AI
For Consumer Laptops
🧠
LLM Inference
Content type:
Video
youtube.com
·
6d
6 days ago
Actions for How to Run Gemma 4 12B Locally - The Best AI For Consumer Laptops
PagedAttention vs Traditional KV Cache: How
vLLM
Reinvented GPU Memory for LLM
Inference
🧠
LLM Inference
Content type:
Blog
medium.com
·
2d
2 days ago
Actions for PagedAttention vs Traditional KV Cache: How vLLM Reinvented GPU Memory for LLM Inference
DiffusionGemma: 4x Faster Text Generation
🤖
ai
Content type:
News
Content type:
Blog
blog.google
·
2h
2 hours ago
·
Hacker News
,
r/LocalLLaMA
,
r/singularity
Actions for DiffusionGemma: 4x Faster Text Generation
The PM’s Playbook for Shipping
AI
Features That Actually Work in Production
💬
NLP
Content type:
Blog
oreilly.com
·
18h
18 hours ago
Actions for The PM’s Playbook for Shipping AI Features That Actually Work in Production
Article Series: Securing the
AI
Stack: From
Model
to Production
⚙️
MLOps
Content type:
News
infoq.com
·
5d
5 days ago
Actions for Article Series: Securing the AI Stack: From Model to Production
New comment by Revanthkodati in "Ask HN: Who wants to be hired? (June 2026)"
🤖
ai
drive.google.com
·
2d
2 days ago
·
Hacker News
Actions for New comment by Revanthkodati in "Ask HN: Who wants to be hired? (June 2026)"
🇳🇱 Go/Golang job: Senior Backend
Engineer
(Go) | Studio
AI
at Creative Fabrica (Amsterdam, Netherlands)
🤖
AI
golangprojects.com
·
3h
3 hours ago
Actions for 🇳🇱 Go/Golang job: Senior Backend Engineer (Go) | Studio AI at Creative Fabrica (Amsterdam, Netherlands)
Latest technical articles & videos.
🤖
ai
certdepot.net
·
4d
4 days ago
Actions for Latest technical articles & videos.
Breaking the Ice: Analyzing Cold Start Latency in
vLLM
🧠
LLM Inference
Content type:
Academic
arxiv.org
·
2d
2 days ago
Actions for Breaking the Ice: Analyzing Cold Start Latency in vLLM
mirkolenz/llmhop: Tiny, stateless Go router that dispatches OpenAI-compatible requests to
single-model
vLLM
and sglang backends with zero external dependencies
🧠
LLMs
Content type:
Code
github.com
·
5d
5 days ago
·
Hacker News
Actions for mirkolenz/llmhop: Tiny, stateless Go router that dispatches OpenAI-compatible requests to single-model vLLM and sglang backends with zero external dependencies
Bring your own evaluation framework to EvalHub
⚙️
MLOps
developers.redhat.com
·
1d
1 day ago
Actions for Bring your own evaluation framework to EvalHub
Modern
BSA/AML compliance on Databricks
🕵️
Fraud Detection
Content type:
Blog
databricks.com
·
17h
17 hours ago
Actions for Modern BSA/AML compliance on Databricks
Running LLM
Inference
on Kubernetes: What It Actually Takes
🧠
LLM Inference
Content type:
Blog
fairwinds.com
·
5d
5 days ago
Actions for Running LLM Inference on Kubernetes: What It Actually Takes
Azure OpenAI Architecture: The Decisions That Actually Matter (Part 2)
⚙️
MLOps
techcommunity.microsoft.com
·
2d
2 days ago
Actions for Azure OpenAI Architecture: The Decisions That Actually Matter (Part 2)
AI
Governance Tools: How To Achieve Compliance and Visibility
⚖️
AI Ethics
Content type:
Blog
blog.n8n.io
·
3h
3 hours ago
Actions for AI Governance Tools: How To Achieve Compliance and Visibility
« Page 1
·
Page 3 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help