Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
AI Engineering
🤖 AI Engineering
LLM, RAG, AI systems, prompt engineering, inference
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
626
posts in
10.4
ms
DeepSeekV4 1.6T Day 0 to Day 43 Performance Over Time - Huawei, GB300 NVL72, MI355X, B200
🧠
Machine Learning
Content type:
News
newsletter.semianalysis.com
·
1d
1 day ago
·
Hacker News
Actions for DeepSeekV4 1.6T Day 0 to Day 43 Performance Over Time - Huawei, GB300 NVL72, MI355X, B200
If Claude Fable stops helping you, you'
ll
never know
🧠
Machine Learning
Content type:
Blog
jonready.com
·
18h
18 hours ago
·
Lobsters
,
Hacker News
Actions for If Claude Fable stops helping you, you'll never know
ICYMI: Inside the Microsoft Agent Framework: How we designed a layered SDK
🔍
RAG
Content type:
Blog
devblogs.microsoft.com
·
22h
22 hours ago
Actions for ICYMI: Inside the Microsoft Agent Framework: How we designed a layered SDK
I built a free extension that adds shared folders +
search
across ChatGPT, Claude and Gemini
🔍
RAG
foldery.app
·
2d
2 days ago
·
r/chrome_extensions
Actions for I built a free extension that adds shared folders + search across ChatGPT, Claude and Gemini
Running
LLM
Inference
on Kubernetes: What It Actually Takes
📊
Observability
Content type:
Blog
fairwinds.com
·
5d
5 days ago
Actions for Running LLM Inference on Kubernetes: What It Actually Takes
What I learned building an
AI
chatbot for websites and docs
✍️
Prompt Engineering
chattybox.ai
·
12h
12 hours ago
·
DEV
,
r/SideProject
Actions for What I learned building an AI chatbot for websites and docs
The hidden bottleneck in
LLM
inference
and the impact on MLPerf benchmarking
🧠
LLMs
edn.com
·
6d
6 days ago
Actions for The hidden bottleneck in LLM inference and the impact on MLPerf benchmarking
TA-RAG
: Tone-Aware
Retrieval-Augmented
Generation for Peer-Support Health Communication
🔍
RAG
Content type:
Academic
arxiv.org
·
2d
2 days ago
Actions for TA-RAG: Tone-Aware Retrieval-Augmented Generation for Peer-Support Health Communication
Best practices for building a
modern
app with
vector
search
🔍
RAG
Content type:
Blog
elastic.co
·
2d
2 days ago
Actions for Best practices for building a modern app with vector search
Hybrid
Search
for
RAG
: Fix
Retrieval
Accuracy in AI
🔍
RAG
Content type:
Blog
pingcap.com
·
5d
5 days ago
Actions for Hybrid Search for RAG: Fix Retrieval Accuracy in AI
PagedAttention vs Traditional KV Cache: How
vLLM
Reinvented GPU Memory for
LLM
Inference
🧠
LLMs
Content type:
Blog
medium.com
·
1d
1 day ago
Actions for PagedAttention vs Traditional KV Cache: How vLLM Reinvented GPU Memory for LLM Inference
The Death of the Four Golden Signals: Designing Telemetry for Non-Deterministic Infrastructure
🏗️
Backend Architecture
devops.com
·
5d
5 days ago
Actions for The Death of the Four Golden Signals: Designing Telemetry for Non-Deterministic Infrastructure
The
AI
Curse (Vis the Lisp Curse)
🔍
RAG
Content type:
Blog
blog.djhaskin.com
·
15h
15 hours ago
·
Hacker News
Actions for The AI Curse (Vis the Lisp Curse)
How to Defend Against
Prompt
Injection in Production
✍️
Prompt Engineering
Content type:
Reference
leanpub.com
·
1d
1 day ago
·
DEV
Actions for How to Defend Against Prompt Injection in Production
New comment by bedelloperator in "Ask HN: Who wants to be hired? (June 2026)"
🔍
RAG
Content type:
Discussion
news.ycombinator.com
·
20h
20 hours ago
·
Hacker News
Actions for New comment by bedelloperator in "Ask HN: Who wants to be hired? (June 2026)"
RAGAS Belongs at Design Time
🔍
RAG
Content type:
Blog
rephrase-it.com
·
3d
3 days ago
Actions for RAGAS Belongs at Design Time
Show HN: Incremental
RAG
ingestion, only changed chunks get
re-embedded
🔍
RAG
Content type:
Code
github.com
·
2d
2 days ago
·
Hacker News
Actions for Show HN: Incremental RAG ingestion, only changed chunks get re-embedded
Introducing GitLab Orbit
💻
Software Engineering
Content type:
Blog
about.gitlab.com
·
7h
7 hours ago
·
Hacker News
Actions for Introducing GitLab Orbit
Full Observability for Pinecone: Introducing an Open-Source Monitoring Stack for SaaS and BYOC
🔍
RAG
Content type:
Blog
pinecone.io
·
1d
1 day ago
Actions for Full Observability for Pinecone: Introducing an Open-Source Monitoring Stack for SaaS and BYOC
Powering the
Inference
Era: Inside the DigitalOcean
Data
& Learning Layer
🔍
RAG
Content type:
Blog
digitalocean.com
·
6d
6 days ago
Actions for Powering the Inference Era: Inside the DigitalOcean Data & Learning Layer
Sign up or log in to see more results
Sign Up
Login
« Page 2
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help