Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
LLMOps
⚙️ LLMOps
LLM operations, model deployment, ML lifecycle, LLMOps
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
193
posts in
6.4
ms
Context compression
finally
works in production: new research cuts
LLM
input 16x without the accuracy hit
💻
AI Engineering
venturebeat.com
·
1d
1 day ago
·
r/LocalLLaMA
Actions for Context compression finally works in production: new research cuts LLM input 16x without the accuracy hit
Your AI agent reads the
fine
print: building a
RAG
pipeline
over EU regulations with Elasticsearch and OGX
💻
AI Engineering
Content type:
Blog
elastic.co
·
3d
3 days ago
Actions for Your AI agent reads the fine print: building a RAG pipeline over EU regulations with Elasticsearch and OGX
The Containment Gap: How
Deployed
Agentic AI Frameworks Fail Public-Facing Safety Requirements
🤖
AI Agents
Content type:
Academic
arxiv.org
·
23h
23 hours ago
Actions for The Containment Gap: How Deployed Agentic AI Frameworks Fail Public-Facing Safety Requirements
PagedAttention vs Traditional KV Cache: How
vLLM
Reinvented GPU Memory for
LLM
Inference
🌐
Open Source AI
Content type:
Blog
medium.com
·
4d
4 days ago
Actions for PagedAttention vs Traditional KV Cache: How vLLM Reinvented GPU Memory for LLM Inference
Mi50 32GB / GFX906 -
vLLM
Qwen 3.5 Configuration for Qwen 3.5:9B AWQ-4bit
🌐
Open Source AI
huggingface.co
·
1d
1 day ago
·
r/LocalLLaMA
Actions for Mi50 32GB / GFX906 - vLLM Qwen 3.5 Configuration for Qwen 3.5:9B AWQ-4bit
SmithDB
🤖
AI Agents
Content type:
News
NULL BITMAP by Justin Jaffray via buttondown.com
·
4d
4 days ago
·
Lobsters
,
Hacker News
Actions for SmithDB
Systematic research with
LangChain
's Deep Agents framework and Elasticsearch
🤖
AI Agents
Content type:
Blog
elastic.co
·
1d
1 day ago
Actions for Systematic research with LangChain's Deep Agents framework and Elasticsearch
Token4Token — pay-per-token inference on Gnosis + Swarm
🌐
Open Source AI
t4t.eth.link
·
3d
3 days ago
·
Hacker News
Actions for Token4Token — pay-per-token inference on Gnosis + Swarm
For whom the door-bell tolls
💻
AI Engineering
ceph.io
·
2d
2 days ago
Actions for For whom the door-bell tolls
GGUF vs GPTQ vs AWQ: The Plain-English Guide to
LLM
Quantization (and Which One to Pick)
🌐
Open Source AI
vettedconsumer.com
·
6d
6 days ago
·
Hacker News
Actions for GGUF vs GPTQ vs AWQ: The Plain-English Guide to LLM Quantization (and Which One to Pick)
Location: Lubbock, TX, USA Remote: Yes (Remote-friendly, US-based) Technologies:...
📚
RAG
Content type:
Discussion
news.ycombinator.com
·
2d
2 days ago
·
Hacker News
Actions for Location: Lubbock, TX, USA Remote: Yes (Remote-friendly, US-based) Technologies:...
massimo92/spark: CLI tool for serving
LLMs
with
vLLM
on NVIDIA DGX Spark. One file, zero friction.
🌐
Open Source AI
Content type:
Code
github.com
·
1d
1 day ago
·
Hacker News
Actions for massimo92/spark: CLI tool for serving LLMs with vLLM on NVIDIA DGX Spark. One file, zero friction.
Kimi K2.7-Code cuts thinking tokens 30% — but practitioners say the benchmarks don't check out
💻
AI Engineering
venturebeat.com
·
5h
5 hours ago
Actions for Kimi K2.7-Code cuts thinking tokens 30% — but practitioners say the benchmarks don't check out
Build a local voice agent with Red Hat OpenShift AI
🤖
AI Agents
developers.redhat.com
·
5d
5 days ago
Actions for Build a local voice agent with Red Hat OpenShift AI
RKSC: Reasoning-Aware KV Cache Sharing and Confident Early Exit for Multi-Step
LLM
Inference
🌐
Open Source AI
Content type:
Academic
arxiv.org
·
2d
2 days ago
Actions for RKSC: Reasoning-Aware KV Cache Sharing and Confident Early Exit for Multi-Step LLM Inference
CommBench: Can
LLMs
Write Correct and Efficient GPU Communication Code?
💻
AI Engineering
uccl-project.github.io
·
1d
1 day ago
·
Hacker News
Actions for CommBench: Can LLMs Write Correct and Efficient GPU Communication Code?
DiffusionGemma: The Developer Guide- Google Developers Blog
🌐
Open Source AI
Content type:
Blog
developers.googleblog.com
·
3d
3 days ago
·
r/LocalLLaMA
·
Cited by 1 article
Actions for DiffusionGemma: The Developer Guide- Google Developers Blog
How to Build an Agentic
RAG
with RubyLLM and Rails
💻
AI Engineering
Content type:
Blog
panasiti.me
·
2d
2 days ago
·
Hacker News
Actions for How to Build an Agentic RAG with RubyLLM and Rails
Youssof Altoukhi (@Youssofal_)
🌐
Open Source AI
xcancel.com
·
5d
5 days ago
·
r/LocalLLaMA
Actions for Youssof Altoukhi (@Youssofal_)
[AINews]
Open
Models
, Model Labs vs Agent Labs, and What's Untrainable — Sarah Guo
🌐
Open Source AI
Content type:
News
latent.space
·
2d
2 days ago
Actions for [AINews] Open Models, Model Labs vs Agent Labs, and What's Untrainable — Sarah Guo
Sign up or log in to see more results
Sign Up
Login
« Page 2
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help