Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Close
Copied to clipboard
Close
Unable to share or copy to clipboard
Close
⚙️ MLOps
Specific
model serving, inference, ML pipelines, model monitoring
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
149949
posts in
11.3
ms
Dockerizing
ML Models: A Data Engineer’s Guide to Model
Serving
🧠
LLMs
medium.com
·
4d
Why Most
ML
Models Die After the
Notebook
(And How to Fix It)
🧠
LLMs
photokheecher.medium.com
·
19h
AgentOpt
v0.1 Technical Report:
Client-Side
Optimization for LLM-Based Agent
🤖
AI Engineering
arxiv.org
·
1d
Building Scalable AI Workflows with
Vertex
AI
Pipelines
🤖
AI Engineering
medium.com
·
2h
MLOps
in 2026: What Is It and Why Should You Care?
🤖
AI Engineering
flexiana.com
·
23h
Inference
Arena
– new
benchmark
of local inference and training
📊
Benchmarking
kvark.github.io
·
4d
·
Hacker News
Benchmarking
LLMs with
Marimo
Pair
🧠
LLMs
ericmjl.github.io
·
14h
·
Hacker News
The case for Model-as-a-Service over
self-managed
inference
🧠
LLMs
news.ycombinator.com
·
3d
·
Hacker News
Model
Packaging
Tools Every
MLOps
Engineer Should Know
🧠
LLMs
freecodecamp.org
·
3d
benchmarking
inference
of popular models on consumer hardware
📊
Benchmarking
inferena.tech
·
5d
·
Hacker News
I Built a Production
MLOps
Platform from Scratch :
Kubeflow
, Kafka, Terraform, and Live on GCP
☸️
Kubernetes
medium.com
·
6d
Overcoming
inference
challenges
🤖
AI Engineering
redhat.com
·
3d
LLM
inference
engine from
scratch
in C++
🧠
LLMs
anirudhsathiya.com
·
4d
·
Hacker News
Show HN: Pre-training,
fine-tuning
, and
evals
platform
🤖
AI Engineering
oumi.ai
·
6d
·
Hacker News
vLLM
introduces memory
optimizations
for long-context inference
🧠
LLMs
github.com
·
5d
·
Hacker News
Automate Your Data + ML
Pipelines
With Apache
Airflow
🤖
AI Engineering
gitanjalisoni.medium.com
·
5d
Fast
Heterogeneous
Serving: Scalable Mixed-Scale LLM Allocation for
SLO-Constrained
Inference
🧠
LLMs
arxiv.org
·
8h
Awesome
Open Source AI
🔬
AI Research
awesomeosai.com
·
5d
·
r/SideProject
Blink: CPU-Free LLM Inference by
Delegating
the Serving Stack to GPU and
SmartNIC
🧠
LLMs
arxiv.org
·
8h
ai-infos/vllm-gfx906-mobydick
: A high-throughput and memory-efficient inference and serving engine for LLMs - Optimized for AMD
gfx906
GPUs, e.g. Radeon VII / MI50 / MI60
🧠
LLMs
github.com
·
4d
·
r/LocalLLaMA
Loading...
Loading more...
Page 2 »
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help