Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
Reliability
🛡️ Reliability
fault tolerance, resilience, SRE, uptime
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
173
posts in
7.1
ms
Ops I did it again: The
SRE
Extension is out!
🔬
eBPF
Content type:
Blog
medium.com
·
2d
2 days ago
Actions for Ops I did it again: The SRE Extension is out!
Komodor Brings Autonomous AI to
SRE
With
Reliability-First
Cloud Optimization
📦
Containerization
cloudnativenow.com
·
14h
14 hours ago
Actions for Komodor Brings Autonomous AI to SRE With Reliability-First Cloud Optimization
The Death of the Four Golden Signals: Designing Telemetry for Non-Deterministic Infrastructure
📈
Scalability
devops.com
·
5d
5 days ago
Actions for The Death of the Four Golden Signals: Designing Telemetry for Non-Deterministic Infrastructure
Maintain
observability
during cloud outages with Datadog Disaster Recovery
📈
Scalability
Content type:
Blog
datadoghq.com
·
2d
2 days ago
Actions for Maintain observability during cloud outages with Datadog Disaster Recovery
Observability
overload is drowning
engineers
🔬
eBPF
thenewstack.io
·
13h
13 hours ago
Actions for Observability overload is drowning engineers
melancholictheory/wellcake: A Kubernetes operator for Valkey — Standalone / Replication / Sentinel / Cluster, operator-driven
failover
, proactive zero-downtime rolling restarts, Atomic Slot Migration, S3 backups, multi-region replication.
📦
Containerization
Content type:
Code
github.com
·
3d
3 days ago
·
r/devops
Actions for melancholictheory/wellcake: A Kubernetes operator for Valkey — Standalone / Replication / Sentinel / Cluster, operator-driven failover, proactive zero-downtime rolling restarts, Atomic Slot Migration, S3 backups, multi-region replication.
Practice like you play: How Amazon scales
resilience
to new heights (ARC316)
🏗️
Systems Design
Content type:
Blog
blog.domb.net
·
1d
1 day ago
Actions for Practice like you play: How Amazon scales resilience to new heights (ARC316)
The single-cloud trap: why UK businesses’ multi-cloud strategy risks leaving them exposed
📈
Scalability
techradar.com
·
17h
17 hours ago
Actions for The single-cloud trap: why UK businesses’ multi-cloud strategy risks leaving them exposed
Azure Availability Zone Mapping and VM
Resilience
Analysis Guidance using
SRE.AZURE.COM
Agent
📈
Scalability
techcommunity.microsoft.com
·
2d
2 days ago
Actions for Azure Availability Zone Mapping and VM Resilience Analysis Guidance using SRE.AZURE.COM Agent
Our DNS servers use GeoDNS to direct connections to the lowest latency servers and implement automatic
failover
via health checks and 5 minute expiry for the...
⚖️
Load Balancing
grapheneos.social
·
4d
4 days ago
Actions for Our DNS servers use GeoDNS to direct connections to the lowest latency servers and implement automatic failover via health checks and 5 minute expiry for the...
New comment by RomainB_ in "Ask HN: Who wants to be hired? (June 2026)"
📦
Containerization
Content type:
Discussion
news.ycombinator.com
·
17h
17 hours ago
·
Hacker News
Actions for New comment by RomainB_ in "Ask HN: Who wants to be hired? (June 2026)"
Elastic brings AI-driven
incident
investigation to Kubernetes and
observability
tools
📦
Containerization
helpnetsecurity.com
·
1d
1 day ago
Actions for Elastic brings AI-driven incident investigation to Kubernetes and observability tools
SwarmSense-DNN: A Trustworthy and Decentralized Neural Framework for Proactive Anomaly Defense in Consumer IoT
🤝
Consensus Protocols
Content type:
Academic
arxiv.org
·
3h
3 hours ago
Actions for SwarmSense-DNN: A Trustworthy and Decentralized Neural Framework for Proactive Anomaly Defense in Consumer IoT
The Server That Does Nothing Is Often the Most Critical
📈
Scalability
siliconopera.com
·
21h
21 hours ago
Actions for The Server That Does Nothing Is Often the Most Critical
SRE
Weekly Issue #520
⏱️
Performance
sreweekly.com
·
3d
3 days ago
Actions for SRE Weekly Issue #520
Explore OpenSearch 3.7
🔬
eBPF
Content type:
Blog
opensearch.org
·
1d
1 day ago
Actions for Explore OpenSearch 3.7
The Hidden HA Gaps Costing You
Uptime
| Trey Isaac, SIOS Technology
📈
Scalability
Content type:
Video
youtube.com
·
2d
2 days ago
Actions for The Hidden HA Gaps Costing You Uptime | Trey Isaac, SIOS Technology
The Four Knobs of AI Agent
Reliability
: A DevOps View
🔧
Microservices
Content type:
Blog
talent500.com
·
15h
15 hours ago
Actions for The Four Knobs of AI Agent Reliability: A DevOps View
SQL Server Always On Availability Groups and Database Master Keys: A Hidden
Failover
Pitfall
📈
Scalability
Content type:
Blog
dbi-services.com
·
1d
1 day ago
Actions for SQL Server Always On Availability Groups and Database Master Keys: A Hidden Failover Pitfall
Scale. Speed. Trust: Three Imperatives for the AI Era
🌐
Distributed Systems
Content type:
Blog
blogs.cisco.com
·
14h
14 hours ago
Actions for Scale. Speed. Trust: Three Imperatives for the AI Era
Page 2 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help