Performance Profiling

Feeds to Scour
SubscribedAll
Scoured 167 posts in 10.1 ms

We went multi-region then undid it

 🌐Distributed systems  Content type: Blog
useautumn.com··Hacker News

tensorhq/suture-stream-repair: Ultra-low-latency reverse proxy that repairs truncated & malformed JSON in LLM streaming responses (OpenAI, Anthropic, Vertex AI, Bedrock) — fixes JSONDecodeError / serde_json EOF on truncated tool calls.

 🧩Microservices  Content type: Code
github.com··Hacker News

AI Voice Agent Architecture: How Real-Time Conversational Systems Work

 🌐Distributed systems

Integrate on-device AI models into your app using Core AI - WWDC26 - Videos

 💻Programming languages

Magenta RealTime 2: Open and Local Live Music Models

 📮Message Queues

memory OS for AI agents (ranks, compresses and evolves agents memory)

 🛡️Odin
thrindex.com··Hacker News

Why AI code optimization needs production-grounded benchmarks

 📊Systems Monitoring  Content type: Blog
datadoghq.com··Hacker News

How Agoda Scaled Its Feature Store 50X with ScyllaDB

 💾Storage Engines
hackernoon.com·

ashp15205/guardian-runtime: A zero-latency, local-first runtime firewall for LLMs. Intercept every prompt and response locally to stop data leaks and runaway token costs.

 📊Systems Monitoring  Content type: Code
github.com··Hacker News

How We Ditched Postgres for ClickHouse to Process 12 Billion Caches Per Day

 🗃️Database Internals  Content type: Blog
momentic.ai··Hacker News

Premature Optimization is Fun Sometimes

 SIMD Optimization

Looking Inside Chromium’s On-Device AI Stack

 📦Data Serialization  Content type: Blog
island.io··Hacker News

The Return of Rigorous Full-System Timing Simulation

 SIMD Optimization
sigarch.org··Hacker News

iSCSI vs. NVMe/TCP: The ultimate storage showdown for Red Hat OpenShift Virtualization

 💾Storage Engines

lbj96347/nemotron-3.5-asr-ios: On-device, offline speech recognition for iPhone/iPad using NVIDIA's Nemotron-3.5-ASR Streaming 0.6B (multilingual) via CoreML.SwiftUI app with mic capture + audio file import, RNN-Tdecoding, and live benchmark metrics (latency, RTF, memory).

 🛡️Odin  Content type: Code
github.com··Hacker News

Nex-N2-mini: A 35B Model Built for Autonomous Agents

 🛡️Odin
hackernoon.com·

Show HN: A Highly Available Distributed Router for Global Realtime AI

 🌐Distributed systems  Content type: Blog
cerebrium.ai··Hacker News

NexusOS v2.0 – A zero-dependency pipeline streaming server chaos to Parquet

 🔥DataFusion

"North Mini Code"; open weights, 30B param, Canadian coding model

 💻Programming languages  Content type: Blog
cohere.com··Hacker News

Your Lambda isn't leaking memory — your metrics are lying to you

 🧠Memory Management  Content type: Blog

Keyboard Shortcuts

Navigation

Next / previous item
j/k
Open post
oorEnter
Preview post
v

Post Actions

Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s

Recommendations

Add interest / feed
Enter
Not interested
x

Go to

Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/

General

Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help