⚙️ MLOps - hop1.ng.1357 · Scour

MCAP: Deployment-Time Layer Profiling for Memory-Constrained LLM Inference 📱Edge AI Optimization

Flow generation through natural language: An agentic modeling approach (11 minute read) 🪄Prompt Engineering

shopify.engineering·1d

The Data Layer Tax for Robot Learning 🧠Machine Learning

rerun.io·13h·Hacker News

LLM Quantization ✨LLMs

huggingface.co·2h·Hacker News

google-deepmind/proeval: Proactive failure discovery and efficient performance estimation for GenAI evaluation. 📱Edge AI Optimization

Lessons from Building an OTel Normalizer for GenAI (Part 1) 🪝eBPF

groundcover.com·21h·Hacker News

Monitoring LLM behavior: Drift, retries, and refusal patterns 🛡️AI Safety

venturebeat.com·5d·Hacker News

AI Infrastructure Architect · Builder · Author 🇨🇳Chinese AI

markferraz.com·7h·Hacker News

Darwinian Specialization in AI 📱Edge AI Optimization

tomtunguz.com·2d

GoogleCloudPlatform/activation-model-scanner: Verify language model safety before deployment by analyzing activation patterns 💉Prompt Injection

github.com·22h·Hacker News

AutoPyVerifier: Learning Compact Executable Verifiers for Large Language Model Outputs ✅Formal Verification

DigitalOcean Dedicated Inference: A Technical Deep Dive 📱Edge AI Optimization

digitalocean.com·5d

Scaling Pain of Coding Agent Serving: Lessons from Debugging GLM-5 at Scale 🔧Agent Tooling

z.ai·1d·Lobsters, Hacker News

Adaptive and Fine-grained Module-wise Expert Pruning for Efficient LoRA-MoE Fine-Tuning 🤖LLM

Building a High-Scale Real-Time Recommendation Engine with Feature Stores and Redis Observability ⚡Edge AI

hackernoon.com·3d

Three Cobblers, One Zhuge Liang: Making Cheaper Models Work Together 🪄Prompt Engineering

markhuang.ai·1d·Hacker News

What agentic AI borrowed from microservices (and made worse) 🔧Agent Tooling

temporal.io·1d·Hacker News

Fixing What LLMs Get Wrong (22 minute read) 🪄Prompt Engineering

thebigdataguy.substack.com·4d·Substack

umbecanessa/neural-ledger-system: An inference architecture that makes LLMs stateful. Patent pending (US 64/050,345). 🪄Prompt Engineering

github.com·2d·Hacker News

AI Observability for Large Language Model Systems: A Multi-Layer Analysis of Monitoring Approaches from Confidence Calibration to Infrastructure Tracing 🛡️AI Safety

Log in to enable infinite scrolling