⚙️ MLOps - jinkai_lau · Scour

Caltech’s PrismML shrinks AI models to fit your phone without losing their mind 🤖AI Engineering

startupfortune.com·3d

Build Strands Agents with SageMaker AI models and MLflow 🤖AI Engineering

aws.amazon.com·4d

Darwinian Specialization in AI 🔬AI Research

tomtunguz.com·3d

Best Practices for inference on Edge AI MCUs 🤖AI Engineering

embedded.com·2d

Reinforced Agent: Inference-Time Feedback for Tool-Calling Agents 🔎AI Interpretability

machinelearning.apple.com·1d

AmSach/kvquant: Drop-in KV cache compressor for local LLM inference - Run 70B models on 8GB RAM 🧠LLMs

github.com·1d·DEV

How we built the most performant DeepSeek V3.2, MiniMax-M2.5 and Qwen 3.5 397B on DigitalOcean NVIDIA HGX™ B300 GPU Droplets 📊Benchmarking

digitalocean.com·4d

Belief-Guided Inference Control for Large Language Model Services via Verifiable Observations 🧠LLMs

MauroCE/m3serve: Optimised BAAI/bge-m3 serving with dense + sparse + ColBERT embeddings, async dynamic batching and pipeline GPU inference 📊Benchmarking

github.com·4d·r/SideProject

Think it, Run it: Autonomous ML pipeline generation via self-healing multi-agent AI 🤖AI Engineering

AI Observability for Large Language Model Systems: A Multi-Layer Analysis of Monitoring Approaches from Confidence Calibration to Infrastructure Tracing ✅Formal Verification

Introducing DigitalOcean AI-Native Cloud for Production AI Workloads 🤖AI Engineering

digitalocean.com·3d

Adaptive and Fine-grained Module-wise Expert Pruning for Efficient LoRA-MoE Fine-Tuning 🧠LLMs

Compliance versus Sensibility: On the Reasoning Controllability in Large Language Models 🧠LLMs

Diagnosing Capability Gaps in Fine-Tuning Data 🤖AI Engineering

Strait: Perceiving Priority and Interference in ML Inference Serving 🧠LLMs

Programming with Data: Test-Driven Data Engineering for Self-Improving LLMs from Raw Corpora 🧠LLMs

When Your LLM Reaches End-of-Life: A Framework for Confident Model Migration in Production Systems 🧠LLMs

Efficient, VRAM-Constrained xLM Inference on Clients 📊Benchmarking

Incompressible Knowledge Probes: Estimating Black-Box LLM Parameter Counts via Factual Capacity 🧠LLMs

arxiv.org·3d·Hacker News

Log in to enable infinite scrolling