🚀 Performance - hugonoss · Scour

Inside Claude Code: An Architecture Deep Dive 🖤CLI Tools

zainhas.github.io·6d·Hacker News

Benchmarking How Workflow Execution Scales on Postgres ⚖optimizing for consensus

dbos.dev·4d·Hacker News

Optimizing Effective Training Time for Meta’s Internal Recommendation/Ranking Workloads 🎓Masterclass

pytorch.org·6d·Hacker News

At Machine Speed 🦙Ollama

matthiasott.com·4d

Watch language models think. 🦙Ollama

openinterp.org·5d·Hacker News

Monitoring LLM behavior: Drift, retries, and refusal patterns 🦙Ollama

venturebeat.com·4d·Hacker News

shreyansh26/Speculative-Decoding: Speculative Decoding Implementations: EAGLE-3, Medusa-1, PARD, Draft Models, N-gram and Suffix Decoding from scratch 🦙Ollama

github.com·2d·r/LLM, r/LocalLLaMA

2026 State of Kubernetes Optimization Report ⚖optimizing for consensus

cast.ai·6d·Hacker News

Flash Attention 2 in CuteDSL: A Naive Kernel, Three Optimizations, and Where I Got Stuck ⬛Ditherpunk

kyrieblunders.bearblog.dev·4d·Hacker News

Which one is more important: more parameters or more computation? (2021) 🦙Ollama

parl.ai·4d·Hacker News

Open-weight 27B hits 38% on Terminal-Bench 2.0 (Opus 4.1 hit 38% in Aug 2025) 🕹️PICO-8

antigma.ai·5d·Hacker News

[AINews] Tasteful Tokenmaxxing 🦙Ollama

·6d·Hacker News

Show HN: We Fixed Code Throughput. Understanding Is Now the Bottleneck 📘how to use AI

blog.jaystuart.dev·6d·Hacker News

From $200 to $30: Five Layers of LLM Cost Optimization 🦙Ollama

blog.dwornikowski.com·4d·Hacker News

itayinbarr/little-coder: A coding agent optimized to smaller LLMs 🦙Ollama

github.com·1d·Hacker News

2026-04-23 INTEL XEON GOLD 6548N (16) @ 2.800GHz, 64G RAM · LesnyRumcajs grpc_bench 💫slick production values

github.com·6d·Hacker News, r/java, r/rust

From 800ms to ~25ms: harness-driven optimization of a CUDA matmul kernel 🦙Ollama

github.com·5d·Hacker News

Show HN: CSP Benchmarks – Go vs. core.async (Clojure) vs. libgoc (C) 🕹️PICO-8

github.com·4d·Hacker News

PMZFX/intel-arc-pro-b70-benchmarks: Benchmark results and performance data for the Intel Arc Pro B70 GPU (Xe2/Battlemage) - LLM inference, video generation, dual-GPU scaling. 💫slick production values

github.com·5d·Hacker News

dorukardahan/benchmark-gap: Reproducible benchmark study on coding-agent context design and GLM-5 family performance 🔓Open source software

github.com·3d·Hacker News

Log in to enable infinite scrolling