⚡ Model Efficiency - jimman · Scour

Structured Context Engineering for File-Native Agentic Systems

simonwillison.net·1d

✍️Prompt Engineering

KV Cache Optimization — Why Inference Memory Explodes and How to Fix It

dev.to·4d·

Discuss: DEV

⚡LLM Optimization

Memory and Learning layer be built in-house or bought externally?

medium.com·12h·

Discuss: Hacker News

⚡LLM Optimization

Building Neuro‑OS Desktop: A Lightweight Python Desktop Environment with Adaptive Optimization

dev.to·3d·

Discuss: DEV

⚡LLM Optimization

Catching Critical Defects In TSVs And Stacked Chips

semiengineering.com·20h

🔍AI Interpretability

Quantized Tensor Train Compression For Turbulent Flow Simulation: O(log N) Scaling with Reynolds-Independent Bond Dimension

zenodo.org·1d·

Discuss: Hacker News

⚡LLM Optimization

Harmonia: Algorithm-Hardware Co-Design for Memory- and Compute-Efficient BFP-based LLM Inference

arxiv.org·5d

⚡LLM Optimization

Optimized LLM Inference Engines

rishirajacharya.com·6d

⚡LLM Optimization

Expectation and Copysets

buttondown.com·1d·

Discuss: Hacker News, Hacker News

⚡LLM Optimization

From Sequential to Parallel: Reformulating Dynamic Programming as GPU Kernels for Large-Scale Stochastic Combinatorial Optimization

arxiv.org·4d

⚡LLM Optimization

Tiny Titan or Overpromised Miniature? The Framework Desktop Reviewed

kirkstechtips.com·14h·

Discuss: Hacker News

✍️Prompt Engineering

VeritasAdmin/audit-grade-ai-workstation: Design rationale for a dual-GPU workstation supporting reproducible AI safety evaluation

github.com·16h·

Discuss: Hacker News

🔍AI Interpretability

Rust Memory Management: The Playroom Analogy

adacore.com·14h·

Discuss: Hacker News

Container Timing: measuring web components performance

blogs.igalia.com·11h·

Discuss: Hacker News

🛠️Developer Tools

The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines, Second Edition

research.google·11h·

Discuss: Hacker News

⚡LLM Optimization

Geometrically Allocated Ads in AI Conversations

june.kim·1d·

Discuss: Hacker News

🔍AI Interpretability

Show HN: Insurance AI Benchmark – 510 scenarios from production

huggingface.co·1d·

Discuss: Hacker News

🔍AI Interpretability

Getting Started with Sapphire Edge+ and AMD Embedded+

hackster.io·19h·

Discuss: Hacker News

🔍AI Interpretability

Show HN: ContinualCode – a coding agent that updates its weights from feedback

sdan.github.io·1d·

Discuss: Hacker News

✍️Prompt Engineering

Khronos at 25: Shaping Visual Computing with Open Standards

khronos.org·14h·

Discuss: Hacker News

⚡LLM Optimization

Loading more...