⚙ Context engineering - laurynas

🤖agents Discussion

news.ycombinator.com··Hacker News

RohiRIK/OpenLtm: Long-Term Memory plugin for Claude Code — semantic search, context injection, session learning

🗃️databases Code

github.com··Hacker News, r/SideProject, r/mcp, r/vibecoding

Hindsight Is the Fastest-Growing Open-Source AI Memory Project Ever

🎯Reranking Blog

hindsight.vectorize.io··Hacker News

a shared format for agent memory

🔍Semantic Search

universalmemoryprotocol.io··Hacker News

Less-relevant results

How we fight GPU scarcity without compromise

🧪Property-based Testing Blog

equixly.com··Hacker News

Designing Memory for a Minimal Rust Coding Agent, Without a Vector Store

🔍Semantic Search

xavierforge.dev··Hacker News

A system programmer’s guide to LLM inference

✨UI generation Blog

blog.xiangpeng.systems··Hacker News

Search sound libraries with natural language, on-device AI

🎯Reranking

curlo.ahrisy.com··Hacker News

Version Controla and Agent Audit Platform

🤖agents

cognatoai.com··Hacker News

Alignment Collapse Under KV Cache Quantization: Diagnosis and Mitigation

🧪Property-based Testing Academic

arxiv.org·

The economics of speculative decoding

🔍AI Interpretability Blog

fergusfinn.com··Hacker News

Running Qwen 35B MoE at 450k Context on a Single 32GB GPU

🌊Stream Processing

local-llm.utop.workers.dev··Hacker News

KaiFelixBennett/gemma4-turboquant-rdna4: Run Gemma-4-31B at full 256K context on a $1,400 AMD RDNA4 GPU (gfx1201): TurboQuant KV cache + HIP-graph-safe Flash-Attention for llama.cpp, fully measured on real hardware.

🎮Deterministic Simulation Code

github.com··Hacker News

Get Started with Meko: Agent Memory with Built-In Discernment

🤖agents Blog

yugabyte.com··Hacker News

Show HN: Ext-Infer

🔍Semantic Search

infer.displace.tech··Hacker News

STAR-KV: Low-Rank KV Cache Compression via Soft Thresholding for Adaptive Rank Control

🎯Reranking Academic

arxiv.org·

Show HN: The agent that builds and operates its own SaaS tools

🤖agents

craftbot.live··Hacker News

Machinic Psychopharmacology: Do LLMs Self-Medicate?

🎮Reinforcement Learning

lesswrong.com··Hacker News

RedKnot: Efficient Long-Context LLM Serving with Head-Aware KV Reuse and SegPagedAttention

memory OS for AI agents (ranks, compresses and evolves agents memory)

Coding Agent Memory Benchmarks

RohiRIK/OpenLtm: Long-Term Memory plugin for Claude Code — semantic search, context injection, session learning

Hindsight Is the Fastest-Growing Open-Source AI Memory Project Ever

a shared format for agent memory

How we fight GPU scarcity without compromise

Designing Memory for a Minimal Rust Coding Agent, Without a Vector Store

A system programmer’s guide to LLM inference

Search sound libraries with natural language, on-device AI

Version Controla and Agent Audit Platform

Alignment Collapse Under KV Cache Quantization: Diagnosis and Mitigation

The economics of speculative decoding

Running Qwen 35B MoE at 450k Context on a Single 32GB GPU

KaiFelixBennett/gemma4-turboquant-rdna4: Run Gemma-4-31B at full 256K context on a $1,400 AMD RDNA4 GPU (gfx1201): TurboQuant KV cache + HIP-graph-safe Flash-Attention for llama.cpp, fully measured on real hardware.

Get Started with Meko: Agent Memory with Built-In Discernment

Show HN: Ext-Infer

STAR-KV: Low-Rank KV Cache Compression via Soft Thresholding for Adaptive Rank Control

Show HN: The agent that builds and operates its own SaaS tools

Machinic Psychopharmacology: Do LLMs Self-Medicate?