OBCache: Optimal Brain KV Cache Pruning for Efficient Long-Context LLM Inference
arxiv.orgยท18h
๐Ÿง LLM Inference
Assuring Agent Safety Evaluations By Analysing Transcripts
lesswrong.comยท12h
๐Ÿ†LLM Benchmarking
How View Caching in Rails Works (2020)
honeybadger.ioยท8hยท
Discuss: Hacker News
โณLazy Loading
Benchmarking LLM Inference on RTX 4090 / RTX 5090 / RTX PRO 6000 #2
reddit.comยท4hยท
Discuss: r/LocalLLaMA
๐Ÿ—๏ธLLM Infrastructure
N8n vs. Windmill vs. Temporal
blog.arcbjorn.comยท22hยท
Discuss: Hacker News
๐Ÿš€Async Optimization
SLip - An aspiring Common Lisp environment in the browser.
lisperator.netยท9hยท
Discuss: r/programming
๐ŸŒฟLeptos
GoMem is a high-performance memory allocator library for Go
github.comยท19h
๐Ÿง Memory Allocators
How different AI engines generate and cite answers
searchengineland.comยท10h
๐Ÿ“ŠFeed Optimization
QUIC! Jump to User Space!
hackaday.comยท6h
โšกQUIC Protocol
How we built a structured Streamlit Application Framework in Snowflake
about.gitlab.comยท22h
๐Ÿ”งDeveloper tools
Show HN: Logiq โ€“ A single bot that manages your Discord server end-to-end
github.comยท22hยท
Discuss: Hacker News
๐Ÿ๏ธIslands Architecture
MECE โ€” The AI Principle Youโ€™ll Never Stop Using After Reading This
pub.towardsai.netยท11h
๐Ÿ”AI Interpretability
Supercharge your Enterprise BI: How to approach your migration to AI/BI
databricks.comยท1h
๐Ÿ—๏ธInfrastructure Economics
Multi-Core By Default
rfleury.comยท20hยท
๐ŸงตConcurrency
VLLM Predicted Outputs
cascadetech.aiยท1hยท
Discuss: Hacker News
๐Ÿ—๏ธLLM Infrastructure
๐ŸŽฒ Full-Stack React.js Chat with AI SDK
robinwieruch.deยท15h
๐Ÿฆ•Deno
Introducing modrpc, a modular RPC framework
reddit.comยท8hยท
Discuss: r/rust
๐Ÿ“‹MCP
MultiPar 1.3.3.5 Beta / 1.3.2.9
majorgeeks.comยท14h
๐Ÿ“„File Formats
My old Infocom transcripts
blog.zarfhome.comยท23h
๐Ÿ“ŸTerminals