InferenceMAX โ€“ open-source Inference Frequent Benchmarking
github.comยท3hยท
Discuss: Hacker News
๐Ÿ—๏ธLLM Infrastructure
YouTube gets ~5% CTR lift on Shorts by replacing embedding tables with Semantic IDs
shaped.aiยท23h
๐Ÿ“ŠFeed Optimization
PostGIS Performance: Indexing and EXPLAIN
crunchydata.comยท9h
๐Ÿ”Query Optimization
GoMem is a high-performance memory allocator library for Go
github.comยท21h
๐Ÿง Memory Allocators
OBCache: Optimal Brain KV Cache Pruning for Efficient Long-Context LLM Inference
arxiv.orgยท19h
๐Ÿง LLM Inference
Benchmarking LLM Inference on RTX 4090 / RTX 5090 / RTX PRO 6000 #2
reddit.comยท5hยท
Discuss: r/LocalLLaMA
๐Ÿ—๏ธLLM Infrastructure
Explicit Lossless Vertex Expanders!
gilkalai.wordpress.comยท13h
๐Ÿ”ฌRaBitQ
Iterated Development and Study of Schemers (IDSS)
lesswrong.comยท9h
๐Ÿ†•New AI
You don't avoid the chaos. You filter it.
threadreaderapp.comยท6h
๐ŸงนSpam Filters
Neural Networks from Scratch in Python: Simpler Than You Think
hamza.seยท2hยท
Discuss: Hacker News
๐Ÿ“ŠVector Databases
Building Long-Term Memory for AI Agents: The Complete Guide to Making Your AI Rememberโ€ฆ
pub.towardsai.netยท9h
๐Ÿ†•New AI
OpenAI's newly launched Sora 2 makes AI's environmental impact impossible to ignore
techxplore.comยท12h
๐Ÿ†•New AI
Evaluating Gemini 2.5 Deep Think's math capabilities
epoch.aiยท9hยท
Discuss: Hacker News
๐Ÿ†LLM Benchmarking
Understanding conflict resolution and avoidance in PostgreSQL: a complete guide
pgedge.comยท3hยท
Discuss: r/programming
๐Ÿ”„Eventual Consistency
Scaling Time-Series Data for AI Models
singlestore.comยท8h
๐ŸŽ›๏ธFeed Filtering
From Text to Token: How Tokenization Pipelines Work
paradedb.comยท23h
๐Ÿ”คTokenization
Can AI Co-Design Distributed Systems? Scaling from 1 GPU to 1k
harvard-edge.github.ioยท1hยท
Discuss: Hacker News
๐ŸŒDistributed systems
Physics-informed AI excels at large-scale discovery of new materials
phys.orgยท8h
๐Ÿ†LLM Benchmarking
How different AI engines generate and cite answers
searchengineland.comยท11h
๐Ÿ“ŠFeed Optimization