Model Quantization, Inference Optimization, GGUF Format, Privacy-preserving AI

OBCache: Optimal Brain KV Cache Pruning for Efficient Long-Context LLM Inference
arxiv.org·22h
Cache Theory
2025-10-10 # LLMs Are Transpilers
alloc.dev·1d·
Discuss: Hacker News
🔄Language Evolution
A small number of samples can poison LLMs of any size
dev.to·1d·
Discuss: DEV
🎵Audio ML
RND1: Simple, Scalable AR-to-Diffusion Conversion
radicalnumerics.ai·1d·
Discuss: Hacker News
🔍Vector Forensics
Three Solutions to Nondeterminism in AI
blog.hellas.ai·2d·
Discuss: Hacker News
🎯Performance Proofs
Revisiting Karpathy's 'Unreasonable Effectiveness of Recurrent Neural Networks'
gilesthomas.com·1h·
Discuss: Hacker News
🎧Learned Audio
LoRA Explained: Faster, More Efficient Fine-Tuning with Docker
docker.com·1d
🌀Brotli Internals
YouTube gets ~5% CTR lift on Shorts by replacing embedding tables with Semantic IDs
shaped.ai·1d
📊Feed Optimization
VLLM Predicted Outputs
cascadetech.ai·5h·
Discuss: Hacker News
🎙️Whisper
GNN Predictions: Hidden Bugs and the Verification Nightmare by Arvind Sundararajan
dev.to·4h·
Discuss: DEV
⚙️Proof Engineering
A gentle introduction to Generative AI: Historical perspective
medium.com·1h·
Discuss: Hacker News
🧠Learned Codecs
Hardware Vulnerability Allows Attackers to Hack AI Training Data – NC State News
news.ncsu.edu·5h·
Discuss: Hacker News
🔐RISC-V Cryptography
SPAD: Specialized Prefill and Decode Hardware for Disaggregated LLM Inference
arxiv.org·22h·
Discuss: r/LLM
🧠Machine Learning
LLM Optimization Notes: Memory, Compute and Inference Techniques
gaurigupta19.github.io·4d·
Discuss: Hacker News
🧮Compute Optimization
Neuro-Symbolic AI
en.wikipedia.org·12h·
Discuss: Hacker News
🔲Cellular Automata
Evaluating Gemini 2.5 Deep Think's math capabilities
epoch.ai·12h·
Discuss: Hacker News
🎯Performance Proofs
Scaling LLM Multi-turn RL with End-to-end Summarization-based Context Management
arxiv.org·1d
🧮Prolog Parsing
Evolution Strategies at Scale: LLM Fine-Tuning Beyond Reinforcement Learning
arxiviq.substack.com·1d·
Discuss: Substack
Incremental Computation
Operationalizing Data Minimization for Privacy-Preserving LLM Prompting
arxiv.org·3d
🧮Kolmogorov Complexity
The Hidden Oracle Inside Your AI: Unveiling Data Density with Latent Space Magic by Arvind Sundararajan
dev.to·2d·
Discuss: DEV
🧠Machine Learning