Model Quantization, Inference Optimization, GGUF Format, Privacy-preserving AI

Is GRPO Broken?
neelsomaniblog.com·5h·
Discuss: Hacker News
🛡️Byzantine Consensus
Why LLMs cannot reach GenAI, but why it looked like they could
haversine.substack.com·8h·
Discuss: Substack
🏗️AI Infrastructure
[R] DeepSeek 3.2's sparse attention mechanism
reddit.com·1d·
🏗️AI Infrastructure
Distribution Preference Optimization: A Fine-grained Perspective for LLM Unlearning
arxiv.org·4d
🏠Self-hosted AI
Revisiting Mixout: An Overlooked Path to Robust Finetuning
arxiv.org·2d
📱Edge AI
Decentralized Intelligence: Empowering Autonomous Systems with Localized Learning by Arvind Sundararajan
dev.to·1d·
Discuss: DEV
🤝Federated Learning
Get RICH or Die Scaling: Profitably Trading Inference Compute for Robustness
arxiv.org·2d
📱Edge AI
ECLipsE-Gen-Local: Efficient Compositional Local Lipschitz Estimates for Deep Neural Networks
arxiv.org·3d
📱Edge AI
Can Speech LLMs Think while Listening?
arxiv.org·1d
🎙️Whisper
Unraveling LCRE-Mediated Chromatin Loops: A Predictive Model for Gene Expression Fine-Tuning in Desert Genomes
dev.to·7h·
Discuss: DEV
🧬Computational Biology
From Documents to Dialogue: A step-by-step RAG Journey
dev.to·16h·
Discuss: DEV
🎙️Whisper
Quantifying the Accuracy-Interpretability Trade-Off in Concept-Based Sidechannel Models
arxiv.org·3d
📱Edge AI
From Defender to Devil? Unintended Risk Interactions Induced by LLM Defenses
arxiv.org·1d
🏠Self-hosted AI
CaRT: Teaching LLM Agents to Know When They Know Enough
arxiv.org·1d
🏗️AI Infrastructure
h1: Bootstrapping LLMs to Reason over Longer Horizons via Reinforcement Learning
arxiv.org·2d·
Discuss: Hacker News
🏗️AI Infrastructure
Latency vs. Accuracy for LLM Apps — How to Choose and How a Memory Layer Lets You Win Both
dev.to·3d·
Discuss: DEV
🏗️AI Infrastructure
Graph-based LLM over Semi-Structured Population Data for Dynamic Policy Response
arxiv.org·3d
🏗️AI Infrastructure
H1B-KV: Hybrid One-Bit Caches for Memory-Efficient Large Language Model Inference
arxiv.org·3d
🏗️AI Infrastructure
Less Is More: Recursive Reasoning with Tiny Networks
github.com·2d·
Discuss: Hacker News
📱Edge AI