⚡ Quantization - buckman · Scour

Modern LZ Compression Part 2: FSE and Arithmetic Coding 📊Shannon Entropy

glinscott.github.io·2d·Hacker News

How to Explain AI to a Friend Who Doesn’t Follow Tech 🤖GenAI

hongkiat.com·5d

Solving Hidden Number Problems Without Lattices 🔐Post-Quantum Crypto

leetarxiv.substack.com·2h·Substack, r/programming

10GB VRAM Local LLM: The Complete Setup Guide (2026) 🟩Nvidia

sitepoint.com·4d

BranchyNet: Teaching Neural Networks When to Stop Thinking 🤖AI Inference

Local SLM as a compression layer for cloud API calls 💻Local LLMs

news.ycombinator.com·2d·Hacker News

Maximal Brain Damage: Sign-Bit Flips in Neural Networks 🧠Neuromorphic Computing

mkimhi.github.io·5d

Statistical Structure and the Failure of Pointing: A System-Class Law for Compression-Based Generative Systems 🤖LLM Inference

philsci-archive.pitt.edu·19h

shreyansh26/Speculative-Decoding: Speculative Decoding Implementations: EAGLE-3, Medusa-1, PARD, Draft Models, N-gram and Suffix Decoding from scratch ⚡Inference

github.com·18h·r/LLM, r/LocalLLaMA

Jamie Simon and Daniel Kunin, UC Berkeley: There Will Be a Scientific Theory of Deep LearningPodcastApril 24, 2026Read more 🧠Deep Learning

I Tried to Run VGG19 on a CPU… It Failed. So I Fixed It." 🤖LLM Inference

github.com·5d·DEV

Metal Lossy Compression Format 📦Parquet

ludicon.com·2d·Lobsters, Hacker News

encfuncs (3) Linux Manual Page 💻Terminal Emulators

systutorials.com·4d

RNN to Transformer NMT: PyTorch Migration with 2.8x BLEU Gain 🔥PyTorch

tildalice.io·2d

Uncertainty-aware neural networks for autonomous real-time correction in bioprinting ✨Generative AI

sciencedirect.com·6d

sc_ReplSymmSCMatrix (3) Linux Manual Page 📐Linear Algebra

systutorials.com·15h

Max-and-Omnis/Nemotron-3-Super-64B-A12B-Math-REAP-GGUF 💻Local LLMs

huggingface.co·3d·r/LocalLLaMA

Fast Attention for Short Sequences 🚀Performance

blog.qwertyforce.dev·1d·Hacker News

To run deepseek v4 flash how much max vram we need? 175 gb or 320gb? 💰Compute Costs

lushbinary.com·3d·r/LocalLLaMA

Training a Transformer to Compose One Step Per Layer (and Proving It) 🤖Large Language Models

lesswrong.com·12h

Log in to enable infinite scrolling