⚡ Quantization - buckman · Scour

Learning in Low Precision 🔥PyTorch

rachitsingh.com·6d

Qwen3.6-27B-INT4 clocking 100 tps with 256k context length on 1x RTX 5090 via vllm 0.19 ⚡Inference

huggingface.co·1d·r/LocalLLaMA

The other paper that killed deep learning theory 📊ML Research

lesswrong.com·6h

Sam Reghenzi Homepage — Running Gemma 4 31B on an Apple Silicon Mac with Ollama 💻Terminal Emulators

sammyrulez.github.io·7h·Hacker News

Boosting algorithm framework for ensemble neural networks based on coordinate descent 📈Optimization

sciencedirect.com·1d

Comptime tensor 🔢NumPy

MobileNet vs EfficientNet-Lite: Pi 4 First Model 38ms Gap 🚀Performance

tildalice.io·2d

Types and Neural Networks 💻Local LLMs

brunogavranovic.com·6d·Hacker News

Additive baselines furnish no evidence for epistasis learning by MULTI-evolve 🔗Network Effects

biorxiv.org·2d

Language Modeling Without Neural Networks 🤖Large Language Models

nathan.rs·6d·Hacker News

Rapid cognitive improvement in Alzheimer's disease following perispinal etanercept administration 🔗Neuroplasticity

link.springer.com·1d

VU#518910: Ollama GGUF Quantization Remote Memory Leak 🕳LLM Vulnerabilities

kb.cert.org·5d

Ternary Bonsai: Top Intelligence at 1.58 Bits 🧠Neuromorphic Computing

news.ycombinator.com·6d·Hacker News

ParaRNN: Large-Scale Nonlinear RNNs, Trainable in Parallel 🧠LLMs

machinelearning.apple.com·4d·Hacker News

Neural Networks Explained In Plain English 🤖ML

blog.algomaster.io·6d

Qwen3.6-27B: Flagship-Level Coding in a 27B Dense Model ⚡Inference

simonwillison.net·4d

Show HN: WaveletLM – wavelet-based, attention-free model with O(n log n) scaling ⚡Inference

github.com·20h·Hacker News

TurboQuant: A First-Principles Walkthrough 🔢NumPy

arkaung.github.io·12h·Lobsters, Hacker News

Generalization performance of narrow shallow neural networks in the teacher–student setting 🤖LLM Inference

iopscience.iop.org·5d

Research Log: Monet/PEER sparse experts 📊ML Research

lesswrong.com·4d

Log in to enable infinite scrolling