An underqualified reading list about the transformer architecture
fvictorio.github.io·2d·
Discuss: Hacker News
🧩Attention Kernels
Flag this post
Our newest model: Chandra (OCR)
datalab.to·1h·
Discuss: Hacker News
🏎️TensorRT
Flag this post
Long-Context Modeling with Dynamic Hierarchical Sparse Attention for On-Device LLMs
arxiv.org·4d
🧩Attention Kernels
Flag this post
To grow, we must forget… but now AI remembers everything
doc.cc·19h
🧩Attention Kernels
Flag this post
After distractions, rotating brain waves may help thought circle back to the task
medicalxpress.com·1d
Flash Attention
Flag this post
A unified threshold-constrained optimization framework for consistent and interpretable cross-machine condition monitoring
sciencedirect.com·13h
⏱️Benchmarking
Flag this post
Semantic search with embeddings in PHP: a hands-on guide using Neuron AI and Ollama
ollama.com·12h·
Discuss: DEV
🛠Ml-eng
Flag this post
Emergent introspective awareness in large language models
transformer-circuits.pub·2d·
Discuss: Hacker News
Flash Attention
Flag this post
Brumby-14B-Base: The Strongest Attention-Free Base Model
manifestai.com·3d·
Discuss: Hacker News
🏎️TensorRT
Flag this post
Sparse Adaptive Attention “MoE”: How I Solved OpenAI’s $650B Problem With a £700 GPU
medium.com·4d·
Flash Attention
Flag this post
**Breaking the Curse of Dimensionality: A Game-Changer for L
dev.to·1d·
Discuss: DEV
🧩Attention Kernels
Flag this post
🧠 Soft Architecture (Part B): Emotional Timers and the Code of Care (Part 5 of the SaijinOS series)
dev.to·21h·
Discuss: DEV
🤖AI Coding Tools
Flag this post
Metis-SPECS: Decoupling Multimodal Learning via Self-distilled Preference-based Cold Start
arxiv.org·2d
🏎️TensorRT
Flag this post
Clarity From Chaos: AI Super-Resolution Redefined
dev.to·1d·
Discuss: DEV
Flash Attention
Flag this post
Decoding non-invasive brain activity with novel deep-learning approaches
arxiv.org·3d
🧩Attention Kernels
Flag this post
Deep Learning — 7 : Optimize your Neural Networks through Dropouts & Regularization.
pub.towardsai.net·2d
📊Gradient Accumulation
Flag this post
GIR-Bench: Versatile Benchmark for Generating Images with Reasoning
paperium.net·1d·
Discuss: DEV
🏎️TensorRT
Flag this post
Specialized structure of neural population codes in parietal cortex outputs
nature.com·1d
🧩Attention Kernels
Flag this post
Hybrid Neuro-Symbolic Reasoning for Adaptive Robotics Control in Dynamic Environments
dev.to·2h·
Discuss: DEV
ONNX Runtime
Flag this post