🔄 Transformers - matan · Scour

Everything About Transformers

krupadave.com·4d

📡Information Theory

Flag this post

Attention Illuminates LLM Reasoning: The Preplan-and-Anchor Rhythm EnablesFine-Grained Policy Optimization

paperium.net·23h·

Discuss: DEV

🎲Bayesian Cognition

Flag this post

Gated DeltaNet (Linear Attention variant in Qwen3-Next and Kimi Linear)

sebastianraschka.com·7h·

Discuss: r/LLM

Flag this post

An underqualified reading list about the transformer architecture

fvictorio.github.io·3d·

Discuss: Hacker News

🎨Computational Creativity

Flag this post

Un-Attributability: Computing Novelty From Retrieval & Semantic Similarity

arxiv.org·5h

Flag this post

Kimi Linear: An Expressive, Efficient Attention Architecture

arxiviq.substack.com·1d·

Discuss: Substack

Flag this post

Spiking Neural Networks: The Future of Brain-Inspired Computing

arxiv.org·5h

🎨Computational Creativity

Flag this post

Unlock Autonomy: Next-Gen LLMs Learn to Decode Themselves by Arvind Sundararajan

dev.to·17h·

Discuss: DEV

💬Philosophy of Language

Flag this post

Don’t Just Normalize, Batch Normalize! A Guide to Stable Neural Networks

pub.towardsai.net·3h

Flag this post

Everything About Transformers

krupadave.com·4d·

Discuss: Hacker News, Hacker News

📡Information Theory

Flag this post

A Minimal Route to Transformer Attention

neelsomaniblog.com·4d·

Discuss: Hacker News

🧮Information theory

Flag this post

[D] Best (free) courses on neural networks

reddit.com·1d·

Discuss: r/MachineLearning

Flag this post

Identifying the Periodicity of Information in Natural Language

arxiv.org·5h

Flag this post

The Kinetics of Reasoning: How Chain-of-Thought Shapes Learning in Transformers?

arxiv.org·3d

🧮Information theory

Flag this post

**Breaking the Curse of Dimensionality: A Game-Changer for L

dev.to·2d·

Discuss: DEV

Flag this post

A Beginner’s Guide to Getting Started with add_messages Reducer in LangGraph

langcasts.com·3d·

Discuss: DEV

Flag this post

ViSurf: Visual Supervised-and-Reinforcement Fine-Tuning for LargeVision-and-Language Models

dev.to·1d·

Discuss: DEV

Flag this post

Start Speaking AI: Easy Explanations for 15 Common Terms

future.forem.com·2d·

Discuss: DEV

Flag this post

The Cargo Cult in the Machine: Why LLMs Are the Ultimate Imitators

steviee.medium.com·21h·

Discuss: Hacker News

💬Philosophy of Language

Flag this post

Attention Illuminates LLM Reasoning: The Preplan-and-Anchor Rhythm EnablesFine-Grained Policy Optimization

dev.to·23h·

Discuss: DEV

Flag this post

Loading more...