Beyond Standard LLMs
๐ฒBayesian Cognition
Flag this post
Transformer-Based Decoding in Concatenated Coding Schemes Under Synchronization Errors
arxiv.orgยท15h
๐งฎInformation theory
Flag this post
Everything About Transformers
krupadave.comยท5d
๐กInformation Theory
Flag this post
Attention Illuminates LLM Reasoning: The Preplan-and-Anchor Rhythm EnablesFine-Grained Policy Optimization
๐ฒBayesian Cognition
Flag this post
Spatial Secrets: Unleashing Language Models with Unexpected Masking by Arvind Sundararajan
๐ฒBayesian Cognition
Flag this post
Accumulating Context Changes the Beliefs of Language Models
arxiv.orgยท15h
๐ฒBayesian Cognition
Flag this post
Attention ISN'T all you need?! New Qwen3 variant Brumby-14B-Base leverages Power Retention technique
venturebeat.comยท43m
๐งWorkflow Automation
Flag this post
Hybrid channel attention network for auditory attention detection
nature.comยท1d
๐ฒBayesian Cognition
Flag this post
Explore More, Learn Better: Parallel MLLM Embeddings under Mutual Information Minimization
arxiv.orgยท15h
๐งฎInformation theory
Flag this post
Post-training methods for language models
developers.redhat.comยท13h
๐NLP
Flag this post
AI Summarization Optimization
๐NLP
Flag this post
Open Source Context-Aware PII Classifier
๐คAI
Flag this post
What Are Auto-regressive Models? A Deep Dive and Typical Use Cases
blog.pangeanic.comยท1d
๐คAI
Flag this post
An underqualified reading list about the transformer architecture
๐จComputational Creativity
Flag this post
ParaScopes: What do Language Models Activations Encode About Future Text?
arxiv.orgยท15h
๐งฎInformation theory
Flag this post
Beyond Broca: The Two Routes to Speaking
psychologytoday.comยท19h
๐ง Psycholinguistics
Flag this post
Optimizing Native Sparse Attention with Latent Attention and Local Global Alternating Strategies
arxiv.orgยท15h
๐คAI
Flag this post
Loading...Loading more...