🔄 Transformers - matan · Scour

SG-UniBuc-NLP at SemEval-2026 Task 6: Multi-Head RoBERTa with Chunking for Long-Context Evasion Detection 💬Natural Language Processing

Salca: A Sparsity-Aware Hardware Accelerator for Efficient Long-Context Attention Decoding 💬Natural Language Processing

Dissociating Decodability and Causal Use in Bracket-Sequence Transformers 🧮Information theory

Carbon-Taxed Transformers: A Green Compression Pipeline for Overgrown Language Models 💬Natural Language Processing

A Dual-Task Paradigm to Investigate Sentence Comprehension Strategies in Language Models 🧠Psycholinguistics

Automating Categorization of Scientific Texts with In-Context Learning and Prompt-Chaining in Large Language Models 💬Natural Language Processing

Multiple Additive Neural Networks for Structured and Unstructured Data 🌊Diffusion models

Learning to Route Queries to Heads for Attention-based Re-ranking with Large Language Models 🔗Link Aggregation

Transformer Approximations from ReLUs 🤖AI

A New Semisupervised Technique for Polarity Analysis using Masked Language Models 💬Natural Language Processing

Adaptive ToR: Complexity-Aware Tree-Based Retrieval for Pareto-Optimal Multi-Intent NLU 💬Natural Language Processing

Identifying the Achilles' Heel: An Iterative Method for Dynamically Uncovering Factual Errors in Large Language Models 💬Natural Language Processing

Large language models eroding science understanding: an experimental study 💬Natural Language Processing

Large Language Models Decide Early and Explain Later 🎲Bayesian Cognition

COPUS: Co-adaptive Parallelism and Batch Size Selection in Large Language Model Training 🔬AI Research

AdaComp: Extractive Context Compression with Adaptive Predictor for Retrieval-Augmented Large Language Models 💬Natural Language Processing

Exploring the Potential of Probabilistic Transformer for Time Series Modeling: A Report on the ST-PT Framework 🌊Diffusion models

Stochastic Scaling Limits and Synchronization by Noise in Deep Transformer Models 🌊Diffusion models

Beyond the Attention Stability Boundary: Agentic Self-Synthesizing Reasoning Protocols 🎲Bayesian Cognition

Information Extraction from Electricity Invoices with General-Purpose Large Language Models 💬Natural Language Processing

Log in to enable infinite scrolling