🤖 language models - comwena · Scour

Decoupling the Benefits of Subword Tokenization for Language Model Training via Byte-level Simulation 💬LLM

How Fast Should a Model Commit to Supervision? Training Reasoning Models on the Tsallis Loss Continuum 💬LLM

PRTS: A Primitive Reasoning and Tasking System via Contrastive Representations 💬LLM

AutoPyVerifier: Learning Compact Executable Verifiers for Large Language Model Outputs 💬LLM

ADE: Adaptive Dictionary Embeddings -- Scaling Multi-Anchor Representations to Large Language Models 💬LLM

When 2D Tasks Meet 1D Serialization: On Serialization Friction in Structured Tasks 💬LLM

Three Models of RLHF Annotation: Extension, Evidence, and Authority 💬LLM

DPN-LE: Dual Personality Neuron Localization and Editing for Large Language Models 💬LLM

Evaluating Large Language Models on Computer Science University Exams in Data Structures 💬LLM

HealthBench Professional: Evaluating Large Language Models on Real Clinician Chats 💬LLM

Structural Generalization on SLOG without Hand-Written Rules 💬LLM

Less Is More: Engineering Challenges of On-Device Small Language Model Integration in a Mobile Application 💬LLM

Domain-Adapted Small Language Models for Reliable Clinical Triage 💬LLM

Text-Utilization for Encoder-dominated Speech Recognition Models 💬LLM

Mixture of Heterogeneous Grouped Experts for Language Modeling 💬LLM

Large Language Models for Multilingual Code Intelligence: A Survey 💬LLM

Programming with Data: Test-Driven Data Engineering for Self-Improving LLMs from Raw Corpora 💬LLM

Extracting books from production language models 💬LLM

On the Trainability of Masked Diffusion Language Models via Blockwise Locality 💬LLM

From Similarity to Structure: Training-free LLM Context Compression with Hybrid Graph Priors 💬LLM

Sign up or log in to see more results

Log in to enable infinite scrolling