🧠 LLMs - randomasshole · Scour

epscylonb/1386.ai.rocm: A lightweight transformer language model built from scratch in PyTorch, trained on a single consumer GPU with a full pipeline for data processing, pretraining, and instruction tuning. 🤖AI

github.com·2d·Hacker News

Shorthand for Thought: Compressing LLM Reasoning via Entropy-Guided Supertokens 🤖AI

Information Extraction from Electricity Invoices with General-Purpose Large Language Models 🤖AI

allocz/slm: zero-dependency TUI LLM chat 🎮Reinforcement Learning

github.com·1d·Hacker News, r/golang

Structural Generalization on SLOG without Hand-Written Rules 🤖AI

Factorized Latent Reasoning for LLM-based Recommendation 🎮Reinforcement Learning

Language Anchoring: A Systematic Method for LLM Multilingual Adaptation 🎮Reinforcement Learning

github.com·4d·Hacker News

What Kind of Language is Easy to Language-Model Under Curriculum Learning? 🎮Reinforcement Learning

Select to Think: Unlocking SLM Potential with Local Sufficiency 🎮Reinforcement Learning

AsishKumarDalal/memoryllm: using differntiable neural computer architecture with GPT2 to provide memory 🤖AI

github.com·5d·DEV

CoQuant: Joint Weight-Activation Subspace Projection for Mixed-Precision LLMs 🎮Reinforcement Learning

itayinbarr/little-coder: A coding agent optimized to smaller LLMs 🎮Reinforcement Learning

github.com·3d·Hacker News

Delineating Knowledge Boundaries for Honest Large Vision-Language Models 🤖AI

Decoupling Knowledge and Task Subspaces for Composable Parametric Retrieval Augmented Generation 🤖AI

TLPO: Token-Level Policy Optimization for Mitigating Language Confusion in Large Language Models 🎮Reinforcement Learning

Programming with Data: Test-Driven Data Engineering for Self-Improving LLMs from Raw Corpora 🤖AI

MoRFI: Monotonic Sparse Autoencoder Feature Identification 🤖AI

LLM-ReSum: A Framework for LLM Reflective Summarization through Self-Evaluation 🤖AI

Addressing Performance Saturation for LLM RL via Precise Entropy Curve Control 🎮Reinforcement Learning

When to Retrieve During Reasoning: Adaptive Retrieval for Large Reasoning Models 🤖AI

Sign up or log in to see more results

Log in to enable infinite scrolling