💬 LLMs - chris1 · Scour

RNN to Transformer NMT: PyTorch Migration with 2.8x BLEU Gain 🧠Neural Networks

tildalice.io·6d

Learning to Orchestrate Agents in Natural Language with the Conductor 🤖AI Agents

openreview.net·3d·Hacker News

Select to Think: Unlocking SLM Potential with Local Sufficiency 📐ML Theory

epscylonb/1386.ai.rocm: A lightweight transformer language model built from scratch in PyTorch, trained on a single consumer GPU with a full pipeline for data processing, pretraining, and instruction tuning. 🤖Machine Learning

github.com·2d·Hacker News

Associative-State Universal Transformers: Sparse Retrieval Meets Structured Recurrence 📐ML Theory

Shorthand for Thought: Compressing LLM Reasoning via Entropy-Guided Supertokens 🔍RAG

MoRFI: Monotonic Sparse Autoencoder Feature Identification 🤖Machine Learning

Information Extraction from Electricity Invoices with General-Purpose Large Language Models 🔍RAG

Language Anchoring: A Systematic Method for LLM Multilingual Adaptation 🤖AI Agents

github.com·4d·Hacker News

TLPO: Token-Level Policy Optimization for Mitigating Language Confusion in Large Language Models 🤖AI Agents

LLM-Flax : Generalizable Robotic Task Planning via Neuro-Symbolic Approaches with Large Language Models 🤖AI Agents

AsishKumarDalal/memoryllm: using differntiable neural computer architecture with GPT2 to provide memory 🤖Machine Learning

github.com·5d·DEV

Structural Generalization on SLOG without Hand-Written Rules 📐ML Theory

Adaptive and Fine-grained Module-wise Expert Pruning for Efficient LoRA-MoE Fine-Tuning 🤖Machine Learning

CoQuant: Joint Weight-Activation Subspace Projection for Mixed-Precision LLMs 📐ML Theory

Showoff Saturday: Using LLMs + Zod to create a deterministic parsing engine for educational content. 🔍RAG

github.com·6d·r/webdev

Human-in-the-Loop Benchmarking of Heterogeneous LLMs for Automated Competency Assessment in Secondary Level Mathematics 📐ML Theory

Ceci n'est pas une explication: Evaluating Explanation Failures as Explainability Pitfalls in Language Learning Systems 📐ML Theory

Evaluating Large Language Models on Computer Science University Exams in Data Structures 📐ML Theory

Addressing Performance Saturation for LLM RL via Precise Entropy Curve Control 🎮Reinforcement Learning

Sign up or log in to see more results

Log in to enable infinite scrolling