🤖 Transformers - spearous · Scour

Algorithm that gets ‘under the hood’ of AI models could effectively steer their responses 🤖AI

·1d

Bruce on AI Engineering 🕵️AI Agents

heyuan110.com·7h

Zero-Cost Transparent Semiotic Awareness for Frozen Language Models SRT-Adapter 🧠LLM Training

sublius.substack.com·4d·Substack

The Sequence AI of the Week #851: DeepSeek-V4 and the Architecture of Million-Token Intelligence 🤖AI

substackcdn.com·1d·Substack

Soul Player C64 – a 2-layer decoder-only transformer LLM 🧠LLM Training

blog.adafruit.com·2d

Unveiling the Hidden Structure of Self-Attention via Kernel Principal Component Analysis 🧠LLM Training

proceedings.neurips.cc·5d

cauchy221/Alignment-Whack-a-Mole-Code: The official code repo of Alignment Whack-a-Mole: Finetuning Activates Verbatim Recall of Copyrighted Books in Large Language Models 🔍RAG

github.com·22h·Hacker News

Enhanced fracture network permeability prediction using attention mechanism and Kolmogorov-Arnold Networks with SHAP interpretability analysis 🤖AI

sciencedirect.com·4h

LLM Quantization 🧠LLM Training

huggingface.co·1h·Hacker News

Zffnn - Comptime Neural Network Inference Engine 🤖AI

Associative-State Universal Transformers: Sparse Retrieval Meets Structured Recurrence 🔍RAG

Presentation: Agents, Architecture, & Amnesia: Becoming AI-Native Without Losing Our Minds 🕵️AI Agents

·1d

DeepSeek open-sources V4 large language model series 🧠LLM Training

siliconangle.com·6d

Two Heads Are Better Than One: Async Knowledge Injection for Speech AI with Tandem Architecture 🧠LLM Training

pub.sakana.ai·1d·Hacker News

Back to BERT in 2026: ModernGENA as a Strong, Efficient Baseline for DNA Foundation Models 🧬Genomics

biorxiv.org·5d

Computation in Superposition: Two Handcrafted Models ⚛️Quantum Computing

lesswrong.com·1d

A First-Principles Theory of Slow Thinking and Active Perception 🧩Cognitive Science

global-sci.com·3d

In new Anthropic Fellows research, we discuss “introspection adapters": a tool that allows language models to self-report behaviors they've learned during train... 🧠LLM Training

twitter.macworks.dev

·1d

Using Bag-of-Words With PyCharm 🔍RAG

blog.jetbrains.com·1d

Training language models to be warm can reduce accuracy and increase sycophancy 🧠LLM Training

Log in to enable infinite scrolling