🧠 LLMs - kelvinyu1117 · Scour

Opus 4.8 Thinking keeps deteroriating on Hard Prompts English in LMArena (again)

arena.ai··r/singularity

Timing Trick Cuts Energy Used in LLM Training by Up to 14 Percent

🤖AI News

spectrum.ieee.org

··Hacker News

Token4Token — pay-per-token inference on Gnosis + Swarm

t4t.eth.link··Hacker News

LLM Inference Engineering Room — Part 3: The Orchestration Layer

🤖Inference Blog

vimal-dwarampudi.medium.com·

Train Models Faster with JAX and MaxText Using NVFP4 on NVIDIA Blackwell

🤖Inference News Blog

developer.nvidia.com·

Anthropic’s AI fearmongering isn’t what it appears to be

🤖AI Blog

lightmetal: GPU LLM Inference From a Single Java 25 JAR

🤖Inference Blog

adambien.blog·

How LLMs work | Practical Leaders

practical-leaders.com··Hacker News

I built an open-source persistent memory layer for AI coding agents

🦀Rust Code

github.com··r/GithubCopilot

Build a Medical Report Analyzer on Dedicated Inference with Python

digitalocean.com·

Alignment Collapse Under KV Cache Quantization: Diagnosis and Mitigation

🤖Inference Academic

Deep Learning Weekly: Issue 458

deeplearningweekly.com·

Research Proposal: Decoupled RISC-LLM Architectures via Circadian Synaptic Consolidation

aermia.com··Hacker News

GGUF vs GPTQ vs AWQ: The Plain-English Guide to LLM Quantization (and Which One to Pick)

vettedconsumer.com··Hacker News

BacteReason: A Reasoning Model for Antimicrobial Resistance Prediction

🏗️MLSys Academic

How we fight GPU scarcity without compromise

🤖Inference Blog

equixly.com··Hacker News

Best explanations of how LLMs work

🏗️MLSys Blog

vorushin.github.io··Hacker News

AI Model for Ancient Papyri.

languagehat.com·

The Neutral Mask: How RLHF Provides Shallow Alignment while Leaving Partisan Structure Intact in a Large Language Model

🏗️MLSys Academic

Speculators v0.5.0: DFlash support and online training

developers.redhat.com·

Log in to enable infinite scrolling