Researchers say they trained a foundation model from scratch for about $1,500 (opens in new tab)

Covers sapientinc/HRM-Text: HRM-Text is a 1B text generation model based on the HRM architecture, strengthened by task completion and latent space reasoning.Covered by TechTalksDiscussed on Hacker News

Training a foundation LLM from scratch costs millions and requires internet-scale data — which is why most enterprises don't bother. Sapient thinks it has a cheaper path.To overcome this brute-force scaling dogma, researchers at Sapient developed .HRM decouples computation into slow-evolving strategic and fast-evolving execution layers. Instead of brute-force autoregressive prediction on raw text, HRM-Text trains exclusively on instruction-response pairs. This is close to real-world enterpris...

Read the original article

Sign in to keep reading the full article.

Sign Up Log In

Covered in 1 article

TechTalks

·

Escaping the chain-of-thought trap: What is next for LLM reasoning

Discussed on Substack