🧠 LLM Inference - emschwartz · Scour

BREAKING🚨: Stanford University just launched a FREE AI tool for researchers!

threadreaderapp.com·1d

Running LLMs in-browser via WebGPU, Transformers.js, and Chrome's Prompt API—no Ollama, no server

noaibills.app·3d·

Discuss: r/LocalLLaMA, r/SideProject, r/selfhosted

Mastering Unstructured data: The Blueprint For Efficient Solution

pub.towardsai.net·2d

🔤Tokenization

Planning Work for Our Single-Threaded Brains

linkedin.com·2d

Hardware Acceleration

jellyfin.org·3d

⚡Hardware Acceleration

NVIDIA VibeTensor: AI Just Built Its Own Deep Learning Engine… And It Actually Works (AI Revolution

youtube.com·2d

userface.ai·2d

How StrongDM’s AI team build serious software without even looking at the code

simonw.substack.com·2d·

Discuss: Substack

🏗️LLM Infrastructure

6 AI Agents, One Company

voxyz.space·1d

Show HN: LocalGPT – A local-first AI assistant in Rust with persistent memory

news.ycombinator.com·3d·

Discuss: Hacker News

Time Series Reasoning via Process-Verifiable Thinking Data Synthesis and Scheduling for Tailored LLM Reasoning

arxiv.org·23h

🏗️LLM Infrastructure

Adaptive Retrieval helps Reasoning in LLMs -- but mostly if it's not used

arxiv.org·23h

🔄LLM RAG Pipelines

Hallucinations in GPT5 – Can models say "I don't know" (June 2025)

jobswithgpt.com·3d·

Discuss: Hacker News

For real game-theoretic reasoning, we need best response in imperfect information games

weyxie.bearblog.dev·1d·

Discuss: Hacker News

🛡️AI Security

Beyond agentic coding

haskellforall.com·3d·

Discuss: Lobsters, Hacker News, Hacker News, r/programming

👨‍💻AI Coding

Heterogeneous Processing: A Strategy for Augmenting Moore's Law (2006)

linuxjournal.com·2d·

Discuss: Hacker News

🖥️Hardware Architecture

EBM vs. LLMs: Our Kona EBM a 96% vs. 2% Sudoku Benchmark

logicalintelligence.com·5d·

Discuss: Hacker News

🏆LLM Benchmarking

We recreated the Anthropic C compiler agent

vizops.ai·2d·

Discuss: Hacker News

⚙️Language Runtimes

Writing an LLM from scratch, part 32b -- Interventions: gradient clipping

gilesthomas.com·6d·

Discuss: Hacker News

🏆LLM Benchmarking

Generative Modeling via Drifting

lambertae.github.io·5d·

Discuss: Hacker News

📦Batch Embeddings

Loading more...