Run small local LLMs in browser 3x faster (opens in new tab)

Discussed on Hacker News

The fastest WebGPU runtime on the web. Run models in the browser, add a cloud gateway, and build with one simple Sipp client.