🤖 AI new techology - Josie · Scour

harshuljain13/llm-inference-at-scale: A Practitioner handbook for production llm serving.

🤖AI Code

github.com··Hacker News, r/LLM

Orchestrate your LLM pipeline. Locally

llmforge.app··Hacker News

UniSVQ: 2-bit Unified Scalar-Vector Quantization

🤖AI Academic

Google's new open-weights model brings image-generation tricks to AI text generation

🤖AI News

theregister.com·

Qwen 3.6 27B AutoRound GGUF, need your feedback

huggingface.co··r/LocalLLaMA

Intelligent inference scheduling with llm-d on Red Hat AI

developers.redhat.com·

Gemma 4 QAT models: Optimizing model compression for mobile and laptop efficiency

🤖AI News Blog

blog.google··Hacker News

6. Air-Gapped Claude Code - The Claude Code SRE Handbook

har-ki.github.io··Hacker News

Why LLMs (still) lack taste

beyondtheprior.com··Hacker News

Two old GPUs I salvaged are doing more AI work than a brand new $2000 card, and I won't be upgrading anytime soon

xda-developers.com·

What's in the Box? A Field Guide to AI Models

🤖AI Blog

iankduncan.com·

local llm on laptop 780M GPU using llama + gemma 4 qat

🤖AI native Blog

alper.bearblog.dev·

Ask HN: Any Local LLM can I run without GPU for Local Agentic workflow AI?

🤖AI native Discussion

news.ycombinator.com··Hacker News

Anthropic Reverses Course on Hidden AI Restrictions Following Developer Backlash

Inferoa AI harness claimed 90% cache savings. We ran it and measured 97.8%

zozo123.github.io··Hacker News

If LLMs are all persona, whose persona are they?

persona.earthpilot.ai··Hacker News

Introducing the Third Generation of Apple’s Foundation Models

machinelearning.apple.com··Hacker News, r/apple

MTP Isn't Always a Win: 1.95x on My 3090, but Speculative Decoding Is Hardware-Dependent

🤖AI native Blog

bric.pe.kr··DEV

#070 - Anthropic walks back Fable 5's throttle, Claude Desktop hides a 1.8GB VM, HTML doubles signups

indiehacker.news·

147th airhacks tv: Local LLMs, LightMetal, ZSmith Agents, AI Rails, Saving Tokens

🤖AI Blog

adambien.blog·

Log in to enable infinite scrolling