💬 LLMs - omisols · Scour

New comment by alroma90 in "Ask HN: Who wants to be hired? (June 2026)"

🧠Agentic AI Discussion

news.ycombinator.com··Hacker News

Context compression finally works in production: new research cuts LLM input 16x without the accuracy hit

venturebeat.com·

Apple's Foundation Models can now use third-party LLMs (Claude, Gemini) [video]

developer.apple.com··Hacker News

Token4Token — pay-per-token inference on Gnosis + Swarm

☁️Cloud Infrastructure

t4t.eth.link··Hacker News

Show HN: In-browser real LLM token counter and cost estimation

holaclaw.ai··Hacker News

LLM Cheat Sheet

🤖AI/ML Blog

drkpxl.bearblog.dev·

Why LLMs (still) lack taste

☁️Cloud Infrastructure

beyondtheprior.com··Hacker News

NexusOS v2.0 – A zero-dependency pipeline streaming server chaos to Parquet

huggingface.co··Hacker News

I Built a RAG System in 2025. The “RAG Is Dead” Posts Keep Telling Me to Delete It.

·

Google's new open-weights model brings image-generation tricks to AI text generation

🤖AI/ML News

theregister.com·

Gemma 4 QAT on 10GB Laptop: Local AI with 6.7GB VRAM

🖥️Hypervisors

everylocalai.com··DEV

Fine-tuning Multi-modal LLMs with ART: Art-based Reinforcement Training

🤖AI/ML Academic

What Are Tokens in LLMs?

🤖AI/ML Blog

bearisland.dev··Hacker News

Context windows in AI: why every token is a budget decision

🚀MLOps Blog

Google's DiffusionGemma generates 256 tokens in parallel and self-corrects as it goes

venturebeat.com·

6. Air-Gapped Claude Code - The Claude Code SRE Handbook

☁️Cloud Infrastructure

har-ki.github.io··Hacker News

fix(ollama): use provider thinking default in SDK session factory (#9… · openclaw/openclaw@4f3c2cd

🔬eBPF Code

Fixing a stuck Ollama runner and building a GPU watchdog

patrickmccanna.net··Hacker News

What's in the Box? A Field Guide to AI Models

🤖AI/ML Blog

iankduncan.com·

Timing Trick Cuts Energy Used in LLM Training by Up to 14 Percent

🤖AI/ML News

spectrum.ieee.org

··Hacker News

Log in to enable infinite scrolling