🧠 LLMs - saeedesmaili · Scour

Philosophy

🤖AI Agents Reference

docs.langchain.com·

Energy-Efficient On-Device RAG on a Mobile NPU: System Design and Benchmark on Snapdragon X Elite

🪟Context Windows Academic

ashp15205/guardian-runtime: A zero-latency, local-first runtime firewall for LLMs. Intercept every prompt and response locally to stop data leaks and runaway token costs.

🪟Context Windows Code

github.com··Hacker News, Hacker News

Apple WWDC On-Device AI Deep Dive - Google Docs

🤖Data science

gist.is··Hacker News

Why LLMs (still) lack taste

beyondtheprior.com··Hacker News

Introducing the Third Generation of Apple’s Foundation Models

🤖Machine Learning

machinelearning.apple.com··Hacker News, r/apple

CommBench: Can LLMs Write Correct and Efficient GPU Communication Code?

uccl-project.github.io··Hacker News

How to Build an Agentic RAG with RubyLLM and Rails

🔍Information Retrieval Blog

panasiti.me··Hacker News

Inferoa AI harness claimed 90% cache savings. We ran it and measured 97.8%

🧠LLM Inference

zozo123.github.io··Hacker News

Building Agents Without Harness-Engineering

rajitkhanna.com··Hacker News

Show HN: Audit any AI/data pairing with Veritrooper

🪟Context Windows

veritrooper.com··Hacker News

Auto complete tickets using Claude Code loop on telegram with linear MCP

🤖AI Agents Blog

niptao.com··Hacker News

How we fight GPU scarcity without compromise

🧠LLM Inference Blog

equixly.com··Hacker News

DeepSeekV4 1.6T Day 0 to Day 43 Performance Over Time - Huawei, GB300 NVL72, MI355X, B200

🧠LLM Inference News

newsletter.semianalysis.com

··Hacker News

Fine-tuning Multi-modal LLMs with ART: Art-based Reinforcement Training

🎯Fine-tuning Academic

Research Proposal: Decoupled RISC-LLM Architectures via Circadian Synaptic Consolidation

🪟Context Windows

aermia.com··Hacker News

DiffusionGemma 26B A4B results on my 5090

🧠LLM Inference

huggingface.co··r/LocalLLaMA

Show HN: Ext-Infer

🪟Context Windows

infer.displace.tech··Hacker News

NetX-lab/Frontier: Frontier: A Discrete-Event Simulator for Modern LLM Serving

🧠LLM Inference Code

github.com··Hacker News

Less-relevant results

The Missing Link Between Agents and Applications

🤖AI Agents Blog

langchain.com··Hacker News

Log in to enable infinite scrolling