🧠 LLMs - kevincrane

🔍RAG Blog

khnsakhnm.medium.com·

Comprehensive evaluation of LLM capabilities for interpretation and analysis of genome-scale metabolic models in metabolic engineering

🤖AI Engineering Academic

biorxiv.org·

CommBench: Can LLMs Write Correct and Efficient GPU Communication Code?

🤖AI Engineering

uccl-project.github.io··Hacker News

LangChain Explained: Understanding Models, Prompts, Chains, Memory, Indexes, and Agents

🤖AI Engineering Blog

towardsai.net·

Timing Trick Cuts Energy Used in LLM Training by Up to 14 Percent

📐System Design News

spectrum.ieee.org

··Hacker News

lightmetal: GPU LLM Inference From a Single Java 25 JAR

🔍RAG Blog

adambien.blog·

Show HN: In-browser real LLM token counter and cost estimation

🖥️Backend Development

holaclaw.ai··Hacker News

A reporting checklist for large language models in behavioural science

🤝AI Agents Academic

nature.com·

Two old GPUs I salvaged are doing more AI work than a brand new $2000 card, and I won't be upgrading anytime soon

🛡️AI Safety

xda-developers.com·

harmansingh4163-ai/ESP-32-s3-Story-maker-LLM: 15M/42M-param Llama split across two ESP32-S3s over 3 wires — too big for either chip alone. INT4, flash mmap, bit-exact verified.

📐System Design Code

github.com··Hacker News

Prompt Caching Explained: The AI Concept That Can Save Millions of Tokens

🔌API Design Blog

sweta-nit.medium.com·

Research Proposal: Decoupled RISC-LLM Architectures via Circadian Synaptic Consolidation

📐System Design

aermia.com··Hacker News

A Plea to the Labs: Let the Models Diagnose.

🛡️AI Safety Blog

tangent.bearblog.dev··Hacker News

Google's new open-weights model brings image-generation tricks to AI text generation

🤖AI Engineering News

theregister.com··Hacker News

Why LLMs (still) lack taste

📐System Design

beyondtheprior.com··Hacker News

Why Your LLM Gets Dumber With More Context

Report: GKE Inference Gateway delivers up to 92% faster AI responses

MTG Bench: Testing how well LLMs can play Magic

Orchestrate your LLM pipeline. Locally

Show HN: Ext-Infer

A Complete Beginner's Guide to Local LLM Inference

Comprehensive evaluation of LLM capabilities for interpretation and analysis of genome-scale metabolic models in metabolic engineering

CommBench: Can LLMs Write Correct and Efficient GPU Communication Code?

LangChain Explained: Understanding Models, Prompts, Chains, Memory, Indexes, and Agents

Timing Trick Cuts Energy Used in LLM Training by Up to 14 Percent

lightmetal: GPU LLM Inference From a Single Java 25 JAR

Show HN: In-browser real LLM token counter and cost estimation

A reporting checklist for large language models in behavioural science

Two old GPUs I salvaged are doing more AI work than a brand new $2000 card, and I won't be upgrading anytime soon

harmansingh4163-ai/ESP-32-s3-Story-maker-LLM: 15M/42M-param Llama split across two ESP32-S3s over 3 wires — too big for either chip alone. INT4, flash mmap, bit-exact verified.

Prompt Caching Explained: The AI Concept That Can Save Millions of Tokens

Research Proposal: Decoupled RISC-LLM Architectures via Circadian Synaptic Consolidation

A Plea to the Labs: Let the Models Diagnose.

Google's new open-weights model brings image-generation tricks to AI text generation

Why LLMs (still) lack taste