🤖 LLMs - tompeart · Scour

harshuljain13/llm-inference-at-scale: A Practitioner handbook for production llm serving.

🤖AI Code

github.com··Hacker News, r/LLM

Comprehensive evaluation of LLM capabilities for interpretation and analysis of genome-scale metabolic models in metabolic engineering

🤖AI Academic

Fine-tuning Multi-modal LLMs with ART: Art-based Reinforcement Training

🤖AI Academic

Intelligent inference scheduling with llm-d on Red Hat AI

developers.redhat.com·

Why LLMs (still) lack taste

beyondtheprior.com··Hacker News

Why Your LLM Gets Dumber With More Context

siliconopera.com·

Claude vs GPT-4: Which AI API Is Better for Developers? (2026)

💻Software Engineering

kalyna.pro··DEV

Ollama 0.30 GPU Boost: Faster local Qwen inference on NVIDIA

everylocalai.com··DEV

Timing Trick Cuts Energy Used in LLM Training by Up to 14 Percent

🏃Running News

spectrum.ieee.org

··Hacker News

What Ollama Reveals About Local AI, Agents, and Open Models

🤖AI Blog

odsc.medium.com·

MCP Architecture Explained for Beginners: Why AI Needs a Structured Communication System

🤖AI Blog

·

The smartest ChatGPT users are putting local AI in front of it — here's why

·

Fixing a stuck Ollama runner and building a GPU watchdog

⚙System programming

patrickmccanna.net··Hacker News

CommBench: Can LLMs Write Correct and Efficient GPU Communication Code?

uccl-project.github.io··Hacker News

Built and launched a research-reading and highlighting tool with Claude over a few months. Here are the things AI was surprisingly good (and bad) at.

highlyt.app··r/ClaudeAI

AMD's Lemonade SDK For Local AI Adds NVIDIA CUDA Support

phoronix.com··r/artificial

Running Ollama on a 15W CPU sounded ridiculous until I got it working with decent results

⚙System programming

xda-developers.com·

MTP Isn't Always a Win: 1.95x on My 3090, but Speculative Decoding Is Hardware-Dependent

🇬🇧London Tech Blog

bric.pe.kr··DEV

Improved performance and model support with GGUF

🤖AI Blog

Inferoa AI harness claimed 90% cache savings. We ran it and measured 97.8%

zozo123.github.io··Hacker News

Log in to enable infinite scrolling