🤖 LLMs - lwshang · Scour

harshuljain13/llm-inference-at-scale: A Practitioner handbook for production llm serving.

🤖AI Code

github.com··Hacker News, r/LLM

Intelligent inference scheduling with llm-d on Red Hat AI

⚙️Systems Programming

developers.redhat.com·

Cross-LLM Consistency in Inference: Evidence from Shared Interactions

🔧Compilers Academic

AI chatbots mimic fear, sadness and stress, then calm down after mindfulness exercise

medicalxpress.com·

Fine-tuning Large Language Models (LLMs) using PEFT

🤖AI Blog

·

Report: GKE Inference Gateway delivers up to 92% faster AI responses

🤖AI Blog

cloud.google.com··Hacker News

How LLMs work | Practical Leaders

practical-leaders.com··Hacker News

How to Build Financial Services AI Agents with Claude

🤖AI Blog

odsc.medium.com·

Comprehensive evaluation of LLM capabilities for interpretation and analysis of genome-scale metabolic models in metabolic engineering

🔧Compilers Academic

Orchestrate your LLM pipeline. Locally

llmforge.app··Hacker News

How Effective Are LLM Trading Agents?

🤖AI News Blog

harbourfrontquant.substack.com··Substack

Why LLMs (still) lack taste

beyondtheprior.com··Hacker News

Law Professors Prefer AI over Peer Answers

🤖AI Academic

law.stanford.edu··Hacker News

How Large Language Models Are Creating New Security Challenges

🔧Compilers Blog

·

LLM Routing: From Strategy Selection to Production Architecture

🤖AI Blog

AI Evaluation: How to Test LLM Applications Properly

🤖AI Blog

·

High Bandwidth Flash | A New Memory for AI Data Centers and Edge Computing | Sandisk

ncnonline.net·

RAG Pipeline Explained: From Query to Answer, Step by Step

🗄️Databases Blog

·

Stop Wasting GPU Budget: Autoscaling AI Inference on Kubernetes with KEDA

⚙️Systems Programming

cloudnativenow.com·

The Inference Alpha: Maximizing Frontier Models on AMD

🤖AI Blog

digitalocean.com·

Log in to enable infinite scrolling