🧠 Large Language Models (LLMs) - pleto · Scour

harshuljain13/llm-inference-at-scale: A Practitioner handbook for production llm serving.

🔧Systems-level optimizations for LLM serving Code

github.com··Hacker News, r/LLM

How LLMs are Actually Trained

✨Model optimizations in LLMs News Blog

blog.algomaster.io·

Making a Vintage LLM from Scratch

💬Prompt optimizations for LLM serving

crlf.link··Hacker News

Orchestrate your LLM pipeline. Locally

✨Model optimizations in LLMs

llmforge.app··Hacker News

Should LLM Agents Decide in Social Simulations? Comparing Finite-State and LLM-Based Decision Policies

🤖Agents using LLMs Academic

How ChatGPT Actually Works (Beginner Friendly)

🤖Agents using LLMs Blog

·

LangChain Explained: Understanding Models, Prompts, Chains, Memory, Indexes, and Agents

🔍Retrieval-augmented generation Blog

towardsai.net·

Why Your LLM Gets Dumber With More Context

🔍Retrieval-augmented generation

siliconopera.com·

LangChain vs LlamaIndex 2026: Response Time on 10 RAG Tasks

🔍Retrieval-augmented generation Blog Discussion

Two old GPUs I salvaged are doing more AI work than a brand new $2000 card, and I won't be upgrading anytime soon

✨Model optimizations in LLMs

xda-developers.com·

Context windows in AI: why every token is a budget decision

🔍Retrieval-augmented generation Blog

Philosophy

🔍Retrieval-augmented generation Reference

docs.langchain.com·

Prompt Caching Explained: The AI Concept That Can Save Millions of Tokens

🔍Retrieval-augmented generation Blog

sweta-nit.medium.com·

lightmetal: GPU LLM Inference From a Single Java 25 JAR

🔢Quantization of LLMs Blog

adambien.blog·

Research Proposal: Decoupled RISC-LLM Architectures via Circadian Synaptic Consolidation

🔍Retrieval-augmented generation

aermia.com··Hacker News

LLM Cheat Sheet

🔍Retrieval-augmented generation Blog

drkpxl.bearblog.dev·

LLM Routing: From Strategy Selection to Production Architecture

📊AI Performance Profiling Blog

Show HN: In-browser real LLM token counter and cost estimation

💬Prompt optimizations for LLM serving

holaclaw.ai··Hacker News

Why LLMs (still) lack taste

💬Prompt optimizations for LLM serving

beyondtheprior.com··Hacker News

My Notes on the Progression from Context to Prompt to Harness engineering in making GPT LLMs Useful: (TUESDAY) MAMLMs

🔍Retrieval-augmented generation News Blog

braddelong.substack.com

Log in to enable infinite scrolling