🧠 LLM - codenm.no2 · Scour

harshuljain13/llm-inference-at-scale: A Practitioner handbook for production llm serving.

💬LLMs Code

github.com··Hacker News

What is Agentic RAG? Building Multi-Agent Agentic RAG Systems

🤖Large Language Models

pub.towardsai.net

·

Slack bot for the whole team, not per-seat

💬NLP Discussion

plugand.ai··Hacker News

LLM-Based Code Documentation Generation and Multi-Judge Evaluation

🤖Large Language Models Academic

These open-source tools do what Claude charges for, and some do it better

xda-developers.com·

Initial impressions of Claude Fable 5

🕸️WebAssembly

simonwillison.net··Hacker News

LLM Inference Handbook 2026

pub.towardsai.net

·

TA-RAG: Tone-Aware Retrieval-Augmented Generation for Peer-Support Health Communication

🤖Large Language Models Academic

SaqlainXoas/llm-system-patterns: A docs-first guide to LLM system design — hybrid search, embedding pipelines, reranking, and LLM-as-judge patterns.

🤖Large Language Models Code

github.com··r/LocalLLaMA, r/SideProject

Fine-Tuning vs. RAG vs. Prompting: the Definitive Decision Framework for 2026

pub.towardsai.net

·

Flaws in the LLM Automation Narrative

💬NLP Academic

LangChain Series #2: Models Explained — LLMs, Chat Models, and Embeddings with Practical…

🤖Large Language Models

pub.towardsai.net

·

A handy llama-server launcher with easy model and configuration customisation

💬NLP Code

github.com··r/LocalLLaMA

An LLM-Native Psychometric Instrument Does Not Predict LLM Behavior: Evidence Across 25 Models

💬NLP Academic

The Neutral Mask: How RLHF Provides Shallow Alignment while Leaving Partisan Structure Intact in a Large Language Model

💬LLMs Academic

Optimizing Local LLM Inference on Constrained Hardware

🤖Large Language Models

pub.towardsai.net

·

AIchain Skill: A Prompt as a Reusable Object

🤖Large Language Models Code

github.com··DEV

Measuring Embedding Drift: Why Hybrid Search Saves Stale Models.

pub.towardsai.net

·

Rosetta Memory: Adaptive Memory for Cross-LLM Agents

🤖Agentic AI Academic

heterodoxin/graphkv: Graph-guided KV cache compression for memory-efficient LLM inference.

🤖LLM Inference Code

github.com··r/LocalLLaMA

Log in to enable infinite scrolling