Large Language Models (LLMs)

Feeds to Scour
SubscribedAll
Scoured 718 posts in 7.5 ms

harshuljain13/llm-inference-at-scale: A Practitioner handbook for production llm serving.

 🔧Systems-level optimizations for LLM serving  Content type: Code
github.com··Hacker News, r/LLM

How LLMs are Actually Trained

 Model optimizations in LLMs  Content type: News  Content type: Blog
blog.algomaster.io·

The Neutral Mask: How RLHF Provides Shallow Alignment while Leaving Partisan Structure Intact in a Large Language Model

 Model optimizations in LLMs  Content type: Academic
arxiv.org·

Making a Vintage LLM from Scratch

 💬Prompt optimizations for LLM serving
crlf.link··Hacker News

LangChain vs LlamaIndex 2026: Response Time on 10 RAG Tasks

 🔍Retrieval-augmented generation  Content type: Blog  Content type: Discussion
tildalice.io·

How ChatGPT Actually Works (Beginner Friendly)

 🤖Agents using LLMs  Content type: Blog
medium.com
·

LangChain Explained: Understanding Models, Prompts, Chains, Memory, Indexes, and Agents

 🔍Retrieval-augmented generation  Content type: Blog
towardsai.net·

Orchestrate your LLM pipeline. Locally

 Model optimizations in LLMs
llmforge.app··Hacker News

Context windows in AI: why every token is a budget decision

 🔍Retrieval-augmented generation  Content type: Blog
redis.io·

Why Your LLM Gets Dumber With More Context

 🔍Retrieval-augmented generation
siliconopera.com·

Philosophy

 🔍Retrieval-augmented generation  Content type: Reference
docs.langchain.com·

lightmetal: GPU LLM Inference From a Single Java 25 JAR

 🔢Quantization of LLMs  Content type: Blog
adambien.blog·

Two old GPUs I salvaged are doing more AI work than a brand new $2000 card, and I won't be upgrading anytime soon

 Model optimizations in LLMs
xda-developers.com·

LLM Routing: From Strategy Selection to Production Architecture

 📊AI Performance Profiling  Content type: Blog
blog.n8n.io·

DiffusionGemma: Discrete diffusion in a large language model

 🔧Systems-level optimizations for LLM serving

Research Proposal: Decoupled RISC-LLM Architectures via Circadian Synaptic Consolidation

 🔍Retrieval-augmented generation
aermia.com··Hacker News

Prompt Caching Explained: The AI Concept That Can Save Millions of Tokens

 🔍Retrieval-augmented generation  Content type: Blog
sweta-nit.medium.com·

My Notes on the Progression from Context to Prompt to Harness engineering in making GPT LLMs Useful: (TUESDAY) MAMLMs

 🔍Retrieval-augmented generation  Content type: News  Content type: Blog

LLM Cheat Sheet

 🔍Retrieval-augmented generation  Content type: Blog
drkpxl.bearblog.dev·

Keyboard Shortcuts

Navigation

Next / previous item
j/k
Open post
oorEnter
Preview post
v

Post Actions

Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s

Recommendations

Add interest / feed
Enter
Not interested
x

Go to

Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/

General

Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help