Large Language Models (LLMs)

Feeds to Scour
SubscribedAll
Scoured 718 posts in 9.4 ms

Markov Chains: The Grandparents of LLMs

 Model optimizations in LLMs
dmanco.dev··Hacker News

Show HN: In-browser real LLM token counter and cost estimation

 💬Prompt optimizations for LLM serving
holaclaw.ai··Hacker News

Ask HN: Any Local LLM can I run without GPU for Local Agentic workflow AI?

 🤖Agents using LLMs  Content type: Discussion

NVIDIA A100 vs RTX 4090 for AI Workloads: The Cost Per FLOP Reality

 ⚙️AI Infrastructure Automation  Content type: Blog
fitservers.com·

Generative AI in the Real World: Agentic Systems Fundamentals with Maarten Grootendorst

 🔍Retrieval-augmented generation  Content type: Audio
oreilly.com·

Context compression finally works in production: new research cuts LLM input 16x without the accuracy hit

 🔍Retrieval-augmented generation
venturebeat.com·

Google open-sources speedy DiffusionGemma text diffusion model

 🔍Retrieval-augmented generation
siliconangle.com·

LLM are universal simulators

 Model optimizations in LLMs

Google's new open-weights model brings image-generation tricks to AI text generation

 📊AI Performance Profiling  Content type: News
theregister.com·

147th airhacks tv: Local LLMs, LightMetal, ZSmith Agents, AI Rails, Saving Tokens

 🔢Quantization of LLMs  Content type: Blog
adambien.blog·

local llm on laptop 780M GPU using llama + gemma 4 qat

 🔢Quantization of LLMs  Content type: Blog
alper.bearblog.dev·

New comment by alroma90 in "Ask HN: Who wants to be hired? (June 2026)"

 🤖Agents using LLMs  Content type: Discussion

Don't let the LLM speak, just probe it (8 minute read)

 💬Prompt optimizations for LLM serving  Content type: Blog

How J.A.R.V.I.S. Became the Smartest Mind on Earth — What is an LLM?

 💬Prompt optimizations for LLM serving  Content type: Blog
medium.com·

AI context windows: Why context quality beats context size

 🔍Retrieval-augmented generation  Content type: Blog
redis.io·

[NEW MODEL] SupraLabs just released Supra1.5-50M Base (Experimental)!

 🔧Systems-level optimizations for LLM serving
huggingface.co··r/LocalLLaMA

langchain-ai/langchain langchain-core==1.4.6

 🔍Retrieval-augmented generation  Content type: Code
github.com
·

Report: GKE Inference Gateway delivers up to 92% faster AI responses

 🔧Systems-level optimizations for LLM serving  Content type: Blog

Tokenminning: Because Tokenmaxxing Is a Bad Idea

 💬Prompt optimizations for LLM serving

AI The Truly Environmentally Friendly Way

 ⚙️AI Infrastructure Automation
hackaday.com·

Keyboard Shortcuts

Navigation

Next / previous item
j/k
Open post
oorEnter
Preview post
v

Post Actions

Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s

Recommendations

Add interest / feed
Enter
Not interested
x

Go to

Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/

General

Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help