🧠 LLMs - lmilekic · Scour

Why Shrinking an AI Model Often Makes It More Useful

siliconopera.com·

Breaking the Ice: Analyzing Cold Start Latency in vLLM

💉Prompt Injection Academic

Why LLMs hallucinate?

⚙️Prompt Engineering Blog

·

Less-relevant results

Token4Token — pay-per-token inference on Gnosis + Swarm

t4t.eth.link··Hacker News

markusheimerl/gpt: A generative pretrained transformer implementation

✨Generative AI Code

github.com··Hacker News

LLM Inference Engineering Room — Part 3: The Orchestration Layer

⚙️Prompt Engineering Blog

vimal-dwarampudi.medium.com·

Phantom transitions in language model fine-tuning

⚙️Prompt Engineering Academic

How we fight GPU scarcity without compromise

⚙️Prompt Engineering Blog

equixly.com··Hacker News

A Regret Minimization Framework on Preference Learning in Large Language Models

🎮Reinforcement Learning Academic

You’re probably using Claude wrong – this $19.99 e-degree can fix that

⚙️Prompt Engineering

Why LLMs (still) lack taste

beyondtheprior.com··Hacker News

Dynamic Linear Attention

📚CS Research Academic

AIs like ChatGPT fall apart in classic 'Stroop' psychological test — and that could stand in the way of achieving artificial general intelligence

·

The Bill Arrives: How to Manage Agentic AI Costs at Scale

🤖AI Agents Blog

cockroachlabs.com·

PagedAttention vs Traditional KV Cache: How vLLM Reinvented GPU Memory for LLM Inference

📊LLM Evaluation Blog

·

AI Product Builder & Full Stack Developer

⚙️Prompt Engineering

manasagrawal.online··r/sideprojects

RKSC: Reasoning-Aware KV Cache Sharing and Confident Early Exit for Multi-Step LLM Inference

📚CS Research Academic

Nvidia DGX Spark GB10 – AI Models and Guide with vLLM and Autonomous Script

✨Generative AI Code

github.com··Hacker News

Claude AI Built Me a Free Tool in Minutes — Someone Else’s Version Makes $30K/Month

⚙️Prompt Engineering Blog

·

How LLMs work | Practical Leaders

📊LLM Evaluation

practical-leaders.com··Hacker News

Log in to enable infinite scrolling