🧠 LLMs - zongyuzhang · Scour

SLUUG Talk: Demystifying Large Language Models on Linux

💡AI Reasoning Code

github.com··DEV

Ollama 0.30 GPU Boost: Faster local Qwen inference on NVIDIA

🔓Open-source Models

everylocalai.com··DEV

The Neutral Mask: How RLHF Provides Shallow Alignment while Leaving Partisan Structure Intact in a Large Language Model

🎮RL Academic

What Ollama Reveals About Local AI, Agents, and Open Models

🕵️AI Agents Blog

odsc.medium.com·

Comprehensive evaluation of LLM capabilities for interpretation and analysis of genome-scale metabolic models in metabolic engineering

💡AI Reasoning Academic

MCP Architecture Explained for Beginners: Why AI Needs a Structured Communication System

🕵️AI Agents Blog

·

Claude vs GPT-4: Which AI API Is Better for Developers? (2026)

🎭Multimodal AI

kalyna.pro··DEV

Intelligent inference scheduling with llm-d on Red Hat AI

🔓Open-source Models

developers.redhat.com·

MTP Isn't Always a Win: 1.95x on My 3090, but Speculative Decoding Is Hardware-Dependent

⚡Quantization Blog

bric.pe.kr··DEV

Why Your LLM Gets Dumber With More Context

siliconopera.com·

AMD's Lemonade SDK For Local AI Adds NVIDIA CUDA Support

🔓Open-source Models

phoronix.com··r/artificial

Improved performance and model support with GGUF

⚡Quantization Blog

Why LLMs (still) lack taste

beyondtheprior.com··Hacker News

New comment by alroma90 in "Ask HN: Who wants to be hired? (June 2026)"

🔧Tool Use Discussion

news.ycombinator.com··Hacker News

Fixing a stuck Ollama runner and building a GPU watchdog

🔓Open-source Models

patrickmccanna.net··Hacker News

Agentic AI vs Generative AI: Why one without the other hits a ceiling

🕵️AI Agents Blog

CommBench: Can LLMs Write Correct and Efficient GPU Communication Code?

⚡Quantization

uccl-project.github.io··Hacker News

A free diagnostic for the Claude Certified Architect exam

🔧Tool Use Discussion Tutorial

claudecertifiedarchitects.com··Hacker News

AI 101: From Prompt Engineering to Skill Engineering

🕵️AI Agents

turingpost.com·

Inferoa AI harness claimed 90% cache savings. We ran it and measured 97.8%

🖥️Inference Compute

zozo123.github.io··Hacker News

Log in to enable infinite scrolling