🧠 LLMs - saeedesmaili

TheArcForge/Hades: Unity-aware AI infrastructure for Claude Code — a knowledge graph + 88 MCP tools that let your AI agent know your project, not just grep its files.

🕸️Knowledge Graphs Code

github.com··Hacker News

SpikeDecoder: Realizing the GPT Architecture with Spiking Neural Networks

🤖Machine Learning Academic

arxiv.org·

GGUF vs GPTQ vs AWQ: The Plain-English Guide to LLM Quantization (and Which One to Pick)

🧠LLM Inference

vettedconsumer.com··Hacker News

Florian Brand, Prime Intellect research engineer, adopts Gemma 4 E4B 6-bit quantized as his primary local Mac LLM

💬Natural Language Processing News

digg.com··Hacker News

Anthropic and OpenAI both said context is the bottleneck for data agents. Here's what they didn't say.

🪟Context Windows Blog

clarilayer.com··Hacker News

When More Documents Hurt RAG: Mitigating Vector Search Dilution with Domain-Scoped, Model-Agnostic Retrieval

🔍Information Retrieval Academic

arxiv.org·

google/gemma-4-31B-it · fix: chat template — null handling, reasoning preservation, turn-tag balance, input validation

🧠LLM Inference

huggingface.co··r/LocalLLaMA

scottpurdy/llmbuffer: LLM conversation buffer with cache optimization and dynamic context.

🪟Context Windows Code

github.com··Hacker News, Hacker News

Sales Is the Customer Clock

🪟Context Windows

hari.computer··Hacker News

An interactive introduction to the terrific experience of rendering Arabic and its technical debt

🪟Context Windows Blog

lr0.org··Lobsters, Hacker News, Hacker News

Gemma 4 QAT models: Optimizing model compression for mobile and laptop efficiency

🧠LLM Inference News Blog

blog.google··Hacker News

TrustMargin: Training-Free Arbitration between Parametric Memory and Retrieved Evidence in Large Language Models

🪟Context Windows Academic

arxiv.org·

Stack Overflow didn't just help AI learn to code

🤖LLM

zozo123.github.io··Hacker News

memory OS for AI agents (ranks, compresses and evolves agents memory)

🔍Information Retrieval

thrindex.com··Hacker News

The Wrong Epsilon to the Brain

🪟Context Windows

hari.computer··Hacker News

Jott2121/agent-gate: MCP server that adds a fail-closed quality gate and hash-chained receipt ledger to any AI agent workflow.

🐍Python Code

github.com··Hacker News

The Injection Paradox: Brand-Level Suppression in Safety-Trained LLM Recommendations via RAG Context Injection

🪟Context Windows Academic

arxiv.org·

Tokenminning: Because Tokenmaxxing Is a Bad Idea

🪟Context Windows

tokenminning.com··Hacker News

No more posts from saeedesmaili's subscribed feeds.

Scour all 25258 feeds Learn more about Feeds

Show HN: RiskKernel, kill -9 an AI agent and resume it without paying twice

Tool to convert technical PDFs into RAG-ready chunks and Obsidian vaults

TheArcForge/Hades: Unity-aware AI infrastructure for Claude Code — a knowledge graph + 88 MCP tools that let your AI agent know your project, not just grep its files.

SpikeDecoder: Realizing the GPT Architecture with Spiking Neural Networks

GGUF vs GPTQ vs AWQ: The Plain-English Guide to LLM Quantization (and Which One to Pick)

Florian Brand, Prime Intellect research engineer, adopts Gemma 4 E4B 6-bit quantized as his primary local Mac LLM

Anthropic and OpenAI both said context is the bottleneck for data agents. Here's what they didn't say.

When More Documents Hurt RAG: Mitigating Vector Search Dilution with Domain-Scoped, Model-Agnostic Retrieval

google/gemma-4-31B-it · fix: chat template — null handling, reasoning preservation, turn-tag balance, input validation

scottpurdy/llmbuffer: LLM conversation buffer with cache optimization and dynamic context.

Sales Is the Customer Clock

An interactive introduction to the terrific experience of rendering Arabic and its technical debt

Gemma 4 QAT models: Optimizing model compression for mobile and laptop efficiency

TrustMargin: Training-Free Arbitration between Parametric Memory and Retrieved Evidence in Large Language Models

Stack Overflow didn't just help AI learn to code

memory OS for AI agents (ranks, compresses and evolves agents memory)

The Wrong Epsilon to the Brain

Jott2121/agent-gate: MCP server that adds a fail-closed quality gate and hash-chained receipt ledger to any AI agent workflow.

The Injection Paradox: Brand-Level Suppression in Safety-Trained LLM Recommendations via RAG Context Injection

Tokenminning: Because Tokenmaxxing Is a Bad Idea