🪟 Context Windows - saeedesmaili

🤖AI Agents Academic

arxiv.org·

Show HN: LLM memory without context bleed; 100% precision vs. <10% vector search

🔍Information Retrieval

tenureai.dev··Hacker News, Hacker News

Running Qwen 35B MoE at 450k Context on a Single 32GB GPU

🧠LLM Inference

local-llm.utop.workers.dev··Hacker News

Less-relevant results

Introducing GitLab Orbit

🧠LLMs Blog

about.gitlab.com··Hacker News

Prompt Injection in RAG Agentic Systems

🧠LLMs

ulad.net··Hacker News

Show HN: Bosun – a small model that keeps an agent's memory graph clean

🎯Fine-tuning

huggingface.co··Hacker News

Researchers trained an open source AI search agent, Harness-1, that outperforms GPT-5.4 on recalling relevant information

🎮Reinforcement Learning

venturebeat.com··Hacker News

When Poison Fails After Retrieval: Revisiting Corpus Poisoning under Chunking and Reranking Pipelines

🔍Information Retrieval Academic

arxiv.org·

Show HN: Run Llama.cpp In-Process from Java with Project Panama FFM

🧠LLM Inference

deemwar-products.github.io··Hacker News

ashp15205/guardian-runtime: A zero-latency, local-first runtime firewall for LLMs. Intercept every prompt and response locally to stop data leaks and runaway token costs.

🧠LLMs Code

github.com··Hacker News, Hacker News

Show HN: Audit any AI/data pairing with Veritrooper

🧠LLMs

veritrooper.com··Hacker News

The AI Curse (Vis the Lisp Curse)

🧠LLMs Blog

blog.djhaskin.com··Hacker News

CompRank: Efficient LLM Reranking via Token-Level Compression and Decoding-Free Scoring

🔍Information Retrieval Academic

arxiv.org·

Tool to convert technical PDFs into RAG-ready chunks and Obsidian vaults

🪨Obsidian

pdf-knowledge-extractor.onrender.com··Hacker News

Is your agent extension actually working?

🤖Machine Learning Blog

developer.microsoft.com··Hacker News

Engineers building MCPs in regulated industries: what's been the hardest part?

🧠LLMs

deepsense.ai··Hacker News

hashwnath/KMCP: Open-source MCP server for your docs. Zero LLM at query time. docker compose up and go.

🏠Self-hosting Code

github.com··Hacker News

Sales Is the Customer Clock

🧠LLMs

hari.computer··Hacker News

memory OS for AI agents (ranks, compresses and evolves agents memory)

🔍Information Retrieval

thrindex.com··Hacker News

How LLMs Actually Work: A Friendly Map for Humans • oreoro

Benchmarking Large Language Models for Safety Data Extraction

Show HN: LLM memory without context bleed; 100% precision vs. <10% vector search

Running Qwen 35B MoE at 450k Context on a Single 32GB GPU

Introducing GitLab Orbit

Prompt Injection in RAG Agentic Systems

Show HN: Bosun – a small model that keeps an agent's memory graph clean

Researchers trained an open source AI search agent, Harness-1, that outperforms GPT-5.4 on recalling relevant information

When Poison Fails After Retrieval: Revisiting Corpus Poisoning under Chunking and Reranking Pipelines

Show HN: Run Llama.cpp In-Process from Java with Project Panama FFM

ashp15205/guardian-runtime: A zero-latency, local-first runtime firewall for LLMs. Intercept every prompt and response locally to stop data leaks and runaway token costs.

Show HN: Audit any AI/data pairing with Veritrooper

The AI Curse (Vis the Lisp Curse)

CompRank: Efficient LLM Reranking via Token-Level Compression and Decoding-Free Scoring

Tool to convert technical PDFs into RAG-ready chunks and Obsidian vaults

Is your agent extension actually working?

Engineers building MCPs in regulated industries: what's been the hardest part?

hashwnath/KMCP: Open-source MCP server for your docs. Zero LLM at query time. docker compose up and go.

Sales Is the Customer Clock

memory OS for AI agents (ranks, compresses and evolves agents memory)