🤖 LLMs - jyunzhang · Scour

TA-RAG: Tone-Aware Retrieval-Augmented Generation for Peer-Support Health Communication

🔍RAG Academic

zhongkaifu/TensorSharp: A C# inference engine for running large language models (LLMs) locally using GGUF model files. TensorSharp provides a console application, a web-based chatbot interface, and Ollama/OpenAI-compatible HTTP APIs for programmatic access. It supports Windows/MacOS/Linux with full GPU capability

🦙Ollama Code

github.com··Hacker News

Evaluating RAG Reliability under Clean, Misleading, and Mixed Retrieval

🔍RAG Academic

TrustMargin: Training-Free Arbitration between Parametric Memory and Retrieved Evidence in Large Language Models

📝NLP Academic

fix(gateway): fail closed for unknown model auth · openclaw/openclaw@85343ea

📝NLP Code

shoo99/paper-rag: A private, fully-local RAG over your own PDFs: BGE-M3 + embedded Qdrant + a local LLM via Ollama. ~150 lines, nothing leaves your machine.

🔍RAG Code

github.com··DEV

harshuljain13/llm-inference-at-scale: A Practitioner handbook for production llm serving.

🤖Machine Learning Code

github.com··Hacker News

SLUUG Talk: Demystifying Large Language Models on Linux

🤖Machine Learning Code

github.com··DEV

IA-RAG: Interval-Algebra-Driven Temporal Reasoning for Dynamic Knowledge Retrieval

🔍RAG Academic

Alvaro-Manzo/promptshift: Model-aware prompt adapter for Claude — translate any prompt to GPT, Gemini, Mistral, Llama and more

📝NLP Code

github.com··r/PromptEngineering

MolE-RAG: Molecular Structure-Enhanced Retrieval-Augmented Generation for Chemistry

🔍RAG Academic

A handy llama-server launcher with easy model and configuration customisation

📝NLP Code

github.com··r/LocalLLaMA

Long Live Fine-Tuning: Task-Specific Transformers Outperform Zero-Shot LLMs for Misinformation Response Classification on Reddit

🎭Anthropic Claude Academic

umair-tareen/philosopher-council: An eleven-philosopher LLM council - ask it questions or point it at AI-research trends. Claude-powered deliberation through the four classical branches of philosophy. Methodology, not metaphysics.

🎭Anthropic Claude Code

github.com··r/SideProject

Revisiting Vul-RAG: Reproducibility and Replicability of RAG-based Vulnerability Detection with Open-Weight Models

🔍RAG Academic

heterodoxin/graphkv: Graph-guided KV cache compression for memory-efficient LLM inference.

🤖Machine Learning Code

github.com··r/LocalLLaMA

Nvidia DGX Spark GB10 – AI Models and Guide with vLLM and Autonomous Script

🏠Self-hosting Code

github.com··Hacker News

Reducing Hallucinations in Complex Question Answering using Simple Graph-based Retrieval-Augmented Generation (long version)

🔍RAG Academic

Kodiqa-Solutions/Kodiqa-agent: 🧠 One agent. Every model. Zero limits. — Open-source AI coding agent that runs anywhere. 7 providers, 69 commands, local or cloud. Your terminal, your rules.

🔧Developer Tools Code

github.com··Hacker News

QCFuse: Query-Aware Cache Fusion via Compressed View for Efficient RAG Serving

🔍RAG Academic

Log in to enable infinite scrolling