🤖 AI - a1k0n · Scour

I built an open-source persistent memory layer for AI coding agents

⚙️Zig Code

github.com··r/GithubCopilot

Show HN: Run Llama.cpp In-Process from Java with Project Panama FFM

🎯Escape Analysis

deemwar-products.github.io··Hacker News

Energy-Efficient On-Device RAG on a Mobile NPU: System Design and Benchmark on Snapdragon X Elite

🔍RAG Academic

Alignment Defends LLMs from Property Inference Attacks

🎯Escape Analysis Academic

Benchmarking Large Language Models for Safety Data Extraction

🎯Escape Analysis Academic

The Order Matters: Sequential Fine-Tuning of LLaMA for Coherent Automated Essay Scoring

🔍RAG Academic

vla.cpp: A Unified Inference Runtime for Vision-Language-Action Models

🤖Machine Learning Academic

harshuljain13/llm-inference-at-scale: A Practitioner handbook for production llm serving.

🤖Machine Learning Code

github.com··Hacker News, r/LLM

LLM-Based Code Documentation Generation and Multi-Judge Evaluation

🔨Compiler Design Academic

fix(gateway): fail closed for unknown model auth · openclaw/openclaw@85343ea

⚙low-level programming Code

BUDDY: BUdget-Driven DYnamic Depth Routing for Adaptive Large Language Model Inference

👁️Attention Mechanisms Academic

LLM-as-a-Discriminator: When Synthetic Tables Still Look Real

🔍RAG Academic

mingusb/transformer-golf: The Fully Unrolled Transformer: An experimental repository for architecture simplification and compilation. [2026]

🤖Machine Learning Code

github.com··Hacker News

A retrieval conditioned rebinding circuit for dynamic entity tracking in large language models

🔍RAG Academic

The Neutral Mask: How RLHF Provides Shallow Alignment while Leaving Partisan Structure Intact in a Large Language Model

🤖Reinforcement Learning Academic

KJLdefeated/RL.cu: RLVR training for LLM in CUDA/C++

🌟Ray Tracing Code

github.com··Hacker News

Automatic Extraction of Structured Information from Brain MRI Reports Using an Open-Weight Large Language Model

🤖Transformers Academic

heterodoxin/graphkv: Graph-guided KV cache compression for memory-efficient LLM inference.

👁️Attention Mechanisms Code

github.com··r/LocalLLaMA

ReasonAlloc: Hierarchical Decoding-Time KV Cache Budget Allocation for Reasoning Models

👁️Attention Mechanisms Academic

From Rigid to Dynamic: Entropy-Guided Adaptive Inference for Long-Context LLMs

👁️Attention Mechanisms Academic

Sign up or log in to see more results

Log in to enable infinite scrolling