Semantic Segmentation, Context Windows, Document Boundaries, Retrieval Units

RAG Chunking Strategies That Actually Work (and Why Most Don’t)
dev.to·3h·
Discuss: DEV
🧪Archive Fuzzing
From Segments to Concepts: Interpretable Image Classification via Concept-Guided Segmentation
arxiv.org·8h
🧠Machine Learning
Eliminating the Precision–Latency Trade-Off in Large-Scale RAG
thenewstack.io·3d
🎯Retrieval Systems
An alternative to knowledge graphs for storing loosely structured content
fleetingswallow.com·1d·
Discuss: Hacker News
🕸️Knowledge Graphs
Algorithmic Archive Project: Use Cases (1/3)
blogs.bodleian.ox.ac.uk·2h
📊Citation Graphs
Detecting Semantic Clones of Unseen Functionality
arxiv.org·8h
🔗Binary Similarity
Teaching Models to Decide When to Retrieve: Adaptive RAG, Part 4
blog.reachsumit.com·1d·
Discuss: Hacker News
🧠Learned Indexing
Latency vs. Accuracy for LLM Apps — How to Choose and How a Memory Layer Lets You Win Both
dev.to·1h·
Discuss: DEV
Performance Mythology
LLM Optimization Notes: Memory, Compute and Inference Techniques
gaurigupta19.github.io·20h·
Discuss: Hacker News
💻Local LLMs
Automating construction safety inspections using a multi-modal vision-language RAG framework
arxiv.org·8h
🤖Advanced OCR
Tritium | Thoughts on the Word Spec in Rust
tritium.legal·1d·
🦀Rust Macros
GPT-5-Codex is a better AI researcher than me
seangoedecke.com·12h·
Discuss: Hacker News
🧠Intelligence Compression
Paper2Video: Automatic Video Generation from Scientific Papers
arxiv.org·8h
📊Document Wavelets
Detecting Distillation Data from Reasoning Models
arxiv.org·8h
⚙️ABNF Mining
SliceMoE: Routing Embedding Slices Instead of Tokens for Fine-Grained and Balanced Transformer Scaling
arxiv.org·8h
🧮Kolmogorov Complexity
Claude Code uses WebFetch vs. WebSearch (observations, schemas, prompts)
mikhail.io·15h·
Discuss: Hacker News
🔍BitFunnel
ALHD: A Large-Scale and Multigenre Benchmark Dataset for Arabic LLM-Generated Text Detection
arxiv.org·8h
📝Text Parsing
**Hyperdimensional Semantic Graph Fusion for Enhanced Knowledge Extraction & Reasoning**
dev.to·18h·
Discuss: DEV
🧮Kolmogorov Complexity
Understanding Retrieval Augmentation for Long-Form Question Answering
arxiv.org·8h
🔍Information Retrieval
Compressed Convolutional Attention: Efficient Attention in a Compressed Latent Space
arxiv.org·8h
Information Bottleneck