🐿️ ScourBrowse
LoginSign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
📝 Document Chunking

Semantic Segmentation, Context Preservation, Retrieval Optimization, Text Processing

From Codicology to Code: A Comparative Study of Transformer and YOLO-based Detectors for Layout Analysis in Historical Documents
arxiv.org·19h
📃Manuscript Tokenization
How AI Agents Gather Data
blog.dust.tt·2h·
Discuss: Hacker News
📥Feed Aggregation
RAG Blueprint
docs.vespa.ai·9h·
Discuss: Hacker News
🤖Archive Automation
Understanding Brotli PDF Compression
pdfa.org·9h
🌪️Brotli
How to Fix Your Context
dbreunig.com·8h·
Discuss: Hacker News
✨Effect Handlers
Flickr Foundation goes Dutch!
flickr.org·7h
🏛Digital humanities
MUVERA: Making multi-vector retrieval as fast as single-vector search
research.google·1d·
Discuss: Hacker News, r/LocalLLaMA, r/programming
🧮Vector Embeddings
How I Built a Smarter ZIP Engine with AI: My Day 9 & 10 Journey (Pagonic Project)
dev.to·3h·
Discuss: DEV
👁️Observatory Systems
7 AI Agent Frameworks for Machine Learning Workflows in 2025
machinelearningmastery.com·11h
⚡Proof Automation
Why Your Chunking Strategy Makes or Breaks Your AI System
medium.com·6d·
Discuss: Hacker News
📄Text Chunking
Kumo Surfaces Structured Data Patterns Generative AI Misses
thenewstack.io·1d
📊Graph Databases
Clustering News Articles for Topic Detection: A Technical Deep Dive
dev.to·4d·
Discuss: DEV
📚Document Clustering
davidchisnall/igk: I got Knuth'd: A compiler for documents
github.com·1d
📝Concrete Syntax
Hitchhiker’s Guide to RAG with ChatGPT API and LangChain
towardsdatascience.com·5h
📊Multi-vector RAG
An AI Agent That Interprets Papers So You Don’t Have To: Full Build Guide
hackernoon.com·12h
🔬Academic Search
Driving cost-efficiency and speed in claims data processing with Amazon Nova Micro and Amazon Nova Lite
aws.amazon.com·1d
🌊Stream Processing
Lightweight Target-Speaker-Based Overlap Transcription for Practical Streaming ASR
arxiv.org·19h
🎙️Whisper
Using LLMs in CI/CD for semantic testing of web content
plo.ug·8h·
Discuss: Hacker News
⚡Proof Automation
Language Modeling by Language Models
arxiv.org·19h
🤖Grammar Induction
I'm a founder and I wrote an honest review of DocSend alternatives
peony.ink·6h·
Discuss: Hacker News
🏺ZIP Archaeology
Loading...Loading more...
AboutBlogChangelogRoadmap