🐿️ ScourBrowse
LoginSign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
πŸ“„ Semantic Chunking

Document Segmentation, Context Windows, Text Boundaries, Retrieval Units

How to Prove That An Email Was Received
metaspike.comΒ·3h
πŸ“„Document Digitization
Show HN: Requests-Based Google Maps Scraper
apify.comΒ·2hΒ·
Discuss: Hacker News
πŸ”BitFunnel
The modern text processing pipeline: Overview
newroadoldway.comΒ·2dΒ·
Discuss: Lobsters, r/programming
πŸ”€Unicode Normalization
Ask HN: Feedback on "QSS" – A Quantized Vector Search Engine in C
news.ycombinator.comΒ·2dΒ·
Discuss: Hacker News
πŸ—‚οΈVector Databases
Generalizing Vision-Language Models to Novel Domains: A Comprehensive Survey
arxiv.orgΒ·1d
πŸ€–Advanced OCR
What are knowledge graphs and why is everyone talking about them?
dev.toΒ·1dΒ·
Discuss: DEV
πŸ•ΈοΈGraph Embeddings
QuranMorph: Morphologically Annotated Quranic Corpus
arxiv.orgΒ·1d
πŸ“‹Document Grammar
AMF-MedIT: An Efficient Align-Modulation-Fusion Framework for Medical Image-Tabular Data
arxiv.orgΒ·19h
πŸ€–Advanced OCR
Driving cost-efficiency and speed in claims data processing with Amazon Nova Micro and Amazon Nova Lite
aws.amazon.comΒ·6h
🌊Stream Processing
CLGRPO: Reasoning Ability Enhancement for Small VLMs
arxiv.orgΒ·1d
πŸ“Linear Logic
Kumo Surfaces Structured Data Patterns Generative AI Misses
thenewstack.ioΒ·9h
πŸ“ŠGraph Databases
Unfolding the Past: A Comprehensive Deep Learning Approach to Analyzing Incunabula Pages
arxiv.orgΒ·1d
πŸ€–Manuscript AI
Recurrent Visual Feature Extraction and Stereo Attentions for CT Report Generation
arxiv.orgΒ·19h
πŸ€–Advanced OCR
CLIP-GS: CLIP-Informed Gaussian Splatting for View-Consistent 3D Indoor Semantic Understanding
arxiv.orgΒ·1d
πŸ“Projective Geometry
Semantic Outlier Removal with Embedding Models and LLMs
arxiv.orgΒ·2d
πŸ”Information Retrieval
Machine Learning Fundamentals: active learning project
dev.toΒ·7hΒ·
Discuss: DEV
🧠Machine Learning
Launch HN: Reducto Studio (YC W24) – Build accurate document pipelines, fast
news.ycombinator.comΒ·2dΒ·
Discuss: Hacker News
πŸŒ€Brotli Internals
CodeMorph: Mitigating Data Leakage in Large Language Model Assessment
arxiv.orgΒ·1d
πŸ’»Local LLMs
Episode-specific Fine-tuning for Metric-based Few-shot Learners with Optimization-based Training
arxiv.orgΒ·1d
πŸ“ŠLearned Metrics
Memory Safety in Web Rust System Zero Cost Secure(1750885516953300οΌ‰
dev.toΒ·2hΒ·
Discuss: DEV
πŸ¦€Rust Borrowing
Loading...Loading more...
AboutBlogChangelogRoadmap