๐Ÿฟ๏ธ ScourBrowse
LoginSign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
๐Ÿ“„ Semantic Chunking

Document Segmentation, Context Windows, Text Boundaries, Retrieval Units

Judge Alsup: Training AI On Copyrighted Works? Fair Use. Building Pirate Libraries? Not So Much
techdirt.comยท3h
โš–๏ธEmulation Ethics
The modern text processing pipeline: Overview
newroadoldway.comยท3dยท
Discuss: Lobsters, r/programming
๐Ÿ”คUnicode Normalization
Show HN: Requests-Based Google Maps Scraper
apify.comยท23hยท
Discuss: Hacker News
๐Ÿ”BitFunnel
Launch HN: Reducto Studio (YC W24) โ€“ Build accurate document pipelines, fast
news.ycombinator.comยท3dยท
Discuss: Hacker News
๐ŸŒ€Brotli Internals
Ask HN: Feedback on "QSS" โ€“ A Quantized Vector Search Engine in C
news.ycombinator.comยท2dยท
Discuss: Hacker News
๐Ÿ—‚๏ธVector Databases
SUTRA: Decoupling Concept & Language for Multilingual LLM Excellence
hackernoon.comยท1d
๐Ÿ’ปLocal LLMs
Enhancing VICReg: Random-Walk Pairing for Improved Generalization and Better Global Semantics Capturing
arxiv.orgยท2d
๐Ÿ“ŠLearned Metrics
Machine Learning Fundamentals: accuracy with python
dev.toยท2dยท
Discuss: DEV
๐Ÿ‘๏ธObservatory Systems
CEGA: A Cost-Effective Approach for Graph-Based Model Extraction and Acquisition
arxiv.orgยท2d
๐ŸงฎProlog Parsing
Entelligence vs CodeRabbit
dev.toยท6hยท
Discuss: DEV
๐ŸŒณIncremental Parsing
LLMs for Customized Marketing Content Generation and Evaluation at Scale
arxiv.orgยท2d
๐Ÿ“ŠFeed Optimization
MiCo: Multiple Instance Learning with Context-Aware Clustering for Whole Slide Image Analysis
arxiv.orgยท2d
๐Ÿ“ŠLearned Metrics
Powering Smarter AI with Precision โ€” Image Data Annotation at AkbhcodeAI
dev.toยท2dยท
Discuss: DEV
๐Ÿค–Advanced OCR
CmFNet: Cross-modal Fusion Network for Weakly-supervised Segmentation of Medical Images
arxiv.orgยท2d
๐ŸŒ€Hyperbolic Geometry
OpenEvents V1: Large-Scale Benchmark Dataset for Multimodal Event Grounding
arxiv.orgยท2d
๐Ÿ“ŠLearned Metrics
Context Consistency Learning via Sentence Removal for Semi-Supervised Video Paragraph Grounding
arxiv.orgยท2d
๐Ÿ“„Text Chunking
Seeing is Believing? Mitigating OCR Hallucinations in Multimodal Large Language Models
arxiv.orgยท15h
๐Ÿค–Advanced OCR
Unfolding the Past: A Comprehensive Deep Learning Approach to Analyzing Incunabula Pages
arxiv.orgยท2d
๐Ÿค–Manuscript AI
Agentic Search for Dummies
benanderson.workยท4dยท
Discuss: Hacker News
๐Ÿ”Information Retrieval
Why Robots Are Bad at Detecting Their Mistakes: Limitations of Miscommunication Detection in Human-Robot Dialogue
arxiv.orgยท15h
๐Ÿง Intelligence Compression
Loading...Loading more...
AboutBlogChangelogRoadmap