๐Ÿฟ๏ธ ScourBrowse
LoginSign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
๐Ÿ“„ Semantic Chunking

Document Segmentation, Context Windows, Text Boundaries, Retrieval Units

How to Prove That An Email Was Received
metaspike.comยท14h
๐Ÿ“„Document Digitization
The modern text processing pipeline: Overview
newroadoldway.comยท2dยท
Discuss: Lobsters, r/programming
๐Ÿ”คUnicode Normalization
Show HN: Requests-Based Google Maps Scraper
apify.comยท14hยท
Discuss: Hacker News
๐Ÿ”BitFunnel
Ask HN: Feedback on "QSS" โ€“ A Quantized Vector Search Engine in C
news.ycombinator.comยท2dยท
Discuss: Hacker News
๐Ÿ—‚๏ธVector Databases
Launch HN: Reducto Studio (YC W24) โ€“ Build accurate document pipelines, fast
news.ycombinator.comยท2dยท
Discuss: Hacker News
๐ŸŒ€Brotli Internals
Referring Expression Instance Retrieval and A Strong End-to-End Baseline
arxiv.orgยท2d
๐Ÿ”Semantic Search
Data Curation Matters: Model Collapse and Spurious Shift Performance Prediction from Training on Uncurated Text Embeddings
arxiv.orgยท2d
๐Ÿ—‚๏ธVector Databases
CodeMorph: Mitigating Data Leakage in Large Language Model Assessment
arxiv.orgยท2d
๐Ÿ’ปLocal LLMs
SegChange-R1:Augmented Reasoning for Remote Sensing Change Detection via Large Language Models
arxiv.orgยท2d
๐Ÿ“Concrete Syntax
Beyond the Link: Assessing LLMs' ability to Classify Political Content across Global Media
arxiv.orgยท2d
๐Ÿ“„Text Segmentation
How to sync Context across AI Assistants (ChatGPT, Claude, Perplexity...) in your browser
dev.toยท1dยท
Discuss: DEV
๐Ÿ–ฅ๏ธModern Terminals
Machine Learning Fundamentals: active learning with python
dev.toยท17hยท
Discuss: DEV
๐Ÿง Machine Learning
Resource-Friendly Dynamic Enhancement Chain for Multi-Hop Question Answering
arxiv.orgยท2d
๐Ÿ”Information Retrieval
Granular-Ball-Induced Multiple Kernel K-Means
arxiv.orgยท2d
๐ŸŒ€Differential Geometry
LongWriter-Zero: Mastering Ultra-Long Text Generation via Reinforcement Learning
arxiv.orgยท2d
๐Ÿ“Concrete Syntax
Multimodal Political Bias Identification and Neutralization
arxiv.orgยท2d
๐Ÿค–Advanced OCR
Learning from Anatomy: Supervised Anatomical Pretraining (SAP) for Improved Metastatic Bone Disease Segmentation in Whole-Body MRI
arxiv.orgยท1d
๐Ÿง Machine Learning
SUTRA: Decoupling Concept & Language for Multilingual LLM Excellence
hackernoon.comยท18h
๐Ÿ’ปLocal LLMs
Episode-specific Fine-tuning for Metric-based Few-shot Learners with Optimization-based Training
arxiv.orgยท2d
๐Ÿ“ŠLearned Metrics
Enhancing VICReg: Random-Walk Pairing for Improved Generalization and Better Global Semantics Capturing
arxiv.orgยท2d
๐Ÿ“ŠLearned Metrics
Loading...Loading more...
AboutBlogChangelogRoadmap