Locality Sensitive Hashing, Jaccard Similarity, Duplicate Detection, Document Clustering

Efficient and accurate search in petabase-scale sequence repositories
nature.comยท2dยท
Discuss: Hacker News
๐Ÿ”„Burrows-Wheeler
Sorting encrypted data without decryption: a practical trick
dev.toยท8hยท
Discuss: DEV
๐Ÿ”Hash Functions
An enough week
blog.mitrichev.chยท1dยท
๐Ÿ“ˆLinear programming
Nearest Neighbor CCP-Based Molecular Sequence Analysis
arxiv.orgยท19h
๐Ÿ”„Burrows-Wheeler
DupeGuru lets you quickly find and remove duplicate files from your drives
techspot.comยท1d
๐Ÿ”„Content Deduplication
YouTube gets ~5% CTR lift on Shorts by replacing embedding tables with Semantic IDs
shaped.aiยท23h
๐Ÿ“ŠFeed Optimization
Homomorphism Problems in Graph Databases and Automatic Structures
arxiv.orgยท19h
๐Ÿ”—Graph Isomorphism
[R] DeepSeek 3.2's sparse attention mechanism
reddit.comยท20hยท
๐ŸŒ€Brotli Internals
Explicit Lossless Vertex Expanders!
gilkalai.wordpress.comยท13h
๐Ÿ’ŽInformation Crystallography
Indexing, Hashing
dev.toยท1dยท
Discuss: DEV
๐Ÿš€Query Optimization
Mind the Gap: Quantifying Vocabulary Mismatch in E-Commerce Site Search
searchhub.ioยท1dยท
Discuss: Hacker News
๐Ÿ“ˆSearch Quality
RND1: Simple, Scalable AR-to-Diffusion Conversion
radicalnumerics.aiยท1dยท
Discuss: Hacker News
๐Ÿ’ปLocal LLMs
Contrastive Weak-to-strong Generalization
arxiv.orgยท19h
โง—Information Bottleneck
MetaGraph: Scalable annotated de Bruijn graphs for DNA indexing and alignment
github.comยท1dยท
Discuss: Hacker News
๐Ÿ”„Burrows-Wheeler
Show HN: Rebuilt Bible search app to run 100% client-side with Transformers.js
biblos.appยท2hยท
Discuss: Hacker News
๐Ÿ“œBinary Philology
Fast-Convergent Proximity Graphs for Approximate Nearest Neighbor Search
arxiv.orgยท2d
๐Ÿ“Range Queries
Automated Copyright Infringement Detection via Semantic Fingerprinting and Dynamic Thresholding
dev.toยท2dยท
Discuss: DEV
๐Ÿ‘๏ธPerceptual Hashing
An enough week
blog.mitrichev.chยท1dยท
๐ŸงฎZ3 Solver
Writing regex is pure joy. You can't convince me otherwise.
triangulatedexistence.mataroa.blogยท21hยท
โœ…Format Verification
Parameterized Complexity of s-Club Cluster Edge Deletion
arxiv.orgยท1d
๐ŸงฎKolmogorov Complexity