Locality Sensitive Hashing, Jaccard Similarity, Duplicate Detection, Document Clustering

Welcome to LIL’s Data.gov Archive Search
lil.law.harvard.edu·14h
💾Data Preservation
Beyond Indexes: How Open Table Formats Optimize Query Performance
jack-vanlightly.com·2d·
🚀Query Optimization
Understanding conflict resolution and avoidance in PostgreSQL: a complete guide
pgedge.com·14h·
Discuss: r/programming
🛡️Byzantine Fault Tolerance
SLip - An aspiring Common Lisp environment in the browser.
lisperator.net·21h·
Discuss: r/programming
🧠Lisp Dialects
Operationalizing Data Minimization for Privacy-Preserving LLM Prompting
arxiv.org·4d
💻Local LLMs
Activation Alchemist: Sculpting Stability with Functional Signatures
dev.to·14h·
Discuss: DEV
🔍Concolic Testing
OBCache: Optimal Brain KV Cache Pruning for Efficient Long-Context LLM Inference
arxiv.org·1d
💻Local LLMs
From Documents to Dialogue: A step-by-step RAG Journey
dev.to·20h·
Discuss: DEV
📊Multi-vector RAG
Show HN: Using an LLM to sensibly sort a shopping receipt
treblig.org·1d·
Discuss: Hacker News
🔗Constraint Handling
The Rise of the Knowledge Sculptor: A New Archetype for Knowledge Work in the Age of Generative AI
arxiv.org·1d
🗺️Competency Maps
H1B-KV: Hybrid One-Bit Caches for Memory-Efficient Large Language Model Inference
arxiv.org·3d
💨Cache Optimization
TRepLiNa: Layer-wise CKA+REPINA Alignment Improves Low-Resource Machine Translation in Aya-23 8B
arxiv.org·2d
🎙️Whisper
Evaluating Gemini 2.5 Deep Think's math capabilities
epoch.ai·20h·
Discuss: Hacker News
🎯Performance Proofs
JEPAs: Unveiling the Hidden Density Oracle Within by Arvind Sundararajan
dev.to·2d·
Discuss: DEV
🧠Machine Learning
Forecasting the Buzz: Enriching Hashtag Popularity Prediction with LLM Reasoning
arxiv.org·1d
⚖️Feed Ranking
🚀 From Rejection to Reinvention: How I Built an AI That Finds My Jobs
dev.to·5h·
Discuss: DEV
🇨🇳Chinese Computing
Real-Time Adaptive Sparsity Optimization for Edge-Deployed AI Inference Accelerators
dev.to·1d·
Discuss: DEV
🌊Streaming Compression