Document Processing, Neural OCR, Multilingual Archives, Computational Philology

Welcome to LIL’s Data.gov Archive Search
lil.law.harvard.edu·22h
💾Data Preservation
​​Speech-to-Retrieval (S2R): A new approach to voice search
research.google·4d·
Discuss: Hacker News
🎙️Whisper
GPT-5 for AI-assisted discovery
johndcook.com·1d·
Discuss: Hacker News
🎯Performance Proofs
Opinion | The A.I. Prompt That Could End the World
future.forem.com·1d·
Discuss: DEV
🤖AI Curation
A new information-theory framework reveals when multi-agent AI systems truly work as a team
the-decoder.com·8h
🔲Cellular Automata
LightReasoner: Can Small Language Models Teach Large Language Models Reasoning?
arxiv.org·1d
🔗Parser Combinators
Stress-Testing Model Specs Reveals Character Differences among Language Models
arxiv.org·1d
📋Document Grammar
Best Japanese to English Document Translation Software
dev.to·3d·
Discuss: DEV
🇯🇵Japanese Computing
Relational Transformer: Toward Zero-Shot Foundation Models for Relational Data
arxiv.org·2d
🧠Learned Indexes
TRIM: Token-wise Attention-Derived Saliency for Data-Efficient Instruction Tuning
arxiv.org·2d
🔨Compilers
Memory Retrieval and Consolidation in Large Language Models through Function Tokens
arxiv.org·1d
💻Programming languages
Automated Anomaly Detection in Account Takeover via Multi-Modal Graph Neural Network Fusion
dev.to·13h·
Discuss: DEV
🔍Vector Forensics
HiPRAG: Hierarchical Process Rewards for Efficient Agentic Retrieval Augmented Generation
arxiv.org·1d
Proof Automation
LogSTOP: Temporal Scores over Prediction Sequences for Matching and Retrieval
arxiv.org·2d
📊Learned Metrics
How human is the machine? Evidence from 66,000 Conversations with Large Language Models
arxiv.org·1d
🇸🇪Nordic Algorithms
Daily Artificial Intelligence Digest - Oct 10, 2025-Old
dev.to·1d·
Discuss: DEV
🤖AI Curation
Decoding Cultures: Why Your Video AI Isn't Truly Seeing the World by Arvind Sundararajan
dev.to·1d·
Discuss: DEV
🌍Cultural Algorithms
Enhanced Predictive Maintenance of Geothermal Heat Exchangers via Hybrid Bayesian Optimization and LSTM
dev.to·13h·
Discuss: DEV
💻Local LLMs