Transformer-Gather, Fuzzy-Reconsider: A Scalable Hybrid Framework for Entity Resolution
arxiv.orgยท8h
๐Ÿ“‹Document Grammar
Identity Types
bartoszmilewski.comยท23hยท
Discuss: Hacker News
๐Ÿ”คType Theory
Token Models as Statistical Simulations: A Different Take
medium.comยท1dยท
Discuss: Hacker News
๐Ÿ’ปLocal LLMs
Automated Contractual Dispute Resolution via Hybrid Symbolic-Probabilistic Reasoning for Ship Brokering
dev.toยท11hยท
Discuss: DEV
โœ“Automated Theorem Proving
Extending Automatic Machine Translation Evaluation to Book-Length Documents
arxiv.orgยท8h
โš™๏ธCompression Benchmarking
SongPrep: A Preprocessing Framework and End-to-end Model for Full-song Structure Parsing and Lyrics Transcription
arxiv.orgยท8h
๐ŸŽตAudio ML
LingoDB โ€“ Data Processing with Compiler Technology
lingo-db.comยท2dยท
Discuss: Hacker News
๐Ÿ”จCompilers
Taking a Look at Compression Algorithms
cefboud.comยท1dยท
๐Ÿ“ฆDeflate
AttnComp: Attention-Guided Adaptive Context Compression for Retrieval-Augmented Generation
arxiv.orgยท8h
โš™๏ธCompression Benchmarking
HICode: Hierarchical Inductive Coding with LLMs
arxiv.orgยท8h
โ›๏ธGrammar Mining
LongCat-Flash-Thinking, LLM from Meituan (China's Equivalent of Uber Eats)
github.comยท4hยท
Discuss: Hacker News
๐ŸŒŠStreaming Algorithms
Adhesive category theory for graph rewriting in Rocq
arxiv.orgยท8h
๐Ÿ”€Category Theory
Variation in Verification: Understanding Verification Dynamics in Large Language Models
arxiv.orgยท8h
๐ŸงชBinary Fuzzing
Unlock Document Intelligence: Powering Decision-Making with AI-driven RAG
dev.toยท1dยท
Discuss: DEV
๐Ÿค–Archive Automation
SiDiaC: Sinhala Diachronic Corpus
arxiv.orgยท8h
๐Ÿ“œBinary Philology
NeuS-QA: Grounding Long-Form Video Understanding in Temporal Logic and Neuro-Symbolic Reasoning
arxiv.orgยท8h
๐Ÿง Learned Codecs
DocIQ: A Benchmark Dataset and Feature Fusion Network for Document Image Quality Assessment
arxiv.orgยท8h
๐Ÿค–Advanced OCR
Make Every Letter Count: Building Dialect Variation Dictionaries from Monolingual Corpora
arxiv.orgยท8h
๐ŸŽ™๏ธWhisper