๐Ÿฟ๏ธ ScourBrowse
LoginSign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
๐Ÿ“ƒ Manuscript Tokenization

Medieval Text Processing, Paleographic Parsing, Historical NLP, Character Segmentation

The Bitter Lesson is coming for Tokenization
lucalp.devยท1dยท
Discuss: Lobsters, Hacker News, r/programming
๐Ÿ”—Monadic Parsing
SUTRA: Decoupling Concept & Language for Multilingual LLM Excellence
hackernoon.comยท2h
๐Ÿ’ปLocal LLMs
Launch HN: Reducto Studio (YC W24) โ€“ Build accurate document pipelines, fast
news.ycombinator.comยท2dยท
Discuss: Hacker News
๐ŸŒ€Brotli Internals
Best Ways to Translate Documents Online Using AI โ€“ Secure OCR, Layout Retention, and Top Tools Compared
dev.toยท3dยท
Discuss: DEV
๐Ÿค–AI Translation
PEGTL -- Parsing Expression Grammar Template Library
github.comยท19hยท
Discuss: Hacker News
๐Ÿ”—Parser Combinators
QuranMorph: Morphologically Annotated Quranic Corpus
arxiv.orgยท1d
๐Ÿ“‹Document Grammar
The one-more-re-nightmare compiler (2021)
applied-langua.geยท1dยท
Discuss: Lobsters, r/programming
๐Ÿ”RegEx Engines
10 FREE AI Tools Thatโ€™ll Save You 10+ Hours a Week
kdnuggets.comยท6h
๐ŸŽ™๏ธWhisper
PDF Retrieval Augmented Question Answering
arxiv.orgยท1d
๐Ÿ“ŠMulti-vector RAG
ByteSpan: Information-Driven Subword Tokenisation
arxiv.orgยท1d
๐Ÿ’พBinary Linguistics
Computational Approaches to Understanding Large Language Model Impact on Writing and Information Ecosystems
arxiv.orgยท1d
๐Ÿ“œDigital Philology
StainPIDR: A Pathological Image Decouplingand Reconstruction Method for StainNormalization Based on Color VectorQuantization and Structure Restaining
arxiv.orgยท1d
๐Ÿ“„OCR
Named Entity Recognition using Bidirectional LSTM and Conditional Random Fields
dev.toยท3dยท
Discuss: DEV
๐Ÿค–Grammar Induction
Patterns for Compounding the Value of LLM interactions
spin.atomicobject.comยท6hยท
Discuss: Hacker News
๐Ÿ”—Constraint Handling
Semantic-Aware Parsing for Security Logs
arxiv.orgยท1d
๐Ÿ“Log Parsing
LongWriter-Zero: Mastering Ultra-Long Text Generation via Reinforcement Learning
arxiv.orgยท1d
๐Ÿ“Concrete Syntax
The Internal Inconsistency of Large Language Models
blog.kortlepel.comยท1dยท
Discuss: Hacker News
๐Ÿ’ปLocal LLMs
OpenAI's Codex Aims to Ease Programmers' Headaches
hackernoon.comยท13h
๐Ÿค–Archive Automation
Statistical Multicriteria Evaluation of LLM-Generated Text
arxiv.orgยท1d
๐Ÿ“‹Document Grammar
Mind the Gap: Assessing Wiktionary's Crowd-Sourced Linguistic Knowledge on Morphological Gaps in Two Related Languages
arxiv.orgยท1d
๐Ÿ”คMorphological Analysis
Loading...Loading more...
AboutBlogChangelogRoadmap