🐿️ ScourBrowse
LoginSign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
📃 Manuscript Tokenization

Medieval Text Processing, Paleographic Parsing, Historical NLP, Character Segmentation

Unfolding the Past: A Comprehensive Deep Learning Approach to Analyzing Incunabula Pages
arxiv.org·1d
🤖Manuscript AI
davidchisnall/igk: I got Knuth'd: A compiler for documents
github.com·12h
📝Concrete Syntax
I Figured Out What the Voynich Manuscript Says; It's Something More Than Words
dmerullo.substack.com·3h·
Discuss: Substack
🏰Manuscript Networks
Why Your Next LLM Might Not Have A Tokenizer
towardsdatascience.com·22h
🤖Grammar Induction
Building an ML model to generate fonts
fontweaver.com·1d·
Discuss: Hacker News
🔠Terminal Fonts
The modern text processing pipeline: Overview
newroadoldway.com·1d·
Discuss: Lobsters, r/programming
🔤Unicode Normalization
Explaining software and computational methods
blog.khinsen.net·18h
📝Concrete Syntax
Unveiling Factors for Enhanced POS Tagging: A Study of Low-Resource Medieval Romance Languages
arxiv.org·1d
👁️Medieval OCR
BNFGen: A random text generator based on context-free grammars
baturin.org·33m·
Discuss: Hacker News
🌳Context free grammars
Kumo Surfaces Structured Data Patterns Generative AI Misses
thenewstack.io·4h
📊Graph Databases
Contextualizing SUTRA: Advancements in Multilingual & Efficient LLMs
hackernoon.com·2h
💻Local LLMs
Using Wavelets and Clustering to Predict Odd or Even Numbers: An Overengineered Approach with Pretty (But Confusing) Plots
dev.to·4h·
Discuss: DEV
🧠Machine Learning
Detecting Machine-Generated Texts: Not Just "AI vs Humans" and Explainability is Complicated
arxiv.org·14h
🧮Kolmogorov Complexity
June 25, 2025 Flight Tracking Workshop (4 hour) [Americas / Europe-friendly time]
bellingcat.com·18h
🧮Prolog Parsing
Practical tips to optimize documentation for LLMs, AI agents, and chatbots
biel.ai·23h·
Discuss: Hacker News
🤖Archive Automation
Cactus Language • Syntax 12
inquiryintoinquiry.com·2h
📝Concrete Syntax
Capturing my handwriting in a searchable digital format – the long way round
colinramsay.co.uk·1d·
Discuss: Hacker News
📲Digitization
Text2Struct: A Machine Learning Pipeline for Mining Structured Data from Text
arxiv.org·1d
🔤Character Classification
Driving cost-efficiency and speed in claims data processing with Amazon Nova Micro and Amazon Nova Lite
aws.amazon.com·1h
🌊Stream Processing
Portable Network Graphics (PNG) Specification (Third Edition)
w3.org·21h·
Discuss: Hacker News
🕸️WebP Analysis
Loading...Loading more...
AboutBlogChangelogRoadmap