Merriam-Webster and Unstructured Data Processing
georgeho.org·9h·
Discuss: Hacker News
Flag this post

I recently finished reading Word by Word: The Secret Life of Dictionaries by Kory Stamper, which was an unexpected page-turner. What intrigued me most was (perhaps unsurprisingly) Stamper’s description of how Merriam-Webster gets written, and what a striking resemblance that process has to many successful unstructured data projects in the wild. I want to use this blog post to ruminate on this.


First it begins with collection and curation of raw, unstructured data. Stamper describes a fascinating process called “reading and marking”, whereby editors are assigned reading of current magazines, periodicals, blogs — almost anything written in English, it seems — and read and underline any …

Similar Posts

Loading similar posts...