Last month I published Think Weirder: The Year’s Best Science Fiction Ideas, a 16-story anthology featuring Greg Egan, Isabel J. Kim, Ray Nayler, Caroline M. Yoachim, and twelve other wonderful authors. The book ended up being the #1 New Release in the Short Stories Anthologies category for a short time on Amazon, outselling many other newly released short story anthologies published by the big NYC publishers with large marketing departments.
I’m not a professional publisher. I have a full-time job and two small kids, so all of this work happened after my kids went to sleep. I had to use my time judiciously, which meant creating an efficient process. Fortunately I’m a programmer, and it turns out that programming skills translate surprisingly well to …
Last month I published Think Weirder: The Year’s Best Science Fiction Ideas, a 16-story anthology featuring Greg Egan, Isabel J. Kim, Ray Nayler, Caroline M. Yoachim, and twelve other wonderful authors. The book ended up being the #1 New Release in the Short Stories Anthologies category for a short time on Amazon, outselling many other newly released short story anthologies published by the big NYC publishers with large marketing departments.
I’m not a professional publisher. I have a full-time job and two small kids, so all of this work happened after my kids went to sleep. I had to use my time judiciously, which meant creating an efficient process. Fortunately I’m a programmer, and it turns out that programming skills translate surprisingly well to book publishing. This post is about how I built a complete publishing pipeline using Python, YAML files, and LaTeX — and why you might want to do something similar if you’re considering publishing a book. I know that by writing this I’ll have my choices questioned by professional designers, but hopefully the software concepts will be helpful.
My initial thought: can I really do ALL of this?
When I started this project, I had some worries. Professional publishers have entire departments of specialists. How could I possibly handle all of that myself?
The answer turned out to be: build tools that automate the repetitive parts, and use simple file formats that make everything transparent and debuggable.
Step 1: Tracking stories with plain text files
The first challenge was tracking hundreds of candidate stories from different magazines. I read 391 stories published in 2024 before selecting the final 16. That’s a lot of stories to keep organized.
I could have used a spreadsheet, but I went with plain YAML files instead. Here’s why this worked well for me:
- Git-friendly: Every decision I made was tracked in version control
- Human-readable: I could open any file in a text editor and understand what I was looking at
- Easy to build scripts around: I wrote several Python functions to do different kinds of metadata introspection that I’ll go through
The structure looks like this:
data/
story-progress.yaml # Central tracking file
markets.yaml # Magazine metadata
themes.yaml # Theme occurrence tracking
subgenres.yaml # Subgenre tallies
stories/
clarkesworld-magazine/
nelson_11_24.yaml # Individual story files
pak_06_24.yaml
reactor-magazine/
larson_breathing.yaml
...
Each story file is pure YAML containing the full story text plus metadata:
title: "Twenty-Four Hours"
author: H.H. Pak
market: clarkesworld-magazine
url: https://clarkesworldmagazine.com/pak_06_24/
word_count: 4540
year: 2024
slug: pak_06_24
summary: ...
Not all stories have public URLs available, but that’s OK because all of the fields are optional. The central story-progress.yaml tracks editorial state:
clarkesworld-magazine-nelson_11_24:
title: "LuvHome™"
author: Resa Nelson
market: clarkesworld-magazine
status: accepted # or: not_started/relevant/rejected
date_added: '2024-09-08T08:22:47.033192'
Step 2: A simple command-line tool
I built a small Python CLI tool (se.py) to help me navigate all this data. Since I do all this work at night after my kids go to sleep, I wanted something fast that mirrored a lot of the other work I do on the command line. The tool is simple:
python se.py —help
usage: se.py [-h] {markets,stories,relevant,decide,accepted,compile} ...
Story Evaluator CLI
positional arguments:
{markets,stories,relevant,decide,accepted,compile}
Available commands
markets List markets
stories Manage stories
relevant List URLs for stories marked as relevant
decide Make accept/reject decisions on relevant stories
accepted Manage accepted stories
compile Show anthology compilation statistics
optional arguments:
-h, —help show this help message and exit
The compile command ended up being really useful — it gave me instant feedback on anthology size and composition:
ANTHOLOGY COMPILATION STATISTICS
============================================================
Total Stories: 16
Total Word Count: 115,093 words
Average Word Count: 7,193 words
Unique Authors: 16
Markets Represented: 4
STORIES BY MARKET:
analog-magazine: 2 stories (12.5%)
asimovs-magazine: 2 stories (12.5%)
clarkesworld-magazine: 10 stories (62.5%)
reactor-magazine: 2 stories (12.5%)
This was really helpful during the selection process. I could quickly check how far along I was toward my ~120k word goal, and make sure I hadn’t accidentally included multiple stories by the same author.
Step 3: Typesetting the print book
This part surprised me the most. I initially thought I’d have to learn Adobe InDesign or pay someone to do the typesetting. But I decided to use LaTeX instead, since I had some previous experience with it (another publishing friend sent me some of his example files, and I had some academic experience). The process worked out better than expected.
I used XeLaTeX with the memoir document class. Here’s what I liked about this approach:
- Reproducible: I can rebuild the entire book from source in a few seconds, and I can use the same templates next year
- Professional typography: LaTeX handles ligatures, kerning, and line breaking better than I could manually
- Custom fonts: I used Crimson Pro for body text and Rajdhani for titles
- Again, version control that I’m used to: The entire book is just text files in Git
The main parts of the master file for the book are really simple:
\documentclass[final,11pt,twoside]{memoir}
\usepackage{compelling}
\begin{document}
\begin{frontmatter}
\include{title}
\tableofcontents
\end{frontmatter}
\begin{mainmatter}
\include{introduction}
\include{death-and-the-gorgon}
\include{the-best-version-of-yourself}
% ... 14 more stories
\include{acknowledgements}
\end{mainmatter}
\end{document}
All the formatting rules live in compelling.sty, a custom style package. Here’s a link to the full, messy file. Some highlights:
% 6x9 inch trade paperback size
\setstocksize{9in}{6in}
\settrimmedsize{9in}{6in}{*}
% Margins
\setlrmarginsandblock{1.00in}{0.75in}{*}
\setulmarginsandblock{0.75in}{0.75in}{*}
% Typography nerding
\usepackage[final,protrusion=true,factor=1125,
stretch=70,shrink=70]{microtype}
% Custom fonts loaded from local files
\setromanfont[
Ligatures=TeX,
Path=./Crimson_Pro/static/,
UprightFont=CrimsonPro-Regular,
BoldFont=CrimsonPro-Bold,
ItalicFont=CrimsonPro-Italic,
BoldItalicFont=CrimsonPro-BoldItalic
]{Crimson Pro}
\setsansfont[
Path=./Rajdhani/,
UprightFont=Rajdhani-Bold,
BoldFont=Rajdhani-Bold,
ItalicFont=Rajdhani-Bold,
BoldItalicFont=Rajdhani-Bold
]{Rajdhani}
% Chinese font family for CJK characters
\newfontfamily\chinesefont{PingFang SC}
The microtype package does a lot of subtle work with character spacing and line breaking that makes the text look professionally typeset.
I wanted story titles in bold sans-serif with author names underneath in a lighter gray. Here’s how I set that up:
\renewcommand{\chapter}[2]{
\pagestyle{DefaultStyle}
\stdchapter*{
\sffamily
\LARGE
\textbf{\MakeUppercase{#1}}
\\
\large
\color{dark-gray}
{\MakeUppercase{#2}}
}
\addcontentsline{toc}{chapter}{
\protect\parbox[t]{\dimexpr\textwidth-3em}{
\sffamily#1
\\
\protect\small
\protect\color{gray}
\protect\textit{#2}
}
}
\def\leftmark{#1}
\def\rightmark{#2}
}
This redefines the chapter command to take two arguments, the title and byline, and sets up both the chapter formatting, TOC formatting, and makes sure that the title and byline are printed in the headers on alternating pages.
Now every story file just says:
\chapter{Death and the Gorgon}{by Greg Egan}
[story content]
Most authors send me stories as HTML, PDF, or word, so I needed a way to convert them to LaTeX. I wrote a simple Python script to do this, which saved me a huge amount of manual formatting work.
Step 4: Creating the ebook
Print was one thing, but I also needed an ebook. This turned out to be easier than I expected because I could reuse all the LaTeX source I’d already created.
I used Pandoc to convert from LaTeX to EPUB:
# Convert LaTeX to EPUB
pandoc 2025.tex -o Think_Weirder_2025.epub \
—toc \
—epub-cover-image=cover_optimized.jpg \
—css=epub-style.css \
—metadata title="Think Weirder" \
—metadata author="Edited by Joe Stech"
Pandoc’s default table of contents only showed story titles. But I wanted author names too, like you see in print anthologies. EPUBs are just zipped collections of XHTML files, so I wrote a small post-processing script:
def modify_toc(nav_content, authors):
"""Add author bylines to TOC entries."""
pattern = r'<a href="([^"]+)">([^<]+)</a>'
def add_author(match):
href, title = match.group(1), match.group(2)
chapter_id = extract_id_from_href(href)
if chapter_id in authors:
author = authors[chapter_id]
return f'<a href="{href}">{title}<br />\n' \
f'<em>{author}</em></a>'
return match.group(0)
return re.sub(pattern, add_author, nav_content)
The script unzips the EPUB, finds the navigation file, adds author bylines, and rezips everything. Now the ebook table of contents matches the print version.
What I learned
The whole process took longer than I expected — many months of night work. The simple software I wrote really made it a feasible one-person project though, and motivates me to go through the whole process again next year.
Staying organized is crucial. When hundreds of stories are involved, it’s easy to forget details, so using se.py to save metadata in the moment that could be sliced and diced later was so important.
Reproducible builds were a lifesaver. I made changes to the book layout right up until the week before publication. Because I could rebuild the entire book in seconds, and everything was backed up in git, I could experiment freely without worrying about breaking things.
Simple file formats made me comfortable. When something went wrong, I could always open a YAML file or look at the LaTeX source and understand what was happening. I never hit a point where the tools were a black box.
I didn’t need to understand everything up front. I learned LaTeX details as I went (arguably I still don’t really understand LaTeX). Same with Pandoc. I got something basic working first, then incrementally improved it.
Can you do this too?
If you’re thinking about publishing a book — whether it’s an anthology, a novel, or a collection of technical writing — I think this approach is worth considering. There’s something motivating about having a detailed understanding of every step in the production process. If you have questions feel free to reach out, I love talking about this hobby! You can email me at joe@thinkweirder.com.
And if you enjoy concept-driven science fiction that is heavy on novel ideas, check out Think Weirder!