Fergus's blog · Scour

Scaling Curation with LLM Comparisons

fergusfinn.com·3w

LLM powered data structures: A concurrent, lock-free binary search tree

fergusfinn.com·3w·

Discuss: Hacker News

Large-Scale Semantic Search Without Embeddings

fergusfinn.com·5w

Parallel Primitives for Multi-Agent Workflows

fergusfinn.com·5w·

Discuss: Hacker News

How fast can an LLM go?

fergusfinn.com·15w

Control Layer Benchmarking

fergusfinn.com·15w

The Doubleword Control Layer

fergusfinn.com·15w

LLM guided scheduling

fergusfinn.com·19w

Scheduling in inference engines

fergusfinn.com·19w

Using caching for fast speculative decoding

fergusfinn.com·19w

Paged attention

fergusfinn.com·19w