wilsonzlin/edgesearch: Serverless full-text search with Cloudflare Workers, WebAssembly, and Roaring Bitmaps (opens in new tab)

Edgesearch

Build a full text search API using Cloudflare Workers and WebAssembly.

Features

Demos

Check out the demo folder for live demos with source code.

How it works

Edgesearch builds a reverse index by mapping terms to a compressed bit set (using Roaring Bitmaps) of IDs of documents containing the term, and creates a custom worker script and data to upload to Cloudflare Workers.

Data

An array of term-documents pairs sorted by term is built, where term is a string and documents is a compressed bit set.

This array is then split into chunks of up to 25 MiB, as each Cloudflare Workers KV entry can hold a value up to 25 MiB in size.

To find the documents bit set associated with a term, a binary search is done to find the appropriate chunk, and then the pair within the chunk.

The same structure and process is used to store and retrieve document contents.

Packing multiple bit sets/documents reduces read/write costs and deploy times, and improves caching and execution speed due to fewer fetches.

Searching

Search terms have an associated mode. There are three modes that match documents in different ways:

ModeDocument match condition
RequireHas all terms with this mode.
ContainHas at least one term with this mode.
ExcludeHas none of the terms with this mode.

For example, a document with terms a, b, c, d, and e would match the query require (d, a) contain (g, b, f) exclude (h, i).

The results are generated by doing bitwise operations across multiple bit sets. The general computation could be summarised as:

result = (req_a & req_b & req_c & ...) & (con_a | con_b | con_c | ...) & ~(exc_a | exc_b | exc_c | ...)

Cloudflare

There are some nice advantages when only using Cloudflare Workers:

  • Faster than a VM or container with less cold starts, as code is run on a V8 Isolate.
  • Naturally distributed to the edge for very low latency.
  • Takes advantage of Cloudflare for SSL, caching, and distribution.
  • No need to worry about scaling, networking, or servers.

WebAssembly

The C implementation of Roaring Bitmaps is compiled to WebAssembly. A basic implementation of essential C standard library functionality is implemented to make compilation possible.

Usage

Get the CLI

LLVM 9 or higher is required to use the CLI for building the worker.

Loading more...

Keyboard Shortcuts

Navigation
Next / previous item
j/k
Open post
oorEnter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help