I built an faster Notion in Rust

Eight months ago I left my job at Stripe to build a knowledge base. I had sat out the urge to do this for years by then. There have been many attempts at building a faster alternative, some of them mine. I don’t think any of them came close to getting it done.

What does it mean to build a better knowledge base? By the time I left my job, I had worked across many teams, built some knowledge systems, maintained some others, and read a whole lot of them. Stripe was the first company I worked at that had this figured out; they got themselves a team to build their own internal knowledge base, with search bridging the gap to other services.

As it turns out, all you need is speed and simplicity. Teams owned one or more spaces, and you didn’t have to remember where’s what, as you could fi…

As it turns out, all you need is speed and simplicity. Teams owned one or more spaces, and you didn’t have to remember where’s what, as you could find it with a quick search. Your team got assigned a task whenever a document went out of date.

The timing couldn’t be more right. Linear had already figured this out for product management. Many documentation tools gave up on building a good product and shifted their focus to competing with chat interfaces. Atlassian is sunsetting their Data Center offering, a once in a lifetime event that lets you swoop in and sign up their largest customers. Regulations over data residency have been tightening in Europe, and it helps to be an Irish company.

How do you build a simpler product? Counterintuitively, you start by building a much more complex one. It’s easy to wrap a database in a theme and call it a week, but you’d hit the same scaling issues as the next tool.

This is not where your favourite language comes in. I started by wrapping a database in a theme, in a week. I used Go and it’s a nice little language. But a couple of weeks in, I was writing more code for generating code than actual application code. There are many parts to a knowledge base, and no one has the time to write all the boilerplate. If you’re building a progressively complex system, you better have some automated way to know when services aren’t compatible anymore. By this point, I’ve decided that users should be able to collaborate in real time, and I wasn’t satisfied with search latency, and querying the database to check who can do what was getting tiresome. This is where your favourite language comes in.

const (
CreateSpaceMethod = "POST"
CreateSpacePath   = "/v1/spaces"
)

type CreateSpaceRequest struct {
Name string `json:"name" binding:"required"`
}

type CreateSpaceResponse = query.Space

func (r *Router) CreateSpace(c *gin.Context, req CreateSpaceRequest) (int, CreateSpaceResponse, error) {
// [...]
}

It didn’t take long to rewrite everything in Rust. Somehow I ended up with fewer lines of code as I replaced my handwritten bits for generating code with macro crates. What looked like a soup of text is now a more readable utoipa call. The ecosystem isn’t the largest out there, but it has some nifty crates that make your life so much easier.

#[utoipa::path(get, path = "/v1/spaces", responses((status = OK, body = Vec<model::Space>)))]
pub async fn list_spaces() -> Result<(StatusCode, Json<Vec<model::Space>>)> {
// [...]
}

A tiny Zanzibar

With Rust in my toolbox, or rather, being my toolbox, there were so many bits I could now at least dream of building, and stay up for nights making a reality. A while ago, Google wrote about their Zanzibar authorisation system. I was getting fed up with querying the database on each call to make sure your man can actually access the page they’re requesting. There were already some open source implementations inspired by Google’s architecture, however, they felt a bit overwhelming. It’s straightforward to launch a new Docker container, but you’ve now taken on the burden of maintaining and debugging a service that might not be as documented or as supported as you might need.

The concepts behind Zanzibar are compelling; abstract your authorisation system away from your usual database queries and application code. I chose to build a tinier version of it. It’s not decentralised, it’s persisted to Postgres, but loaded in memory on startup. It doesn’t have a complicated configuration language; I already know my entities. The relation is defined in a csv file right next to the code.

Scope,  Role,           Action,                         Object
space,  owner|admin,    read|create|update|list|delete, space
page,   viewer,         read|list,                      page|comment

Permissions are inheritable. Being a member of a team with access to a certain space immediately gives you access to that same space. Services can invoke the authorization system with a single macro call.

grant!(user_id, Editor, Page(page_id));
must!(user_id, Page(page_id), Update);

Without much optimisation, it takes nanoseconds to check if a user has access the right permissions to access a resource. It takes milliseconds to list every resource a user or a team have access to.

Benchmark results for authorisation checks

A more elastic search

A knowledge base is only as good as its search engine. I had been following the success of tantivy for a while. Its demos were convincing. There are easier ways to go about doing this, and your favourite tools all do it the same way; but they would’ve been too slow. Is your way truly any better than the other hundred if you’re making all the same choices as everyone else? My search engine was ready. Not long after, I added language detection, partially using whatlang, and multilingual tokenisation.

I didn’t expect much out of this, but I could now see the results coming in, with no latency I could notice, as I typed my queries. With authorisation also being within my control, I went a step further and integrated it with search. In the engine’s core, only the resources you can access are considered. They’re kept seamlessly in sync.

It might’ve taken longer than it should’ve to build these two integrated systems, but reading up other companies’ experiences, I think it saved me years of headaches.

Oxidising prose

I had hacked around with prosemirror before and it was the obvious choice this time. There are many collaboration plugins to use with it. It’s at this point that I started to think that my choice of Rust might finally come back to haunt me. The project itself comes with well-tested collaboration primitives, which I couldn’t use, because they were in JavaScript. I was aware of some alternatives, but they felt bloated and would get slower with larger documents and more users. It would’ve been foolish to rewrite such a massive project in Rust.

User edits are recorded as a list of steps, fed to prosemirror and compared against the original document for conflicts. An updated document with a new version hash is generated after, and persisted. To sidestep the issue, I applied the steps on the client-side, sent back the whole document, along with the steps, which I forwarded to the other connected users. This worked fine. With everything happening in real time, and given fast enough internet, no one would notice. But this meant that anyone could send anything and replace the whole document. It also introduced unnecessary latency and potential lag if you ever went through a patch of slow internet. With steps, and therefore, new documents, being generated for every few keys pressed, it wasn’t ideal. After building such great authorisation and search systems, this felt like a step down.

Is it foolish to rewrite prosemirror in Rust? It’s around this time that I realised someone had already done this work for Go. Ugh. I spent the rest of the week trying to set up quickjs or v8 to process steps on the backend. They were fine, albeit buggy, but I couldn’t feel the added complexity’s worth. It’s time to do the thing. I spent the next week porting prosemirror to Rust along with all its tests and thousands of compatibility snapshots. I think I barely left the house that time.

#[test]
fn slice_can_cut_half_a_paragraph() {
let original = doc!(p!("hello world"));
let expected = doc!(p!("hello"));
let result = original.slice(0, Some(6), None).unwrap();
assert_eq!(result.content, *expected.content());
assert_eq!(result.open_start, 0);
assert_eq!(result.open_end, 1);
}

It’s hard to know how great the thing is until you do it. In this case, it now takes mere microseconds to apply document edits.

Benchmark results for prosemirror

I haven’t given it much thought at the time, but a while after I realised the potential applications for having done this. It’s much easier to extract text content from documents before feeding them to the search engine, it’s easier to extract links or mentions, and it’s easier to add tab completion. You can’t trust a large language model with editing a structured document, but if you check and resolve any conflicts within instants, it doesn’t matter if half its suggestions break the structure. Adding tab completion or structured suggestions isn’t my top priority at the moment, but it’s now at least possible.

Safe sockets

That little crab is rather trusty. I think that matters the most when you’re tens of thousands of lines deep into your work. I’d rather be told when something’s broken, than be left to discover it a few months later, at two in the morning. Thanks to utoipa, again, real time messages are synced to the frontend, so TypeScript can carry on the trust.

fn create_docs() -> OpenApi {
OpenApiBuilder::new()
.info(Info::new("live", "1.0.0"))
.components(Some(
ComponentsBuilder::new()
.schema_from::<Command>()
.schema_from::<CommandReply>()
.schema_from::<MsgContent>()
.build(),
))
.build()
}

Beyond prose

An editor is more than just text. It’s its own tiny app. It’s well known, by now, that react doesn’t mix well with prosemirror. You could hack your way around it, or you could once again choose to do something different. I happen to be quite fond of solid. Unfortunately, the ecosystem is still too small to build an entire user interface with it. But it’s solid enough to integrate with prosemirror and handle its rendering cycle. With indirection out of the way, there’s so much one can make. Nothing, but time, is holding me off from adding advanced diagrams, plots, macros, variables, canvases, and all the things no other editor has ever dared to dream of. A knowledge base could be about more than just prose; some threads, a few slides, a spreadsheet even.

Tools are about workflows. Linear had that figured out. And workflows are about structure. Documents should expire. They should be linted for dead links and mistakes. Your knowledge tools should sync with your task management ones. These are some low-hanging bits. We live at a time where we can push well past the boundaries of structure. A link might not be dead, but it could’ve lost any semantic relevance by this point. The recent code changes might’ve made your diagrams out-of-date. Language models won’t write your prose, but they can push workflows.

What’s next?

Screenshot of Outcrop

There’s a lot more work to be done. I made a marketing page with some demos you could check out. You could join the waitlist. And, if you see yourself using this product, you could consider pre-ordering some seats and funding my work.

I’m aiming to launch within the next six months. Each seat will cost around €/$10. I’ve made some Stripe payment links; you’ll get double your pre-order as credits.

For smaller teams, €/$100 and €/$500;
For medium-sized teams, €/$1000 and €/$2500;
And for much larger teams, and if you’d want me to prioritise working on certain features, consider a €/$5000 pre-order.

If you’ve got any questions, or ideas, you can reach me at imed under outcrop.app.

Thank you! Imed

28 Oct 2025