You Gotta Push If You Wanna Pull
morling.dev·14h
🗄️Database Internals
Preview
Report Post

Historically, data management systems have been built around the notion of pull queries: users query data which, for instance, is stored in tables in an RDBMS, Parquet files in a data lake, or a full-text index in Elasticsearch. When a user issues a query, the engine will produce the result set at that point in time by churning through the data set and finding all matching records (oftentimes sped up by utilizing indexes).

Generally, this approach of pulling data works well and it matches with how people think and operate. You have a question about your data set? Express it as a query, run that query, and the system will provide the answer. But there are some challenges with that, too:

Performance: Queries might be prohibitively expensive to process, taking too long to pro…

Similar Posts

Loading similar posts...