Bootstrap a Data Lakehouse in an Afternoon
towardsdatascience.comΒ·3d
🧊Iceberg Tables
Preview
Report Post

doesn’t need to be that complicated. In this article, I’ll show you how to develop a basic, β€œstarter” one that uses an Iceberg table on AWS S3 storage. Once the table is registered using AWS Glue, you’ll be able to query and mutate it from Amazon Athena, including using:

  • Time travel queries,

  • Merging, updating and deleting data

  • Optimising and vacuuming your tables.

I’ll also show you how to inspect the same tables locally from **DuckDB, **and we’ll also see how to use Glue/Spark to insert more table data.

Our example might be basic, but it’ll showcase the setup, the different tools and the processes you can put in place to build up a more extensive data store. All modern cloud providers have equivalents of the AWS services I’m discussing in this artic…

Similar Posts

Loading similar posts...