I/O Observability for Uber’s Massive Petabyte-Scale Data Lake
uber.com·9h
Flag this post

Share

Introduction

As Uber’s data infrastructure evolves toward a hybrid cloud architecture, understanding data access patterns across our platform is more critical than ever. This data I/O (Input/Output) observability plays a crucial role in the journey to CloudLake (Uber’s hybrid cloud architecture). As part of the CloudLake migration, Uber is expanding its compute and storage capacity in the cloud, while gradually decommissioning on-prem capacity. This opens up a new set of problem statements. First, the cross-service provider network link is a bottleneck. Second, colocating workloads with datasets for efficient execution is envisaged, but the challenge arises due to a lot of experimenta…

Similar Posts

Loading similar posts...