Building Zone Failure Resilience in Apache Pinot™ at Uber
uber.com·3h
Flag this post

Share

Introduction

ZFR (zone failure resilience) is a critical aspect of modern distributed systems, especially for real-time analytics platforms like Apache Pinot™ that power many Tier-0 use cases at Uber. As part of our regional resilience initiative, ensuring Pinot can withstand zone failures without impacting queries or ingestion is paramount. This blog details how we’ve achieved zone failure resilience in Pinot by leveraging its instance assignment capabilities, integrating with Uber’s in-house isolation group concept, and consequently accelerating our release processes.

Leveraging Pinot’s Instance Assignment for Cross-Zone Data Distribution

Initially, our Pinot clusters at Uber relied on two key strategies: tag-based instance assignment, which groups servers by tenant …

Similar Posts

Loading similar posts...