Total Cost of Ownership
In an era when data grows beyond massive, the Total Cost of Ownership (TCO) has become a critical metric for evaluating data infrastructure. Among the infrastructure, the cost of databases, where you store and query the data, stands out as a big challenge.
Conventional database solutions often force a difficult trade-off: provision for peak load and suffer high waste during off-peak hours, or provision for average load and risk performance degradation or outages during traffic spikes.
This rigidity stems from the coupled architecture of storage and compute found in most Shared Nothing systems. Because every compute node maintains a local replica of the data, scaling in is risky and operationally expensive—data must be rebalanced to remaining nodes before a node can be safely removed.
Consequently, engineering teams typically resort to Static Provisioning: they estimate the maximum possible traffic (peak load), add a safety buffer, and provision infrastructure to that level 24/7.
As illustrated above, the Real Workload (solid green line) fluctuates significantly throughout the day. However, due to the difficulty of scaling, the Static Provisioning (red dashed line) must remain constant at the peak level. The entire area between the red line and the blue curve represents Wasted Capacity—expensive compute and storage resources that are paid for but sit idle for the majority of the day. Ideally, an infrastructure should follow the Elastic Provisioning curve (solid blue line), closely fitting the actual demand.
Tiered Storage: A Pioneer in Cost Efficiency
To address the prohibitive cost of storing petabytes of data on high-performance block storage (like EBS), many traditional database vendors introduced "Tiered Storage." This technique attempts to offer the best of both worlds by offloading older, less frequently accessed (”cold”) data to cheaper object storage (S3), while keeping "hot" data on local high-performance disks.
While this was a pioneering attempt that reduced the cost of storing cold data, it only patched the legacy Shared Nothing architecture rather than fixing the real issue. These systems were originally built with the core assumption that data locality is essential. In the bare-metal world, reading from a local SSD was orders of magnitude faster than the network. However, in the cloud, network bandwidth has exploded, while the operational cost of managing stateful storage on compute nodes has become a major bottleneck.
Because Tiered Storage treats S3 as a cold archive rather than a primary access layer, it introduces significant limitations: