Has Merge Join's Time Gone? (opens in new tab)
I have long thought of the hash join/merge join duopoly as reflective of the fundamental truth that basically all data-processing related algorithms are based on either hashing or sorting. Yes, we also have nested loop join, but that's sort of just hash join in disguise. Aggregation? Your options are sorting and grouping, or hash bucketing. Deduplication? Same deal. Union? Intersection? Difference? Same thing. And to me at least, this dichotomy feels very orthogonal. The way that you hash a p...
Read the original article