Subset Sampling over Joins
arxiv.orgยท1d
๐Ÿ“ŠDatalog
Preview
Report Post

Title:Subset Sampling over Joins

View PDF HTML (experimental)

Abstract:Subset sampling (also known as Poisson sampling), where the decision to include any specific element in the sample is made independently of all others, is a fundamental primitive in data analytics, enabling efficient approximation by processing representative subsets rather than massive datasets. While sampling from explicit lists is well-understood, modern applications โ€“ such as machine learning over relational data โ€“ often require sampling from a set defined implicitly by a relational join. In this paper, we study the problem of \emph{subset sampling over joins}: drawing a random subset from the join results, where each join reโ€ฆ

Similar Posts

Loading similar posts...