Last week Anon pointed us to Meng 2022, which clarified some of my confusion about “non-probability” samples:
the phrase non-probability samples should be understood as a short hand for “samples without an identified design probability construct”.
Without (human) design probability, we can still have “divine probability”:
we typically conceptualize that the data at hand is a realization of a generative probabilistic mechanism given by nature or God.
…
Last week Anon pointed us to Meng 2022, which clarified some of my confusion about “non-probability” samples:
the phrase non-probability samples should be understood as a short hand for “samples without an identified design probability construct”.
Without (human) design probability, we can still have “divine probability”:
we typically conceptualize that the data at hand is a realization of a generative probabilistic mechanism given by nature or God.

But to bring us back down to earth, Meng introduces “device probability”:
By far, most probabilities used in statistical modeling are devices for expressing our belief, prior knowledge, assumptions, idealizations, compromises, or even desperation.
In math notation:
- Responders may be selected by human design probability P(R_i = 1 | X_i) where X_i can include stratification variables (for example).
- Without fully controlled human design, responders follow laws of nature, with divine probability pi_i = P(R_i = 1) that can differ across people arbitrarily, perhaps depending on outcome of interest y_i.
- To estimate divine probability, we might assume device probability, e.g. a generalized linear model P(R_i = 1 | X_i) = g^-1(b*X_i) that doesn’t include y_i.