A science journalist sent me an email:
Is this your sort of paper? It’s on sex ratios at birth.
Seems fascinating to me but I need the thoughts of a statistician. If you have time and if this is your kind of hating, I’d love to talk to you. It’s embargoed in Science Advances for Thursday.
This all happened awhile ago so the paper is no longer embargoed, hence the link above.
Here’s how I responded to the journalist:
I’m not sure about the details of this analysis but I’m skeptical. For one thing, references 8, 9, 10, 11, and 12, which they cite and which purportedly give evidence for systematic variation of sex ratios, have small enough sample sizes that their results are essentially pu…
A science journalist sent me an email:
Is this your sort of paper? It’s on sex ratios at birth.
Seems fascinating to me but I need the thoughts of a statistician. If you have time and if this is your kind of hating, I’d love to talk to you. It’s embargoed in Science Advances for Thursday.
This all happened awhile ago so the paper is no longer embargoed, hence the link above.
Here’s how I responded to the journalist:
I’m not sure about the details of this analysis but I’m skeptical. For one thing, references 8, 9, 10, 11, and 12, which they cite and which purportedly give evidence for systematic variation of sex ratios, have small enough sample sizes that their results are essentially pure noise (we wrote a paper about this a few years ago) so it is already a bad sign that the authors cite these papers uncritically. They do say, “none of these hypotheses have been confirmed in large epidemiological studies,” but that understates it, in that those papers should never have been taken seriously in the first place.
That said, there is some evidence from population studies that sex ratios vary a little bit. I can’t remember all the details, but I think that the probability of girl birth is about 0.5 percentage points higher for African-American parents than for white parents, also the probability of girl birth is slightly higher for older mothers.
I’ll just say a few things:
1. For some reason, people looove to study sex ratios and they’re always trying to see if they can predict if a baby will be a boy or a girl.
2. The process is essentially random, i.e., it’s unpredictable. The average probability of girl birth is approximately 48.8%. Even if it’s 49.3% for some mothers and 48.3% for others, it’s still almost entirely random.
3. It’s easy to find patterns in random data, as evidenced by references 8, 9, 10, 11, and 12 and also discussed in this classic 2011 paper by Simmons et al.
4. Differences in sex ratio are small enough that you need huge sample sizes to detect any signal amidst all the noise, and even small data problems will overwhelm any signal.
5. I flat-out don’t believe their claim that if you have three boys, that there’s a 61% chance your next baby will be a boy. I could be wrong, but based on my experience with the literature, I just don’t believe it. I don’t know whether this is coming from selection bias or what.
The journalist responded:
I meant to say if this is your kind of thing. But I guess I must have been anticipating hating because that’s what I wrote.
If you remain interested in the statistical challenges of estimating variation in sex ratios, I recommend my 2009 article with David Weakliem and this 2001 post in Chance News by the late Laurie Snell.
Studying sex ratios is just a lot harder than you think: effects are tiny and variation is large.
From polling and clinical trials and psychology experiments and other things, we’re used to the idea that a few thousand people is a large sample. But a few thousand is not a large sample if you’re estimating tiny effects. Differences in Pr(girl) of 0.1 percentage point or even 0.5 percentage point are small compared to the effects that we study in medicine, politics, and psychology, and a sample size that’s sufficient to study effect sizes of 5 percentage points or 10 percentage points or more won’t be enough here.
The trouble–and we see it over and over again–is that what is reported are statistically significant results, which by necessity will be large. So people do these sex-ratio studies, they find some huge effects. The effects are “statistically significant” and “practically significant.”
What could possibly go wrong? The answer is that (a) it’s not hard to find statistically significant results from pure noise, (b) the signal here is small enough that it can be overwhelmed by biases in data collection, and (c) with such noisy studies, anything that’s statistically significant will automatically be practically significant–indeed, it will be a drastic overestimate of any real effect.
The result is a series of apparently bulletproof statistical results implying large results, leading to an entire subfield full of noise (as in references 8, 9, 10, 11, and 12 of the above-linked paper). It’s a vicious circle, a perpetual motion machine of apparent success. And the claimed huge effects get amplified in the news media (yeah, I’m looking at you, Freakonomics!), motivating more studies along these lines.
On the plus side, the journalist contacted me first, which injected a note of skepticism into the proceedings. On the minus side, there’s a selection bias by which the credulous media outlets (yeah, I’m looking at you, BBC!) promote the claim uncritically, while the savvier journalists know to lay off it. The net result is that the coverage that does come out, is uniformly unskeptical, or nearly so.
To return to the example at hand: if the sex ratio of humans varied like the sex ratios of Seychelles warblers, then these studies would be just fine. The problem is the effect size. Population studies find tiny effect sizes–not zero, but less than 0.5 percentage points in most settings. Studies based on surveys find huge effect sizes, which is what we’d expect from noisy estimates.
I also don’t see any substantive theory behind the claims, other than that it seems very intuitive to people that (1) the sex of a baby should be predictable and (2) sex ratios should vary a lot.
I think the basis for that first intuition is gender essentialism: men and women are so clearly different that it just stands to reason that they should somehow be externally distinguishable (in the pre-ultrasound era) even in the womb, and that the sex of the baby should be influenced by various ways as might be explained by gender essentialism.
I think the basis for the second intuition is what Tversky and Kahneman called “the law of small numbers”: everybody knows someone with several siblings, all of the same sex. People don’t have a sense of the huge sample size that would be needed to learn anything useful from such data, even setting aside selection and recall issues. Also, gender essentialism: it just stands to reason that parents of boys should be more masculine, in some way, than parents of girls. This bit of intuition can be backed up by endless speculation of the sort that is called evolutionary psychology and which plays well in Freakonomicsland.
Regarding the particular article under discussion. I can’t say for sure they’ve got nothing there. I can just say that I don’t believe it, because their paper falls in a long line of research finding large and statistically significant effects from noisy sex-ratio data. Also it’s a bad sign that they cited several papers from the literature that are definitely full of crap.
I agree with the authors that variation in Pr(girl) is possible–indeed, the variation can’t be exactly zero, as we know that the probability varies by ethnicity, maternal age, and other conditions. I’ve just never seen any evidence that the natural variation would be large enough to be detectable from this sort of study, and I’ve seen lots of this sort of study that present such claims with a great deal of unwarranted confidence.