Post navigation
From Alpert and Raiffa’s classic 1969 article, A progress report on the training of probability assessors:
In the aggregate, intervals from .25 to the .75 fractiles were too tight. Just as many true values should have fallen outside the interquartile ranges (the .25 to .75 range) as fell inside, but as a matter of fact twice as many fell outside as inside. But it’s not good enough for us to say, “Spread out your interquartile ranges” because there is a lot of variation from question to question and from individual to individual.
I read that article decades ago and it stuck in my mind. This idea of spreading out your ranges . . . it’s tough! The challenge…
Post navigation
From Alpert and Raiffa’s classic 1969 article, A progress report on the training of probability assessors:
In the aggregate, intervals from .25 to the .75 fractiles were too tight. Just as many true values should have fallen outside the interquartile ranges (the .25 to .75 range) as fell inside, but as a matter of fact twice as many fell outside as inside. But it’s not good enough for us to say, “Spread out your interquartile ranges” because there is a lot of variation from question to question and from individual to individual.
I read that article decades ago and it stuck in my mind. This idea of spreading out your ranges . . . it’s tough! The challenge is that we know from many empirical studies that your subjective probability intervals will be too narrow, but if you already know they’ll be too narrow, maybe you’ve already widened them, and you don’t want to make them too wide either.
So what to do?
This problem has bugged me for many years, but now I think I have a solution. OK, not a solution, exactly, but a way into the problem:
Instead of trying to the start with the solution and then widening your intervals, start with the procedure that would create the too-narrow intervals and then widen.
So, what’s the procedure that creates a too-narrow 50% interval? It starts with a guess at the unknown quantity. Then make the upper and lower bounds. Don’t try too hard to think about 50% coverage, just get reasonable bounds. Then let’s suppose you’re like the average person, and your interval only has 33% coverage. Going from 33% coverage to 50% coverage, for a normal distribution, would take you from mean +/- 0.43 sd to mean +/- 0.67 sd, i.e., widening the interval by a factor of 0.67/0.43 = 1.56. So take your interval and multiply it by 1.5, and that’s what you can give as your 50% interval. If you want a 95% interval, multiply your 50% interval by 3. And if your unknown quantity is a positive quantity (as is typically the case), do this all on the log scale.
Let’s try an example, something I don’t know but I can easily look up . . . ummm, ok, what’s Tom Brady’s lifetime NFL passing yardage?
I’ll start with a guess. If he passes for 300 yards per game, 16 games a year (I guess they only count regular-season games in the stats, and I think the NFL season was around 16 games during Brady’s career), and 20 full seasons, that’s 300*320 = 96,000 yards. That sounds like a lot, also 300 yards per game would be a high average, so let me guess 50,000 yards. Also maybe he had some off seasons, so, 30,000?
What’s a reasonable range? We already have 96,000 as an upper bound, call it 100,000. As a lower bound, I dunno, 20,000? It’s hard to imagine it being less than that.
OK, on the log scale, the center of the interval is sqrt(20,000 * 100,000) = 45,000. And the two endpoints of the new 50% interval will be 45,000 */ (45,000/20,000)^1.5 = (13,000, 150,000). So that’s what I’ll go with.
And now I’ll look it up . . . Tom Brady’s actual lifetime NFL passing yardage is 89,214. Ha! Already included in my original interval. In retrospect I didn’t need to widen it.
Well, that’s just one example. I still like my method. The key idea is to (implicitly) model the process by which we create our intuitive intervals with their characteristic undercoverage (as discussed by Erev, Wallsten, and Budescu in their 1994 paper, Simultaneous over- and underconfidence: The role of error in judgment processes, and then we correct for that.
Too bad Dave Krantz isn’t around, or I’d ask for his take on this.