Well, it seems to me that ....
If you assume Big Blue is perfect, you open Vault B without further thought.
If you know Big Blue has been perfect to date, but suspect that it, like the rest of us, has some non-zero rate of random failures (i.e., just a very good guesser to date), then you can estimate the low limit of Blue's mean time to failure based on the number of observations and the level of confidence you require.
With 1% confidence, you can say that his MTBF is 99.1 (=-1/LN(1-1%)) times the number of past perfect guesses, so you open Vault B knowing that's it highly likely it will guess correctly this time.
If you require 99% confidence (because you are DESPERATE for SOME money), even if Blue has guessed right 100 times correctly in the past, the low end of his MTBF is just 21.7 correct guesses, and you open both.
EDIT: Your expectation is always higher if you open B.
Bookmarks