# Too Good to Be True

In ancient Israel a court of 23 judges called the Sanhedrin would decide matters of importance such as death penalty cases. The Talmud prescribes a surprising rule for the court. If a majority vote for death then death is imposed except, “If the Sanhedrin unanimously find guilty, he is acquitted.” Why the peculiar rule?

In an excellent new paper, Too Good to Be True, Lachlan J. Gunn et al. show that *more evidence can reduce confidence*. The basic idea is simple. We expect that in most processes there will normally be some noise so absence of noise suggests a kind of systemic failure. The police are familiar with one type of example. When the eyewitnesses to a crime all report *exactly* the same story that reduces confidence that the story is true. Eyewitness stories that match too closely suggests not truth but a kind a systemic failure, namely the witnesses have collaborated on telling a lie.

What Gunn et al. show is that the accumulation of consistent (non-noisy) evidence can reverse one’s confidence surprisingly quickly. Consider a police lineup but now consider a more likely cause of systemic failure than witness conspiracy. Suppose that there is a small probability, say 1%, that the police arrange the lineup, either on purpose or by accident, so that the “suspect” is the only one who is close to matching the description of the criminal. Now consider what happens to our rational (Bayesian) probability that the suspect is guilty as the number of eyewitnesses saying “that’s the guy” increases. The first eyewitness to identify the suspect increases our confidence that the suspect is guilty and our confidence increases when the second and third eyewitness corroborate but when a fourth eyewitness points to the same man our rational confidence should actually

decrease.

Even though the systemic failure rate is only 1%, that small probability starts to weigh more heavily the more consistent (less noisy) the evidence becomes. The red line in the graph at right shows–using a 1% systemic failure rate and realistic probabilities of eyewitness identification–that after 3 witnesses more evidence decreases our confidence and when more than 10 witnesses identify the same suspect we should be less certain of guilt than when one witness identifies the suspect! The yellow line shows how certainty increases when there is no possibility of systemic failure which is what most people imagine is the case. Notice from the green line that even when the probability of systemic failure is tiny (.01%) it begins to dominate the results quite early.

What matters is not that the probability of systemic failure is tiny but how it compares to the probability of consistency which, with any reasonable estimate of noise, is itself getting tinier and tinier as evidence accumulates. In another application, the authors show how even the miniscule probability of a stray cosmic ray flipping a bit in machine code can materially reduce our confidence in common cryptographic procedures.

In summary, the peculiar rule of the Talmud receives support from Bayesian analysis–too much consistency is suspect of failure.