Stephen Stigler has a cool piece on a machine that Francis Galton built in 1877 that calculated a posterior distribution from a prior and a likelihood function. Galton's originality continues to astound.
Here is Stigler:
The machine is reproduced in Figure 1 from the original publication. It depicts the fundamental calculation of Bayesian inference: the determination of a posterior distribution from a prior distribution and a likelihood function. Look carefully at the picture–notice it shows the upper portion as three-dimensional, with a glass front and a depth of about four inches. There are cardboard dividers to keep the beads from settling into a flat pattern, and the drawing exaggerates the smoothness of the heap from left to right, something like a normal curve. We could think of the top layer as showing the prior distribution p(Î¸) as a population of beads representing, say, potential values for Î¸, from low (left) to high (right)….
…the beads fall to the next lower level. On that second level, you can see what is intended to be a vertical screen, or wall, that is close to the glass front at both the left and the right, but recedes to the rear in the middle.
…The way the machine works its magic is that those beads to the front of the screen are retained as shown; those falling behind are rejected and discarded. (You might think of this stage as doing rejection sampling from the upper stage.)
…The final stage turns this into a standard histogram: The second support platform is removed by pulling to the right on its knob, and the beads fall to a slanted platform immediately below, rolling then to the lowest level, where the depth is again uniform–about one inch deep from the glass in front. This simply rescales the retained beads… the magic of the machine is that this lowest level is proportional to the posterior distribution!
Hat tip: The Endeavour.