Logical Thought==Efficient Markets

A remarkable new paper on logical induction by Scott Garrabrant, Tsvi Benson-Tilsen, Andrew Critch, Nate Soares, and Jessica Taylor dramatically extends Ramsey’s Dutch book arguments in support of Bayesian epistemology and in so doing demonstrates deep connections between logical thinking and efficient markets. The research was supported by the Machine Intelligence Research Institute.

We present a computable algorithm that assigns probabilities to every logical statement in a given formal language, and refines those probabilities over time. For instance, if the language is Peano arithmetic, it assigns probabilities to all arithmetical statements, including claims about the twin prime conjecture, the outputs of long-running computations, and its own probabilities. We show that our algorithm, an instance of what we call a logical inductor, satisfies a number of intuitive desiderata, including: (1) it learns to predict patterns
of truth and falsehood in logical statements, often long before having the resources to evaluate the statements, so long as the patterns can be written down in polynomial time; (2) it learns to use appropriate statistical summaries to predict sequences of statements whose truth values appear pseudorandom; and (3) it learns to have accurate beliefs about its own current beliefs, in a manner that avoids the standard paradoxes of self-reference. For example, if a given computer program only ever produces outputs in a certain range, a logical inductor learns this fact in a timely manner; and if late digits in the decimal expansion of π are difficult to predict, then a logical inductor learns to assign ≈ 10% probability to “the nth digit of π is a 7” for large n. Logical inductors also learn to trust their future beliefs more than their current beliefs, and their beliefs are coherent in the limit (whenever φ → ψ, P∞(φ) ≤ P∞(ψ), and so on); and logical inductors strictly dominate the universal semimeasure in the limit.

These properties and many others all follow from a single logical induction criterion, which is motivated by a series of stock trading analogies. Roughly speaking, each logical sentence φ is associated with a stock that is worth $1 per share if φ is true and nothing otherwise, and we interpret the belief-state of a logically uncertain reasoner as a set of market prices, where Pn(φ) = 50% means that on day n, shares of φ may be bought or sold from the reasoner for 50¢. The logical induction criterion says (very roughly) that there should not be any polynomial-time computable trading strategy with finite risk tolerance that earns unbounded profits in that market over time. This criterion bears strong resemblance to the “no Dutch book” criteria that support both expected utility theory (von Neumann and Morgenstern 1944) and Bayesian probability theory (Ramsey 1931; de Finetti 1937).

The authors are quick to acknowledge that their algorithm holds only in the limit which makes it impractical to implement. Nevertheless, the first fully rational beings on the planet will surely be artificial intelligences.


I have developed

An algorithm

That detects


We (mostly) all have it. It was supplied by evolution to detect cheaters and be disgusted by free riders.

I think I understand the result. It goes like this: If we have an infinite amount of time and an infinite amount of cognitive resources, then we can eventually know everything that can be known. Things that can't ever be known will still not be known.

Sounds like a true breakthrough. I don't know if anyone every thought of that before.

You did not understand the result.

This brings to mind Mitt Romney's Kinsley gaffe that "corporations are people". Yes they are, and so are markets. And just like corporations, markets make mistakes. My problem with papers like this one is that they treat markets as a kind of religion, and in so doing, they actually have the opposite effect than what is intended. One might make a good case that the Texans who made the decision to fight Santa Anna at the Alamo were rational beings since the events at the Alamo motivated other Texans to defeat Santa Anna ("remember the Alamo"). On the other hand, the decision to fight Santa Anna at the Alamo produced a terrible result for the Texans at the Alamo. One might make a good case that non-intervention during a time of financial crisis is the rational choice since in the long-term it produces the greatest economic growth. On the other hand, non-intervention produces a terrible result for the poor bastards who had the misfortune of owning lots of assets as market prices collapsed.

You also did not understand the paper. It is not about the merits or demerits of non-intervention in real markets, it is using markets as an analogy. They want to assign meaning to statements like "The twin prime conjecture is probably true." which mathematicians make all the time, but which make little sense in many traditional philosophical accounts of probability.

You also did not understand the paper. It is not about the merits or demerits of non-intervention in real markets, it is using markets as an analogy. They want to assign meaning to statements like “The twin prime conjecture is probably true.” which mathematicians make all the time, but which make little sense in many traditional philosophical accounts of probability.

That doesn't seem so perplexing. If you take 10 conjectures that mathematicians claim 'are probably true', then you would expect at least a majority of them to turn out to be true. Of course to test that you actually have to take the conjectures and prove them true or false.

One might make a case that, given what eventuated at Goliad (what we used to call the Goliad Massacre, but I think is now trending as the Goliad Deserved Execution), the Texians at the Alamo had nothing at all to lose in making the decision to try to hold the Alamo.

My comment was about the religion of markets. I don't question whether markets are a better predictor of optimal outcomes than individuals, I question optimal outcomes measured only by their monetary value. On a related subject, last week Cowen sort of defended the end of net neutrality on the ground that it may not be so bad after all. Of course, that turns the efficient market hypothesis on its head, since the end of net neutrality will mean fewer participants in the internet market. What makes markets efficient is unlimited participants with unlimited information. How can the internet market be efficient if the ISPs control who participates?

The religion of markets or measuring outcomes only by monetary value have nothing to do with the paper.

This excerpt makes me feel like I am short-sighted, but in fact only at first sight, it makes me feel uneasy.
I am a physicist and applied mathematician, and I have landed here through twitter.
Really, Asher made me lough out loud, he/she is right :
There is no such thing as "The twin prime conjecture is probably true," or exept maybe at a bar with your math colleagues. And you have to get somewhere, give tangible, discernable arguments to ponder for a while, you have to convince them that you might be right on that particular aspect? In other words, you have to zoom in, with magnification the higher the closer to the solution you get. There is a priori even less such thing as "I have a model that learns to predict patterns of truth and falsehood in logical statements" : Boolean logic is not predictive, statements are either right or wrong and this holds for ever. Dynamics may be predictive, probabilities have a meaning : they serve to assess unpredictable dynamical systems. Logical statements are either right or wrong, or undecidable, it's only your skills which are predictable. Incompleteness is incompatible with the claims in my opinion.
Then, " learns to have accurate beliefs about its own current beliefs" what does this mean even in english?
So the first few sentences conveyed an idea that " assigning probabilities to all arithmetical statements" would be a way to adress something difficult cleverly, but in fact it is like Rayward says : it is talking religion by assigning a scalar value to nonsense, why call it a dollar when it is fixed to 1 ??
Sorry dudes, but it sounds gibberish to my ears, or some kind of kryptic poetry. Maybe I don't have the ears to hear this nice music...
Oh by the way, the title is rather disappointing : every one knows for a fact that there is no efficient market, but there is logical thought. So in one fraction of match crack, one can state that the paper is false with probability one.

Participial phrases have long been known to be common sense. Lying in his room, He ate the dog. The dog could not have been eaten. David Foster Wallace was a positivist in this sense. He would have maintained the dog died.

But the metaphor is all mixed up. Many groups use names apparently derived from the Romani word kalo or calo, meaning "black" or "absorbing all light." This closely resembles words for "black" or "dark" in Indo-Aryan languages (e.g., Sanskrit काल kāla: "black", "of a dark colour").

It's an affront to all creatures with an exoskeleton.

Having seen so-called "rationalism" (the human variety, not the digital version) in action across one single brief lifetime: are there rational grounds for hoping OR believing that "the first fully rational beings on the planet" will themselves be fully capable of "rationalism"?

Sounds a bit like asking the Moon to keep its face fully lit every night of the month.

Perhaps undetected flaws in algorithms or design will yet make this venture into "complete rationalism" only the occasional and infrequent cognitive or epistemic excursion so familiar to our species.

(What algorithm would be needed to invent an "anti-rational intelligence"? Perhaps only THAT kind would do us any good.)


Alexa appears to be an anti-rational artificial intelligence, if this clip is in fact true.

I do prefer to distinguish "anti-rationalism" (the cognitive/epistemic corrective supplied by Swift who in writing to Pope contested Augustine's representation of humans as "rational animals" in preference to the formula "animals merely capable of reason", a corrective more elaborately described by Pascal and Vico in polemics with Descartes) from all manner of "irrationalism" (non-rational approaches to cognition and epistemology largely of the modern era--positivism itself seems to qualify some days, but I'm thinking more of surrealism and absurdism and 'pataphysics and Dada, et cetera, and/or as the clip suggests, political extensions of irrationalism to be found lurking in the media-proffered feminist, SJW, et al., tumult).

Alexa sounds as if she could get an editorial gig at The Atlantic or NPR without a fight.

I may learn something here! Explain a bit more. If I understand correctly, if I am perfectly rational and placed between two equidistant and identical pizzas, I will starve to death. An 'irrational' choice is then necessary to allow me to pick one, satisfying a higher-order rational choice (starve or don't). It's more important to identify the critical decisions and be rational about them than worry about being 100% rational on the trifles and hoping it trickles up.

Or is this more that humans have a ghost in the machine that turtles do not, but perhaps dogs do? I heard an interesting discussion about consciousness being fractional, or not (Intelligence appears to be, and I think that consciousness and perhaps free will also are - I have some form of consciousness while hopped up on Ambien)

For yours truly, given my own partial grasp, the trouble comes largely of our fruitless attempts to impose rational solutions upon non-rational "reality" (including, of course [if not primarily], ourselves).

E. g.: the advent of the internal combustion engine, a marvel itself of applied rationality, was facilitated by arguments against municipal accumulations of horse manure. Well, reason prevailed, and today, global climatic patterns seem to be responding more to technogenic contributions than to equine or bovine contributions, or at least our climatologists haven't sorted out all the competing, accumulative contributions to what I remain willing to call (irreversible) "Technogenic Climate Change". (The more exact source of climate change NOT being "anthropic" but specifically "technologic".)

"Rationalism" seems to remain a deficient description of cognition, while it fails to contribute to an entirely trustworthy account of epistemology: thus, it plays havoc with many proffered models of anthropology, et cetera.

". . . the first fully rational beings on the planet will surely be artificial intelligences."

Perhaps; if they also have the capacity for "artificial" imaginations.

Before others make the same mistake that Alex, the people retweeting Alex (Sean Carroll), and also initially me, are making, you should read this criticism of the paper: http://nostalgebraist.tumblr.com/post/160975105374/on-miris-logical-induction-paper

(remember your real analysis classes!)

nostalgebraist also linked me on Twitter to a bunch of peer pre-reviews of the paper by the Open Philanthropy project, can't find the link now but they weren't so impressed. Reason why it's getting attention on Twitter at the moment is that @MIRI just tweeted it and it got a retweet by a single ML guy who later agreed with the above criticism when pointed out.

More generally on Bayes, you should probably also read this: http://nostalgebraist.tumblr.com/post/161645122124/bayes-a-kinda-sorta-masterpost

To clarify: "computable" and even "holds in the limit" (which limit?) does not imply "useful at all", you need something at least somewhat mathematically stronger than that. (Seeing Sean Carroll retweet this did make me feel better about buying into the hype when this first came out ages ago, though.)

Here we go: some peer reviews http://files.openphilanthropy.org/files/Grants/MIRI/consolidated_public_reviews.pdf

To clarify, none of the reviews in that PDF are review of the Logical Induction paper itself, which was not finished at the time of the 2016 OpenPhil review. But "Paper 3" and "Paper 4" are the two papers which MIRI described as predecessors to the Logical Induction paper (see https://intelligence.org/2016/04/21/two-new-papers-uniform/ ), and note that these received harsh criticism from the external reviewers, some of which dovetails with my critique of the LI paper.

Anyway, thanks for linking to this stuff, Saturos :)

More recently, another reviewer was very impressed by the Logical Induction paper. The Open Philanthropy Project discusses that here

We received a very positive review of MIRI’s work on “logical induction” by a machine learning researcher who (i) is interested in AI safety, (ii) is rated as an outstanding researcher by at least one of our close advisors, and (iii) is generally regarded as outstanding by the ML community.

I heard before that nostalgebraist didn't think the paper was so good and was very interested in reading his critique. Thanks for the link.

That said, after reading the review I have not changed my mind that this is a highly significant paper. Yes these results are all asymptotic and not usable in practice, welcome to higher maths... for any real applications there is typically a large amount of theoretical results that precede that inform/limit/or otherwise condition the ultimate application. The Logical Induction paper does some nontrivial and novel mathematics. Progress on AI decision theory will likely need many of these kinds of breakthroughs, most of which concern algorithm/results/etc that are not practically implementable.

There are two parts to the paper: the logical induction criterion and its properties and the construction of a the ' Super Trader'. The former provides highly surprising results if a certain program satisfies this criterion, the latter shows that there is actually a program that satisfies this criterion. It does so by a fairly typical trick in mathematical computer science where it constructs a rather wild program by essentially enumerating everything: in this case enumerating the entire market of traders. This is a bit like cheating and this latter program is of course not useful in practice; nevertheless it is an highly interesting question whether one could find more natural programs satisfying the logical induction criterion.

Disclaimer: I have no connection with any of the MIRI people.

My critique isn't really about practical implementation. As I said in the post, I think the LI criterion is too weak (in certain specific senses).
So the issue isn't "the criterion is great if you can get it, but hard to get" -- it's "the criterion is not (in itself) that great, even if you can get it."

The main reason I say the criterion is weak is that it only requires pointwise convergence (in a sense I make precise in my post). And of course, as in real analysis, the distinction between pointwise and uniform convergence is a mathematically significant one, not one of "merely practical" significance.

I'm not sure the results following from the LI criterion are really so surprising? Although it depends on how you look at it. It is possibly surprising that one can find a countable set ("e.c. traders") that includes many (all?) of the individual (and mutually consistent) properties we want out of a probability measure on logical sentences. Once we've noticed that there is such a set, we can of course guarantee each such property after some time step, since we have a countable infinity of time steps, and the properties are logically consistent with one another.

Dear Nostalgebraist,

Thank you for you reply. I think our positions are clear and we'll have to agree to disagree; even as good bayesians should not - though I understand you think that is a bit of an hype as well;-)

Your insistence on the difference between pointwise and uniform convergence is a fair one; I agree that as is one can think of many programs (like the Super Trader) which satisfy the LI criterion yet are do not conform to our conceptions of a 'good' logical inductor. Your example of the 'Mistakes Machine' that enumerates and tries every possible mistake is another good one. The bridge between uniform and pointwise convergence is indeed very large and goes beyond the issue of practical implementation. Nevertheless, I stand by my assessment that this is a highly significant advance, albeit one that should be taken as the first step in a long process of investigation, one that could take decades.

The surprising part is having a single criterion that neatly implies over a dozen good properties. Earlier attempts to assign probabilities to theorems were dogged by unnatural constructions and (consequently) a failure to satisfy many good properties at the same time. I am not an expert in the area, but have occasionally dabbled in it and these are genuinely difficult problems.
The hardest part is not even finding a suitable criterion, but having a definition of what it even *means* to assign probabilities to theorems. In fact, one could be forgiven for thinking that this is in fact not possible at all (in nontrivial ways). IIRC one of Yudkowsky's earliest papers was exactly about the importance&possibility of this probabilistic approach to circumvent hard theorems on self-reference (Godel, Lob, etc).

You are right to assert that the proof of all these properties from the LI criterion all follow in 'essentially' the same way. That is, if the property does not hold one can construct a trader that abuses the system indefinitely.
The surprising thing here is not the proof of the proposition themselves but the overarching framework.

Let me give a number of naive attempts of assigning probabilities to theorems:

- ('Bayesian' approach I) Start with a set of priors of probabilities of elementary theorems. Decompose a given theorem into logical primitives and then assign the resulting probabilities inductively. Bayesian update when one learns that a given theorem is true or not from the Oracle.
- ('Bayesian' approach II) Consider the set of Theorem Probability Assigners (TPAs): programs which given a list of possible theorems start to assign probabilities to them. Enumerate all TPA's; assign the TPA's a weight (a 'prior'). At each timestep output the averaged weighted probabilities assigned by the TPA's; Bayesian update the 'priors' when the Oracle outputs a new proof.
-('Kolmogorov Complexity') Enumerate all TPA's T and assign them a quality rating at each time step N as follows according to how long it took them to assign a probability, how accurate these were with respect to the Oracle, and divide by the Kolmogorov complexity of T.
- ('Inductive neural network' approach)
Step 1. Start with a number of primitive logical inductors s_i with weights w_i. Average the answer.
Step 2. At a certain time T_0 consider the s_i as the first layer in a 2-layer neural network N_2 with a number of nodes t_j on the second layer. Next, consider the set of all these possible 2-layer neural networks N_r and assign weights W_r to them corresponding to how successful they were in the past at assigning probabilities to theorems. Next, consider the set of neural networks N_r and weights W_r as set of logical inductors with weights as in Step 1 and proceed inductively.
One could think of variations that apply gradient descent, or try to optimise on the number of layers, etc (obviously this is a rather hot topic right now).
- ('Mathematician' approach ) Define a Mathematician: a program that given a list of theorems start to try to find proofs; one has to be aware of halting issues here, so we allow Mathematicians to ' skip' a theorem at any point if they so desire. Enumerate all these programs. After each timestep assign to each Mathematician a ' smartness value' corresponding to how many theorems it has already proved, weighing more difficult theorems (=those that have been proven by very few other Mathematicians & took more time to proof) heavier. Average this smartness value over all possible lists of theorems [here we will assume that in these lists theorems are either true or false].
Now ask the smartest Mathematicians how much time they are willing to spend to proof a given theorem and take this as an indicator of the probability of the truth of a theorem.
- ('Computational Complexity' approach?) Perhaps one can play around with computational complexity classes and try assigning theorems probabilities according to how computationally complex they are.

Clearly, this is merely a sketch of possible approaches, one that is moreover rather naive and vague. There are many issues here; for instance some of these approaches might run into the impossibilities of recursively enumerating certain sets. Naturally, one could think of much more sophisticated approaches. Nevertheless, I suspect that all of these approaches will run into fundamental difficulties.
Most of them work by some sort of inductive procedure; it usually not too difficult to inductively find a wild program that will screw up these procedures.

The problem has a very typical flavor in the sense that one 'feels' there are two parties trying to out-compete each other. We are trying to assign the right probabilities to theorems such that it satisfies the right properties; the Opponent is trying to find situations for which a certain desired property fails. A basic example with a similar flavor is of course the epsilon delta definition of the limit.
The issue here is that these inductive processes are rather 'dumb' while the possible opponents can be arbitrarily complex/smart. The approach as outlined in the Logical Induction paper brilliantly avoids these problem by introducing traders and markets. Instead of facing a 'dumb' inductive procedure that can be 'inductively circumvented' the opponent faces, as it where, an entire market of traders each trying to outdo him. This approach has brilliantly 'turned the tables' on the opponent: for each strategy the Opponent devices we can find another trader which outdoes it.

Thanks for the reply, Lee. I agree with you that assigning probabilities to theorems *at all* is a nontrivial problem, and insofar as the LI paper moves forward our understanding of this problem, that is good.

However, I am not convinced that the paper ever gives us a satisfying "assignment of probabilities to theorems." What it gives us is two things -- the limiting probabilities P_{\infty}, and the finite-time "probabilities" P_n. The former has some pleasant properties like assigning 0 < P < 1 to sentences independent of the axioms, but is essentially irrelevant to the original motivation of "how do we reason about things we can't prove yet?", since it is obtained "at the end" after every possible deduction has been made. (So it can make use of every possible proof, and the only thing it does above and beyond deduction is to put numbers on independent sentences.)

On the other hand, the finite-time P_n do not even have to form a probability measure at any time, and we cannot use them to do the sorts of things we would like do with probabilities, like decision theory. For instance, the authors define expectation values at time n but only prove that they behave well in the limit, and indeed the definition at time n involves an arbitrary choice which the authors justify because it washes out in the limit (see p. 40). Of course, any way that P_n fails to be a probability measure will disappear in the limit, because the limit is P_{\infty}; but if we let ourselves wait for P_{\infty} we forfeit any ability to reason probabilistically about theorems before they are proven. If we want to do that, we have to use the finite-time P_n, but in fact we cannot reason probabilistically with these at all.

It seems my reply grew rather lengthy and is also a little messy; I seem to be unable to edit my reply, however. My sincere apologies.

First of all, you are a shill of the state. Secondly, all knowledge isn't logical, so good luck!

I am skeptical about any Bayesian framework to evaluate the "probability of being true" of unproved assertion in mathematics.

The mathematicians, when they care about making probability judgements on such assertions (very rarely) tend to be anti-Bayesian. That is, the more an unproved, conjectural, assertion, is found to logically imply other assertions that can be independently proved to be true, the less credence they give to the original assertion, which is certainly the opposite of what a Bayesian person or algorithm would do. The rationale is that in logic, "false implies true", plus an untold belief that "there is no free lunch" -- so, the mathematical thinking goes, if a property implies too many theorems too easily, it is probably false.

Also, if the evidence used to re-evaluate our priors on an assertion about all natural numbers is checking the assertion for many small values of the natural number, this is poor information as far as interesting conjectures in number theory are concerned. Many conjectures, known to be for all integers less than a huge number (far beyond the number you may call astronomical, economical, or whatever), have been disproved.

Beings need goals and goals can never be fully rational, only approaches to fulfilling them. Most rational would be nihilism, lacking any goals, but it would be questionable what being means for something like that.

Alex Tabarrok is tracking the fundamentals behind the auto traded cash. This stuff appears in the smart contracts layer. We are headed for the singularity, version one, the first intelligent network in which humans are not allowed.

That markets are efficient given full rationality is a very natural result. Consider a market where people are free to make trade offers to each other and the equilibrium allocation is inefficient, hence there exists a pair of people that will both benefit from a trade but they don't trade. That's a contradiction with then being rational utility maximizers: if one makes a proposal that makes both better off it's better for the other side to accept rather than to not trade at all. That trade offer might not be the equilibrium offer but clearly if both can benefit from trade and are not trading is because they are not being rational.

That's why general equilibrium theory is a quite tautological exercise and why I don't like models of market failure. These models of market failure work by assuming that people are partially rational as they usually are models of missing markets: they just assume people "do not" trade certain goods and services. Even the monopoly model is pareto efficient when one takes into account that monopoly pricing is an optimal mechanism in terms of maximizing surplus extraction from the consumers when there is private information regarding valuations (i.e. it's optimal when taking into account the information constraints), any other feasible allocation generates lower utility for the monopoly (such as the competitive market price, it generates larger aggregate surplus but the monopolist is worse off).

If you liked that you’d probably also like this result:
P=NP if and only if markets are efficient


Comments for this post are closed