# The Millennium Market Problem

Ritwik Priya sent me an intriguing paper from Philip Maymin arguing that an efficient solution to NP-complete problems is reducible to efficient markets, and vice-versa. In other words the weak-form efficient market hypothesis holds true if and only if P = NP. ~~This result is not published in a peer reviewed journal (as far as I can tell), but purports a remarkable discovery.~~ My bad, looks like it’s published in *Algorithmic Finance* which has Ken Arrow and Myron Scholes on its editorial board. I’m still surprised NBER doesn’t point me to any future citations. As Maymin himself notes, he seems to have married the definitive question in finance (are markets efficient) with the holy grail of computer science (does P = NP).

There are reasons to be skeptical, and a lot of questions to be asked, but first I want to summarize the paper. I Googled around, to see where the paper showed up and, maybe not surprisingly, this MarginalRevolution post was a top hit. But Tyler Cowen seems unfairly dismissive when he says “Points like this seem to be rediscovered every ten years or so; I am never sure what to make of them. What ever happened to Alain Lewis?”.

I don’t know much about this Alain Lewis. I can see he has written papers like “On turing degrees of Walrasian models and a general impossibility result in the theory of decision making”. I can’t even pretend to understand the abstract. On the other hand, reading Maymin’s paper didn’t really change my mind about efficient markets, but it gives an intriguing example of market capabilities. Anyone with thirty minutes and a mild interest in computer science should read the whole paper, because I think it gives a very good heuristic on understanding the debate of EMH itself, even if not resolving it thereof.

Indeed, some commenters on the net (at MarginalRevolution and elsewhere) joke that this paper is just another guy hating on the EMH. They say this, of course, because they have an incredibly high subjective belief that P ≠ NP (will discuss this later). They have not read the paper, because this disregards the fact that the author is a blatant libertarian citing Hayek favorably within.

Before I give a brief explanation of Maymin’s proof, I will add that I am skeptical as this result seems not to be replicated (with regard to his empirical evidence) in any prominent journal, economic or mathematical. While one may cite the “natural conservativeness” of the former profession as an explanation, the proof is simply too cool not to receive more attention. My understanding of theoretical computer science is limited, and to the extent that I am a good judge, the paper makes sense on first read. (Strike one against comparing him to Alain Lewis, whose very titles make me shiver?) I do have some quibbles which I note along the way.

**A Summary**

Maymin ventures to prove biconditionality between the weak-form of the EMH and P = NP. He notes this would be an interesting result as the majority of financial economists have a fair degree of belief that markets are efficient, contrasted with computer scientists who very much doubt that P = NP. (It is this relationship that I will critique, but that later.) The first part of the proof shows that efficient markets imply that P = NP.

The weak-form of the EMH asserts the following:

- If a certain pattern – such as Amazon always moves with Google with a lag of seven days – is observed, it will immediately disappear as the market incorporates this information into natural prices.
- Because markets are informationally efficient, said pattern will be found immediately.

Richard Thaler calls these, respectively, the “no free lunch” and “price is right” claims of the EMH. Maymin’s result suggests that for the latter to be true, there must exist polynomial time algorithms to NP-complete problems (specifically, the Knapsack problem). We assume that there are *n* past price changes (1) for UP and (0) for DOWN. We take that a given investor can submit either a long, short, or neutral position at each price change. Therefore, the total number of strategies is 3^*n*. We note that *verifying* whether a given strategy earns a statistically significant profit requires only a linear pass through the *n* past time changes and is hence in NP. (That is, given some model, is there a 95% chance that your strategy beats a monkey throwing darts at the *WSJ* at coffee each morning). Remember this whole thought experiment is an exercise in finding some pattern of ups and downs associated with some future ups and downs which will always hold true and hence may be exploited.

Maymin notes that in practice, popular quantitative strategies are based on momentum, and hence some fixed-lookback window *t*. He notes the joint-hypothesis problem from Fama (1970) that the EMH says we cannot, given some equilibrium model with a normal profit *K*, earn in excess of *K* for a period of time. He resolves what I find to be an important debate among EMH skeptics quite well; that is how we reasonably search across the 3^*n* possible strategies. Some argue that we should stick specifically to economic theory, others submit blind data mining, and others still machine learning. Maymin notes that this is irrelevant to the question at hand, as the employed search strategy is endogenous to the determination of *K*.

Maymin agrees that determining whether a strategy iterated on *one* asset can earn supernormal profits is polynomial time. However, he notes that under the assumption that a) investors do not have infinite leverage and b) operate under a budget constraint, answering the question “does there exist a *portfolio* of assets earning supernormal profits within our budget” to be akin to solving the Knapsack problem.

For those who do not know, the Knapsack problem – a canonical introduction to discrete optimization and intractable problems – asks one, given *n* items represented as {value, size} to maximize the sum total value keeping the total size under a constraint *C*. In the analogy, size is the price of an asset following a *t*-length time series where *t* is the lookback window, value is the future return of the asset following the same time series, and the strategy on each asset is picked from a t-sized set *U* with {long, neutral, short} as possible options. Hence, in Maymin’s words, “the question of whether or not there exists a budget-conscious long-or-out strategy that generates statistically significant profit relative to a given model of market equilibrium is the same as the knapsack problem, which itself is NP-complete. Therefore, investors would be able to quickly compute the answer to this question if and only if P = NP.”

Maymin concludes that this algorithm is exponential in *t*, the size of the lookback window. He suggests that because this grows linearly with *n* (the total time series of all history) markets become intractable rapidly. I must quibble, theoretically if not empirically (which seems soundly in Maymin’s favor). Is there reason to assume that *t ~ n*? Is it not possible that for asymptotically large *n*, *t* ~ *log n*? Indeed, for the market as a whole, if this were the case the problem would be linear in time. Empirically, however, linearity seems to be a fair assumption. I might add that time series analyses are restricted by the assumption of stationarity. In the future, the possible window of reasonably assuming such might be more than linearly larger than it is today. This would work in Maymin’s favor.

I have not yet explained *why* this means markets are efficient if and only if P = NP. Let’s say there are a group of investors searching through the total strategy set *U*, which is 3^*n* in size, for a supernormally profitable strategy. Let’s say, by miracle on one of my first guesses, I happen to find one such strategy. If P = NP, theory suggests that most everyone else will also immediately find this strategy, and hence it will be “priced into the market”.

However, if P ≠ NP, it might take *years* for someone else to find the same strategy, allowing me to earn a supernormal profit for a large period of time. This would render even the weak form of the EMH false. What are the empirics in favor of this idea? Well, this is something that probably deserves further research and I’m not happy with what’s provided, but Maymin cites Jegadeesh and Titman (1993) as a plausible example. Jegadeesh and Titman are credited with developing an investment strategy based on market *momentum*. Their strategy was largely unknown in the experiment windows (1965 – 1989) and therefore not priced into the market. Maymin’s result would suggest that this strategy becomes increasingly effective against the market as other participants content against a linearly growing events for an exponential-time algorithm. He offers this as evidence:

I don’t see it as such. First, assuming stationarity across 1927 to 1989 is incredibly iffy. Second, backtracking a current strategy onto *historical* trends tells us what? I am positive I can also find some strategy (not momentum-6) which finds just the opposite. *So what?* Rather, Maymin touches on the empirical evidence that *would* work in his favor. That is, NASDAQ was added to the data set in 1972, vastly increasing the number of data points. If some strategy earned supernormal profits, it would be exponentially harder to mine it *after* the inclusion of data. To the extent that this strategy remains broadly unknown, its performance against the market should increase relative to baseline after 1972. But he doesn’t cite this data.

On the one hand, I’m glad he offers the correct framework on which to make his prediction falsifiable. On the other, presuming the above printed data from “Table 1” as in support of his hypothesis seems somewhat sly. I read this part quite favorably on my first parse, but employing this dataset is obviously incorrect for the hypothesis he attempts to prove.

**The Corollary**

As interestingly and more convincingly, Maymin argues that an efficient market implies that P = NP. To do this, he assumes that markets allow participants to place order-cancels-order transactions. I can say that I want to buy the *Mona Lisa* if it falls to $10, or sell *David* if it hits $15, but as soon as market conditions are such that one order is fulfilled, the other is automatically cancelled. We must actually assume that such orders with *three* items may be placed. Computer science nerds will know where this is going. Maymin wants to program the market to efficiently solve 3-SAT, quite literally the mother of NP-complete problems. It is beside the point of this scope to explain the dynamics of this problem, but enough to know that its solution is reducible into solving many other intractable problems, including factoring large prime numbers and hence breaking into your bank account.

The logical form of the problem is such:

Let *y* = (a | b | !c) & (b | !d | a) & (z | m | a) & … & (a | b | !m), where all variables are boolean

Within the literature, this is known as a “conjunctive normal form”. Each parenthetical phrase is a *clause*, which must consist of a disjunction between three “literals”. Solving 3-SAT involves finding the state (true or false) of each literal such that the whole statement is true (or known to be impossible to solve). 3-SAT is exponential in the number of clauses.

We can think about each clause as an order-cancels-order (OCO) option, consisting of three possible transactions. A literal can imply a sale and a negated literal a purchase, or vice-versa. Now let us price each asset (literal) at the midpoint of the bid-ask spread. Therefore, it yields a supernormal expected profit for all participants (and will be immediately arbitraged in markets are efficient).

Once we place the set of OCOs, they should all be executed within an arbitrarily small time period, as each by itself is a contradiction of of the “no free lunch condition” of efficient markets. In fact, *each* of the OCOs must be executed to maximize profits, and that is what proponents of the EMH suppose they do. Maymin does not state his implicit assumption that the time it takes to clear his transaction on the open market may not be instantaneous but within the weak EMH bounds of “as quickly as possible”. I would say this is a big ding to the theoretical equivalence of the two efficiencies (as he does not offer a mathematical or logical reason why one must be the other, but equivocates the terminology). I wish he had made this more clear. But I still think it’s an important result because the EMH would be a toothless truth if the “as quick as possible” included the years it would take to solve a sufficiently large 3SAT. Even then, without the theoretical equivalence, the structural similarities are striking. Note, that the example I provided above is rather easy as there are almost as many variables as there are literals. In reality, the question is a lot harder. Therefore, the market mechanisms that would “solve” such a set of OCOs in a polynomially-small time period would have done something remarkable. If I want to solve some given 3SAT problem, I just pick any given stock for each variable, encode nots as sell orders and literals as buys, and place the total transaction which must be completed in the profit-maximizing manner.

I found this result far more compelling than the first, perhaps this reflects my propensity towards computer science over finance.

**Discussion**

I’ve read quite a bit of stuff on how Walrasian equilibria are NP-hard, this or that. There seems to be a lot of literature relating economic *games* and *equilibria* with computational tractability. The question of the EMH is inherently different, and logical – not mathematical – in nature. The symmetry between the two fields here is one-to-one, even self-evident. So, I disagree with Cowen’s quip that stuff like this comes up once every ten years. I can’t put my finger on it, but the previous literature suggesting such similarities had more to do with solving for some equlibrium that’s hard, rather than processing the market itself.

Specifically, this is why I find the second result (reducing markets to 3SAT) to be mind-boggling.

Regardless, something in my gut rejects the logical biconditional between P = NP and the EMH. However, I think this result supports the idea that one should form his priors on both with a similar heuristic (which may yield a different result for either depending on the heuristic used).

For example, Maymin notes the contradiction between most finance professionals believing the EMH to be true and most computer scientists rejecting that P = NP. Let’s take the latter case. How exactly can one form a prior that P≠ NP? Well, he can believe that when the problem is solved it will be in favor of P ≠ NP. But that’s just turtles all the way down. A better way of explaining such a prior would be “I believe that if I asked a magical black box ‘does P = NP’ the resulting answer would be in the negative”. You agree that’s a fair way of forming a subjective belief, right? This belief can be formed on any number of things but for most computer scientists it seems to be the radical implications of a positive result (breaking RSA cryptosystems, etc.)

But, to form such a prior, you must exist that in a Popperian world such black boxes can exist. However, any such magical truth tellers existing is *ipso facto* a stronger and more absurd reality than P = NP. Therefore, this question is not in any normal sense falsifiable (other than the standard turtles all the way down, which only talks about the *resolution* of the problem rather than the true implications thereof).

I would argue that even if the biconditional between P = NP and EMH does not hold, for whatever reason, the structural science does. That is to say that submitting that the EMH is falsifiable would be akin to believing in such magical black boxes. It is better to not form any priors regarding efficient markets, *as a general rule*. Better would be to believe whether certain *forms* of markets are efficient.

The analogy to computer science holds even here. Though most NP-complete problems are in the *worst case* exponential in scale, heuristics and randomized approximations allow computer scientists to design remarkably efficient solutions for *most *cases or the *average* case. This is scientifically falsifiable, and the correct question to ask. Similarly, we may talk about the informational efficiency of a *practical* market, within certain bounds, granted certain approximations. And, crucially, granted some margin of error and the risks thereof. What are the chances a randomized input to a backtrack-enabled Knapsack solver will fall to exponential time? What are the chances a market will fail to weed out inefficiencies given a level of technology? Indeed, he suggests that it is such approximations and shortcuts that make a market perhaps inefficient, but a government even more so. He compares this result to Hayek’s famous economic calculation problem, which suggests something similar.

To me, this is an absolutely non-trivial comparison. After reading this paper I genuinely believe that it is a futile question either guessing whether P = NP (or, practically, whether when the question is resolved if P *will equal* NP) or whether markets are efficient. However, according to the St. Louis Fed, this paper has never been cited. (I intend to change that, one day.)

Within appropriate bounds, Maymin’s paper illuminates not necessarily how we view efficient markets, but how we view the debate thereof.

This paper has the germ of something interesting, but its most striking feature, to me, is the idea that an idealized, but nevertheless fundamentally empirical statement, can be proven to be equivalent to a mathematical statement. This says a great deal about the lack of scientific, as opposed to mathematical, content in economics. Of course, perhaps some mathematical statement named the EMH might be equivalent to P = NP, but this is of little interest to economics.

The real connection between efficient markets and NP-hard problems is far more interesting. The notion that a dynamical system will equilibrate quickly is at the core of the idea of efficient markets. Now, a market consists of many units interacting with one another. Simple models of such systems do indeed equilibrate quickly. However, it is well known that once we move beyond these simple systems, and start to consider systems subject to complicated external forces, especially forces that affect the interactions between the units, then the behaviour is not so simple. The approach to equilibrium can be very slow: the system ‘freezes’ on the timescale at which the individual units would normally change, sitting in a locally optimal but globally far-from-optimal state; with transitions between such states occurring rapidly. Or critical situations can be reached with avalanches of change of many different sizes. Sornette and many others have built such models and demonstrated their relevance. Given these complex types of behaviour in quite simple models, how the notion of efficient markets is of interest to anyone is unclear; casting doubt on it seems to be a waste of time, like casting doubt on the idea of a perpetual motion machine. The connection with NP-hard problems is then just that finding the equilibrium states of such models is provably an NP-hard problem. This of course says nothing about P = NP, unless we assume that the model is an accurate representation of the real world – fine – and that the real world is polynomial – whatever that might mean – yet nevertheless can find such equilibrium states; this is the argument of Maymin’s paper.

Pingback: Reflections Weblogging | This is Ashok.

nice trick to lose those extra pounds without any diet:

If you would like to improve your know-how only

keep visiting this web site and be updated with the newest news

update posted here.