Investment and optimal error rates: moonshots and mundanes

The main point of this note is to suggest that an investor should have a higher tolerance for false positive classifications when selecting shares with right-skew and/or fat-tailed returns (potential “moonshots”). 

Terminological preamble

Statistical discussions of hypothesis testing commonly refer to “false positive” (Type I) and “false negative” (Type II) errors. The term “error” is value-neutral in statistical testing, but may be unhelpful in a portfolio selection context, because it might be misread as insinuating some analytical mistake on the part of the investor.  To avoid this blameworthy connotation, I will use the (slightly) more neutral term “disappointment.”   

I seek to avoid ex-ante errors of analysis, whilst recognising that there is an optimal rate of ex-post disappointments. (I can’t think of a succint word pair for the distinction I am stressing here: on the one hand blameworthy ex-ante errors, and on the other hand blameless ex-post disappointments.  This seems an interesting linguistic lacuna.)

Portfolio selection as a classification problem

Portfolio selection can then be viewed as a classification problem, with “positives” being shares which are added to (or retained in) your portolio, and “negatives” being shares which are rejected (or sold). Normally in classification problems you seek to minimise the error rate, defined as a weighted sum of false positives and false negatives.  The two weights – one for false positives, and one for false negatives – are set according to the cost of each type of disappointment.

To give an example from medicine: if a disease is potentially fatal but has a reliable and safe treatment, then false negatives are costly, and false positives are benign.  Hence we should tolerate a higher rate of false positives (eg pap smear testing for cervical cancer in young women).   On the other hand, if a condition is often inconsequential and the treatment tends to be worse than the disease, converse payoffs would apply and we should tolerate a higher rate of false negatives (eg PSA antigen testing for prostate cancer in elderly men). 

In portfolio selection, false positives and false negatives cannot sensibly be identified with just two payoffs.  Instead each type of disappointment generates a return distribution. The nature of that distribution depends on what type of share it is: a moonshot or a mundane. The graphs below show the return distribution for an individual stock drawn at random from each class. 

Moonshots are shares with the potential for very high growth: shares which might “go to the moon”. Say Apple in 1997, or ASOS in 2003, or QXL in 2005 (of course examples are easy to identify ex-post!).  Potential moonshots are few in number. Most moonshots fizzle out (call these “duds”).  Actual moonshots are very rare (call these “hits”).

Mundanes are (unleveraged versions of) housebuilders, manufacturers, engineers – reliable businesses, but not plausible moonshots.  Mundanes are plentiful, and easy to recognise.  The nature of their operations makes long-term high growth unlikely.  Compared to potential moonshots, mundanes produce outcomes which are less extreme, and more evenly distributed between “hits” and “duds”.

Is portfolio selection more like pap smear testing (false negatives are more costly), or PSA antigen testing (false positives are more costly)?  The answer depends on the type of share: false negatives have much higher (opportunity) costs for moonshots than mundanes.  For moonshots, a false negative means we miss one of the very few Apple-like shares which could make a big difference to our portfolio return. For mundanes, false negatives are not much of a problem, because a single missed mundane won’t make much difference to our portfolio return, and there are always plenty more largely interchangeable mundanes we could include.   

This implies we should apply a heavier penalty to false negatives (and therefore necessarily also accept more false positives) when classifying moonshots than when classifying mundanes.

Toy example: portfolio selection from moonshots

Suppose that the entire class of moonshots comprises 100 shares, with these payoffs: 90 out of 100 go bust, and the other ten rise to 3x their cost. 

Assume we select 10 shares for our portfolio, and make equal investments in each of them.

If we select at random, the expected portfolio return is a loss of 70% (0.1 x3 x1 + 0.1 x0 x9).

But if we can manage to select four true moonshots, the expected portfolio return becomes a positive 20% (0.1 x 3 x 4 + 0.1 x 0 x 6).  With five true moonshots, it’s 50%. With 6,7,8,9, 10, it’s then 80%, 110%, 140%, 170%, 200%.

So in this (extreme) example, we have can have a false positive rate as high as 60% when selecting mooonshots, and yet still generate a positive portfolio return.

Would it be better to tighten our criteria, and so reduce the false positive rate below 60%?  That would reduce false positives (duds). But it would also increase false negatives (hits). Because hits are so rare, the combination of reducing false positives and increasing false negatives may produce a portfolio with a lower fraction of hits, and hence a lower expected return.

Toy example: portfolio selection from mundanes

Suppose the entire class of mundanes comprises 500 shares, with these payoffs: 250 give a 20% loss, and 250 give a 30% gain. (Note that there are 5 times as many mundanes as potential moonshots.)

As before, we select 10 shares for our portfolio, and make equal investments in each of them.

If we select at random, the expected portfolio return is 5%.

If we select mundanes with a 60% false positive rate – the same disappointment-tolerant strategy which produced a positive 20% return from the potential moonshots class above – then our expected return is nil (0.1 x 0.8 x 6 + 0.1 x 1.3  x 4 = 1.0).  In this case, a 60% false positive rate produces a lower result than chance; the disappointment-tolerant selection strategy which worked for moonshots doesn’t work for mundanes.

To achieve a positive return from mundanes, we need to penalise false positives more heavily. Say we tighten our selection criteria to reduce the false positive rate to 40%. Then our expected return is 0.1 x 0.8 x 4 + 0.1 x 1.3 x 6 = 10%.  With the false positive rates of 30%, 20%, 10%, and 0%, the expected returns are 15%, 20%, 25%, and 30%.

By tightening our selection criteria, we also probably increase the false negative rate – that is we reject some hits. But we don’t care much, because no single mundane makes a big difference to the portfolio, and there are always hundreds more mundane hits to find.

Other observations

In advocating raised tolerance for false positives when selecting potential moonshots, I am not saying that we should set out to make careless judgments.  We should strive to avoid ex-ante errors of analysis; but we also need to accept that even diligent judgement may lead to a high rate of ex-post disappointments, and we need to be comfortable with this pattern of outcomes.   

A problem with advocating higher tolerance for false positives for selecting potential moonshots is that shares are not labelled as belonging to one or other of these categories.  The categorisation is itself a matter of judgment. I have no solution to this.

How do we increase the false positive rate for moonshots?  The most obvious way is just to be (a little) more credulous when assessing potential moonshots, and conversely for mundanes.  To formalise this, one can use looser requirements for current financial metrics when assessing moonshots. 

Another way might be to apply an inclusive checklist for potential moonshots, and a disqualifying checklist for mundanes. 

By inclusive checklist I mean that the presence of certain positive features (say a management team with exceptional previous start-up success) guarantees inclusion in the portfolio largely irrespective of other any concerns. By disqualifying checklist I mean that the presence of certain negative features (say Debt > 3 x EBITDA, or large share sales by insiders) guarantees exclusion from the portfolio, irrespective of any other merits of the company.

One extant manifestation of my suggested strategy “fatter-tailed and/or right-skew returns => be more tolerant of false positives” is the tech start-up sector.  For angel investors in tech start-ups, most investments are duds, but they hope to more than make up for this with a few big hits.  Peter Thiel (of Paypal / Facebook fame) suggests that to a first approximation, an angel investors achieve a positive portfolio return only if their single best investment ends up being worth more than all the others combined.  

Update (21 October 2012): Paul Graham makes much the same point: angel investors in tech start-ups are Black Swan farming.

Leave a Comment

twenty seven ÷ = three