Fractional shares and dodgy pie charts

When fractional shares are ranked and grouped in buckets, any ratio of successive bucket means >0.5 is suspect

Listening to this interview with John Hempton reminded me of his forensic scepticism about the following pie chart, taken from an investor presentation about the pharma company Valeant.


John wrote (in 2014, when Valeant was still riding high) that the chart looked implausible:

“What we are saying is the top ten products average 1.8 percent of sales each. The next ten average 1.2 percent.

Now lets just - for the sake of argument suggest that the top product is 3 percent of sales. And the next four average 2.3 percent. Well then the first five are 12.2 percent of sales - and the next five can only be 5.8 percent of sales (or they average 1.16 percent of sales).

But oops, that is too much concentration - because we know the next ten average 1.2 percent of sales. In fact it is far too much concentration as product number 11 needs to be more than 1.2 percent of sales.

I have fiddled with these numbers and they imply a distribution of sales flatter than I have ever seen in any product or category. In order to make the numbers work the differences between product sales have to be trivial all through the first fifteen products.”

 This is an acute observation to make just skimming a presentation, and I’m not sure I would have got it. And what if the buckets (sectors on the pie chart) each contained 3 or 5 items instead of 10 – what difference would that make? Can we generalise this?

One way to think about it is to assume revenue shares by product form a geometric progression with a constant scaling factor, say r. E.g. if first product has 5% share, the second has 5 x 0.8 = 4% share, then the third has 4% x 0.8 = 3.2% share, etc. This won’t be exact in reality, but the sorting by size means it can’t be far off.

Then if the first term of the geometric progression – i.e. the share of the largest product – is x1, the usual formula for the sum of the first 10 terms is

and the mean over the first bucket is just this divided by ten.

We want to find {mean over bucket of first 10 / mean over bucket of next 10}.  Replacing 10 by n for generality, some trivial algebra shows that this ratio of successive bucket means is given by

We can display this quantity in a 2-way look-up table against the scaling factor, r, and the number of items in each bucket, n:

For the original Valeant example, n = 10, and the observed ratio of successive bucket means is 1.2% / 1.8% ≈ 0.66. So reading off the top number in the third column, this implies a scaling factor of around 0.96, which seems incredibly flat. Hence John’s intuition that the chart looked dodgy.

More generally, what’s “dodgy” depends on what the true scaling factor is.  E.g. If the largest holdings in an investor’s portfolio are sorted by size, this might typically give a scaling factor of 0.8 or less; but not the investor constantly rebalances the portfolio to equal weights (which some people say is a good idea, although the maths is subtle and a lot of published work on “volatility pumping” is flawed).  For most quantities, my intuition is that a scaling factor as high as 0.9 (the third row of the table) would be rare. It’s also striking how quickly the values fall away for lower (more typical?) scaling factors. Overall, the following seems a reasonable rule of thumb:

When fractional shares (revenue by product, market shares by company, or similar quantities) are sorted by size and grouped into descending buckets, any ratio of successive bucket means above 0.5 (or perhaps even 0.3) is suspect.

Guy Thomas Monday 26 August 2019 at 12:32 pm | | Default | No comments

Two parts of a whole: compound growth and Hemingway decline

This article by the investor Claire Barnes of Apollo Investment Management highlights an interesting property of compound growth in any quantity which forms part of a finite whole. In summary, the point is this. Suppose urban land area + rural land area = 1 (ie a finite land area), and assume urban land area grows at a constant percentage rate (in line with the ‘economic growth’ universally sought by governments). Then it follows that rural land area doesn’t just decline, it declines at an accelerating percentage rate.

Claire’s illustration copied below shows the case where rural land is initially 85% of the total, urban land is 15%, and urban land grows at 5% per annum. On these parameters, the rural land is extinguished after about 38 years.

Note the alarming and invidious property of the second graph. For many years the percentage rate of decline in the rural part is very small, too small for most people to notice. Then suddenly the rate of decline speeds up and the remaining rural part disappears quickly, too quickly to do anything about it.

Algebraically, we can represent this as follows.

Assuming that the urban part grows a constant rate g per annum, we have

Note that continuously compounded percentage rates of growth can also be expressed as semi-logarithmic derivatives (this will be useful in a moment), e.g.


Then the continuously compounded percentage rate of decline of the rural part is the semi-logarithmic derivative. That is

To interpret this, note that the urban part increases at a constant continuous rate log (1+g). The rural part decreases at an accelerating rate.  The rate is log (1+g) scaled by the current ratio of the two parts. 

For small g, log (1+g) ≈ g (e.g. log 1.05 = 0.0479), and so we can forget about logs and just scale the growth rate g. This makes intuitive sense – in some sense the “same” process is affecting the rural part, it just has to be scaled for the continuously varying relative sizes of the two parts.

The time until the rural part is exhausted is

which for u(0) = 15% and g = 5% gives 38.9 years, consistent with the graph above. In words: the time to extinction of the rural part is inversely proportional to the growth rate of the urban part.

Pages 12-13 of this report from the World Bank give some data on current urban land and rates of growth, which suggest that for most Asian countries, the outlook is not quite as bad as suggested by the graph above. But there are probably ambiguities with the definition and measurement of “urban” and “rural”. And whatever the exact parameters, a precautionary principle seems sensible, because the overall pattern is quite general: when one part increases at a constant percentage rate, the other part doesn’t just decline, it declines at an accelerating percentage rate.

I agree that this does not seem to be widely appreciated. Perhaps it is appreciated by ecologists, but on a quick search I could not find obviously relevant commentary. Claire suggests a couple of reasons why it may be neglected.  First, individuals and governments tend to focus more on things which show compound growth, because that is where investment and career and taxation opportunities tend to be found. We tend to be less aware of complementary things which are declining. Second, evidence-based people focus on things which are quantified; but things which are declining tend not to be quantified, precisely because of the lack of positive opportunities. I agree with both these points.

The alarming property of the graph is the “gradually, then suddenly” pattern of the rural part’s decline. This is not really exponential (a constant percentage rate of decline), it’s more alarming than that: it’s an accelerating percentage rate of decline. Following the famous quote about bankruptcy, perhaps we can call it Hemingway decline. 

We can then summarise, loosely speaking, like this:

Where two parts form a finite whole, and one part increases at a constant percentage rate, the second part declines at an acccelerating percentage rate. It declines gradually, then suddenly: Hemingway decline.

Guy Thomas Thursday 08 August 2019 at 11:38 am | | Default | No comments