Join our community of smart investors

Predict the unpredictable

Financial markets need a new set of equations to replace those that failed in the credit crunch. Power-law distributions might just do the trick
August 8, 2014

Welcome to a world where Warren Buffett does not exist. He could do in theory, but not in practice – the idea that one person could accumulate so much wealth is too preposterous to believe. Besides, in this world the South Sea Bubble is yet to burst; nor has it even inflated. That may happen, but probably not for another 23,000 years or so. Clearly, therefore, this world also awaits the credit crunch. That, too, may occur, but not for at least another 100,000 years. Plenty of time for a nice cup of tea before you pop along to queue outside your branch of Northern Rock.

This world is an odd sort of world. It’s a bit like Schrödinger’s cat – simultaneously it exists and does not exist. In what we would call ‘the real world’, clearly it does not exist. Unless history lies terribly – and it does sometimes – the South Sea Bubble did burst (in 1720, to be precise). Surely we haven’t forgotten those queues outside the Northern Rock (it’s coming up to the seventh anniversary) and Warren Buffett – though elderly, at 83 – is still going strong.

Yet in much of the finance industry – where the outlook for share prices is reckoned, where traded options are bought and sold and where the performance of fund managers is assessed – this world is the de-facto reality. It’s the world that’s taught in business schools and the world that’s practised throughout the City. You could say it’s the ‘normal’ world because it’s the one where forecasts are made on the assumption of what’s called ‘the standard normal distribution’ of returns.

Normal distribution is the paradigm sourced from 18th century statistics that took root in classical economics and thence into financial economics, where it is embedded in the most influential models. Its assumptions lie behind Harry Markowitz’s ground-breaking work to find optimal portfolios in the 1950s, the capital-asset pricing model of the 1960s and – most influential of all – the Black Scholes options-pricing model of the 1970s. It says that if you want to forecast the value of a ‘variable’ – a share price, for instance – then the range of possible outcomes will be neatly and symmetrically arranged around their average figure. The most likely will cluster thick and fast near the average – or the ‘mean’ – and the least likely will fall away so rapidly – ‘exponentially’, in the jargon – that extreme outcomes will be very rare. So neat will be the cluster – so shaped like a bell when the range of outcomes are plotted on a graph – that the normal distribution would come to be known as the ‘bell curve’. And to find that curve all you need to calculate is the mean and the variation – or ‘variance’ – around it.

This pattern was discovered by man’s obsession with his vices, in particular his addiction to gambling. Thus the question arose: what are the chances of rolling a dice three times and throwing ‘six’ each time? Answer: one in 216. To the non-statistical brain those odds seem implausibly long. We’ll let it pass and ask another question: what are the chances of rolling the dice six times and throwing six each time – okay, the odds will be much longer, but surely not that much? Don’t count on it. The odds lengthen to one in 46,656 times. Or, put another way: the average score from rolling a dice six times will be 21 yet the chance of scoring 36 has a probability factor of 0.0000214. Now we begin to glimpse why, statistically speaking, it’s not possible to have Warren Buffett’s wealth – it’s just too far from the average to be real.

Still, which do you prefer to believe in, the reality of Warren Buffett or normal distribution? As a belief system, normal distribution has won hands down, helped by the seductive qualities of its beautifully symmetrical pattern, which is found everywhere. So much so that Abraham de Moivre, the 18th century mathematician who did much formalise it into what’s called ‘the central limit theorem’, “ascribed it to the Almighty”.

Yet the real world is not always beautiful and even less often is it symmetrical. Everywhere there are asymmetries – especially in the financial markets. Take the following simple example: imagine you have a trading plan that will make £1,000 profit 99 times out of 100, but will lose £100,000 once in 100. It sounds possible, yet the probabilistic world of normal distribution will struggle to handle it because the aggregate value of all the possible outcomes does not sum to zero. No worries. The solution is simple – call the one in 100 chance of losing a packet an ‘outlier’ in normal-distribution speak, regard it as a freak, ignore it and deal happily on the strong likelihood of making small amounts of money often.

Do that, however, and you are behaving much as the traders who piled into the Mexican peso in the early 1990s, as the banks who gorged themselves on collateralised debt obligations in the early 2000s, as the bosses of Northern Rock who came up with a funding plan that relied almost wholly on the wholesale money markets. You are assuming that the future will be much like the past and that it will based on the normal distribution of outcomes. True, your plan will work most of the time, but when it fails it will wreck you – and all because you did not read the small print on the label.

In fairness to the normal-distribution paradigm, it never said that it works for everything. Indeed, it stressed that mean-variance analysis only works well when the observations under analysis are truly independent of each other. That works for the throw of a dice. Clearly the result of one throw won’t influence the next one and so on. It also works pretty well for statistics about people. Measure the height and weight of the UK’s adult population and you will see a nice bell curve; no chance of freaks 12-feet high and weighing 200 stones to mess up its shape.

But for many things we want to measure, the observations won’t be independent. The result of one will affect the next and so on. It’s called feedback and it’s well-known in systems as diverse as climate change (the negative feedback of cloud formation slows global warming) or genetics (cell reproduction is controlled by the interaction of DNA segments). And it’s prevalent in systems powered by what people do.

So, for example, distributional theorems that run on feedback explain why JK Rowling is read so much more than George Eliot, why London is roughly twice the size of Birmingham, why the average FTSE 100 chief executive makes more money in three days than the average UK worker makes in a year and why there are highly-rated glamour stocks such as Apple (US: AAPL) or, in London’s market, microprocessor designer Arm (ARM).

The basic point is that where there is interaction and the rules of normal distribution fail, then average values and their dispersion around the mean also fail. That’s tricky to grasp because we are schooled in the notion that where there is a series of values – the price of stocks in a market, for example – then there must be an average value. And if there is an average value, then the dispersion around it must be measurable, too. Sure, those calculations can be made, but they won’t tell us anything. To see why, let’s bring Warren Buffett back into the story.

Imagine – highly unlikely though it is – that a group of 100 middle-class English people gathered together reveal their wealth and the average value is £1m. Next, bring in Warren Buffett – flown over specially, no doubt – and see what happens. At a rough estimate, Mr Buffett’s wealth stands at £39bn. That skews the average wealth to £386m, probably 100 times more than the second wealthiest person in the group. Clearly the rules of normal distribution tell us nothing sensible here, yet so many financial models take similar instances and try to persuade us that they do.

However, if normal distribution fails, so-called power-law distributions may do a better job. Power laws inhabit a zany world of tipping points and non-linear dynamics, of fractals and – crucially – of ‘fat tails’. But at least it’s a world where Warren Buffett’s wealth, the South Sea Bubble and the credit crunch are all perfectly explicable.

The vital point to grasp about power-law distributions is to do with these fat tails. Put another way: when putting a series of observations into a predictive model, the rate at which extremes decay is much slower using power-law formulae than with models based on normal distribution. In other words, power-law models predict lots of extreme events – or fat tails – which seems a better explanation of real life.

We deal with some of the mathematical issues of power-law distributions in the box, ‘The distribution trade’, which is optional reading. For the purposes of understanding the everyday world of investing and finance we only need to explain why extreme events should happen much more frequently than the maths of normal distribution indicates. To do that, let’s formalise under four headings what happens when feedback mixes with the world of happenstance and random events.

■ Non-linear dynamics This is what it says on the can – change, but not in a straight-line direction. Non-linear dynamics were first spotted in studies of the weather. Tiny changes in the starting conditions of a computer model could have a massive impact on the end result. This has been caricatured as ‘the butterfly effect’, where the result of the butterfly flapping its wings in the Amazon jungle spirals into a tornado thundering up North America’s Atlantic sea coast. Or, say, in business small differences in the set-up of two otherwise identical companies – a bit of extra cost here, a little more debt there – means that one of them becomes a FTSE 100 success story while the other goes bust within a year. The problem is that at the outset it’s hugely difficult to predict which will be which.

■ Self-organised criticality Here the notion is that things and people automatically organise themselves into complex systems that exist on the edge of breakdown. The classic analogy is with sand piles, built up, grain by random grain, into a shape that seems to deny nature. Add a few more grains, however, and mini avalanches appear. Add the fatal final grain and the whole pile collapses.

Motorway traffic can organise itself like that, building up to such a critical speed and a smooth yet fragile flow that it takes just one car out of place to prompt a pile up. Equally, it needs just one unforeseen event to send a speeding stock market into a tail spin. Witness the supersized response to an innocuous decision by Germany’s Bundesbank to raise interest rates on Friday 16 October 1987. When Wall Street opened the following Monday it suffered its worst-ever day, falling 22.6 per cent. The fragile edifice on which was built the US market’s 44 per cent rise in the year until then just collapsed.

■ Highly optimised tolerance This is a bit like self-organised criticality on steroids. It’s where we take an interaction that would arrange a near-critical state by itself, then we add the human factor that makes a power-law outcome more likely. The text-book example is nature’s propensity to create forests that once in a while will be blighted by fire. To that, you add the efforts of the forester to pack trees more tightly than nature would allow, but build wider fire breaks. That should provide adequate safety but it leads to the occasional forest fire far greater than in the natural environment.

It’s easy to see the parallel with financial markets and the illusory security provided by regulatory authorities and their regulations. Thus the presence of central banks, which should deter risk taking, actually encourages it because self-interested protagonists (for which read ‘bankers’) know that their bets will have an asymmetrical outcome. They keep the rewards if they guess correctly, but the central bank picks up the tab if they guess wrongly. What ensues is a fat-tailed distribution of financial crises that are both worse and more frequent than they would be without regulations.

■ Preferential attachment Benjamin Graham once likened stock selection to a popularity contest and JM Keynes went one better when he compared it with a beauty parade. Both were being more accurate than they realised because preferential attachment plays a part in financial-market mania. Indeed, the notion of preferential attachment lies behind many of the best known power laws. At its heart is the point that popularity is contagious – people are drawn to what other people are drawn to. Via Zipf’s law – and à propos a few paragraphs ago – this explains why London is roughly twice Birmingham’s size and why JK Rowling is read so much more than George Eliot.

More pertinently, it helps explain why some investment themes become so dominant (think of the craze for Japanese equities in the 1980s or for emerging markets just a couple of years ago) and why there are glamour stocks. We know all about Apple or ARM, but – ominously – some glamour stocks turn out to be all hype. Remember Polly Peck or British & Commonwealth in the 1980s or Independent Insurances in the 2000s. Scams of varying degrees were at the core of all three, yet in their ascendancy investors became deeply attached to them, almost certainly encouraging others to follow.

Where does this leave investors? Hopefully accepting that the world of normal distribution is not so normal; that power-law distributions predict outcomes that come closer to matching reality. If so, they must be prepared for the shocks. We have already seen that regulators and their rules won’t help (and might make crises worse). So the only answer is more diversification.

Back in 2005 – by definition therefore before the credit crunch – in what became a prescient article for Fortune magazine, Benoit Mandelbrot, a mathematician famous for developing chaos theory, and Nassim Nicholas Taleb, a trader turned author who specialised in warning about the dangers inherent in the slavish acceptance of conventional financial models, gave the following advice: “Diversify as broadly as you can – far more than the supposed experts tell you now. This isn’t just a matter of avoiding losses. Long-run market returns are dominated by a small number of investments, hence the risk of missing them must be mitigated by investing as broadly as possible. Passive indexing is far more effective than active selection, but you need to go well beyond an S&P 500 fund to do yourself much good. And wherever you put your money, understand that conventional measures of risk severely underestimate potential losses. For better or worse, your exposure is larger than you think.”

Box copy: The distribution trade

Distribution is important. Ask any retailer. It’s also important in the finance industry, and not just for selling shares in the likes of AA, Saga or whatever is a hot new issue. Another sort of distribution forms a key part of the intellectual infrastructure on which financial forecasts are based and – to the extent that the whole point of the finance industry is to anticipate the future – that makes distribution vital.

Clearly, in making a forecast it is useful and sometimes essential to know the bounds within which the price of, say, a share might be distributed. If I am thinking about buying shares in Tesco (TSCO) – current price 280p – then it helps to know what are the chances that the price might hit 320p or 240p within the next six months; equally it will help to know how its price might range within those extremes.

That intuitive necessity is mathematically formalised within many valuation models that the finance industry uses. These models are helpful and – in the case of the best known, such as the Black-Scholes options pricing model – are ubiquitous. But they struggle to answer the really important questions: what will be the most extreme price movements within a given period; at what rate will the extremes crop up?

The chief reason for this is that maths itself struggles to deal with uncertainty; a statement that should be self-evident (if it didn’t, there would not be uncertainty). In particular, maths has difficulty when quantitative factors give way to qualitative ones. Five factors may drive Tesco’s share price in the next six months, but the difficulty is knowing which are the most and least important ones. Then there are the additional complications that the factors might affect each other, that their relative status may change within the period and that – actually – there aren’t five factors, but ‘n’ number, a figure that isn’t known or knowable.

Mostly, financial valuation models side step these difficulties by saying that they don’t really matter since extreme events are so rare. That’s the message of a commanding piece of mathematics – the central limit theorem – and it has rarely been profitable to argue with it. That theorem and its application – the standard normal distribution of random variables – gave financial models the impression of both rigour and precision; meanwhile, the numbers that the models generated were useful, they helped the finance industry thrive – end of story.

Except that it has long been known that there are other ways of arranging the distribution of random variables. Best known is the so-called Pareto distribution (named after an Italian economist, Vilfredo Pareto), which is caricatured as the ‘80-20’. For example, 80 per cent of Tesco’s profits would come from 20 per cent of its stores, or 80 per cent of a nation’s income would be grabbed by 20 per cent of its workforce.

The important characteristic of a Pareto distribution – and this has a wide application – is that it is a ‘power law’ distribution which permits more extreme events than a standard normal one. That’s because the rate at which rare events fade away is slower using a power-law function embedded in a formula than a formula running on normal distribution. Power law functions are not that subtle; they are a pale reflection of the complexities of real life, but they may do a better job than models using normal distribution.

Certainly, when adapting the maths of power laws to study share-price returns extreme events become much more familiar. Returns that are 100 standard deviations from their average – impossible under conditions of normal distribution – become possible. A 10 standard-deviation event – near impossible under normal distribution – will happen every week somewhere among the shares traded on a big liquid stock exchange like London’s.

In addition, according to Xavier Gabaix, a finance professor at New York University’s Stern School of Business “the 1929 and 1987 crashes do not appear to be outliers in the power-law distribution of daily returns.” In other words, stock-market crashes are really quite normal.

If that thought alone were disseminated throughout the finance industry, it might reduce the panic that spreads to all parts of the system when something really nasty happens, such as the failure of Lehman Brothers in September 2008. That would be hugely helpful. However, it probably would make the finance industry’s distribution trade more complicated still. Mathematics has already cooked up umpteen ways to plot probability distributions, many of which are designed to cope with extreme – or ‘fat tail’ – movements. Incorporating more of them into valuation models would require some really fearsome equations.