Not all data is created equally

Grrrr, Excel. It’s that time of year when the corporate planning cycle kicks in and proper work is kicked into touch as next years plans are written, spun, chopped, pulled apart and horse-traded over.

Apart from the obviously painful process perhaps what’s even more worrying is that many Excel models are so poorly constructed. Not because the numbers don’t add up, they do, (well most of the time). No, the real threat is the lack of transparency on the quality of assumptions that underpin the decisions being taken. Indeed we’ve seen many decision models that give no indication of whether a number is a historical provable fact, an estimate from a trusted source or just a heroic guess – that had to be used as there was nothing else to hand!

And given that in business most decisions are forward facing, which by definition means you can’t absolutely know what will happen, this is a bit of an issue. You could call it the curse of the Black Swan, made famous in Nassim Nicholas Taleb’s excellent critique of the financial services industry and treatise on unpredicted events.

Be it planning next year’s budget, launching a new product, defining a business strategy or even inventing a new exotic financial derivative what is required is transparency – a clear indication of what’s a fact and what’s a guess. And it’s this principle of risk management that decision makers should adopt i.e. it’s OK to take a risk as long as you understand what the risk is. It’s when that risk is opaque that things tend to go awry.
This is not to dismiss the value of guesswork, despite what the “garbage in, garbage out” protagonists say. Guesses are valuable. Consider, for instance, planning a business in a market that doesn’t properly exist yet then garbage is all you have – so you’ve just to deal with it.

And in a world of Twitter, blogs and Facebook running in parallel to traditional content providers this challenge is amplified especially as these diverse sources of information are aggregated, analysed and visualised in VAIMs. Consequently I think it’s important to consider what I call ‘hard’ and ‘soft’ data; hard data being factual, proven news, stock prices histories, known dates of new regulation etc whereas soft data is opinion, gossip, rumour, photos etc. Its this ‘soft’ imperfect information that joins the dots. How many of the salespeople only respond to historical known facts? No, instead they’re also mining the rumour, gossip and opinion to help them fully understand the situation and react faster, with better market intelligence.

I’d suggest that humanistic predictions and decision-making quite naturally merge these two elements, whereas traditional Business Intelligence tools tend to cope well with hard data only. It’s something that the emerging field of Predictive Analytics must explore – knitting together the analytics bit with the softer human decision-making.

Interestingly the Met Office uses a data index to help structure it’s weather forecasting algorithms. This approach is extremely useful and one that we have used regularly to develop index of ‘quality indicators’ that can flag confidence, and when appropriate even treat the range of data from hard to soft differently. For instance you may want to range the variance on a prediction based upon gossip more than a published forecast – or not if you have an impeccable informal source, for example opinion sourced from Breaking Views or Lex could actually have very high quality.

And of course these quality indicators can evolve over time as sources improve or degrade.

This reminds me of an interesting book Certain to Win by Chet Richards that explores military strategy, and highlights how context is constantly changing therefore the key approach must be to Observe, Orientate, Decide and Act – what he calls a OODA loop – and that these OODA loops must be fast and often. Professor Don Sull at London Business School has called this ‘strategy agility’.

Someone, somewhere said “life’s a beta”. So as we’re sitting developing scenarios and plans for next year (and if we’re really brave the year after) I wonder whether in the same way GIS systems use xyz coordinates to pin together otherwise unrelated data, we can use timelines in the same way, with ‘plan’, ‘actual’ and ‘forecast’ being three quite distinct timelines that are always in place, make use of hard and soft data flagged with intuitive quality indicators and assume a healthy dose of garbage will actually make those Excel models more informed.