Lies, damned lies, and statistics

You don’t have to look hard to be overwhelmed with statistics, they are quite literally everywhere and are created for an equal breadth of reasons, from:

  • Surveys of an organisation’s membership to gain their views, to help shape its future direction;
  • Polls of the national public to elicit their opinion on stories within the newspapers;
  • Commissioned analysis to understand the impact of policies on everyday life or trends within a given industry;
  • The results of novel research to extend the overall body of scientific knowledge.

This certainly isn’t an exhaustive list, but what these areas do have in common are where the adage “lies, damned lies and statistics” are easily seen.  At a superficial level, we could apply this phrase to the difference between “fake news”, that is a statistic which is deliberately false and has been created to deceive, and one which is the result of careful analysis.

Being an optimist, I genuinely believe that there are more statistics which have been carefully calculated than those which are “fake”.  However, no-one is perfect, and we all make mistakes such as: generalising niche results to a wider population; presentation of results in poor, hard to understand visualisations; through to simple errors of not checking for bias or misinterpreting the results.  All of these errors are in the control of the analyst and should, in the main, be limited by processes such as internal. Peer review will pick up and minimise these errors before they reach publication.

Nevertheless, this still leaves (in my opinion) one of the greatest reasons that people do not regularly trust statistics.  This reason, most commonly out of the control of the analysts, is the step (or more often than not, a disconnect) between the calculation of a result and its publication either online or in the news.  There is a dichotomy in numbers – when people see an individual number, they often expect it to be both absolute and true.  There is no room for error, no allowance for uncertainty.  How can we possibly succinctly share the impact of all those factors which shape the final result, from the method chosen, through to the data included/excluded and the error bands on the final result?

For any organisation wanting to use statistics to gain awareness of it’s work, if you want to put out a number that is robust, that your intended audience can trust, then before you calculate a value “just because you can” answer these three questions:

  1. Why is the statistic important to your organisation?
  2. Why is the statistic important for your organisation and your audience?
  3. How will the statistics help your audience engage in a conversation with you?

The answers to these questions will tell you if the statistic is worth calculating and, if so, how much time and effort you’re prepared to put into the calculation and presentation of the answer.

Categorized as blog

Leave a comment

Your email address will not be published. Required fields are marked *