
Just another statistic? Lockdown, Part 2
How statistics can change our lives!
It is clear from COVID-19 that statistics change our lives. Inspired by the BBC’s More or Less: Behind the Stats podcasts, we explain what makes the debate so ‘hot’ – selection bias in COVID-19 statistics?
All statisticians – from scientific modellers to political pundits – use sampling techniques to inform theories, models or predictions.
They select sub-sets of data as a reflection of the whole picture. Selection bias can lead to erroneous results and is the same for infections and elections.
Why COVID testing results may suffer selection bias and may not represent the population?
We are told daily by the media about the reported number of positive Coronavirus cases. We see daily, weekly numbers as averages and totals. But, does this reflect the pandemic accurately?
Getting real infection rates across the population needs a lot of data to be accurately analysed. This is very hard as we cannot (yet) test the whole population.
Data has lately been gathered if we have symptoms or, if we have come into contact risk, and we get/apply for a COVID-19 test. In the beginning of the pandemic, testing was almost confined to hospital admissions.
Because the majority of data collected is from test data, the resulting sample is not necessarily representative of the population.
- Who puts themselves up for testing? Only the people who ‘believe’ they have the symptoms or who are contacted by ‘Track & Trace’ get tested and so the positive rate of the illness reflects people who might have symptoms, those exposed and possibly some others that just wanted to check (people taking precautionary tests for example).
- Does everyone get tested? Some people do not. People who have some symptoms but do not want to be tested or decide they cannot financially risk isolation for example. This adds to those people who do not have symptoms and do not get tested.
This type of sample (test subjects) is known as a self-selecting sample rather than a random sample.
Consequently, if 10,000 people get tested and 1000 test positive, it does not mean that the population infection rate is 10%.
In short, you cannot extrapolate from testing results unless the sample was sufficiently randomised.
To identify the infection rate in the whole population, a random test across a large proportion of the public would be required.
Is a prediction more accurate if compensated by a large sample?
Maybe, the fewer (self-selecting) people tested; the rates are likely to be distorted upward.
If only one person in every 100 of a self-selecting model gets a negative test result, then the infection rate could be reported as 99 in 100 or 99%!
There are potential errors in all testing techniques and execution. We’ve heard about ‘False Positives’ who are people who test positive but don’t have the illness. There are also ‘False Negatives’, people who have the virus but don’t test as positive.
The problems facing ‘the science’ get bigger when test results are dropped, lost or contaminated etc etc.
A big sample is better but if it is still not random it still only provides a reflection of the truth!
Even in the Liverpool initiative of mass walk in testing, the sample is self-selecting. However, the data will be more accurate because some people will be just curious rather than symptomatic.
Our decision makers face a real challenge, and we should recognise that a simple plan of action just will not work. It is like a game where the other player may not do what you expect.
I strongly recommend the BBC’s More or Less: Behind the Stats podcasts which studies what the politicians (or scientists say) on many current affairs subjects. Using experts, the show scrutinises statistics and determines if the claims are justifiable or not.
Even if whole population testing is carried out, it does not provide a solution to the virus.
Also, it appears that some people are more infectious than others, transmission is complex, and individuals respond differently to infection. There appears to be strong age-related and ‘underlying condition’ factors as we have been told.
What do we do to prevent us being another statistic?
Its very clear that statistics can change our lives. We all have a role in the current statistical impact.
First, FOLLOW THE RULES, OBEY THE LAW!
What we all know is our vital role in keeping infection rates down and away from those more vulnerable:
Hands – Face – Space (social distancing) = reduced risk of transmission.
Happy ‘quiet’ Guy Fawkes night. Read all about Guy Fawkes on the BBC a memorable day to start Lockdown Part 2.
We looked at ‘how to lie with statistics’ and storks and babies in Holland in our last blog, our next blog is going to build on this theme of statistics and apply it to the insurance industry. “See” you soon!