Tuesday, April 7, 2020

Good and Bad Science

In this post, I'll look at two examples of COVID-19 related science. One study points out a very important aspect that is often overlooked; the other study is much more complicated, uses tons of data, but unfortunately uses assumptions that have no base in reality - if fact, they are in direct disagreement with everything we know about COVID-19.

Let's start with a figure from the first study - the "good science":
The figure illustrates that computer models of the COVID-19 epidemic must consider both contact and non-contact transmissions.

Basically, the study points out that you can get COVID-19 even if you follow all guidelines and orders, and stay away from others by at least 6 feet all the time. That can happen through "fomite" transmission, for example when touching a virus-laden surface when shopping; or through "aerosol" transmission, for example in shared office spaces, business meetings, or funeral services. For more details, please check my previous post; for an example, read what a 30-year old, very fit man wrote about how he got infected, and how he experienced COVID-19.

The study is well written; easy to understand; outlines the problem clearly; shows how this can be considered in computer models; and gives example data of the effects. The study takes into account what we currently know or assume about COVID-19; it does not, however, make any specific claims about how many people would get infected or would die.

This is very different from the second study, which predicted 81,114 deaths, with a "95% uncertainty interval" of 38,242 to 162,106 deaths. Since the numbers are lower than the numbers from most other models, the study has attracted the attention of the White House and the media.  So, let's have a closer look at the publication. It is written quite differently, full of sentences like this one:
"Posterior uncertainty within each location was then obtained using a standard asymptotic approximation at that location".
That sounds very scientific, right? But if you ever took a class in science communication (I have), this may actually raise some red flags. What this sentence really means is:
"To get an idea how good the results for each location were, we compared the curves our models predicted to the actual death numbers."
Before I explain what the researchers did, let's look at the prediction graph from their website:

The numbers have changed a little since the study was first made public, but not much. If you have looked at other studies, one thing jumps out: a very rapid drop. At the end of May, daily deaths have dropped to almost 0, at least compared to the peak of around 3,000 deaths per day. This is much faster than other models have predicted.
So, what do they know that others do not? Perhaps it is that they have made very specific projections for every state in the US, trying to match current data? Nope, that's not it, others have done that, too. What stands out is that the authors have not used the standard "SEIR" or similar epidemiologic models, but instead have bases their predictions exclusively on the observed deaths.  Using deaths numbers avoid most problems that arise from limited testing, and has been used by a number of groups in their models. But what is quite unique in this study is that predictions were based on observed deaths in Wuhan China. The authors state:
"In Wuhan, strict social distancing was instituted on January 23, 2020"
Their idea is that any state that implements similar measures would see a similar development in death rates. They then state that 4 types of "social distancing" measures were taken in Wuhan:
  1. School closures
  2. "Closing non-essential services"; at other places referred to as "closures of non-essential services focused on bars and restaurants"
  3. Stay-at-home or shelter-in-place orders
  4. Major travel restrictions 
I will comment in a minute how amazingly ignorant and false these statements and conclusions are, but let's first see how they used this in their model. They write:
"A covariate of days with expected exponential growth in the cumulative death rate was created using information on the number of days after the death rate exceeded 0.31 per million to the day when 4 different social distancing measures were mandated by local and national government: school closures, non-essential business closures including bars and restaurants, stay-at-home recommendations, and travel restrictions including public transport closures. Days with 1 measure were counted as 0.67 equivalents, days with 2 measures as 0.334 equivalents and with 3 or 4 measures as 0."
That's a handful. Let me re-write it in English that can be understood:
As soon as states implemented at least 3 of the 4 social distancing measures above, the model used the observed drop in Wuhan for the predictions. States implementing just one or two measures would slow the exponential growth down by one third or two thirds.
Another rather important assumption is stated as:
"For states that have not implemented 3 of 4 measures (school closures, closing non-essential services, shelter-in-place, and major travel restrictions), we have assumed that they will be implemented within 7 days" 
To summarize: the authors assume that within 7 days, all states would have measures in place that will be as effective as the measures taken in Wuhan. Any differences between the states in the details how measures are implemented are completely ignored. But the measures will be sufficiently effective to completely stop the epidemic within about 2 months.This requires that additional new infections drop very rapidly, and are basically non-existent after a few weeks.

In my last post, I explained how minor differences in "stay-at-home" orders and other measures between states can explain the observed differences in the drop of new COVID-19 cases. States with very strict orders, penalties for those who ignore orders, and very limited exceptions have seen a pronounced drop in new daily cases; states with less strict measures have seen no drops, or smaller drops. But even the strictest measures in the US cannot compare to the strictness of measures taken in China.  Here are just a few of the important differences:
  • Wuhan and the province it was in was put under a complete lockdown; no resident were allowed to leave the city or province.
  • Stay-home orders where strictly enforced by security guards.
  • Extreme efforts were taken to track anyone who had contact with an infected person. Wuhan had 1800 teams of 5 or more dedicated to contact tracing
  • Strict, supervised isolation and quarantine measures for infected persons.
  • Required use of facemasks in public, with relatively wide availability of masks. 
  • Strict traffic regulations and controls. 
  • Hotels remained open in most US states, with no or minimal travel restrictions.
In stark contrast, many states in the US have measures that are either voluntary or rarely, if ever, enforced. Business closures vary widely by state, with some states having very broad exceptions; construction-related business are often allowed to continue operating with minimal restrictions. Shopping is generally allowed, but many shops do not have any disinfectants available for customers, not even near registers where checking out typically require the use of touch screens or keypads. In addition to permitted exceptions, some people choose to ignore existing guidelines or orders for a variety of reasons.

Every single of the differences in measures between Wuhan and the US means more additional COVID-19 infections in the US. In terms of the reproduction rate R, even dropping the rate down to near 1.0, which would only stabilize the number of new infections per day, has proven elusive for some states. Even states like Oregon who implemented stricter measures than other states only see a gradual drop in new infections. This is reflected in the new case numbers: on April 7, two weeks after many states implemented COVID-19 measures and stay-at-home regulations, there are still 14 US states and territories where the number of new cases was at least 10% of the number of total cases, indicating rapid growth of the epidemic.

While there are clear indications that the social distancing measures in the US are working and slowing down the COVID-19 epidemic, the slowdown is not nearly as fast as the second study assumes. This makes it extremely likely that the total number of deaths from COVID-19 in the US will exceed the roughly 82,000 cases the study predicts, possibly by a large margin. On average, the measures in place in the US are less strict than the measures in place in Italy; assuming that the effect will be higher is overly optimistic. Worse, it is potentially dangerous, since the low projections may encourage the lifting or simple ignoring of the regulations too early, thereby creating a second flare-up of the epidemic.
Note added May 4, 2020:
With almost 70,000 reported COVID-19 deaths in the US, it has become abundantly obvious that the IHME model is what I called it: bad science. The CDC, which had included the IHME model until late April as one of the models they look at, has dropped the IHME model in the May 1 update. 
One of the models still included by the CDC gives a succinct description of some of the problems that the IHME model has

1 comment:

  1. Wow! Finally got around to reading this in full. Yeah, I've read papers like this. These guys are charlatans.


Note: Only a member of this blog may post a comment.