Tuesday, April 21, 2020

Better COVID-19 predictions

Any model for COVID-19 cases should be based on actual data. Any extrapolation towards the future has to reflect what we know about current case trends, the coronavirus, and epidemiology.

I have developed such a model which I call the "Data Trend COVID-19 Model". Let's start with some results before I explain the model. Here is a projection of total "confirmed cases" and deaths in the US for the next two months:
The numbers shown are based on state-by-state projections, using the basic assumption that future decreases and increases in case numbers will stay on the same course as during the last week or two. That's easiest to see if we look at state level predictions:
The graph shows the 11 states which had the highest growths of new COVID-19 cases in the last week. The data on the left side are actual reported numbers; I used 3-day averages to smooth out the very erratic reporting by some states. The shaded area on the right is a projection of how many new cases each state will see if the current trend continues. More about that later, let's first look at the other states:
The second group of 10 states has a lower growth rate, but the number of new cases was still increasing during the last week in each of them.
In the third group of 10 states shown above, new daily cases had remained roughly at the same level over the last week.
The final group of 11 states saw daily case numbers go down in the last 7 days, so the projection continues this trend. Some of the smaller states that had very few COVID-19 cases are not shown in the graphs above.

Overall, about half of the states showed a growths in daily COVID-19 cases during the last week, and only about one quarter of the states showed a decline. New York, which had by far the highest number of COVID-19 cases among the states, showed a decline; New Jersey, the state with the second-highest number, was roughly flat.

Note that the y-axis in the graphs above uses a logarithmic scale. A straight line on the logarithmic scale indicates exponential growth when the line goes up, and exponential decline when the line goes down. The curve for almost every state shows a straight line going up, an curved transition phase, and then a new straight line that is much flatter than before. In about half of the states, this line goes up slowly; in about a quarter of the states, it goes down slowly.

Let's have a look at what this looks like with a linear scale for the y-axis:
One thing that jumps out is that New York (the yellow curve) had a relatively sharp drop in the last few days (before the gray area on the right). If the data in the next few days also show such a sharp drop, the extrapolated numbers for New York are too high; the line should be angled down more. We should get a better idea if this is the case in the next few days. There are two effects that come into play here: (a) the "weekend effect", and (b) distortions from not enough tests being available in New York during the height of the epidemic.

The "weekend effect" can be seen in the case numbers for many states, which tend to be lower on weekends and (sometimes) on Monday, and higher during the middle of the week. This probably reflects that some of the test labs run with reduced personnel on weekends.

The distortions that were introduced because not enough tests were available at the end of March and beginning of April in New York are illustrated in this graph:

In the graph, the number of cases estimated from reported deaths is scaled and time-shifted so that is should match the number of reported cases closely. However, this is not the case after 3/20: the number of new cases was growing faster than the test capacity, as is evident by the increasing fraction of positive tests (the green line). This lead to extensive delays. The yellow curve shows an estimate of the number of tests that were delayed. This first caused a flattening of the reported "confirmed cases", which afterwards stayed higher for an extended period of time. Imagine the tip of an iceberg cut off, and glued onto the right side of the iceberg - that's what happened. This distortion means that we do not really see how fast cases in New York really dropped.

Now let's get back to the first two figures on top of this post. The second figure shows steadily increasing daily case numbers in all 10 states for the next month. In reality, this is unlikely to happen: if case numbers keep rising, it is likely that governors will issue additional regulations to stop the growth of the local COVID-19 epidemic. To reflect this, I re-ran the model with a limit of the growth in daily new cases to the first 21 days of the projection. Here is the result:

The total number of deaths until 6/19/2020 here is about 160,000 - about 63,000 less that is the growth in daily cases continues. Here's a closer look at the numbers:
  • Assuming no additional restrictions in state with growing numbers of daily COVID-19 cases:
    • All states: 4.5 million confirmed cases, 223,901 deaths
    • Excluding NY and NJ: 3.7 million cases, 169,034 death
  • With additional restrictions that lead to steady daily cases after 21 days:
    • All states: 2.5 million cases, 160, 642 deaths
    • Excluding NY and NJ: 1.7 million cases, 105,854 death
In both scenarios, the majority of deaths come from states other than New York and New Jersey. Note that the reported numbers are only for the period until 6/19/2020, about 2 months from now. A very large number of additional deaths would be likely after this period if the case rates do not start dropping much more rapidly very soon.

All of the data shown above are based on the assumption that current restrictions like "stay-at-home" orders and business closures remain in effect, and that the number of people following these orders does not change. This seems extremely unlikely, given the relentless push to ease the restrictions by the president and right-wing media. The governors of Georgia, South Carolina, and Tennessee all have already announced partial re-openings starting between April 20 and May 1. It is virtually certain that this will lead to a more rapid growth of new COVID-19 infections in these states.

Unfortunately, the negative effect of "re-openings" will be delayed and incremental. The observed delay between new infections and the inclusion in official "confirmed cases" is about 10 days to more than two weeks. Gradual adaptation by the population is likely to lengthen the time before the number of new cases rises even further. Most governors will be very reluctant to re-enact restrictions due to small increases in COVID-19 infections. If, for example, restrictions are lifted for 4 weeks, and the duplication time during this phase is 7 days (much slower than in past months), this would lead to a 16-fold rise of COVID-19 infections in the affected states. That's an increase by 1,600 percent in new daily cases. In comparison, the observed drop in new cases in the US during the last 2 weeks was closer to 10 percent per week, and most states did not show any noticeable drop at all.

The example of New York shows that drastic and enforced stay-at-home orders work, and can lead to a reduction in new transmissions. Many states have tried to get away with less stringent restrictions, for example by including a wide variety of businesses in the "essential" category, and by making adherence to restrictions voluntary; most of these states still show a growth in daily new COVID-19 infections.

The desire to go "back to normal" is understandable. However, the actual data we have about the COVID-19 epidemic clearly indicate that it is much too early to lift restrictions; if anything, most states need more stringent restrictions. Any hope to re-open without incurring many hundred thousand COVID-19 deaths absolutely requires fast testing and efficient tracking of contacts. Currently, not a single state in the US has both testing and tracking in place.

Today saw a new record for reported COVID-19 deaths in the US; as I am writing this, Worldometers shows 2,715 deaths. The total official number of COVID-19 deaths in the US now exceeds 45,000; the vastly over-optimistic IHME model predictions of 60,000 deaths that the White House so loves will be proven wrong within the next 10 days. On the current trajectory, we would see between 160,000 at 220,000 COVID-19 deaths within the next two months, even without "re-opening". With "re-opening" happening to early, it is likely that several hundred thousand people will die of COVID-19 in the US. The data show this very clearly.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.