Tuesday, March 24, 2020

Corona Virus Infections Are Much Higher Than Reported

Currently (3/16/2020), there are about 4,000 confirmed coronavirus cases in the US, but the real number of infections is probably a lot higher: between 50,000 and half a million. For other countries with a similar approach to corona testing and contact tracing, the under-reporting is similar: roughly, just 1 or 2 cases out of a 100 infections is reflected in the "confirmed cases" number. This post explains how I came to this conclusion, in a way that you can run the numbers yourself if you want to check them.

Only very limited corona virus testing is available in the US, and many suspected cases could not get tested, even if the treating doctor suggested a test. Some guidelines severely limited testing, for example to health care workers and elderly patients with symptoms, but excluded others even if they showed symptoms typical for COVID-19.

But when a friend posted an estimate by a Johns Hopkins professor that up to 500,000 Americans are infected, I quickly ran some numbers in my head, and concluded "no, that number is too high". Boy, was I wrong! Today, I spend a couple of hours to set up a spreadsheet for a more careful calculation, and discovered that the professor probably was right. My estimates show that there are probably about 200,000 to 300,000 infected people in the US right now (with a range from 50,000 to one million). That's smack in the middle of Prof. Makary's estimate.

To estimate the number of current infections, I started with the number of deaths from COVID-19 in the US. As of today, 77 corona virus deaths were reported in the US, 53 of them during the last 7 days. From that, we can work backwards, using numbers from studies and the actual reported cases in the US (the exact number seems to go up every time I reload the web pages reporting the statistics - it is now 86).

Let's start with a simplified approach. We know that the about 5-6 days elapse between infection and the first onset of symptoms. The average time between onset and symptoms and death is about 18 days. That means someone who died today probably got infected about 23 or 24 days ago. This means that the number of death roughly reflects the number of infections back then - more than 3 weeks ago. So, how many people where infected back then? We cannot know the exact number, but a good estimate of the ratio between infections and deaths (the "IFR") is 0.67%. We can safely assume that all the 53 reported death this week were infected 23 days ago, so if we divide 53 by 0.67%, we get our first estimate of infections: 7,910.

The number of 7,910 is the estimated number of infections 23 days ago - but how many infections do we have today? One thing we can do is to compare the confirmed number of cases 23 days back and today. On 2/22/2020, there were 29 confirmed infections in the US; today, there were 4,597 (as of 3.16.2020, 6:07 pm CST). That's a 158-fold increase in confirmed infections. If we multiply that with our estimate of 7,910 infections 23 days ago, we get about 1.25 million infections now.

However, since the US had problems getting corona virus testing to work, the initial number of reported cases may be artificially low, which would inflate the estimate of the current number. So instead, we can plug in the typical increases in infections that were reported from places where testing was available. Most of these studies show a doubling time between 4.5 and 7 days. If infections double every 5 days, then the number of infections after 23 days would be about 24-fold higher. That would give us an estimate of about 190,000 infections today.

There are a bunch of additional refinements that we can do. For example, some of the 53 death happened a few days ago, which reduces the multiplier we have to use. Another factor to consider is that the number of reported death is likely to be lower than the actual number of deaths, since some deaths may have been diagnosed incorrectly, or have occurred outside of the health care system. We also don't know the exact doubling time. If we use shorter numbers that are closer to the observed growth in the US, we get a higher estimate; if we use longer numbers, it is lower. The range I observed for what I deemed as reasonable numbers was between about 50,000 and 1 million. The "most likely" numbers resulted in an estimate of about 300,000 infections as of today (3/16/2020). This number is about 75-fold higher than the "official" number of confirmed COVID-19 cases.

There are a number of simplifications in these calculations that would affect the outcome somewhat; however, I believe that these changes would be lower than the changes that result from reasonable changes in the parameters, and that the overall "bottom line" is solid.

The bottom line

The reported numbers of "confirmed" COVID-19 cases in the US understates the actual number of corona virus infections by about 50- to 100-fold. The best estimate for current number of infections in the US is about 300,000, with a range from 50,000 to 1,000,000.


P.S.: The day after I wrote this post, I found a study published in the Science magazine which concluded that 86% of the infections in Wuhan remained undetected prior to the January 23rd - only one in six infections was detected. If the same factor would be applied to the reported case numbers in the US, there would have been about 28,000 infections on 3/16/2020. However, testing in the US has been unavailable for many patients, even with symptoms and doctors notes, which means that the actual underreporting factor in the US is likely to be much larger than 6.

This post was originally posted on 3/16/2020 at boardsurfr.blogspot.com/2020/03/corona-virus-infections-are-much-higher

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.