He said: "No, Houston, it is small - we've got nothing to worry about", but could not understand why they did not believe him. |
The basic idea is that COVID-19 patients who have symptoms that are so severe that they need to be treated in a hospital are much more likely to be tested than someone with mild symptoms. If we then take into effect what percentage of patients end up in the hospital, we can estimate the number of infections. To do so, we also need to take into account that several days elapse between the infection, the onset of symptoms, and the admission to the hospital. A lot of studies have tried to determine the exact numbers for each of these variables, so it is possible to have good estimates. These can then be fed into a computer model that calculates how many infections correspond to the observed number of hospitalizations. The authors used numbers from Santa Clara County in California, a region in California that was hit relatively hard by the COVID-19 epidemic.
With the most likely set of parameters, the study estimated a total of 6,500 infections for the almost 2 million people in the county on March 17. With a very optimistic set of parameters, the number is reduced to 1,400 cases; with a more pessimistic set, the number increases to 26,000.
A look at the COVID-19 tracking site shows that for all of California, there were 483 confirmed cases on 17 Mar 2020. The internet archive site shows that 155 of these cases were in Santa Clara county. So only 155 out of 6,500 case were reflected in the "official" numbers. In other words, the tests captured just 1 out of every 41 infections.
This "hidden case" factor of 41 is similar to the factor of 35 reported before for a similar study that looked at reported COVID-19 death in the US. There is 6-day time difference for those two number, during which time the more widespread availability of tests reduced the "hidden case" factor for the US; on March 17, the factor from the death-based study was 92. A likely source for the discrepancy is that Santa Clara county had more available test capacity than typical for the US; the county has reported the ability to do 100 test per day. Even with this test capacity that looks reasonable when compared to reported case numbers, the county lab limited testing hospital patients and members of high-risk groups.
Where does the large "hidden case" factor come from?
Multiple issues contribute to the issue, but we can estimate at least one of them: the "infection-to-test delay factor". If every person would get tested every single day, that factor would not exist (or, mathematically, be 1.0). But this cannot be done at any place in the world, nor would it make sense. Instead, the minimum wait is until first symptoms appear. This is about 5 days for COVID-19. During that time, the size of the epidemic has doubled. In other words, for an epidemic with a doubling time and time to onset of symptoms of 5 day:- If every single infected person would be tested on the first day of symptoms, the "infection-to-test delay factor" would be 2.0
- A more realistic estimate of the "infection-to-test delay factor" is 4.0
What else contributes to the "hidden case" factor?
Like in many other countries, access to tests in the US has been limited. Just having symptoms alone was not a sufficient reason to qualify for a test; instead, factors like exposure to a confirmed COVID-19 case, travel history, or being a medical professional were required - often together with symptoms. Without other factors, COVID-19 testing was generally limited to patients with severe symptoms. A typical estimate is that 80% of cases have only mild symptoms (or no symptoms at all). This adds a "light symptoms factor" of 5.There are other reasons why an infected person may not get tested, or not be included in the official case numbers. Some people prefer not to get tested, even with moderate symptoms; some may also have a cold or the flu which obscures the COVID-19 infection; some may not be able to get to a testing site, or convince their doctor to given them the necessary note; testing may fail and return a false-negative result; and to more. We'll group all those together into one factor, which we will call the "other reasons factor". Let's assume this factor is smaller than the others, and give it a value of 2.
We can view the inverse of the factors as probabilities. The chance that any given infection is "old enough" is 1 / 4, or 25%. The chance that an infection will be severe enough to warrant testing is 1 in 5, or 20%. Since they are independent probabilities, we can just multiply them to see what the chances are that an infected person will be tested and included in the official numbers:
- p(report) = 0.25 * 0.2 * 0.5 = 0.025 = 1/40
Bottom line: A large "hidden case factor" can easily be explained
Yes, the official numbers are indeed likely to understate the number of confirmed cases by a factor of 40. This cannot just be explained, but actually would be expected from what we know about the testing and the disease:- An "infection-to-test delay factor" of 4 is caused by the incubation time, and delays between onset of symptoms and reporting.
- A "light symptoms factor" of 5 is the result of limiting testing to cases with more severe symptoms,
- Multiple other factors together are less important, and can be summarized in an "other reasons factor" of 2.0.
To improve this situation, testing guidelines must be relaxed. For example, the "light symptoms factor" of 5 could easily be reduced by testing everyone with light symptoms; the testing guidelines in several countries now allow for all suspected cases to be included.
Even if tests are made much more available, though, the "infection-to-test delay factor" remains as long as the epidemic is in an exponential growth phase. The only way to get a true number of current infections would be to test an entire population (or a subset) regardless of symptoms. One step that goes in at least in this direction, but is easier to implement, is the contact tracing with complete testing of all identified contacts.
Understanding the "hidden case factor" is very important for dealing with the pandemic. Humans have a problem to really understand how quickly a rapidly growing epidemic spreads; the "hidden case factor" only compounds this problem. People tend to take a reported number at face value, and conclude the danger is minimal if the number is small. But let's look at what 100 confirmed cases really means for an epidemic that doubles every 5 days if no effective interventions are taken:
- Now:
- 100 confirmed cases
- 4,000 actual infections
- 3 weeks later:
- 800 confirmed cases
- 32,000 actual infections (of which 160 will die if the IFR is 0.5%)
- 2 months later:
- 16,000,000 infections in 2 months without interventions
- 80,000 deaths within 3 months even if the no additional infections happen after 2 months
Did you like the iceberg picture at the top of the post? An iceberg is actually a pretty bad analogy here. For icebergs, 10% are above the water, much more than the 2.5% of COVID-19 infections. Icebergs barely move and do not grow very rapidly and exponentially.
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.