Thursday, September 10, 2020

400,000 COVID-19 Deaths in the US in 2020?

 Recently, media headlines reported that an "important" computer model now predicts more than 400,000 COVID-19 deaths in the US by the end of the year. I have described in earlier posts why I am not a fan of this particular computer model, so I'll just ignore the model. However, I'll have a closer look at the question if we may indeed see 400,000 deaths linked to COVID-19 in the US before the end of the year.

At first glance, the number of 400,000 seems too high. So far, the official COVID-19 deaths numbers in the US are below 200,000, and many people think even this number is an overstatement. So let's look at a graph to get started:

The graph shows the weekly number of COVID-19 deaths, based on death certificates submitted to the  National Center of Health Statistics, as a dotted line. It also shows the number of "excess deaths" as a black line. Here's a short section from the data file that I downloaded from the CDC web site to generate the graph that explains the "excess deaths":

All lines show data for the week that ended on 8/15/2020. In the first line, the numbers show data only for the deaths certificates that have already been reported to the NCHS. Some states and counties are very slow to submit their data; typically, it takes 6-8 weeks for 99% of all death certificates to be submitted. When the data file was generated on 9/9/2020, a total of 55,145 deaths had been reported for the week. From previous years, the CDC has a pretty good idea how many deaths would have typically occurred in this week: 51,639. For example, the number of deaths reported for the same week in 2019 was 51,128; the expected number is based on data from the last 4 or 5 years, which eliminates most week-to-week fluctuations.

So for the week of 8/15, the number of deaths certificates received so far by the NCHS was 3,506 higher than expected - in other words, there were 3,506 reported "excess deaths". But the CDC knows that the data are incomplete, and can estimate how many "late" deaths certificates will arrive in the next months. This number is shown in the second line, where the Type is listed as "Predicted (weighted). The numbers indicate that the CDC expects to receive almost 5,000 additional death certificates, which will result in more than 8,000 excess deaths for the week. When the same extrapolation is applied to all recent weeks, the calculated number of excess deaths for 2020 (up to 8/29/2020) increases to 252,307. 

In the third line above, the number of death certificates that list COVID-19 as a cause of deaths has been subtracted in the excess deaths columns. This results in 2,027 excess deaths that are not counted as COVID-19 deaths for the week, and a total of 81,012 excess deaths for the year. So roughly, for every 2 reported COVID-19 deaths, there is an additional excess deaths that is not reported as COVID-19. 

In the graph above, the number of excess deaths very closely follows the number of reported COVID-19 deaths. This is also true when we look at the data for individual states and regions, for example New York City and Texas:

Again, we see that the number of excess deaths is very closely linked to the number of COVID-19 deaths, but always higher. In states had very low COVID-19 cases and deaths, for example Hawaii and Alaska, the number of excess deaths also remained very low (and sometimes negative), regardless of whether or not the states issued stay-at-home orders.

This is strong evidence that the actual number of COVID-19 deaths is significantly higher than the reported number of COVID-19 deaths - in fact, about 45% higher. One likely reason for this under-counting is that many COVID-19 victims have died without ever being tested for COVID-19, and often outside of hospitals. False-negative tests for COVID-19, which can easily happen if the PCR tests are administered too early or too late in an infection, are another possible reason.

But regardless of the reasons, it is a proven fact that the number of deaths linked to COVID-19 in the US is significantly larger than the reported number of deaths. According to death certificate data, more than 250,000 people had died in addition to the "usual" number of expected deaths by the end of August. There are additional death, not people "who would have died anyway".

So, now back to the question if the US will reach 400,000 COVID-19 deaths before the end of the year. Rather than running complicated computer models, I'll just give you a "back of the envelope" calculation. As of today, the 7-day average of confirmed COVID-19 deaths is slightly above 700, or about 5,000 a week.  This is down from almost 1,200 deaths per day in early August. The drop closely mimics the drop in confirmed cases, which reached almost 70,000 per day in mid-July, but seemed stable in the low forty thousands before the Labor Day weekend.

If we assume that the number of new infections and the number of COVID-19 deaths remain stable at the current level, we would see 5,000 additional confirmed COVID-19 deaths per week. With 16 weeks remaining in the year, that's an additional 80,000 confirmed deaths. As I write this, Worldometers shows a total of 196,183 COVID-19 deaths for the US; thus, we'd have about 276,000 COVID-19 deaths by the end of the year. Since the COVID-19 linked excess mortality is 45% higher, this predicts that the US will see about 400,000 COVID-19-linked excess deaths before the end of the year.

If you read the media articles that I mentioned at the beginning of this post, you may have noticed that I have not considered many factors that may increase COVID-19 transmissions and deaths before the end of the year.  Some of these include:

  • Secondary transmissions after school and college re-openings. 
  • Re-starting of sport events and other large gatherings.
  • Seasonality: it is quite possible that transmissions increase during fall and winter.

Initial evidence of college re-openings has shown clearly that many students will be infected when living on or near campus. This is to a large extend because they know that their personal risk of dying of COVID-19 is extremely low; however, infected students will cause secondary infections in college and school personal, parents, and other community members, in general people likely to be older and at higher risk for severe or deadly COVID-19 symptoms. 

Perhaps the most worrying factor is the potential that infections will rise during the colder seasons. we do not know if this will happen, but there are multiple potential reasons - from spending more time indoors, where transmissions are higher, to lower vitamin D levels from reduce sun exposure, which has been linked to more severe disease. But perhaps the most troubling indicator is that the list of countries with the highest number of COVID-19 cases now includes many countries from South America, which is just emerging from winter; some of these countries report extremely high case loads and COVID-19 deaths numbers despite strong government interventions to reduce transmissions. 

If COVID-19 transmissions in the US rise again from the current levels, regardless of whether that is due to one of the factors listed above or to other reasons, then it is likely that the number of COVID-19 deaths will increase significantly, quite possibly exceeding half a million "excess" deaths.

We'll know for sure sometime early spring next year, when most death certificates for 2020 have been submitted to the CDC.

Thursday, August 27, 2020

Separate Realities

Being trained as a scientist, I have always believed that there is such a thing as truth and a single reality. But people in the USA seem to be living in completely different realities, as a recent poll shows. It asked Americans if they found the current number of deaths from COVID-19 in the US acceptable. The results vary dramatically by political orientation:

The majority of Republicans (57%) think that the current number of deaths is acceptably, but only 1 out of 10 Democrats thinks this way.

A key to understanding these differences is that "About two-thirds of Republicans (64%) think the number of US fatalities from coronavirus is actually lower than what is being reported".  In sharp contrast, only 12% of Democrats think that the number of COVID-19 deaths is lower. But in both parties, the percentage of people who believe that COVID-19 deaths are overstated closely mimics the percentage of people who deem the current fatalities acceptable. This raises the question:

Are the reported COVID-19 deaths numbers accurate, too low, or too high?

This is a reasonable question. There is no doubt that not all COVID-19 deaths are reported accurately.  If someone dies of COVID-19 without ever being in a hospital and without a positive COVID-19 test, the death will often not be reported as a COVID-19 death, leading to under-reporting. But at other times, COVID-19 may be listed on the death certificate even if the death clearly was not caused by COVID-19, causing over-reporting of death. Many people who believe that actual deaths are lower than reported numbers will have stories of someone without COVID-19 symptoms who died in a motorcycle accident or similar. Some believe that hospitals cheat by listing COVID-19 as the cause of death so they get higher reimbursements.

There is no doubt that both under- and over-reporting of COVID-19 deaths happens. Theoretically, both could happen at a similar rate so over- and under-reporting cancel each other out, but this seems unlikely. So how can we get an idea how many more deaths are really caused by the coronavirus epidemic in the US?

The most straightforward way is to compare the number of people who died since the start of the epidemic to the number of people who died during the same time period in previous years. If COVID-19 has caused a large number of additional death, that should show up in the number of reported death. This approach, named "excess death analysis" is standard when trying to estimate the impact of epidemics; for example, it is generally used to estimate how many people die of influenza. 

Before we get started, let's have a quick look at the number of COVID-19 deaths reported in the US as of today (August 27, 2020, 8:15 pm):

  •  The CDC web site reports 178,998 COVID-19 deaths.
  • Worldometer.info reports 184,764 COVID-19 deaths.
  • Johns Hopkins University reports 180,527  COVID-19 deaths.

Exact numbers differ a bit depending on when and how data are collected, but we can say the reported number is close to 180,000.

Next, we can look at the data from the National Center for Health Statistics, were all states send reports of deaths to. The web page provides a download link for "National and State Estimates of Excess Deaths", so you can download a file in .csv format that you can import into Excel or OpenOffice.

The file contains state-by-state data for weekly reported and expected deaths since 2017, and totals for the entire US. There are three data sets in the table, and we'll look at each in turn.

The first data set we'll examine is the "unweighted, all causes" set. These are the numbers for the death reports that the CDC had received by the time the file was generated. For recent weeks, these numbers will be incomplete, since not all states and counties have reported their numbers yet; typically, it takes about 8 weeks until about 99% of the death certificates have been submitted. Therefore, this data set will give an underestimate of the number of excess deaths.

The latest data included are for the week that ended 8/15/2020. The "unweighted" (incomplete) data set reports an excess of 203,840 deaths for 2020. This number is significantly higher than the number of reported COVID-19 deaths. This is a clear indication that the reported COVID-19 deaths understate the actual number of deaths cause by the epidemic.

Since many recent death certificates are missing from the "unweighted" data set, we need to look at the other data sets. The next one we can look at is the "weighted, all causes" data set. For this set, the CDC has estimated how complete the submission of deaths certificates was for each week and jurisdiction, and adjusted the totals to account for missing reports. The "weighted" (predicted) data set reports an excess of 245,305 deaths for 2020

The third data set in the file ("weighted, all causes excluding COVID-10") calculates excess deaths after subtracting reported COVID-19 deaths. This gives the number of excess deaths that are not classified as COVID-19 on the death certificate. This results in 82,049 excess deaths, in addition to 163,256 COVID-19 deaths. In other words, for every 2 reported COVID-19 deaths, there is another additional death that does not list COVID-19 as the cause of death. 

One way of interpreting these results is that only about two thirds of COVID-19 linked deaths are reported. In other words, the actual death toll from the corona virus epidemic in the US is about 50% higher than reported in the official death counts.

There are a few different ways to calculate these numbers, but they all end up with pretty similar results: actual excess deaths are about 40-50% higher than reported deaths.

The final exercise is to provide a "best estimate" of the death toll as of today. The CDC spreadsheet only contains data until 8/15/2020; since then, and additional 10,536 deaths have been reported. Using the under-reporting factor of 50.3% described above means we expect more than 5,000 additional excess deaths, for a total of 261,136 excess deaths in the US linked to the COVID-19 epidemic.

The analysis is based on data submitted by the states to the CDC, a government organization that has been under the control of the current administration for the last 3 1/2 years. The data are publicly available, and anyone can download them and do their own analysis. But even the incomplete "unweighed" data, which does not include many deaths from the most recent weeks, clearly show:

Actual deaths linked to COVID-19 are significantly higher than reported numbers



Saturday, August 8, 2020

Evaluating the US Response to COVID-19

 How well did the US response to the COVID-19 pandemic work?  This post examines both COVID-19 tests and deaths, and compares them to data from other countries - mostly European countries, since their culture and political system are similar to the US.

Let's start with a look at COVID-19 deaths. We'll use 7-day rolling averages to smooth out day-to-day variations, and use death per million population so that we can compare countries of different sizes. 

During the initial phase, death rates at the peak  were lower in the US than in Italy and France, two of the hardest-hit European countries. But US death rates were about 3-fold higher than in Germany.

But if we extend the graph until August, the picture changes:

In the three European countries, death rates dropped dramatically. In contrast, the drop in the US was less pronounced. We can see the differences better if we use a log scale for the y axis:

The graph below  shows a comparison of how much COVID-19 deaths were reduced from the maximum to August 1:

 Whereas France, Germany, and Italy reduced daily COVID-19 deaths by between 55-fold and 135-fold,  the US only achieved a 2.4-fold drop. In other words, the US efforts to contain COVID-19 were between 22-fold and 55-fold less effective than the efforts in France, Germany, and Italy.

But what about COVID-19 testing?

There have been some repeated claims that the US has done "the most COVID-19 testing in the world". However, these tests are both false and very misleading.

In terms of absolute numbers, the country that has performed the most COVID-19 tests is China, with 90.4 million tests compared to 60 million to 64 million tests in the US. But absolute numbers mean little, since countries differ vastly in population size and COVID-19 cases.

When population size is taken into account, the US drops down to the 19th spot on the Worldometers ranking:

But while population size is one factor to consider, the more important factor is the number of actual COVID-19 cases in a country: a country with more infections also needs to test more. If we look at the number of tests relative to the number of confirmed COVID-19 cases, the US does poorly:

It is illuminating to examine how the ratio of tests to cases developed over time in different countries:

Note that the y-scale is logarithmic. For the last few weeks, the tests/cases ratio for the US has been about 12. For the European countries shown, the ratio has been above 100 - typically 10 to 20 times higher than in for the US.

If we look at the ratios in April in the graph above, and know how severe the COVID-19 pandemic was in the different countries, a clear relation becomes visible: the countries with the lowest amount of testing had the worst epidemics. Of the European countries shown, Greece had the lowest number of COVID-19 deaths in April (relative to population size), followed by Germany. Greece also had the highest ratio of tests per case, again followed by Germany. The countries with the worst epidemics in April were Italy, the UK, and the United States - all countries with substantially less COVID-19 testing.

The same pattern was present again and again during the COVID-19 pandemic: less testing meant more COVID-19 deaths, and often out-of-control growth. Currently, two countries with very rapidly growing COVID-19 case numbers illustrate this: Brazil and South Africa.

With adequate COVID-19 testing, people can see rapidly raising infections, and adjust their behavior, often even before mandatory public health measures are initiated. But without sufficient testing, the warning flags go up too late. The real scope of the epidemic remains hidden from view; infections keep multiplying rapidly; and the end results is a large death toll, as was seen in Italy, Spain, France, New York City, and many other places.

The bottom line

The efforts to contain COVID-19 in the US have been substantially less effective than in other countries, including several of the European countries that were hit hardest by the COVID-19 pandemic. As a result, the current number of new COVID-19 infections and deaths is an order of magnitude higher in the US.

Even though the US has performed the second-highest number of COVID-19 tests in the world, the high level of active infections means that the current COVID-19 testing capacities are insufficient. This is evident in the poor performance of the US when comparing the test-to-case ratio with other countries. Average wait times for test results of more than 4 days, and wait times of 10 days or more for 10% of tested individuals, also underscore the shortcomings of COVID-19 testing in the US, and severely limit the potential effectiveness of contact tracing and related measures to contain the spread of COVID-19.



Wednesday, July 22, 2020

Texas, 3 Weeks Later

I started posting about COVID-19 in Texas about 3 weeks ago. Back then, the number of daily COVID-19 cases had started to increase dramatically, but the number of deaths had not increased. Let's have another look, this time with 3 weeks newer data:

The graph shows the number of new COVID-19 cases and deaths over the previous seven days. The scale of the two separate y axes (left axis for cases in blue, right axis for deaths in red) is chosen so that the two curves have the same slopes, for the most recent data, as indicated by the parallel orange lines.

The graph clearly shows that Texas had a three week delay between confirmed cases and the resulting deaths. Multiple studies have shown that this is close to the typical time between the onset of COVID-19 symptoms and deaths from COVID-19. Of course, there is some variation from case to case, with some people dying within a couple of days after diagnosis, and others dying after a couple of months in the hospital; but three weeks is about the average.

The delay between the curves also depends on when exactly a test is done; how long it takes to get and report test results; and how long it takes to report COVID-19 deaths. If we see a three week delay, that means tests are done (on average) shortly after first symptoms appear, and that the delay to get and report test results is roughly similar to the delay in the reporting of deaths - a few days. During the height of the epidemic in New York, tests were scarce, so most people could get tested only after symptoms had become severe. Furthermore, testing was backlogged so that getting results often took a week or even longer. Together, this caused the delay between increases in cases and the corresponding deaths to be much shorter, only about one week.

Unfortunately, the longer delay between case reports and COVID-19 deaths that we see now in Texas and many other states have caused many to believe that COVID-19 has somehow become a lot less dangerous - maybe even because the virus has mutated. This is not the case! While some progress has been made in treating severe cases, for example by limiting and better managing intubations, and by using drugs like remdesivir and dexamethasone, the effect of these improvements is much smaller than the apparent effect of the "death delay" and better testing.

Most people looking at the curves above will immediately understand that the number of weekly deaths had not reached its maximum yet. Even if new COVID-19 cases would remain at the current level or drop, the three week delay means that the number of deaths will continue to rise for another three weeks. During that time, the number of COVID-19 deaths per week in Texas will roughly double, to about 1,500 deaths per week, or 200 deaths per day. This number may climb even higher if hospitals are overloaded with the large number of COVID-19 patients, which seems to be happening.

We can use the knowledge about the delay between new cases and COVID-19 deaths to calculate "time-adjusted" case fatality rates for Texas. For example, Texas reported 69,491 new COVID-19 cases in the two weeks from 6/17 to 6/30/2020. With the average delay of 3 weeks, most corresponding deaths are expected to occur between 7/8 and 7/21. The number of reported COVID-19 deaths in Texas during this time was 1,421. This gives a fatality rate ("time-adjusted CFR") of 2%.

This means that if Texas announces 10,000 new COVID-19 cases on a given day, this will result in about 200 COVID-19 deaths, which will be reported roughly 3 weeks later. Actual reported numbers vary a lot from day to day, so it's better to look at weekly numbers: the current rate of about 70,000 cases per week will translate to about reported 1,400 COVID-19 deaths per week in Texas in three weeks. If Texas would keep the number of new transmissions constant, at about 10,000 reported cases per day, this would mean about 70,000 reported deaths from COVID-19 in Texas over the next year. As I explained in a recent post, it will take at least this long, and possibly more than two years, to reach "herd immunity" in Texas.

But if we look at the actual number of deaths in Texas, and compare them to the same weeks in 2019, and even grimmer picture emerges:
The table above shows the number of deaths from "natural causes" and the number of deaths from COVID-19 that Texas has reported to the CDC; the data are available for download on the CDC website.  I have used the most recent data (updated today, 7/22/2020), and omitted the last two weeks of data because the reporting for the most recent weeks is very incomplete.

As the table shows, that there were almost 6,000 more deaths from natural causes in Texas between the middle of March to the end of June than during the same period last year. Only about half of these "excess deaths" listed COVID-19 as a cause on the death certificate. In many US states, COVID-19 will only be listed if COVID-19 test was done before deaths, and if it was positive. For anyone who did not have a COVID-19 test before death, or who had a false negative test result, it is very unlikely that COVID-19 would be listed as a cause of death.

For every officially reported COVID-19 deaths in Texas until the end of June, there was another "excess" death that probably also was linked to COVID-19, either directly (an unconfirmed case) or indirectly. If this trend would continue, and the number of cases would remain at 10,000 per day, then the expected number of additional deaths in Texas over the next year would be 140,000. Given that Texas reported 187,806 deaths from natural causes in 2019, this would mean that the death rate in Texas would almost double.

Saturday, July 18, 2020

Herd Immunity For Texas?

Let's just say it: the governors of several southern states have decided to follow a "herd immunity" approach in dealing with COVID-19. They limit interventions to the absolute minimum level required to avoid dramatic hospital overloading. This is certainly the case in Texas and Florida.
On a purely selfish basis, I can see positives in the approach Texas is taking. If they succeed in reaching herd immunity quickly, then we can go back to Texas next winter, without having to worry much about COVID-19. But can Texas get there - or rather, how long will it take to reach herd immunity? This post tries to answer this question - but let's start with a quick recap about what "herd immunity" is.

What is "herd immunity"?

When a new virus like the SARS-CoV-2 virus that causes COVID-19 emerges, it keeps infecting more an more people until one of several things happens:
  1. People change their behavior in a way that stops transmissions. This can be voluntary, or in response to government orders like "stay-at-home" directives.
  2. A drug is found that prevents the virus from infecting others.
  3. Just about everyone in a population has been infected and cannot be infected again because their immune system remembers the virus, and keeps it from causing disease upon re-infecting a person.
Vaccines can be seen as a combination of the three steps above: the vaccine is a drug (#2) that people need to take (#1), and it builds immunity to prevent infection (#3). But a COVID-19 vaccine will not be available anytime soon, so we'll ignore this for the time being.

At the point were enough people have been infected (or vaccinated) so that an epidemic dies out, the population has reached "herd immunity". Depending on how infectious a virus is, reaching herd immunity requires that anywhere between 50% to more than 90% of the population has been infected (or successfully vaccinated).

For COVID-19, it is generally assumed that herd immunity would be reached after somewhere between 60% and 70% of a population becomes immune due to prior infection or successful vaccination. Here is a simple way of explaining this number: we know that on average, each person infected with the corona virus will infect three others. But if two of these three others are immune to the virus, then only one person will be newly infected - the epidemic stabilizes. If the "immunity rate" is a bit higher than 2 out of 3, then each person will infect less than one new person, and the number of new infections will start to decrease.

How about Texas?

To see how Texas might reach herd immunity, we need to look at some numbers:
  • Population size: about 29 to 29.9 million
  • Confirmed COVID-19 cases: 307,572 (as of 7/18/2020, 4:38 pm)
  • New cases per day: 9,748 (7-day average)
  • Total COVID-19 deaths: 3,735 (as of 7/18/2020, 4:38 pm)
If 2 out of 3 Texans need to have been infected for herd immunity, that comes to 20 million Texans. With 307, 572 Texans already infected, that leaves about 19.6 million infections that have yet to happen. At the current rate of about 10,000 new cases per day, a very naive estimate would be that reaching herd immunity will take another 1,963 days - more than 5 years.

What about untested infections?

I called the first estimate above "very naive" because it assumes that every single infected person in Texas has been tested for COVID-19. We know with absolute certainty that this is not the case - some people do not have any symptoms, and others do not get tested for a variety of different reasons. For New York, antibody studies have shown that only about 1 out of 10 infected people were tested - so let's plug in these numbers to see what happens.
With a "test rate" of just 10%, the number of actual cases would be 3.07 million, and the number of people who get infected every day would be about 100,000. This means that herd immunity levels would be reached in 169 days - just about when we like to become winter birds in Texas!
Unfortunately, that's not good enough: after 169 days, we'd still have about 10,000 new confirmed infections per day in Texas, and this number would just start to slowly drop lower. And since we assumed that only 1 in 10 infections is tested, the actual number of daily infections would still be about 100,000. Another way of looking at the same numbers is that about 1 in 300 persons gets newly infected every day. With 2 people staying for a couple of months, the infection risk would be about 40% - too much for my taste. Bummer.

Realistic scenarios for Texas

The two scenarios above describe two extreme situations: what would happen if tests cover every infected person, and what would cover if only every tenth infected person gets tested. In all likelihood, the truth lies somewhere in between. There are a few things we can look at to get clues. One is the "test positive" rate - how many COVID-19 tests come back positive. In New York City during the height of the epidemic, the test positive rate was between 50% and 70%: tests were so scarce that (almost) only people got tested where it was extremely likely that they were infected - usually people with symptoms severe enough to require hospital admission.

In contrast, the overall test positive rate for Texas is about 10.6%, much lower than in NYC. This indicates that tests have been more widely available, and that a larger fraction of infections has been captured by testing. The test positivity rate has recently increased in Texas, but always stayed below 20%.

Another way to estimate the under-reporting factor is by looking at fatality rates. In New York during the height of the epidemic, the case fatality rate (CFR) exceeded 10%; for Texas, the raw CFR is around 1.3%. Even after doing the necessary adjustment for the delay between testing and death, the "time-adjusted" CFR for Texas is about 2% to 3%.

We can use the time-adjusted CFR together with the "infection fatality ratio" (IFR) to estimate the ratio of infections to reported cases.  As the upper limit for the IFR, we can use 1.45%, which is the ratio observed in New York City. Given that Texas has a younger population and that hospitals are not as overloaded as they were in NYC, this is a true upper limit, and probably too high. As a lower limit, I'll use 0.4%, which is most likely too low.

This gives us a "hidden case factor" or "under reporting factor" range of between 1.5 and 6 for Texas; I'll use 2 and 5, which pretty much outline the reasonable range.

If we assume an under reporting factor of 5, about 1.5 million Texans have been infected with COVID-19 already, and about 50,000 new infections happen every day. At this rate, reaching herd immunity will take 368 days - one year.

If we assume an under reporting factor of 2, reaching herd immunity would take 966 days - about two and a half years.

Similarly, we can use the known COVID-19 fatality rate in Texas to extrapolate how many additional COVID-19 deaths would occur in Texas until herd immunity is reached. For an under reporting factor of 5, the total number of deaths at herd immunity levels of infection would be 73,539; for an under-reporting factor of 2, it is 186,998.  In both cases, the number of expected COVID-19 death per day would be about 200.

A Texas-sized sacrifice of human lifes

Texas is a big state. It's a very populous state, with almost 30 million people living there. The approach to COVID-19 has placed heavy emphasis on the economy.

If Texas keeps following the current approach, while keeping the number of new daily confirmed COVID-19 cases stable at about 10,000 per day, Texas will reach herd immunity somewhere within the next two and a half years, and possible within one year. During this time, about 200 Texans will die each day from COVID-19 (although excess mortality data indicate that the actual death toll will likely be significantly higher). During the entire time, many hospitals will be operating at or beyond ICU capacity, which could further increase the death toll.

Perhaps the most positive scenario for Texas includes a rapid development of an effective COVID-19 vaccine, and wide-spread use of this vaccine in Texas. In the most optimistic scenarios, a vaccine might become available in adequate quantities at the beginning of next year. Until then, about 40,000 Texans will have died from COVID-19 - unless the governor of Texas finally decides to (re-)impose stricter and more effective measures to reduce corona virus transmissions.

Wednesday, July 15, 2020

Women Good, Men Bad - Or Bad Science?


The picture above shows the presidents of two countries with outstanding COVID-19 responses - but outstanding in very different ways.

Taiwan's president Tsai Ing-wen led what can be characterized as the most successful response of all countries, keeping the total number of deaths in Taiwan limited to seven.

In contrast, Donald Trump's efforts have led to world-wide "leadership" of the USA in infection measures: the highest number of confirmed cases (3.6 million); the highest number of COVID-19 deaths (close to 140,000); and the highest number of new daily cases. The COVID-19 death rate per million inhabitants is about 1,400 times higher than the death rate in the US; only six countries in the world with a population of at least one million have a higher death rate, but the US is rapidly catching up.

So - can we conclude that women are better at leading countries than men? That seems a bit hasty, looking at just two examples. But what if we add a few more pictures? Italy, Spain, and the UK had very high COVID-19 case numbers and deaths, and all were led by men. In contrast, New Zealand did very well, and it's led by a woman. Similar arguments can be made for Germany, Iceland, and a few other countries. And indeed, composite pictures showing the "good" female leaders and the "bad" male leaders have become popular on Facebook, and even made it into articles on relatively conservative journals.

These images are convincing. I am a German-American, and I desperately wish that the US would mount a COVID-19 response similar to Germany's. If a female scientist like Angela Merkel, rather than a misogynistic reality TV star, would be leading the US, that would make me insanely happy.

But I have also looked at how different countries have handled COVID-19 way too much to just "buy" a simple explanation that "women are better". While Germany has done reasonably well when compared to Italy and France, it has not done great. Neighboring Austria, which share the same language and a very similar culture, has done better: the COVID-19 death rate (normalized to population size) is about one fourth lower, even though Austria is closer to Italy, and home of at least on of the initial "hot spots" in Europe. Austria's chancellor is male.

But even Austria has done poorly compare to some other countries that are also led by men. Singapore, for example, did extremely well in the initial response to COVID-19, and still reports a 20-fold lower death rate than Germany (and an 80-fold lower death rate than the US). Australia's response was a bit delayed, but still timely and effective, with very similar results. Greece had an outstanding response in the European Union, with a death rate that is about 30-fold lower than Italy's. All these countries have male leaders.

A quick visit to Worldometers will also show that the country with the highest COVID-19 death rate in the world is Belgium, with 844 COVID-19 deaths per million (if we ignore San Marino, with a population of less than 34,000). The Prime Minister of Belgium is female. In all fairness, Belgium does a better job at reporting COVID-19 deaths in nursing homes than many neighboring countries, but even taking this into account, the COVID-19 death rate would still be similar to the rates of other heavily affected countries like France, Italy, Spain, and the UK.

Now we come to the "Bad Science" part of the post. I am referring to a preprint titled "Women in power: Female leadership and public health outcomes during the COVID-19 pandemic".  As a preprint, this paper has not yet gone through the rigorous "peer review" process, where experts in the field look critically at the science - and this shows. However, you need to look closely before the weaknesses become clear.

The first warning signs come from the author list. The 12 authors come from a variety of institutions in different countries. The affiliations listed include "Department of Political Science", "European Environment Agency", "Environmental Planning and Climate Protection Department", "Institutes of Earth Sciences and Sustainable Development Studies", "Crawford School of Public Policy", and "Natural Capitalism Solutions". But let's put that aside and look a bit at the science.

The authors looked at 35 countries with a mix of male and female leaders, and a number of COVID-19 related measures that appear reasonable. They defined a number of criteria for inclusion, like a democratic regime (except for China) and minimum income levels. But things get "interesting" when the authors classify the countries. They write:
"Of  the  35  countries  considered,  10  have  a  woman-led  government  (Belgium, Denmark, Estonia, Finland, Germany, Greece, Iceland, New Zealand, Norway and Taiwan)"
There are two mistakes in this list:
Both of these countries have done well, but the government leaders are male (Germany, which is correctly included in the "female led" list, has a male president, who is also "mainly ceremonial").

This is a significant error, since it puts two countries with very good COVID-19 responses into the wrong category.

Another red flag is raised by the following statement in the preprint:
"We excluded countries with no lockdown or with only sub-national lockdowns in place to ensure consistency across countries."
But the list of countries used includes the USA, where a number of states never issued "stay at home" orders, as well as Sweden, which chose a "herd immunity" strategy and never issued a country-wide lockdown, and Taiwan, where a lockdown was never necessary. These are two countries that increase the numbers in the "male leader" category, and one country that decreases the numbers in the "female leader" category. I am not arguing that these countries should (or should not) be included - but a basic requirement for a scientific publication is that it describes what has been done. This is clearly not the case here.

Things become even more questionable when we look at two countries that the authors have excluded: Thailand and Sri Lanka. Both countries have done extremely well in containing COVID-19, and both have male leaders. The authors gave as justification that these two countries were "without a distinct peak in daily deaths over the study period". But that seems to be a highly questionable justification, especially if we compare the death curves for Taiwan (which was included in the "female" category) and Thailand (which was excluded from the "male" category:

Thailand does show a cluster of 1-4 daily deaths from the end of March to the middle of April.
Taiwan had even fewer death, with one day of two deaths.

Including Taiwan, but excluding Thailand, Sri Lanka, and Singapore would seem like an arbitrary decision, if not for the effect on the results: the exclusions make the "female lead" results look better. This very much looks like intentional data massaging. That's not science, that's politics.

The intentional bias becomes even more obvious when we look at sentences like this one from the preprint:
"Female-led countries reported 1,983 (± 2,724; 95% CI) deaths, while men-led countries 13,276 (± 9,848; 95% CI), by considering average values"
The primary reason for the difference? The "men-led" countries have larger populations! When taking the differences in population size into account, the differences mostly disappear. The authors write:
"When we normalize the data per population, we find that countries led by women had 1.6-times fewer deaths per capita than their male-dominated counterparts"
That's only a factor of 1.6. But to get this low factor, the authors had to manipulate the countries in the lists, as explained above! If we correctly place Greece and Estonia in the "men-led" categories, and add countries that have done extremely well in fighting COVID-19, but were excluded by the authors, the differences become a lot smaller: a factor of 1.27. This drops to 1.23 if we omit the US because there never was a nation-wide lockdown.

For illustration, let's look at a graph that shows the death rates:
Countries led by women are shown as red bars. Note that the x-axis is logarithmic, which is better suited for the huge observed differences in death rates. Sri Lanka and Thailand, the countries the authors chose to omit, are high-lighted.

The differences between countries in the "women-led" category are more then 2,800-fold; in the "men-led" category, the differences are more than 1,200 fold.  The differences between the sexes, on average? Just 1.2-fold. Perhaps the sex of the leader does play a role - but if so, that's a thousand-fold less important than other differences in how countries responded to COVID-19.

It appears that the authors of the preprint distort and misrepresent data to fit their pre-conceived conclusion that "female leaders are better". This behavior is at least close to scientific misconduct. It may give the authors a few moments of fame when this is picked up by media, but the long-term effect on science can only be detrimental. A lot can be learned from countries that have kept COVID-19 transmissions low, like Taiwan, Sri Lanka, and New Zealand. Focusing on the sex of the country's leader only distracts from learning what really works - like face masks, which were very much at the center of Taiwan's efforts to minimize COVID-19 deaths.


Friday, July 3, 2020

Has COVID-19 Become A Lot Less Deadly?

Short answer

No. This graph shows the main reason why cases have increased recently, but deaths have not yet:
Figure 1:
Typical COVID-19 time periods for New York (April) and Florida (now)

Long answer

It's easy to come to the wrong conclusions when looking at graphs like this one:
Figure 2:
COVID-19 cases and deaths in Florida

Since the beginning of June, the number of confirmed COVID-19 cases in Florida has risen about tenfold, but the number of COVID-19 deaths has remained roughly the same. So it seems obvious that something else has happened - perhaps the higher case numbers are only due to testing? Or the virus has mutated and is less deadly now? Both answers seem logical, but they are wrong. Let me explain.

To start with, let's have a look at the cases and deaths curves from earlier in the pandemic. We'll start with New York:
Figure 3:
Cases and deaths in New York in March - April

We can see that the death curve followed the case curve with a delay of about 1 week. That's true for when cases and deaths were rising at the end of March, and it's also true for the peaks in early to mid April.

But things looked quite different in Germany:
Figure 4:
Cases and deaths in Germany in March-April

For Germany, the offset between cases and deaths was much longer - about two weeks instead of just one week. Let's also look at Spain:
Figure 5:
Cases and deaths in Spain

For Spain, the delay between cases and deaths was even shorter than for New York - only about 2-3 days.

We know with absolute certainty that the observed differences were not due to changes in the corona virus that causes COVID-19. Multiple virus isolates from New York, Germany, and Spain have been sequenced and compared to each other, and while there are small differences between almost all isolates, those are mostly "silent" mutations that have no biological consequence.

However, we do know what did cause the observed differences in time lags between confirmed cases and deaths: the availability of COVID-19 tests. Germany had sufficient tests available so that most people with COVID-19 symptoms or exposure to COVID-1 patients could get tested, and test results were generally reported within a couple of days. Therefore, the time difference of two weeks between test and deaths is close to the about 16-20 days that are the typical time from first symptoms to death for COVID-19.

The situation was very different in New York in March and early April. Test capacity was extremely limited, so that testing was mostly limited to patients with very severe symptoms, often patients that needed hospital care. At the same time, hospital capacity in New York City was fully used, which led to very strict criteria for hospital admissions. As a result, patients were testing much later after the initial infection: not when the first symptoms developed after about 5 days, but only after symptoms got a lot worse, which often took another week or longer.

In addition, test providers were severely backlogged, so that getting test results back often took up to two weeks. Together with the delayed ordering of tests, this reduced the typical time between test results and deaths to a week. In Spain, test availability early in the epidemic was even more restricted than in New York, which reduced the test-to-death time even further.

What about Florida?

Since April, COVID-19 testing capacity in the US has increased significantly. As a result, COVID-19 tests have often been available to anyone with symptoms, and even to people without symptoms who (for example) had been in contact with confirmed COVID-19 cases. This means that on average, anyone infected with COVID-19 can get tested about a week early than in New York in April. Furthermore, test results are usually available within a day or two. Together, these two factors extend the time between test results and deaths by almost 2 weeks, as shown in Figure 1 above. There are also indications that the reporting of confirmed COVID-19 deaths in Florida is slower than in New York, probably by several days.

Therefore, the expected delay between the rise in confirmed COVID-19 cases in Florida and the corresponding rise in COVID-19 deaths is more than three weeks. The rapid rise in cases started about three weeks ago, so the corresponding rise in deaths would be expected to start within the next week or so.

If we look at the cases and deaths for Arizona, where the rise in infections started about a week or two earlier than in Florida, we can indeed see that deaths are starting to increase:

Figure 6:
COVID-19 cases and deaths in Arizona

The number of confirmed cases in Arizona started to rise at the beginning of June; about 3 weeks later, the number of COVID-19 deaths started to rise from about 20 to almost 40.

The effect of younger people being infected

Many news reports have detailed that the current COVID-19 infection wave in the south differs from the initial infection wave: a much larger percentage of young people is infected now. To some extend, this is likely to be distortion linked to testing. A young person with COVID-19 is much less likely to have severe symptoms, require hospital care, or die from COVID-19 than an older person; this is a well-known fact that has been seen desribed in for initial epicenter in Wuhan. When testing was limited in the US, and therefore mostly restricted to patients with severe symptoms, the likelihood that a young infected person would get tested was significantly lower than it is now, with much more testing capacity available.

However, while this may distort the picture somewhat, it is nevertheless true that younger people are now driving the wave of infections. To some extend, this is due to younger people being less concerned about COVID-19, and therefore less likely to adhere to social distancing and face mask guidelines. But independently of that, younger people tend to have a much higher number of social interactions than older people, and are therefore more likely to be infected when restrictions are lifted.

Over time, however, younger people interact with older people - their parents, grand parents, coworkers, and others. As a result, the infection wave spreads to older population groups, albeit with a noticeable delay.  A study by Dr. Jeffrey Harris, an economics professor at the MIT, found this to be the case in infection time lines for Florida. Here is a graph from this study:
Figure 7:
COVID-19 infections by age group in Florida (from Harris, 2020)

The figure shows that the growth in the older (60+) age group trails the growths in the younger (20-39) age groups by about 1-2 weeks, but then increases at about the same pace. The effect of the age distribution and timing on COVID-19 deaths amounts to an additional delay of 1-2 weeks between confirmed infections and deaths.

As a result of all the factors discussed above, the overall delay between the rise in confirmed COVID-19 cases in Florida and a corresponding rise in deaths is likely to be approximate one month.

But the CFR is down!

Another argument made by "partially informed" people that "proves" that the corona virus is getting less harmful is that the case fatality ratio (CFR) is going down. The case fatality ratio is easy to calculate: just divide the number of COVID-19 deaths by the number of confirmed cases. Do this for New York on May 1, and you get 7.6%. Do this for Florida on July 2, and you get 2.1%. Quod erat demonstrandum? Not so fast!

The biggest problem with the CFR is that it uses "cases". Increase testing, and the number of confirmed cases goes up. But the number of deaths does not change (or changes only minimally, assuming most severe cases still get tested). So do more testing, and your CFR goes down! That's exactly what we are seeing - Florida has done a lot more testing than New York. But testing has changed nothing about how deadly the virus is. More testing only warns us that we have a problem earlier, giving us more time to do something to reduce transmissions.

The really relevant number is not the CFR, but the IFR: the infection fatality ratio. But to calculate that, we need to know the actual number of infections - something we usually do not know. There are multiple ways scientist can try to estimated the true number of infections, and all of them must take age distribution into account. For different countries and regions, the studies have returned numbers in the range of 0.4% to 1.4%; these numbers have not really changed much since the first thorough estimates based on Wuhan data in February and March. One of the higher infection fatality rates of 1.45% was estimated for New York City. For age group from 25-44 years, the estimated IFR was 0.12%; for the oldest age group (75+), the infection fatality rate was 17%.

One likely reason why the IFR in New York City was relatively high was the overloading of hospitals and ICU units. Failure to understand the delays between the rise in reported COVID-19 cases and the corresponding deaths has already lead to delayed actions in several affected states, and will likely cause similar hospital capacity problems in many areas in these states - and similar high fatality rates.

Herd immunity still means more than a million COVID-19 deaths in the US

Some individuals who looked at case curves and deaths curves and then wrongly concluded that COVID-19 had mutated to a less deadly form (which it has not) have also advocated to go for "herd immunity".  To reach this point where new COVID-19 transmissions would stop "naturally", at least 60-70% of the population would have to be infected: more than 200 million Americans.  With the fatality rate seen in New York, this would lead to almost 3 million COVID-19 deaths. Even with a fatality rate at the low end of the estimates, 0.5%, "herd immunity" would still mean more than a million deaths from COVID-19.

The vast majority of Americans still considers a million deaths absolutely unacceptable. But some people value their "freedom" to party and not wear face masks higher. Often, they hide their real opinions, instead downplaying how dangerous COVID-19 is. But the science is clear, and it is not "just another opinion". Don't be fooled.

Is the more infectious mutant G614D more deadly?

A couple of hours after writing this post, I found a couple of interesting publications that describe a mutant of the corona virus called "D614G". One study by a large group of researchers from Los Alamos, Duke, Harvard, WUSTL, and the UK looked at 28,576 sequences from corona virus isolates, and tracked the changes over time. The study found solid evidence that the original strain, D614, has been largely replaced by a mutant strain, D614G, in many different countries and continents. The likely reason for this observation is that this strain is more infectious, which is supported by the observation that the mutant virus appears to be present in higher concentration in the upper respiratory tract than the original D614 strain. Such a higher concentration of viral particles would explain a higher infectivity, and it could also cause more severe disease symptoms.

A second study had looked at how common the original and mutant virus strains were in different countries, and correlated this to the reported CFR rates. The study concluded that the mutant D614G strain (called G614 in the study) was linked to higher fatality rates, and therefore more pathogenic - in other words, more deadly. However, as discussed above, the CFR rate depends on both fatalities and testing, and the larger changes in observed rates are linked to testing differences. The testing rates vary dramatically between the countries included in the study, so any conclusion about the mutant being more pathogenic is, at best, tentative.  The study did note that the isolates from New York had a higher percentage of the D614G strain; if this strain is indeed more  pathogenic than other strains, then this could explain the observed higher infection fatality ratio, possibly in combination with, and addition to, other factors like hospital overloading.

Additional studies will be needed to clarify whether or not the D614G strain is indeed more pathogenic than the original D614 strain. At this point in time, we only know that this mutant strain that has become dominant on most countries is more infectious, and can only speculate that it may be more deadly. Still, the scientific evidence we have today points towards more, not less, deadly corona virus variants.