Saturday, December 11, 2021

O Mi Cron

 It's been about 2 years since the COVID-19 pandemic started. It's been a year since vaccines have been approved. This COVID-19 thing should be over. So why, in hell, am I writing another post about COVID-19?

Unfortunately, it is not over.  The US still has more than 1,000 COVID-19 deaths per day. Germany has reported record number of COVID-19 infections in the last couple of weeks. A new virus variant, Omicron, has emerged, which has replaced Delta in regions of Africa, and threatens to do the same in other parts of the world - partly because its many mutations allow Omicron to evade the immune response from vaccinations or previous COVID-19 infections. 

In this post, I'll look at some of the things we have learned about Omicron, and from other variants. I'll also look at data from different countries and US states to examine how effective vaccinations are in respect to reducing COVID-19 infections, and in respect to reducing severe COVID-19 cases and deaths.

Finally, I will combine these insights with some basic facts about the human immune response to speculate what will happen in the next years.


The Omicron variant of the SARS-CoV-2 virus that has about 50 mutations relative to the original Wuhan strain. What makes Omicron so unusual is that about 30 of these mutations are in the spike protein, which the virus uses to invade human cells; and that there are no closely related strains known.

The first description of the Omicron variant is less than 3 weeks old, so our knowledge about it is still limited. It looks like Omicron first started spreading in the Gauteng province in South Africa in October. In less than 2 months, it become the dominant strain in the Gauteng province, mostly replacing the previously dominant Delta variant. It has also spread to dozens of other countries and US states.

The fast spread of Omicron is an indication that Omicron may be even more infectious than Delta. Reports from the UK, Germany, and other countries indicate that vaccines and previous infections offer only limited protection from infection with Omicron, with protection of 75% or more only seen in persons who recently received a third (booster) vaccine shot, or who received two vaccinations and also had a previous COVID-19 diagnosis.

Currently, there are not enough data available to know how severe COVID-19 caused by Omicron will be. Limited anecdotal evidence suggests that Omicron cases may not be as severe as cases caused by other variants. However, many patients who tested positive for Omicron had been previously infected by other variants, or were vaccinated, which significantly reduces the chances of severe or deadly COVID-19.

COVID-19 Vaccine Effectiveness

A large number of scientific studies from different countries have shown that COVID-19 is effective at preventing COVID-19 infections, but even more effective at preventing severe COVID-19 cases, hospitalizations, and deaths. The RKI in Germany publishes one of the most comprehensive studies on a weekly basis. Germany tracks COVID-19 infections, vaccinations, hospitalizations, and deaths in a central register, which enables tracking the vaccination status for positive cases, hospitalizations, and deaths, and therefore calculating the effect of the vaccinations for each calendar week:

The region shaded in grey at the right side of the graph indicates data from the most recent weeks, which are incomplete due to reporting delays. The data clearly show that vaccinations prevent infections and hospitalizations in all age groups. With careful analysis, this effect can be quantified:

The graphs show that the protection from infection (top left graph) went down slightly over time, from about 85% to about 70%, and then started increasing again due to booster shots in the most recent weeks. The protection from severe outcomes like hospitalization (top right), ICU (bottom left), and death (bottom right) remained higher, generally between 80% and more than 90% for all age groups.

Another way to see the effect of vaccinations is by comparing countries, for example the UK and the USA. Let's start with the US (data from

The graphs for cases, hospitalizations, and ICU admissions all look quite similar, with the peak in the fall of 2021just slightly lower than the peak in January 2021. However, the summer/fall death peak is somewhat reduced relative to the January peak.

For comparison, here are the graphs for the UK:

Looking at the case numbers in the summer/fall relative to last winter, the trend in the UK was comparable to the US, or even worse, since the case number remained high. However, the second peaks for hospitalizations, ICU, and deaths are all significantly reduced - about 4 x to 9x lower than the peaks numbers last winter. In other words: while the UK has experienced a similar peak in COVID-19 cases in the fall, the increase in severe COVID-19 cases and deaths was several-fold lower than in the US.

The reason for the observed difference can be seen in the vaccination statistics for the two countries:

  •  Not fully vaccinated (12 years and older)
  • Not fully vaccinated (65+):
    • US: 12.8%
    • UK: <4%

For many southern US states, which were hardest hit by the summer/fall 2021 wave, vaccination numbers are substantially lower than the US averages. For example, more than 20% of the 65+ population in Georgia and Alabama is not fully vaccinated, and the non-vaccinated rate in the 12+ population is above 40% in Alabama (45.3%( Mississippi (44%), Louisiana (41.5%), Tennessee (41.5%), Georgia (41.4%), and Arkansas (41.2%).

The UK data show that vaccinations can cause a substantial reduction in hospitalizations and deaths from COVID-19 if a high percentage of the population (and a very high percentage of the most vulnerable population) is vaccinated.

Variants, Vaccines, and the Immune System

Over the past two years, we have seen a new virus variant replace the previous variants roughly every six months: the original Wuhan strain was quickly replaced by the D614G mutation, which was replaced by the Alpha strain, which was replaced by the Delta strain. In some regions of the world, other variants like the Beta and Kappa strains also temporarily become common or dominant strains.

Each subsequent strain had advantages over the previous strains that helped it become dominant; typically, the newer strains had mutations that made them more infectious, so that they simply spread faster. Some of the later strains also showed some signs of "immune escape", which resulted in increased re-infections, as well as reduced effectiveness of antibody therapies.

What we are seeing is "evolution live": the SARS-CoV-2 virus adapts more and more to its new host, the human body. The speed of this evolution is directly related to the number of infected "hosts" - the more people are infected, the faster the virus will change, becoming better at infecting more people. We have absolutely no reason to believe that this will stop anytime soon; it seems quite likely that Omicron will be replaced by an even "better" variant before the end of the next year, especially given vaccine "hesitancy" in many countries and regions.

Theoretically, vaccines can be adapted for new strains. However, even though data showed that vaccine efficacy was reduced for the Delta variant, no new vaccine that has been modified for Delta has been approved yet. Technically, modified vaccines could be developed within weeks, and manufactured within maybe a couple of months. However, clinical testing for safety and efficacy will likely add several months, and regulatory approval even more time. Additional complications come into play from travel and other restrictions. For example, if traveling requires full vaccination, should a participant in a trial for a new vaccine be allowed to travel before the vaccine he receives is proven to work?

A lot of the research about new variants simply looks at antibodies: how well do antibodies from a vaccinated or previously infected person work against a virus? That's a reasonable question, but there is another reason why we hear a lot about such studies: they are relatively easy to do, and give quick answers. The answers are often condensed down into a single number. For the Omicron variant, for example, reductions by factors of 20 or 40 are often reported.

Interpreting these numbers, however, requires some understanding of the tests that are being done. Typically, the reported number is basically a dilution factor - the "titer" required to inhibit, for example, the ability of the virus to infect cells in the lab. The titer in effect combines two aspects of the antibody immune response: 1. how well antibodies bind to a virus, and 2. how much antibodies are present in the blood. For Omicron, one study reported that a booster shot gives relatively good protection within the first 2 weeks after a booster shot, but significantly less protection after 3 month. The difference here is entirely due to the "how much" aspect of the titer: over time, the concentration of antibodies in the blood goes down (assuming no new infection or vaccination). Typical half lives for antibodies in the blood are in the range of several weeks; so after 3 months, the amount of COVID-19 specific antibodies may have dropped by a factor of 10 to 100.

After a second vaccine shot or a booster shot, the observed antibody titer goes up, which indicates improved protection. This, however, is due to both factors - and increase of the amount of antibodies, and an increase in how well the antibodies bind. Our body is constantly exposed to viruses that want to hijack our cells and multiply - just looking at cold viruses, there are a few hundred different ones out there. All those viruses mutate - some a bit faster than SARS-CoV-2, some slower, but they all change. 

To keep up with ever-changing viruses, the immune system has its own "rapid evolution" mechanism. Basically, as the cells that make antibodies against virus multiply and mature, the antibody changes, due to mutations in "hyper-variable" regions in the antibody genes. So a certain fraction of the antibody-producing cells will have mutations that change the antibody a bit, just like a corona virus has mutations that changes its spike protein. This part is just random, but now comes the "clever" part: if a random mutation improves how well an antibody binds to the "antigen", for example the spike protein, then the cells with this mutation will divide faster than "older" cells with not-quite-as-good antibodies.

This process of "antibody improvement" happens every time after a new infection - and after a new vaccine shot. It can improve the binding of antibodies by factors of more than 10 every time, which is one reason why a vaccine with 2 shots tends to work better than a single-shot vaccine. But it can also make the "necessary changes" if a virus has mutated, causing the old antibodies to be less efficient.

Now let's put this all together, and see what that means with respect to Omicron. Let's assume I have had a booster shot recently: my antibodies have now two rounds of improvement after the initial shot, so they will bind the spike protein much better than they would have after the first shot, or even after the second shot. Every person makes different antibodies, partly due to different genes, and partly due simply to chance. The Omicron spike protein looks quite different from the spike protein in the original virus, but if I am lucky, some of my antibodies will still bind modified spike protein well enough. Shortly after the booster, the antibody levels will be high, and my antibodies cover all the spike proteins in all the virus particles that infected me. The virus never stands a chance to multiply to a level where I would have any symptoms.

But more likely, my antibodies recognize the Omicron spike protein, but due to the mutations, don't bind it very well. That means some of the spike proteins remain free of antibodies, and the virus can enter my cells and start multiplying. But some of my antibodies will also do their job, interact with the other parts of my immune system, and start multiplying. As described above, that also means that they'll "check out" some random mutations, and there's a good chance that one of the mutated antibodies binds the Omicron spike protein very well. Maybe it takes a few days for this to happen, and the virus can multiply a bit during that time - enough to give me a sore throat and a mild cough. But after that, my "new and improved" antibodies take over, and the infection is quickly controlled. This is what is often seen in breakthrough infections - the virus initially multiplies and reaches levels similar to what isseen in unvaccinated persons, but then, virus levels drop a lot faster, and symptoms remain lighter.

What I described above partly explains what we see with vaccinations - that they can prevent infections, but are even better at preventing severe COVID-19. In fact, the observed protection against several variants, including Delta, was better than what would have been expected from antibody-binding studies. The explanations above are, of course greatly simplified. For example, I did not go into the role of the "second arm" of the immune system at all, the T-cell mediated immune response; nor did I mention the effect of repeat immunizations on memory cells (just like learning, repeating usually helps). But the data we have so far are quite clear: vaccines protect; more shots protect better; and more recent is better. So if you qualify for a booster, but have not gotten one yet, go get one!

Time for Coronas

So, the virus causing COVID-19 is here to stay, and will likely keep mutating, perhaps faster than vaccines can keep up. Over time, it is very likely that infections become less dangerous. We are seeing this already in the UK, where case fatality rates have fallen by at least fourfold, and in other countries with high vaccination rates. We will also soon have effective medications that can be taken in pill form, based on similar mechanisms as the protease inhibitors that have worked so well for HIV infections. 

We have seen that protection against COVID-19 seems to fade over time, and that multiple exposures to the vaccine and/or infection increase protection. Given how widespread COVID-19 infections are, and the observed seasonality, we will keep seeing waves of COVID-19 infections for years to come. It is quite likely that the severeness will go down over time, if our immune systems are exposed to the virus or a vaccine on a regular enough basis. After three vaccine shots, I'll be happy to expose myself to "natural exposure" again more frequently - time for Coronas, the Mexican kind (or perhaps a local brew instead). So I do hope that this will indeed be my last post on the subject of COVID-19.

I'll finish with a couple of graphs that illustrate the danger of declaring "it's over" too early - something that a lot of states and countries have done. Here's a look at the ICU usage in Florida:

A comparison of excess death rates (in percent relative to expected deaths) for Texas and Massachusetts:

Those are graphs based on all death certificates submitted to the CDC by local authorities, so there is no political meddling possible. The data for the last few weeks are incomplete. The CDC data show 95,147 excess deaths for Texas, 70,138 excess deaths for Florida, and 11,333 excess deaths for Massachusetts (which has about 1/3rd of the population of Florida, and 1/4th of Texas).

Tuesday, January 19, 2021

More Than 560,000 Deaths Due To COVID-19

With the beginning of a new presidency in the US, the number of 400,000 COVID-19 deaths in the US has come up in various news reports. While this is the "official" number of confirmed COVID-19 deaths, a closer look proves that it underestimates the death toll from COVID-19 by a large margin. From the beginning of March 2020 to January 19, 2021, there were more than 560,000 excess deaths in the US - deaths that are in addition to the expected 2.5 million deaths from all causes.

The Numbers: More than half a million additional deaths

Here is a quick outline where the number of 560,000 excess deaths comes from. The starting point is a spreadsheet from the CDC called "Excess Death Associated with COVID-19" which is published every week by the CDC, and which summarizes the deaths certificates that all US states and territories have to submit to the CDC. The most recent version was updated on 1/13/2021, and includes data up to the week that ended on 1/2/2021.

Since the submission of death certificates often takes several weeks or longer, data for the most recent weeks are incomplete. The CDC tries to correct for missing data to some extended by providing a "weighted" data set with a prediction of the final counts, but the correction is limited to states and regions where a substantial fraction of the expected deaths have been reported, and therefore is always an under-estimate for the most recent 4-6 weeks. Therefore, the starting point of the analysis were the data for the weeks ending between 3/7/2020 and 12/5/2020.

The "average expected count" of deaths from all causes for this period is 2.157 million. The actual number of deaths was 2.567 million. This means there were about 410,000 more deaths than expected. However, we need to correct for a few states that are exceptionally slow in reporting. This includes North Carolina, which had not submitted any death certificates for the weeks after 9/26/2020. Here's the numbers:

  • COVID-19 deaths reported for NC on  12/5/2020: 5,516
  • COVID-19 deaths reported for NC on 9/27/2020: 3,441
  • COVID-19 deaths between 9/27 and 12/5/2020 in NC: 2,075
  • Ratio of excess deaths to reported COVID-19 deaths for NC: 1.4
  • Estimated excess deaths for NC from 9/27 to 12/5/2020: 2,896

There is an additional correction to make for a couple of other states what had very incomplete data for the most recent weeks, which adds another 973 excess deaths. This gives a total estimate of 413,671 excess deaths until 12/5/2020.

To this number, we have to add the deaths that occurred between 12/6/2020 and 1/19/2021. According to, there were 122,648 confirmed COVID-19 deaths during this period. However, the number of excess deaths in the US has always been substantially higher than the number of confirmed COVID-19 deaths. If we use the empirical factor of 1.38 (from the period between 3/7 and 12/5/2020), we get 169,687 additional excess deaths since 12/5/2020. This give a total of 583,358 excess deaths in the US between March 2020 and today. 

It is possible that the under-reporting of COVID-19 deaths has gone down over time due to various factors, including the increased availability of testing after the first wave of deaths. If we assume that the current correction factor is only 1.2 instead of 1.38, get 147,178 additional excess deaths since 12/5/2020. This gives a total of 560,849 excess deaths in the US between March 2020 and 1/19/2021.

There is some uncertainty about the exact numbers of excess deaths due to the incomplete reporting. Over the next week, this uncertainty will be reduced, but it will take several more months before we have a reasonably complete picture. But at this point in time, is seem very likely that the total number of excess deaths linked to COVID-19 so far is between 550,000 and 600,000.

For comparison, the total number of expected deaths for the period between March 2020 and the middle of January 2021 was about 2.5 million. Instead, more than 3 million people died in the US during this time - and the deaths were very closely linked to COVID-19 infections about one month before, regardless of whether states were in lockdown or not, or if the states were controlled by Republicans or Democrats.

One final thing to note is that the actual number of deaths caused directly or indirectly by COVID-19 may be even higher than the numbers I have given above. During the past 10 months, most states were in lockdown for at least several weeks, and some cities, regions, and states have been in extended lockdowns. Both lockdowns and other COVID-19 control measures like social distancing and face masks have the effect of reducing overall deaths. This has been observed in many countries when COVID-19 infection rates were low. We can see the same effect in US states, too. Here are two examples of states that had relatively strict lockdown measures, but low COVID-19 case numbers and deaths, during the period between 3/14/2020 and 5/2/2020:

  • Maine: 59 fewer deaths than the expected 2,392. This is a 2.5% reduction in mortality, even though Maine reported 56 deaths from COVID-19 during this period.
  • Hawaii: 94 fewer deaths than the expected 1,894. This is a 5% reduction in mortality. Hawaii, which has a very similar population size to Maine, reported only 16 COVID-19 deaths during this period.

If the "expected" deaths numbers would take into account the mortality-reducing effect of lockdowns, the calculated death toll numbers would be even higher.

 But regardless what the exact number is, the bottom line is that more than half a million people already died because of COVID-19. Since COVID-19 case numbers have remained very high, and deaths are delayed by up to a month after infection, the US will see close to 100,000 additional confirmed COVID-19 deaths over the next 30 days. At that point in time, the death toll we be equal to the entire population of cities like Boston, Washington D.C., or Seattle.







Monday, December 7, 2020

Cool Science and The Importance of Staying Distant

This post looks into several important new insights into COVID-19 that were described in recent publications. I will also look at what this means for us - why it is very important to stick to social distancing and related prevention measures. It's long, so I'll give you a

Short version

  1.  If you get exposed to 500 COVID-19 virus particles, you'll probably get infected. That one "infectious dose".
  2. If you're infected, you probably don't know it. You'll be most likely to infect others before you have any symptoms. 
  3. When talking, an infected person emits about 500 infectious doses per hour, which would theoretically be enough to infect 500 others. Talking loudly, singing, and yelling increases to 2,500 infectious doses per hour. There are many examples of one person infecting dozens of others.
  4. The virus particles are in aerosol droplets that can remain airborne for up to several hours, and therefore accumulate indoors.
  5. Do the right thing. 

Long version

Let's start with a graph that tells us a lot about CVID-19 infections:

This is from a in-depth study of infections in Austria. The scientists used a combination of contact tracing and DNA sequencing to study how exactly the COVID-19 virus was passed from one patient to the next. To get this information, they looked at new mutations that are always present in a subset of virus particles in each infected person, and compared these to the mutations found in others a patient had expected. This allowed the scientists to estimate how many virus particles (or virions) had been passed from one patient to the next - the "bottleneck size" in the graph above.

The results indicate that in many cases, a couple of thousand virions were passed on during an infection. However, a similar number of patients had been infected by a lower number of virus particles, between 20 and 200. In one of the infections studied, the transmission only involved fewer than 10 virions - possible as few as two.

The science behind these results is, in my opinion, pretty cool. But perhaps I am biased, since is is closely related to the type of research for which I have developed commercial software for the last 20 years. Anyway, the study was a collaboration of more than 30 scientists from several top level labs in Vienna and at Harvard, the MIT, and the Dana-Farber Cancer Institute.

The results show what many virologists so far only had suspected: that COVID-19 infections are usually caused by several hundred to thousand virus particles that are somehow transferred from an infected person to another person. Infections can also happen at a lower number of virus particles, but that appears to happen only rarely.

The data indicate that the "independent action hypothesis" of viral infections applies to COVID-19. It basically states that each virus particle that enters your body has a small chance of establishing an infection and causing disease (or, for COVID-19, an asymptomatic infection that can nevertheless infect others). The more virus particles enter your body, the higher the chance that you're getting sick. Here is a graph that illustrates this relation:

An important concept here is the "infectious dose". That is the number of virus particles that has a 50% chance of creating a "successful" infection. For COVID-19, that number seems to be somewhere around 500. We don't know the precise number - it could be 200, or it could be 1000, but it is very unlikely that it is below, say, 100 or above, say, 5000. Note that below this dose, the chance of infection is not zero, but rather declines in a near-linear fashion.

A bit simplified, the "infectious dose" (ID) means this: if you get infected with this many virus particles, the chance that you'll get COVID-19 is slightly higher than the chance that the COVID-19 virus fails to establish itself in our body. Get more than the ID, and you'll most likely get COVID-19; get less, and your chances of walking away healthy are better. That raises the question:

What are the chances of receiving an "infectious dose" of the COVID-19 virus?

To answer this question, we first need to look at how someone with an active COVID-19 infection passes the virus on to someone else.  We know that saliva and lung fluids of an infected person can contain a lot of virus particles - a typical number is 100 million virions per milliliter. That's about 200 thousand infectious doses (or, short, IDs)! Some of this fluid, and the virus particles in it, is emitted as droplets of various sizes when a person coughs, sneezes, speaks, sings, or just breathes - but the number and size of these droplets depends a lot on what exactly a person is doing. Coughing and sneezing can produce the largest droplets - perhaps the word "gross" fits (especially if you speak German). But we now know that most COVID-19 transmissions happen before symptoms appear, when coughing and sneezing cannot play a role. If you're interested in the details, I suggest you check this preprint and the references in it. I'll just summarize the results from a different study that combined knowledge about particle sizes, viral loads, and actual transmissions in five super-spreader events for different activities:

  • Breathing: ~10 IDs / hour (5-10-fold higher during hard exercise)
  • Talking:  ~500 IDs / hour
  • Singing: ~2,500 IDs / hour 

Yelling is similar to singing,  and loud speaking in between talking and singing. Note that these particles are in aerosols that can remain airborne for up to several hours.

Now remember that you want to avoid getting anywhere close to just one ID (that is 500 virus particles). If you are sitting really close to someone, you end up "sharing breath" with that person - a significant percentage of the air he breathes out you breathe in.  If you're close enough to someone in an area with little ventilation, you could "collect" one ID within a couple of hours, even if he is just breathing! Fortunately, "breath sharing" drops very quickly as the distance between two persons increases, and a distance of four feet or more is usually sufficient to reduce your "collection" to a small fraction of one ID per hour. But remember that this only reduces your risk of infection - it does not eliminate it!

But as you can see from the list above, things get a lot worse when talking or singing. When breathing, the droplets you emit are very small; but when talking, yelling, or singing, the droplets are a lot bigger. A 10-fold larger droplet has 1000-fold more volume, so a given number of droplet can contain 1000-fold more virus particles. But most of the "large" droplets generated by talking are still so small that the water in them evaporates very quickly in normal room air, shrinking the particles to sizes that can remain airborne for a long time ... just waiting for you to breathe them in.

With this in mind, let's consider going out to dinner with a couple of friends we have not seen in a while. We'll go to a restaurant and share a nice booth between four people, not knowing that one of them had gotten infected with COVID-19 a few days ago, and is now shedding the virus at a high rate. Someone is talking all the time, but we're nice and take turns, so the infected person is talking just one fourth of the time. After an hour, though, he has emitted about 500 / 4 = 125 "infectious doses". Let's say the booth is about 2 x 2 x 2.5 meters, or about 10 cubic meters. Without any air exchange, the concentration of infectious doses in the both air would be 125 IDs / 10 cubic meters, or 12.5 IDs / cubic meter. We breath about half a cubic meter of air per hour - so everyone on the table would have been exposed to about 6 infectious doses of the virus, making it very likely they'd get infected.

The unrealistic assumption in our example is that there is no air exchange. In reality, ventilation in typical building creates exchanges the air about 4-10 times per hour. That would drop the virus concentration in our booth by a comparable amount. But it would still leave each of the three uninfected people on the table "collecting" about 1/2 to one infectious dose, meaning it would be almost certain that at least one person would get effected. 

So far, we have looked at two parts of the "infection equation": how many virus particles are needed to infect someone with COVID-19, and how many virus particles (or infectious doses) an infected person "emits". But we also need to look at a third part: when an infected person emits virus particles, relative to when they were infected and to when they experience symptoms.

Wrong person, place, and time

While there are still plenty of uncertainties and gaps in our knowledge of COVID-19, scientists from all over the world have collected and published a lot of data that can guide our efforts. In some important aspects, COVID-19 is very different from other infections like influenza. Two of these aspects are the highly variable incubation time and the high variability of symptoms, which includes the frequent complete absence of symptoms. Other diseases progress is a well-defined fashion, for example a symptom-free incubation period of 3-4 days, following by a symptomatic period, with the highest infectivity a few days after first symptoms.

But for COVID-19, the incubation period can vary between 2 and 14 days, and multiple studies have shown that many transmissions occur before a patient has any symptoms, and from persons who never develop any symptoms, or only light, non-specific symptoms. In many super-spreader events, where one infected person infected dozens of others, the "super-spreader" had no or only mild symptoms. Based on these and other data, scientist now think that an infected person is most likely to infect others in just a relatively short period; in symptomatic patients, this is just before symptoms appear. 

If an infected person has close contact with many others during this short period of "maximum infectivity", he can infect many of his contacts in a short time - often within an hour or two. Being at the wrong place at the wrong time, and sharing space with the wrong person, makes it extremely likely to get infected. With many contacts, many get infected, as the graph below (from this publication) shows:

What does that mean? Let's go back to our "restaurant with friends" example from above. Since many COVID-19 transmissions happen before symptoms appear, it means very little that our friends do not have any COVID-19 symptoms. The high variability in incubation times means that they could have been infected just a couple of days ago, or two weeks ago, and just now reach their maximum infectivity period, during which we will get infected if we are close to them for an hour. But given the high shedding of infections virus particles when talking, it might not even be our friends who infect us - it could be someone sitting a couple of tables away, or someone who had our booth before us. Sure, the risk of infection is highest when you are close to an infected person, so you should keep at least a 6 ft distance from others whenever possible. But aerosol particles can quickly travel much farther than 6 ft, and any air flow from ventilation can help distributing them. With incubation times as short as 2 days, even someone who just was tested for COVID-19 two days ago may be infectious now.

Break the infection chain. It's not just about you - it's about the ones you infect, and they ones they infect, and so on.

Keep your distance. 

Wear your face mask. They work. We know why they work. The better it fits, the better it works - especially at protecting you. Perhaps a 20 or 50% lower chance of getting infected does not mean much to you - but if everyone wears masks (and follows other guidelines), it will drop transmissions by 50%,  and the epidemic will "just go away". It's not political. It public health, consideration for others - and the best way to get the economy back on track. Just ask Australia.

Thursday, November 26, 2020

Why COVID-19 Is So Hard To Fight

In this post, I will use one graph to explain why COVID-19 is so hard to understand, and therefore to fight. It shows the daily confirmed COVID-19 cases in the US, and the 5-day change in daily cases, since the end of April:

The blue curve at the bottom shows daily confirmed COVID-19 cases in the US. It shows an initial drop to about 20,000 cases, then the "summer rise" to about 70,000 cases per day in July, another drop to about 40,000 in September, and then a rise to 170,000 cases per day in November.

The red curve, which uses the y-axis on the right, shows an indication of the growth (or drop) in cases: the ratio of cases on a given day, divided by the number of cases 5 days earlier. When this ratio is below 1 (in the green section), case numbers are going down; when it is above 1 (in the light red section), the number of cases is increasing. 

The numbers are based on 5-day periods because that is the roughly the average time between getting infected, and passing the infection on to someone else. In scientific jargon, this is often called the "generation interval" or the "serial interval". One way to understand it is to remember that it typically takes about 5 days after infection for symptoms to start, and that the chance of infecting others is largest just before and just after first symptoms.

This means that the 5-day ratio is also very close to the "reproductive number", often called R. In practical terms, R indicates how many others, on average, each infected person infects. If each person infects more than one other person, the number of new infections per day grows; if each person infects less than one other person, the number of daily infections goes down. This can be easily seen in the graph, where the red curve is in the red area whenever the number of daily cases (the blue curve) goes up.

Now take a close look at the values we get for R. In May and August when case numbers were dropping, R was between 0.85 and 1. In the summer and fall periods where case numbers were increasing, R was above 1, but never higher than 1.3. Typical values around 0.9 in "dropping" periods, and around 1.2 in "rising" periods. The difference is quite small - and therein lies the problem! To understand why, we need to look at this from two angles: the "personal risk" perspective, and the "public health" perspective.

Personal risk: A small increase means very little

 When deciding what to do, in a public health crisis, the first question most people will ask is "What is the risk to me?" Depending on the answer, and on personal tolerance for risks, they may be inclined to change their behavior more or less. But regardless what exactly the answer is, everyone will have to accept a certain level of personal risk in the end.

After a few weeks or months of "being good" and, for example, staying away from restaurants and bars, the desire to go back to normal becomes stronger and stronger, and we start doing things again that are slightly more risky. That might be going to restaurants again; meeting with friends; going shopping; not wearing that face mask; or something else. But we'll generally decide that a bit more risk has to be taken. If we are young or healthy, we may well conclude a slight increase in risk still means a very low risk of getting seriously sick. Unless you're a statistician, you probably won't quantify the risk, but just about anybody would agree that a relative risk increase from 0.9 to 1.2 is so small that it's worth taking, if it means we can go back to the gym, the hair dresser, shopping, restaurants, or whatever strikes our fancy. If my personal risk was small to begin with, then even a 2-fold or higher increase in risk may well be worth it.

On a personal level, taking a bit more risk is a perfectly reasonable decision. This also is true if we consider others in our risk assessment, too - kids we send to school, other family members, or friends we meet.

Public health: Small risk increases have disastrous consequences

But what happens if everyone decides that taking a bit more risk is perfectly reasonable, and changes their behavior a bit? Say, for example, in a way that increases the risk of getting COVID-19 by just one third. What happens?

Let's assume we were in a period were new infections were dropping by 10% every 5 days, corresponding to R = 0.9. With 1/3 more infections now, R increases to 1.2: instead of a steady drop, we now have a rapid rise in new infections: 20% more daily infections after 5 days, and 44% more daily infections after 10 days (1.2 x 1.2). After a month of R staying at 1.2, the number of daily infections has grown 3-fold: just about what we saw in the US from October to November. A very small change on the individual level has caused a huge increase on the population level. 

What is a perfectly reasonable decision on a personal level becomes a public health disaster.

Small things are "driving the pandemic"

Currently, the US is just one of many countries that is failing to control the resurgence of COVID-19 infections. A common theme here is that many regions try to contain COVID-19 with a minimal set of measures, for example limited restaurant hours instead of full closures. Against many measures, an often-heard argument is that "X is not driving the pandemic". Various regions have used this argument to leave schools and colleges open, have restaurants operating with minimal or no restrictions, and so on. 

Taken literally, the arguments are correct insofar as that each individual "infection place" like schools or restaurants is not causing the majority of new infections. But even measures that eliminate just a small percentage of new infections can make a huge difference, and a few in combination can make the difference between a controlled epidemic with dropping infection numbers, and a rapidly growing, out-of-control epidemic. Therefore relaxing a few of such "minor impact" measures may well end up "driving" the epidemic from a "dropping" phase into a "rapid growth" phase. This problem is only made worse by halfhearted interventions, which drop R only just below 1.0. This means that case numbers will drop only very slowly, and rapid growth resumes quickly again after any relaxation.

Over the past six months, I have read several hundred scientific publications about COVID-19. Of all these, one of the publications that stuck to my mind the most was published by scientists from New Zealand. Apparently, it formed the basis of New Zealand's successful complete elimination of COVID-19 cases in the country. It listed a large number of interventions which were used in groups, depending on the current level of infections:

We can only hope that a similar rational approach will be used to control COVID-19 in the US and Europe over the next several months, until vaccines become widely available. Otherwise, we will see hundreds of thousands of additional avoidable COVID-19 deaths.

Monday, November 23, 2020

A Close Look At The Danish Face Mask Study

In the last week, a Danish research study that tried to study the effectiveness of face masks has gotten a lot of attention in the media. The study had been designed so that it should have shown a statistically significant effect if wearing a face mask reduces the risk of COVID-19 infection by 50% or more

Note that there are a lot of words in italics in the previous sentence. All of these are very important to understand what the outcome of the study really means - and what it does not mean. There have been many articles and posts that explain some of the shortcomings of the study, but many of these miss some very important points. Let's have a closer look, starting with the results.

Study results: Face masks reduce PCR-confirmed infections by 100%, and doctor-confirmed infections by 50%

When looking at a scientific study, the first thing to do is to look at the data. The important results are given in Table 2 of the study:

Let us start with the last two lines of the table (we'll spend plenty of time on the first lines later!). The second-to-last line shows how many study participants had a positive PCR test for the COVID-19 virus. This is the "gold standard" for diagnosis. A positive PCR test is required to be counted as a "confirmed case" in the US and most other countries. The study showed that zero people in the "Face Mask Group" had a positive PCR test for COVID-19. In the control group that did not use face masks, there were 5 confirmed COVID-19 cases.

So, based on the "gold standard" test, the use of face mask prevented 100% of COVID-19 infections in the study! 

If we go on to the last line, which shows the number of participants that have been diagnosed with COVID-19 by a health care provider, the picture changes a bit: 5 participants wore face masks were diagnosed with COVID-19, compared to 10 participants in the "no mask" control group. This means:

Judging by the actual diagnosis from health care providers, face mask use reduced COVID-19 by about 50%.

But that's not what the study claimed, some astute readers may point out. And this brings us to the first lines in the table which we have ignored so far, which describe the results of antibody tests and the "primary composite end point". Which warrants some explanation.

Reading through the paper and the 88-page long supplementary material carefully, we learn that the study heavily relied on "dip stick"-type antibody tests that the participants did at home. The tests work pretty much like pregnancy tests, except that instead of peeing on the stick, you have to put a couple of drops of blood on the stick; and instead of a "+' sign, a positive test gives two lines, as opposed to a single line for a negative test.

The study also sent all participants two swab kits for PCR testing, and instructed them to use the kit and send the sample to a lab for PCR testing if they should develop any COVID-19 symptoms. In addition, participants with symptoms where instructed to seek medical help.

The "primary composite endpoint" now takes the combination of PCR test results, antibody test results, and confirmed medical diagnoses. Any participant who is positive in any of these three results counts towards the "composite endpoint". Participants with a positive antibody result at the beginning of the study were excluded from the analysis.

Looking back at the results table, we see that the antibody results dominate the overall results. In the face mask group, the number of positive antibody results is more than 6-fold higher than the number of confirmed diagnoses. This raises an immediate red flag. One potential reason for this discrepancy is that some participants had asymptomatic infections. However, asymptomatic infections typically account for about 50% of all COVID-19 infections, and all symptomatic patients should have received a confirmation by PCR or from health care providers. Therefore, the number of positive antibody tests should only have been about 2-fold higher. This is a clear indication that the antibody results are possibly very wrong.

Scientists familiar with COVID-19 antibody tests will immediately think about false-positive test results. According to the study, the manufacturer indicated that 0.8% of tests will give a false-positive result; for about 2,500 participants in each group, that would be about 20 false positives. But let's ignore false positives for the time being, and look at a different issue: timing.

Timing is crucial. Timing is crucial.

Yes, timing is crucial in more ways than one for this study. Let me explain.

In the study, participants did a COVID-19 antigen test at the beginning of the study, and then again at the end of the study about 30 days later. Anyone who tested negative at the start, and positive at the end, must have gotten infected during the study period, right? Wrong! Very wrong!

As plausible as the "trivial" conclusion seems, it completely ignores what we know about how long it takes to develop antibodies. A brief visit to the CDC web page about antibody testing for COVID-19 shows that "Antibodies most commonly become detectable 1–3 weeks after symptom onset".  Let me illustrate this with a figure from a blog post on this topic:

In this study, it took about 10 days after the first COVID-19 symptoms before half of the antibody tests gave positive results, and about 2 weeks before close to 100% of the patients had antibodies. Another study showed slightly shorter times, but also showed that it took more than 4-5 days after symptom onset before antibodies were detectable. We also know that it usually takes another 5 days after infection before the first symptoms appear, and in some cases up to 2 weeks. This means that it will take more than 10 days after infection before antibody tests are positive.

In the context of the Danish study, this means that any participants who got infected within about 10 days before the study start gave a negative result in the first test, but most of them had a positive result in the second test. 

Things get worse when we look at a second timing effect: the change in COVID-19 infections in Denmark before and during the study. This is where it gets a bit more complicated. The reported number of confirmed cases peaked on April 9, just before the study started around April 15, and then decreased quickly. But testing increased rapidly after April 19, more than tripling by April 30. One way to eliminate the effect of testing availability and changes is to calculate the actual number of infections from reported COVID-19 deaths. This is shown in the next graph:

The study was done in 2 separate groups starting 2 weeks apart, shown by the blue and red shaded areas. The graph shows a very rapid drop in daily infections from about 2,000 per day to about 500 during the first week of the study, and further drops later. This means that at the start of the first study period, there was a relatively large number of Danes that had been infected in the preceding 10 days. They had not yet developed antibodies to COVID-19 when tested at the start of the study, but would test positive at the second test a month later. This means that the test would wrongly count many infections that occurred before the study began!

We can estimate the actual number of infections that happened in the 10 days before each of the 2 study periods, and compare it to the number of infections during the study period:

The numbers show that there were almost as many infections (22,886) in the 10 days before the study periods as there were during the 30-day study periods (24,943). Since the pre-existing infections could not be affected by face mask wearing, this created a major distortion, increasing the reported infections in the face mask group significantly.

The numbers shown above are calculated for the entire Danish population of about 5.8 million. The groups in the study were about 2,500 participants each, which we can use to calculate the expected number of cases for the face mask group and the control ("no mask") group in the study:

  • For the "no mask" group, the expected number of cases is 20, consisting of about 10 cases infected in the 10 days before the study period, and 10 cases infected during the study period.
  • For the face mask group, we also expect 10 cases from the 10 days before the study, but the number of cases during the study would be reduced by 50% to 5 cases, so we would expect a total of 15 cases in the face mask group.

 Given the design of the study, we would expect to see just 5 fewer cases in the face mask group than in the control group even if faces masks reduce the infection of wearers by 50%. The observed number of cases would be 15 in the face mask group, and 20 in the control group. The observed reduction would be smaller than the expected 50% reduction due to 2 effects:

  • the large drop of infections at the start of the study period
  • the fact that only antibody-tests, but not PCR tests, were done at the start of the study period (PCR tests give positive results much earlier than antibody tests, often within 2-4 days after infection)

Comparing actual and expected results

The calculations above show that we would expect 15-20 positives in the two groups, with just 5 cases difference between the groups. Note that we had to make some assumptions, for example about the fatality rates, and that the estimates may be off by a factor of 2 - but not much more.

In the study, the authors reported 10 cases of COVID-19 in the control group that were confirmed by health care providers, and 5 cases in the face mask group.  Diagnosis are generally only made for symptomatic cases, which are typically estimated to be about 25-50% of total infections; thus, there is very good agreement between the expected and reported numbers.

The number of positive antibody tests reported is slightly higher, between 31 and 37 for IgM and IgG. The difference between the face mask group and the control group is less pronounced than for the diagnosed cases, which is exactly what we would expect as the effect of including participants that had been infected before the start of the study, but who had not yet progressed to a detectable antibody response. The antibody numbers are roughly 2-fold higher than the expected numbers. Two issues that may have contributed to this difference are false-positive antibody results, and a higher infection rate in the study participants, who spent on average 4.5 hours outside their home each day, relative to the general population.

But while we are seeing good agreements between the expected and the reported numbers, the agreements we see are just qualitative. Due to the relative small number of cases, and the "contamination" from infection before the start of the study period, it is unlikely that the results reach the typically required levels of statistical significance. For that, the study would have had to be substantially larger, and ideally also have included a PCR test at the start of the study period.

What went wrong

When designing the study, the authors determined the number of study participants they needed based on an estimated infection rate of 2%, which was reasonable at that time. The authors also were looking for a relative large reduction effect of 50%; to see a smaller effect, a larger study would have been necessary.

However, by the time the study started, the interventions initiated by the Danish government had taken effect, and reduced the number of daily infections by more than 2-fold, with an additional 10-fold reduction by the time the study ended. The overall calculated infection rate for the combined study periods was less than 0.5% - roughly four-fold lower than what had been expected. This resulted in lower case numbers - number that are too low to give a statistically meaningful result. To some extend, the study became a "victim" of the success that Denmark had in containing the COVID-19 pandemic in the spring.

A second factor that contributed to the lack of a "clear signal" from the study was that the authors apparently did not consider the lag time between infection and the begin of a detectable antibody response. In their defense, the data that describe the timing of the antibody response were probably not available when the study was designed. Furthermore, the effect would have been significantly lower if the infection numbers had still been increasing, or at least stable. Nevertheless, the lack of any discussion of the "antibody delay" effect on the study results in the publication is somewhat disappointing.  With proper consideration of this effect, the data produced by the study are not only compatible with a 50% protection from wearing face masks - they actually are in agreement, even if they may not be "statistically significant" due to the factors discussed herein.

Friday, October 30, 2020

New Evidence That Face Masks Work

 In the last couple of weeks, several new scientific studies were published that provide solid evidence that face masks can reduce COVID-19 transmissions significantly. In this post, I'll briefly describe several studies, as well as other evidence from trends in the US.

Face mask mandates reduced COVID-19 growth in Missouri counties

One study that was published as a preprint compared the COVID-10 growths in 5 metropolitan regions in Missouri. Two of the regions, St. Louis City and St. Louis County, implement mask mandates in July, while the three other regions did not. Comparing the COVID-19 growth rates between the "mask mandate" regions and the "maskless" regions, the authors found significantly slower growths of COVID-19 cases in the mask mandate regions: 1.36% per day, almost 2-fold lower than the 2.42% observed in the maskless regions. This difference was significantly larger than the difference in growth seem in the weeks before the mask mandates were issued.

To see if this trend persisted after the time analyzed in the study, I downloaded per-county level data from Johns Hopkins University, and looked at the growth in COVID-19 cases from July 1 to October 27:

 The counties with a mask mandate (in blue) had about 3 to 4-fold more total COVID-19 cases at the end of the period than at the beginning; the counties without mask mandates showed about 10 to 12-fold growths. This indicates that the mask mandates cut COVID-19 infections to about one third.

Face masks reduce in-flight transmission of COVID-19 dramatically

A second recent study examined how well face masks work in flights. It reviewed a number of previous studies, including one where a single passenger in business class had infected 12 other passengers in the business class cabin:

The originally infected passenger ("patient 0") was on seat 5K, shown in red. The passengers she infected during the flight (shown in orange) were mostly seated behind her, and/or to the side. Several of the infected passengers were more than 6 feet away from patient 0. The flight happened early in the epidemic, and the infected passengers did not wear face masks. The review also lists several other flights with in-flight transmission before the use of face masks on flights became common.

In stark contrast to this "superspreader event" flight is a series of flights with Emirates that arrived in Hong Kong in June and July. Overall, 8 flights transported 58 passengers who were COVID-19 infected during the flight. The flights were 8 hours long, and had a total of 1500 to 2000 passengers, who were quarantined and repeatedly tested for COVID-19 in Hong Kong. The testing showed that not a single transmissions happened on those flights. Emirates had a strict face mask policy in place at the time of the flights, which was enforced during the flights by flight attendants.

The review also describes a number of other flights, both with and without mandatory face masks, which show that the use of face masks during flights dramatically reduces the number of COVID-19 transmissions on flights.

Evidence from face masks mandates in Germany

While face masks mandates were common in Germany from the early phase of the COVID-19 epidemic on, the dates when mask mandates were issues varied by location. Two separate studies examined the resulting differences in COVID-19 transmissions, with focus on Jena, a city that implemented mask mandates early. One study presents qualitative evidence that face masks mandates reduced transmissions in Jena. The second study used "synthetic control methods" and data from 401 German regions to derive quantitative estimate, concluding that face mask mandates reduced the daily growth in COVID-19 infections by about 40%.

More evidence from US regions and states

A number of studies have focused on US states or regions, and come to similar conclusions as the studies above.  Here is a figure from one of these studies that shows a drop in COVID-19 infections after the introduction of mask mandates:

The effect shown is not as pronounced as in some of the other studies, which may be due to varying levels of adherence to the mandates, especially since enforcement of mask mandates in many US regions is lax or non-existent. Sadly, the use of face masks has become politicized in the US. Many Republican governors have refused to issue face masks; Republican law enforcement has been reluctant to actually enforce mask laws; and many Republicans refuse to wear face masks. 

The effect of the "Republican refusal" can be seen when comparing which US states have the most COVID-19 cases over time. Here are a couple of screen shots from an illustrative animation:

In early June, the most affected states were a roughly even mix of blue and red states. But by late October, the picture had changed dramatically:

Now, Republican states dominate the distribution. Of course, other factors like early re-openings and resistance against new containment measures are also likely to play a role, but negative attitudes towards face mask wearing and face mask mandates are certainly a big factor in this development.

The evidence is very clear - face masks reduce COVID-19 transmissions and save lives. Scientists have a pretty good idea why and how face masks work. I have discussed why misguided "herd immunity" strategies won't work, using Texas as an example. For the US as a whole, just "letting it take it's natural course" would cost more than a million additional lives; when taking into account that death rates increase significantly when hospitals are overloaded, the number of additional deaths in the US would more likely exceed 2 million. Anyone who still believes that COVID-19 deaths are overstated in the US needs to have a close look at excess death calculations, which show that the official COVID-19 numbers represent only 2 out of 3 COVID-19 linked deaths.

So, please, if you go to an indoor space where other people are, or if you are outdoors in a crowd, or closer than 6 feet to someone else who does not live with you: wear a mask!

Tuesday, October 27, 2020

Misleading COVID-19 Information in Florida

This page describes a systematic pattern of misrepresenting information about COVID-19 by officials in Florida. These officials include the governor Ron Desantis, the governor's spokesman Fred Piccolo Jr.,  Florida's Surgeon General, Dr. Scott A. Rivkees, and Republicans in the Florida House.

A Red Flag: Is COVID-19 Becoming More Deadly in Florida?

 What started my investigation was a strange observation: based on reported COVID-19 confirmed case numbers and deaths, it appeared that COVID-19 is becoming more deadly in Florida than it has been during the summer peak. One way of looking at this is by looking at the relation between reported case numbers and reported death rates; since deaths are typically delayed by several weeks relative to test results, I am comparing death rates to case rates two weeks earlier, using 2-week averages for both deaths and cases:

While the time-adjusted case fatality ratio (CFR) for the US remained almost constant for the US between July and October, it increased  from about 1.3% to about 4% for Florida. This peculiar increase prompted me to look for possible explanations.

The Florida COVID-19 Dashboard: How to Understate COVID-19 Deaths

One of the first stops was Florida's official COVID-19 dashboard. The graphs on the right side that depict cases and deaths are interesting:

The top graph shows the new cases, which show an increase over the last month. The bottom graph shows COVID-19 deaths, and the immediate impression is that things must be getting a lot better - the graph shows a clear downward trend in deaths! Wonderful - but in direct contradiction to the increasing fatality rates we had seen in the previous figure. What gives?

The first hint comes from the title "Resident Deaths by Date of Death". That seems reasonable enough - until you read the fine print: "The Deaths by Day chart shows the total number of Florida residents with confirmed COVID-19 that died on each calendar day (12:00 AM - 11:59 PM). Death data often has significant delays in reporting, so data within the past two weeks will be updated frequently."

The key here is that "death data often have significant delays in reporting". That means that the numbers for the last several weeks understate the actual death substantially; the number for the last few days show only a small fraction of the deaths that actually occurred. But rather than stating this clearly, the fine print states that data "will be updated frequently". Perhaps understating the actual death toll may be a bad thing, but updating frequently must be a good thing, right?

But the Florida government had a reason to choose the "death by day" reporting: it will always show a positive trend in deaths, since there will always be fewer cases for the last few days. Anyone who looks at the graph without reading and understanding the fine print will always conclude that the COVID-19 situation in Florida is improving. Always. And who reads the fine print?

For an example, we can use the screen shots of the Florida COVID-19 dashboard that the COVID Tracking Project has captured.  Here is what the death graph looked on 8/2/2020:

Death by day 8/2/20

For comparison, here is what the graph looks like when plotting the number of new death reported:

That's a very different picture for the last two weeks of July! If we look at the screenshot of the Florida dashboard from 8/15, it gives a very different picture for these weeks:

Florida dashboard as of 8/15

Note that the cases around 7/20 now hover around 160 per day, instead of the 120 per day as reported on 8/2. For the beginning of August, we now see around 140 cases per day; two weeks later, this increases to 180 per day.

The bottom line is that the "By day of death" graph on Florida's COVID-19 dashboard will never show an accurate picture of the actual trends in recent weeks. It will always understate deaths for the last 2 weeks substantially, and show a decline of deaths in the most recent days. Given the observed reporting delays, the only apparent purpose of the death graph on Florida's COVID-19 dashboard is to mislead.

Even worse, the graph creates an incentive to delay the reporting of COVID-19 deaths. Early in the COVID-19 epidemic, Florida's board of medical examiners published data about COVID-19 deaths directly. However, when the government noticed that the numbers reported by the medical examiners where higher than the numbers reported by the state, the health department stopped the release of the medical examiner's list.  Afterwards, only numbers released by the Florida Department of Health were available, whenever the department chooses to include a deaths. When deaths are added with a 2-week delay, as was typical in the summer, it would help to create the impression that the worst problems were in the past. If a death was added more than 30 days after it happened, it would never show in the death graph on Florida's COVID-19 dashboard.

This created a strong incentive to delay death reports in Florida for anyone who wanted to downplay the severity of the COVID-19 epidemic. As a result, the reporting delays increased substantially since the summer:

But while the delayed reporting was welcome when it reduced the number of reported COVID-19 deaths in the summer, it is now creating a problem: eventually, the death have to be reported!

Killing Two Birds With One Stone: "Investigate All COVID-19 Deaths!"

On October 21, Florida's Surgeon General, who had remained surprisingly quiet during the COVID-19 epidemic up to this point, issued a press release stating that all COVID-19 fatalities reported to the state will be subject to a "thorough review". In addition to criticizing that some reports were more than 30 days late, he focused on 5 cases where more than three months had elapsed between the COVID-19 diagnosis by PCR test and the eventual deaths.
The issue was quickly picked up by governor DeSantis' spokesman Fred Piccolo Jr., who stated:
"What is different about the deaths, is that the health department was finding people who were admitted as positive as far back as March or April and who passed away in August or September or October. Is that a COVID death?”
Looking at the data in the Surgeon General's press release shows that Piccolo is stretching the truth beyond the breaking point. Questioning if someone who was diagnosed in March and died in October really died of COVID-19 seems reasonable, right? But the earliest test date listed by the Surgeon General was from June, not March or April - three months later. The longest elapsed time between test report and death was 111 days. While this is still a long time, it is shorter than times that have been reported for people who recovered from COVID-19, as a quick Google search shows:
  • A patient in North Carolina was released after 137 days in the hospital. Her complications which were directly caused by COVID-19 included a heart attack and kidney and lung failure.
  • Two men in Georgia were in the hospital for COVID-19 for more than 4 months. One of the two was released, the other is still in the hospital. 
  • A 35-year of woman in the UK was treated for 141 days in the hospital, which included 105 days on the ventilator.
The last case is interesting because the treatment happened in a hospital run by UK's National Health Service - a public health system that Republicans typically describe as "socialist".

Those are just some random samples from a quick internet search, and all of the listed patients survived. Scientific studies show that survivors typically spend less time in hospitals than patients who die; other studies report very long hospital stays, for example three patients with more than 50 days in a hospital in one early study from China (as well as two more patients who still were in the hospital after 37 days). Other studies show that hospital stays in the US tend to be longer than in China, and that a significant fraction of patients stay in hospital care for more than 40 days. Some patients get admitted to the hospital for COVID-19 multiple time. In one case in Belgium, DNA sequencing proved that a patient had been infected on two separate occasions from different people; this patient died from the second infection.
These examples show that there is plenty of both anecdotal and  scientific evidence of patients who require hospital treatments for several months, and that a small number of cases with a large time between diagnosis and death is therefore not suspicious. It is very likely that the investigations will come to the same conclusions, although it is extremely unlikely that the Florida government would announce such conclusions.

The Pattern: Create Doubt About COVID-19 Deaths

The Surgeon General's press released discussed above is just one of many examples where Republican politicians in Florida try to create doubt about the true number of COVID-19 deaths. A recent example is a "Florida House report" commissioned by Republican House Speaker Jose Oliva. The report says that "60% of death certificates issued for state residents whose deaths were attributed to COVID-19 had reporting errors and most were filed by medical examiners". It speculates that this "may be inflating the COVID-19 death toll by 10%".

Phased differently, the results could be phrased as "a close investigation looking for problems has found that 90% of the reported death are definitely due to COVID-19, with the remaining 10% possibly being due to COVID-19 or some other cause". But instead, the House Speaker, who has no medical background, talks about "compromised data". 

Note that the reporting about the issue starts with casting doubt on 60% of the death certificates. It is likely that many readers will remember this particular number, and few will remember than in reality, at most 10% of the death certificates are questionable with respect to COVID-19.

Another example of the "cast doubt" strategy is governor DeSantis'  mentioning of the death of a motorcyclist who had tested positive in an accident, and who was initially included on Florida's list of COVID-19 related deaths. However, even before governor DeSantis made the statement in an interview on July 20, this case had already been removed from the reported death counts. Nevertheless, this example is very "sticky", and comes up frequently in conversations with COVID-19 deniers.

 The Reality: Florida Reports Less Than 3 Out Of 4 COVID-19 Deaths

The is a simple number that really determines how deadly the COVID-19 epidemic is: the number of people who die in addition to the number who would die in a typical year without COVID-19. This number, called "excess deaths", can easily be looked up based on death certificate data that all states submit to the CDC, and which the CDC publishes on its web site.

Based on spreadsheets last updated on 10/21/2020, and looking at actually submitted death certificates from the weeks ending between 3/7 and 9/19/2020, we can compare the excess deaths to the number of death certificates that listed COVID-19 as a cause of death:

During these roughly 6 months, the number of excess deaths in Florida was 21,263 (note that this number will go up slightly in the next few months, since some deaths certificates are submitted with delays up to a year). Of these, 14,795 death certificates listed COVID-19 as a cause of death. This is about 69.6% of the excess deaths. The graph above shows that excess deaths and COVID-19 deaths follow the same pattern, which strongly indicates that the vast majority of excess deaths is most likely caused by COVID-19, and not some other cause like violence or suicide.

The data for excess death calculations are readily available. Excess death analyses have been published on many web sites, including the Financial Times and Our World In Data. Several scientific studies have analyzed excess mortality in the US, including a study recently published by the CDC. There is world-wide agreement on using excess mortality analysis to determine the impact of epidemics.

The result of excess death analysis for Florida is clear: the current process fails to correctly identify COVID-19 as a cause of death in 3 out of 10 cases. The COVID-19 reporting problem that Florida has is one of under reporting, not of over reporting. This could be addressed by requiring COVID-19 tests and, if necessary, autopsies for any deaths where COVID-19 cannot be excluded by clear evidence. 

Just don't wait for the governor or state Republicans to suggest that.