I always seem to get a couple of people who post a TLDR reply. So here’s my TLDR for them: this post is definitely not for you; come back another day!
Wow, yesterday was quite a day, wasn’t it? I don’t think I ever had that much activity around a daily post. Sure, there were some trolls, some rude people, some obstinate ones — but the vast majority, 95% really, added comments that were an absolute joy to read. So thanks to all of you. I tried to answer every one, but I know I missed some, for which I apologize.
In yesterday’s comments, /u/daltibud asked me if I had detected any changes as a result of data no longer being sent directly to the CDC, but to HHS instead. He pointed me to the David Packman show and to some others like David Hester and Philip DeFranco, who made the argument that the White House was “cooking the books.” He made it clear that he took all this with a grain of salt; he just wondered if there was any truth to it.
That piqued my interest. After all, if that could be shown clearly in the data, it would be a very big deal. It would basically destroy the credibility of the HHS and the CDC, just about the last thing we need now. [yes, I know a lot of you think the CDC’s credibility is already destroyed, but I’m not going to get into that again]
DeFranco’s and Hester’s arguments are primarily based on case data, so that’s what I downloaded from covidtracking.com, just as I do for 7 states every day. Only this time, I downloaded case data from June 1 to the present for 23 states: AL, AZ, CA, FL, GA, IL, KS, LA, MD, MI, MS, NC, NH, NM, NV, OK, OR, SC, TN, TX, WA, and WV. Twelve of these states are led by Republican governors; eleven by Democratic governors.
At first blush, it appears that they may be right.
This chart shows the daily case total for red states and blue states. The downward curve after July 17 is certainly much sharper for red states than for blue ones. Blue states went from 21,188 cases per day to 13,835 cases per day, a 35% drop. Red states went from 43,075 cases per day to 30,093 cases, a 30% drop.
While that’s not a huge difference, one could make the case that it’s not “normal.” To see what’s going on, we really have to look at each of the 23 states.
You may have noticed some anomalies with some of those charts. For example, Kansas appears to be missing a lot of data. Washington has a day where they had -500 cases. I can only say that I took the data as-is from covidtracking.com, without any attempt to massage it or fix it.
Kind of hard to come to a firm conclusion from all that, isn’t it? They all have the same polynomial trend line. They all have July 16 clearly marked, i.e., the last day data went to the CDC directly. Eyeballing the trend lines, I classified each state into one of five categories:
- A steep decrease in cases after July 16
- A mild decrease in cases
- Mostly flat
- Mild increase
- Steep increase
I then placed each state into the appropriate category:
The blue states more or less occupy the middle ground, while 4 red states have steep decreases, but two of them have steep increases.
Bottom line: I don’t think the data support the argument of David Hester and Philip DeFranco, which is why I didn’t link to them. If you’re curious, of course you can Google them on your own.
While this may not be a full-fledged conspiracy theory, to me it borders on that. I don’t like conspiracy theories, no matter who they come from or which side they support. They give a false picture of reality, and that’s the last thing we need in a pandemic.
Also, if you really think about it, it would take a LOT to keep something like this a true secret. It’s possible, but unlikely. There are thousands of people handling case data from every state. And while there are certain states that have really bungled the data [ look at the Kansas chart, for example], to manipulate case data based on the political party of the governor would require the cooperation of too many people to keep it a secret. There’s an interesting paper on the statistics and probabilities involved with conspiracy theories if you’re interested.
Is a single event enough to establish correlation?
“In the wake of the recent mass gathering Americans have witnessed in the streets of Portland and Seattle, we are also tracking a significant rise in cases in both metropolitan areas because of what’s been going on”Trump
press briefingcampaign briefing, July 28, 2020
There are also “neutral” charts popping up on the Internet. Here is one from /r/dataisbeautiful – a sub I really love, by the way.
The Oregon Health Authority reported that demonstrations were not a big factor in the COVID19 spike. I don’t want to rehash the whole argument about whether BLM protests caused a spike — we’ve had enough of that! The point here is, how reliably can you take a single point in time, compare it to a series of data, and declare that there’s a correlation between the two?
As you may have guessed, this was also the central question involved in the first section of this commentary, i.e., does the data show that HHS “cooked the books”?
Typically, when you’re looking for correlations, you have at least two sets of data. You can plot them on a chart, or figure out if they’re correlated from a statistical calculation. You also need enough data to make a meaningful inference.
There’s an excellent overview about correlation from the National Institute of Health about the common pitfalls in correlation techniques. I won’t repeat all of them here, but two of the most relevant ones for this question are a) you need enough data points in each set; and b) you have to be very careful in discussing correlation, lest you imply causation.
In this case, the death of George Floyd is a single event in time. That set of data contains one element. Daily cases constitute around 60 elements in the second set. Can you make a meaningful implication of correlation with 2 sets of data, one set containing a single element, and the other containing 60?
An interview with Multnomah County health officials demonstrates why this is problematic. Dr. Jennifer Vines explained that family and social gatherings were happening a lot more. Worksites were reporting more cases. People were lax with friends and family, and spent more time with them. She did say that only a handful of people connected to the protests had turned up sick. In Lansing County MI, officials traced 138 cases to a single bar. Scientific American has a very good article about “superspreader events” that drive 80% of the spread of COVID. What do these events have in common? They occur indoors, they involve lots of people, and the people are close to each other for longer periods of time.
The bottom line is this: unless someone can verify the actual setting for an infection, don’t take for granted any assertion that this or that event “caused” or “had a major effect” on case volume. Even if you don’t use the word “caused,” it’s easy to phrase your thesis in a way that directly implies causation.
I remember a long time ago my stat professor made us look at data for stork populations and deliveries outside of hospitals. You guessed it, there’s a correlation!
Understanding COVID is a journey in the accumulation of knowledge. When there is little knowledge of COVID in particular, but lots of knowledge of the potential disastrous effects of a pandemic, health officials have to assess risk with little direct evidence for causality. In February and March, for example, they didn’t know exactly which situations caused outbreaks; they just knew from previous epidemics that limiting human contact was a prudent thing to do. The countries that minimized risk are the ones that in general have far lower death rates than the ones that disregarded risk.
As knowledge builds, so does understanding of causality. For example, we know now that superspreaders are a particular risk. We know what kind of settings are most conducive to superspreaders infecting others. Causality, which in March was behind an opaque wall, is now revealing itself. And it behooves us to do what’s necessary to reduce superspreader events as much as possible.
On the other hand, we also know now that outdoor activities, especially where there is constant movement and little or no physical contact with others outside the immediate family, are not conducive to the spread of the virus. So in my opinion (again, I am not an epidemiologist!), it makes sense to allow people to walk on the beach for exercise, or to sit in the sun there, as long as they’re separated from other people.
Social risk can’t be evaluated as an abstraction. All risk avoidance has a social cost, as does risk complacency. If we evaluated risk from an absolutist point of view, we’d have to stop driving anywhere. We could eliminate COVID all together if we could drive risk to zero, but the only way to do that would be to convince everyone in the entire world to quarantine for about four weeks.
I’ll finish up by saying that I’m not going to answer any more replies from people who want to argue that BLM protests caused more COVID cases, or that they’re the same as indoor megachurch events.
A couple of people asked about Orange County yesterday, and since it’s the home of one of the people we should remember, Parker Green, a promoter of the weekly “Saturate OC” events that follow much the same guidelines as Sean Feucht’s event in Cardiff.
OC’s Zorgi Score has definitely gone down. Here’s why:
The positivity rate has more or less flatlined. It’s too high, at 13%, but at least it’s not going up. Cases are showing a slight decline.
The HUR has pretty much stabilized at around 10% to 12%, and the IUR has declined very slightly. The daily patient load has declined slightly from 800 at the beginning of July, to around 700 now.
Fatalities are higher than they were a month ago, but it’s not a huge increase – from 2 a day to 3 a day. More importantly, the fatality doubling days looked like they were leveling off or even declining a few days ago, but now they’ve started going up again.
OK, by now you’re probably as tired as I am!
Stay well and healthy, wear your masks, and try to talk to one person today who hasn’t voted and convince them to vote!