Still Puzzled

Comments on yesterday’s post convinced me I’m not crazy and that I haven’t suddenly fallen into a spell of innumeracy. Some readers suggested that the discrepancies I couldn’t figure out might be due to “testing adjustments.” I looked into that.

Remember, my cumulative case numbers were very close to what was on the state’s website. That’s to be expected, since they come directly from the CA DHHS database. The top line of the table above has the calculation for the new metric, cases per 100K, from the state, and the second line has my calculation. The numbers for LA are pretty close – 12.7 vs. 13.1, but the numbers for OC and SD are out of the ballpark. The state’s number is 33% less than mine for OC, and 27% less for SD.

Is the difference due to a testing adjustment?

I get my cumulative testing numbers directly from each county’s public health department. They’re the only 3 numbers left that I have to put in manually. Because there is so much daily fluctuation, I calculate a daily 7 day moving average, on the 4th line. From that, I can get a daily testing rate per 100,000 people.

Here’s where it gets strange. LA County has a testing rate much higher than the other counties. OC’s testing rate is 36% lower than LA’s, and SD’s rate is 35% lower than LA’s. But LA’s case rate per 100K from the state is 3% higher than a straight calculation, while OC’s id 33% lower, and SD’s is 27% lower.

Is the “testing adjustment” a penalty for doing more testing? I would find that hard to believe.

In other words, I’m no closer to understanding this than I was before. If anything, the adjustment should go the other way. A county that does more testing should get a downward rate adjustment, if that’s what accounts for the difference between a straight calculation and a modified one.

What is “data transparency”?

Almost every epidemiologist says that this is paramount in a pandemic. The public must be able to trust your numbers. As a bulletin from the W.H.O. puts it:

The reality is that most measures for managing public health emergencies rely on public compliance for effectiveness. Measures ranging from hand washing to quarantine require public acceptance of their efficacy, as well as acceptance of the ethical rational for cooperating with instructions that may limit individual liberty so as to protect the broader public from harm. This requires that the public trust not only the information they are receiving, but also the authorities who are the source of this information, and their decision-making processes. 

Data transparency doesn’t just mean publishing the numbers. It also means revealing exactly how the numbers are derived. As the WHO bulletin puts it:

The second dimension to transparency aims to promote trust between the public health authorities and the public by being forthcoming and open on all aspects of an emergency, including the evidence and assumptions used by authorities in making decisions, the manner in which those decisions are being made and by whom.

We’re not talking about nuclear physic here. We’re not even talking about, at this point, the mathematics in establishing R0 for covid-19. This is simple elementary school arithmetic. You need a population number. You need a daily case number. You need whatever numbers you’re going to use to “adjust” the result. Addition, subtraction, multiplication, division — that’s it.

Instead, what we have unfortunately had is a confusing jumble of calculations that went into key metrics. Until last weekend, we had a calculation based on a 14 day period, offset from the current day by 3 days, divided by the population, multiplied by 100K, based on cases from the date symptoms were first reported, etc., etc. The database of cases based on first symptom date wasn’t made available to the public. Unless you had access to it, you just had to take the state’s word for it.

Now we have a new metric, much simpler than the old one, but just as puzzling. For the metric to be truly effective, it has to meet three conditions: a) it has to be based on publicly available data; b) it has to be reproducable; and c) it has to be understandable by the general public.

Here’s an example. Suppose the Zorgi Public Health Dept. wants to communicate what it would take to achieve herd immunity. They publish the number of cases, and then they use an article from the NCBI to explain the calculation:

we will estimate Rt, and we can do this by applying the exponential growth method,4 using data on the daily number of new COVID-19 cases, together with a recent estimate of the serial interval (mean = 4.7 days, standard deviation = 2.9 days),5 at a 0.05 significance level, with the mathematical software R (v3.6.1.). Using these values of Rt, we can then calculate the minimum (‘critical’) level of population immunity, Pcrit, acquired via vaccination or naturally-induced (i.e. after recovery from COVID-19), to halt the spread of infection in that population, using the formula: Pcrit= 1-(1/Rt). So, for example, if the value of Rt = 3 then Pcrit= 0.67, i.e. at least two-thirds of the population need to be immune.

The question is, then, is herd immunity an effective metric? Only partly. It meets the first two conditions, but not the third. Only a small percentage of people would understand the above quote, although most of them certainly understand that herd immunity won’t occur until 60% to 70% of the population needs to be immune.

Some argue that Rt should be the primary metric. While I agree that Rt is extremely important, it’s not as effective as measurements like daily cases, daily fatalities, and daily hospitalizations, simply because it requires advanced math and statistics to calculate. If you don’t believe me, go to’s covid model and see if you think the average person could understand it.

I’m not arguing that a public health department shouldn’t use any complex metrics. Of course they should use something like Rt. But they also need to rely on metrics that satisfy all three conditions above. If one of your metrics is cases per 100,000 people, then it’s incumbent upon you to publish all the data that goes into that calculation, as well as your methodology. It’s just like you were told for your homework assignments: “show your work!”

And let’s be very clear about one point: I am not saying the CA public health departments shouldn’t be trusted, or that they’re trying to pull one over on us, or that this is some big conspiracy to hide the truth. Nothing of the kind. I’m suggesting an improvement in communications – that’s all.

This is not fundamentally about epidemiology, or I wouldn’t be opining on it. It is about effective communications, and I do have 40 years of experience in that field. If you’re in business, which I was, you have exactly the same challenge. Whether it’s your employees, your investors, or your customers, the metrics you use to define your operations must meet the same conditions.

QAnon, Trump, and the CDC

Yesterday, Twitter had to remove a claim from a QAnon supporter who completely distorted the CDC Report on covid-19 deaths. Of course, Trump retweeted it before that, evidently having nothing more important to do than tweet misinformation. In it’s weekly report in the section on comorbidities, the CDC said, “For 6% of the deaths, COVID-19 was the only cause mentioned.” Right wing media jumped on this with their typical distortion of reality. They multiplied 6% x 165,000 deaths, and came up with 10,000 – supposedly the “true” number of deaths from covid-19.

This fatuous trick was tried all the way back in May by SD County Supervisor and trumpster Jim Desmond, claiming there were only six “pure” covid deaths. It relies on the idiotic supposition that in order to count something as a cause of death, it must stand alone, with no comorbidities. Anyone who uses this argument has probably never seen a real death certificate, because if they had, they would know that there are four lines describing the cause of death: the first, which is the final disease or condition resulting in death, and three more, which list comorbidities.

Some people do know this, and still argue the nonsensical point. These are just propagandists, who rely on ignorance and mass confusion.

Election Day is in 62 Days

Sick of the pandemic and ready for a change? Your vote counts, no matter where you live. So plan now: check your registration, make sure your family and friends do that, and motivate others to save our country. And don’t wait until the last minute to drop your ballot in the mail!

Also, here’s another great site where you can track the status of your ballot:

Finally, I’m showing more of the county charts from, which will become my new source of data for every level from the county on up as soon as we can finish the scripts to load the master spreadsheets. I’ll continue to add the line for the date when CA started relaxing its lockdown – June 19.

Stay safe, be healthy, and wear your masks!

Infection Rates (Rt)

Last updated 8/31/2020. Each data point is a 14-day weighted average. We present the most recent seven days of data as a dashed line, as data is often revised by states several days after reporting. Learn more about our methodology and our data sources.

County Dashboards

Leave a Reply

Your email address will not be published. Required fields are marked *


This site uses Akismet to reduce spam. Learn how your comment data is processed.