Data Collection/Analysis and Covid-19: TL;DR Edition
Not to encourage people to skip the longer version, but here's a key point.

Let me emphasize a key point that I really wanted to make in my previous, much longer post. It comes from a FiveThirtyEight piece (Why It’s So Freaking Hard To Make A Good COVID-19 Model) I linked therein:
Numbers aren’t facts. They’re the result of a lot of subjective choices that have to be documented transparently and in detail before you can even begin to consider treating the output as fact. How data is gathered — and whether it is gathered the same way each time — matters.
There’s also the issue of uncollected or inaccurate data. To determine the fatality rate, you have to divide the number of people who have died from the disease by the number of people infected with the disease. In this case, we don’t really have a reliable count for the number of people infected — so, to put it mathematically, we don’t know the denominator. (If we’re being honest, we probably don’t know exactly what the first number — the numerator — is, either, but we’re assuming it’s closer to correct.)
In other words not only is there a lot we don’t know, but even the number of deaths as currently reported is an artifact of uncertainty.
The real death tally is the reported tally +/- some level of error (which, of course, is true about the annual flu-related death rate, or anything else that requires judgment calls and/or has to account for human error).
The question at the moment is: what is more likely in these conditions? An over-count or an under-count?
I would argue that an under-count is more probable. First and foremost because of the lack of adequate testing. Second, this is a new phenomenon (unlike the flu) and there is, therefore, no experience with making decisions about how to classify morbidity (and this also raises consistency problems in terms of coding deaths). Third, we are placing a lot of stock in instant counts, but the reality is that in the middle of crisis we should expect some communication errors and lags.
On that last point let me note: a lag in communication cannot lead to an over-count, it can only lead to an under-count.
Another problem with getting a good sense of the available data is that there are various time-horizons in operation here. California is one clock, NYC in on its own clock, and Louisana yet another. It is difficult to really assess the effects of various policy choices right now. For example, Florida’s stay-at-home order is only just over a week old as I write this. (And their tests are lagging, as I noted in a link my previous post).
Fundamentally, I would argue that the criticisms of the estimates of the death toll are asserting far too much certainly prematurely because they aren’t thinking through both the quality of the data at the moment nor the incompleteness thereof.
Vaccine available in September? I can’t find much of anything but I just saw something saying that researchers at Oxford are 80% sure they have a vaccine and it would be ready sometime in September.
Here is the link, seems real:
https://www.bloomberg.com/news/articles/2020-04-11/coronavirus-vaccine-could-be-ready-in-six-months-times
@senyordave:
Fingers crossed.
@senyordave:
Related to that:
Bill Gates is funding new factories for 7 potential coronavirus vaccines, even though it will waste billions of dollars
Bill Gates is building factories to mass produce seven vaccine candidates, so that as soon as testing concludes the best one can be made immediately available and just accepting the other six as a loss.
Getting overly focused on the lack of accuracy of models is a failure to appreciate their purpose – which is guiding policy, which they can do without being accurate.
My last job consisted primarily of building financial models. One of the people I built models for swore by my work, and he used to say that all my models were wrong, but much less wrong than anybody else’s models.
@Stormy Dragon:
Good on him.
In truth, that’s what for-profit pharmaceutical companies do, too — they fund many drug developments, in hopes that the one success in the bunch will pay for all of the losers. Fewer than 10% of drugs entering Phase 1 trials eventually get approved — and quite a few drugs never make it even to Phase 2.
@senyordave: from that link
So, if they go straight to manufacture of an untested vaccine, and nothing of any sort goes wrong while moving to mass production, and they get lots of luck, then maybe in the fall if her 80% self-confidence is actually true.