What Is the True Active Prevalence of COVID-19?

Authors: Mu-Jeung Yang, David Eccles School of Business; Nathan Seegert, David Eccles School of Business; Maclean Gaulin, David Eccles School of Business; Adam Looney, David Eccles School of Business; Brian Orlean, University of Utah School of Medicine; Andrew T. Pavia, University of Utah School of Medicine; Kristina Stratford, University of Utah School of Medicine; Matthew Samore, University of Utah School of Medicine; Steven Alder, University of Utah School of Medicine

Download the Paper

Estimates derived from random, representative testing in Utah suggest the true prevalence of COVID-19 across all 50 states is 2 to 3 times higher than publicly reported.

From May 4 and June 10, 2020, the state of Utah — through the Utah HERO project — embarked on an ambitious COVID-19 testing project that would ultimately result in the randomized, representative testing of 10,000 individuals across four counties.

Using data from the project, which tested for both antibodies and active infections to measure prevalence of the virus, researchers at the Marriner S. Eccles Institute in the David Eccles School of Business and the University of Utah School of Medicine were able to develop a model that tracks COVID-19 infections in real-time and ultimately predicts infections before they appear in public data.

A key parameter of the model is the likelihood ratio of symptoms for infected persons relative to uninfected persons based on response data collected in Utah. This allows the researchers to extrapolate from positive rates in symptomatic people to the positive rate of the underlying (symptomatic and asymptomatic) population.

Prevalence of COVID-19 in Utah and across all 50 U.S. states

Randomized viral testing in Utah estimates that the prevalence of COVID-19 in Utah from May 4 to June 11 was 0.23% with a 95% confidence interval and that at any point in time from May to June 11, average viral prevalence was 0.30%.

Given the method’s accuracy, the authors provide estimates for all 50 U.S. states and ultimately find that prevalence of COVID-19 (the number of people who have ever had the virus) is 2−3 times higher on average than publicly reported prevalence.

Additionally, comparing the time series of latent and reported prevalence, we show that the ratio of these two-time series is not stable. In other words, sample selection will typically be time-varying, which is an important insight for modeling purposes, see Yang et al. (2020).

How researchers conducted random testing in Utah

Between May 4th and June 10th, HERO project workers contacted 25,642 households across four counties in central Utah (Davis, Salt Lake, Summit and Utah Counties). To recruit a representative sample, they randomly selected households from a public list of 657,870 addresses using a stratified sampling approach. Addresses in the first recruitment strategy were contacted in person by a postcard, a letter, and follow-up from a field team.

Remaining addresses received a letter but were not contacted by our field team. All randomly selected households were encouraged to fill out household and individual surveys and to visit a testing bus to receive a PCR (viral) and serology (antibody) test. Individuals were compensated with a $10 USD gift card for completing the survey and being subsequently tested.

To make it easier for individuals to be tested, a testing bus was parked in the center of the geographically compact area we selected. Each area consisted of two or more adjacent Census tracts. To ensure the sample was representative, the authors stratified the Census tracts based on publicly reported case prevalence, the portion of the population identified as Hispanic, and the population’s median age.

For more on the sampling methodology, read the full paper.

Benefits of random, representative testing

While the benefits of randomized, representative testing are many, experts agree it’s the only way to truly know how widespread the disease is in our communities. Non-random testing—for example testing available only for those with symptoms or exposures—often excludes individuals who choose not to seek care or face barriers to seeking care, whose symptoms are too mild to be tested under current protocols, or who may be asymptomatic but contagious.

Even six months into the pandemic in the U.S., most testing is used for the sick and vulnerable and reveals only the tip of the iceberg of the infected population. Random sampling allows public health officials to see the entire infected population.

However because random sampling studies can quickly become prohibitively costly and organizationally unwieldy, the authors’ method makes it possible for other states to derive estimates of infection within their population based on lessons from Utah.

Download the Paper