The Color of Coronavirus: COVID-19 Deaths By Race and Ethnicity in the U.S

Country/area: United States

Organisation: American Public Media, APM Research Lab

Organisation size: Small

Publication date: 8 Apr 2020

Credit: Andi Egbert, Benjamin Clary, Gabriel Cortes, Craig Helmstetter, Kristine Liao

Project description:

“The Color of Coronavirus: COVID-19 Deaths By Race and Ethnicity in the U.S.” is the APM Research Lab’s ongoing nationwide COVID-19 mortality monitoring project, with a focus on racially inequitable deaths. The project combines original data collection, demographic data analysis, visualization and compelling narrative.

Impact reached:

Data sets requested by nearly 70 institutions of higher learning, medical doctors, health equity think-tanks, etc., including: UC Berkeley, Syracuse University, Los Angeles Economic Development Corporation, Duke University, Margolis Center for Health Policy, Rutgers University, National Women’s Law Center, Johns Hopkins University, Texas Health Institute, Native Hawaiian and other Pacific Islander COVID-19 Resource Team & Asian American and Pacific Islander COVID-19 Policy and Research Team, and more.

 [EA1]Kristine is tracking this figure

Techniques/technologies used:

We scrape and compile data on COVID-19 mortality by race and ethnicity directly from state health department websites. Data are derived/estimated in the case of states that only report percentages by races, not counts. We also compare the data we collect to the time-lagged and often suppressed CDC data on the same topic; in the few cases where the CDC data is found to be more robust (and in the case of the two states not yet reporting these data, North Dakota and West Virginia), we supplement our data file with the CDC’s offerings. Deaths of unknown race are removed for uniformity across states and importantly, population denominators are aligned with each state’s method of reporting ethnicity (whether overlapping or distinct Latino group) and race (alone, or alone or in combination with other races).

We then calculate mortality rates per 100,000 by race and ethnicity for all states and aggregate the states’ data to construct figures for the nation as a whole. Finally, for each release, we calculate the latest national age-specific mortality rates and apply it to each state’s population data by race and age, so that we can indirectly age-adjust the mortality rates. This accounts for varying age distributions across racial groups and states. This results in even larger mortality disparities observed between Black, Indigenous and other populations of color relative to Whites, who experience the lowest age-adjusted rates nationally. Age-adjusting also elevates the mortality rate for Latinos over its actual (crude) rate more than any other group—revealing that COVID-19 is stealing far more Latino lives than we would expect despite this group’s relative youthfulness. 

What was the hardest part of this project?

One challenge has been presenting such a wealth of data and complex methods such as age-adjustment in an accessible manner. We have sought to make our findings highly readable, highly visual and highly actionable for all who are seeking them out.

From the beginning, we were challenged by reporting discrepancies by state health departments. We contacted numerous states in the first two months of the work to request clearer or more comprehensive data, and our data advocacy succeeded in some cases. However, the lack of uniform reporting means is still challenging; e.g., some states do not uniquely identify Indigenous people or Pacific Islanders. Others report Pacific Islanders jointly with Asians.

Treatment of Hispanic ethnicity also varies across states and in some states, has changed over time. All of this unevenness in the data has required an extra degree of care by our team, to select and align the proper denominators at the state level to accurately calculate percentages (of total deaths and population share) and mortality rates. We have observed that other reporting projects that now exist on this topic have not exercised the same degree of caution. The result, however, is the most robust aggregated portrait of COVID-19 mortality by race and ethnicity at the national level.

The most significant challange has been bearing the weight of this knowledge and he trauma and tragedy that each one of these numbers represent. Feeling the human cost of these data has been difficult to process. 

What can others learn from this project?

Not only are jouranlists accross the globe using this in real-time to inform their reporting on the Coronavirus pandemic and health equity, and draw a fuller picture of key real-time events — journalists and media organizations thinking of doing this kind of work in the future can takeaway the impact and importance that this kind of quality meaning-making of complex data can have for audiences and issues of great world importance. This work helps democratize data that is available but is not accessible to the general population and connects dots that are not otherwise being connected in the collection of these data.

Project links: