Overview
It is clear from US data [1, 2] that fewer people who test positive for the live virus in the US are dying today than was the case early in the pandemic. Can we credit improved Covid-19 treatment? Is this due to the fact that we are testing more people or perhaps that those who are getting tested are less ill than was the case early in the pandemic? Is this related to the sex, race and ethnicity of those infected? Are viral mutations responsible? Is this the result of people of different ages taking different precautions against getting infected? While there are numerous possible reasons for the fact that fewer people who test positive are dying, in this study we investigate only the impact of the ages of the people infected with the virus on the Covid-19 specific mortality rate.
More specifically, this study documents a methodology for calculating the local mortality rate for people infected with Covid-19. The local mortality rate is based on the ages of the people testing positive and dying. We first estimate the mortality rate for age groups spanning 1 decade, then apply those estimates to US regions to yield the estimated local mortality rate vs. time and place since the start of the pandemic.
Estimating Age Dependent Mortality and Local Mortality Rates
As described in the ‘Model Description’ section of the Covid-19 Model page, daily deaths divided by the mortality rate yields the estimated prevalence () of the virus. Here, the mortality rate is the rate at which people infected with Covid-19 die. Estimates for the mortality rate vary between 0.001 and 0.03 [3, 4, 5] around the world. A pre-release paper [6] based on a study in Geneva, Switzerland estimates that the mortality rate for an infected population of all ages is 0.0064. Somewhat arbitrarily, we adopt 0.006 as the nominal mortality rate. By nominal mortality rate we mean the mortality rate for a given population when people of all ages in that population interact as they did pre-Covid-19.
To estimate the impact of age on the mortality of a population infected with the virus we subdivide the population into age groups spanning 1 decade from age 0 through age 89. In this study we add people aged 90 and above to the 80 to 89 age group, not in any way as a value judgment, rather to simplify calculations and graphs.
Local mortality rate is defined as:
(1)
where, through
are the fractions of the population in the decadal age groups 0 through 80 who are exposed to the virus for a specific time and place (in our case a US region), and
through
are the mortality rates for the decadal age groups 0 through 80 independent of time and place.
When considering the local mortality rate imagine a Covid-19 exposed population comprised of age groups in the fractions given by through
whose composition varies with time and place (in this case regions of the US). Said another way, at any time and place different age groups are behaving differently relative to Covid-19 exposure and this is manifested in the ages of those testing positive for the live virus.
Continuing, nominal mortality rate, , is defined as:
(2)
where, through
are the pre-Covid-19 fractions of the population in the decadal age groups 0 through 80 independent of time and place.
Table 1 lists the number of people and fractions of the population () in each decadal age group in the US.
Age Group | Number of People | Percentage of Population |
0 to 9 | 37518750 | 0.112 |
10 to 19 | 43211250 | 0.129 |
20 to 29 | 46057500 | 0.137 |
30 to 39 | 45798750 | 0.137 |
40 to 49 | 41917500 | 0.125 |
50 to 59 | 43987500 | 0.131 |
60 to 69 | 39588750 | 0.118 |
70 to 79 | 24063750 | 0.072 |
80 and over | 13196250 | 0.039 |
Table 1. US population by decades of age.
With adequate data through
also vary with time and place. Here they are constants.
Again picking on the 80 and above crowd, we rewrite equation (2) as:
(3)
To estimate the mortality rate by age group we make a key assumption: the ratio of the mortality rates of any two age groups is proportional to the ratio of the death rates of those same two age groups. Here, the death rates the number of people who die from Covid-19 divided by the number of people who test positive for the live virus. Following is an example equation for one pair of age groups:
(4)
where, is the number of deaths [7] for a particular age group through the whole of the pandemic and
is the number of confirmed positive tests [8, 9 , 10] in the same age group through the whole of the pandemic.
After calculating we solve algebraically for
through
.
Note that deaths are reported for the following age groups: under 1, 1 through 4, 5 through 14, 15 through 24, 25 through 34, 35 through 44, 45 through 54, 55 through 64, 65 through 74, 75 through 84, and 85 and over [7]. Positive tests results are reported for the following age groups: 0 to 4, 5 to 17, 18 through 49, 50 through 64, and 65 and over [8, 9]. Where available data crosses decadal boundaries we distribute deaths and positive test results uniformly by year and move them into the appropriate decadal age group. One result of this process is that certain decadal age groups share the same number of deaths and positive test results even though this is certainly not the case. We need more granular, or at least consistent data, to improve the effective mortality rate estimate for any given time and place.
The assumption that the ratio of mortality rates for two age groups is proportional to the ratio of death rates for two age groups implies that people of all ages infected with the virus get tested at the same rate. For example, if 1 in 5 40-year-olds who are infected with Covid-19 get tested and test positive then 1 in 5 people of all other ages infected with the virus get tested and test positive. This assumption is surely not valid across all age groups. Nonetheless, we proceed with the expectation that the resulting age, time, and place-variable effective mortality rates will yield a better estimate of the prevalence of the virus than would a single mortality rate for all ages.
To calculate the local mortality rate vs. time and place all that remains is to calculate fraction of the population in each age group exposed to the virus ( through
) vs. time and place. We use positive test data to determine
through
[8, 9, 10] as follows:
(5)
Note that in the death-based the local mortality rate in combination with daily deaths will determine the estimated prevalence of
Covid-19 (). By using age group dependent positive test data to calculate the fraction of the population in each age group that is infected, we once again make the inaccurate assumption that people of all ages get tested at the same rate. We need better data to improve our estimate of the local mortality rate.
Results
The methodology describe above yields the estimated decadal mortality rates listed in Table 2. These decadal mortality rates are based on all US Covid-19 deaths and confirmed cases through a particular date. We update the decadal mortality rates weekly when new age-based data is published and then use the updated mortality rates in our model as constants vs. time and place. For the last several weeks these mortality rates have changed by at most +/-25% for persons 49 and below and less for persons 50 and above. Note that these estimated mortality rates are anchored to the nominal mortality rate or mortality rate across a all ages in the population. Here the nominal mortality rate is equal to 0.006. The age group specific estimated mortality rates in Table 2 scale linearly with the nominal mortality rate.
Age Group | Estimated Mortality Rate |
0 to 9 | 0.00006 |
10 to 19 | 0.00010 |
20 to 29 | 0.00019 |
30 to 39 | 0.00055 |
40 to 49 | 0.0014 |
50 to 59 | 0.0042 |
60 to 69 | 0.0112 |
70 to 79 | 0.0262 |
80 and over | 0.0496 |
Table 2. Estimated mortality rates for the US by decades of age based on deaths data through 20-07-15.
To calculate the local mortality rate first calculate the Covid-19 exposed population fractions by age, time and place. The next 11 graphs present exposed population fractions at the time of writing. Recall that, for better or worse, we equate the exposed population fractions to the fraction of confirmed cases in a particular age group (5). The graphs below are labeled accordingly to make clear the origin of the numbers. The thin lines in the graphs below having constant values versus time represent the nominal fraction of the population for the age group with that same line color. Values prior to the end of March, 2020 are highly suspect due to limited test data.
Finally, the following graph presents the estimated local mortality rate (1) in the US as a function of time and place.
Additional Work
We suggest additional work by posing the following questions:
- How does Covid-19 prevalence actually vary with age?
- Are confirmed cases proportional to Covid-19 prevalence with the same multiplier for all age groups?
- Is there additional data available to improve the estimates presented here?
- How do local mortality rates change if we find a way to account for differing mortality rates vs. sex, race and ethnicity?
- What are the local mortality rates for other countries?
- At the date of writing, schools have reopened in the US with differing Covid-19 mitigation approaches. It seems that children are now exposed to Covid-19 at least as frequently as other age groups. Does the difference between the nominal age group fractions and the fractions of confirmed cases by age group tell us something about the variation in prevalence by age? We strongly suspect that it does and that Covid-19 is even more prevalent (
) in the US than presented in our Data and Projections. Refer to COVID-19 Projections Using Machine Learning [11] for a different assessment of prevalence.
- How have treatment improvements changed local mortality rates? We have made an attempt to answer this question with publicly available data and we plan to document and share those results at a later date.
Additional Assumptions
Deaths and confirmed cases: Covid-19 death data from the Center for Disease Control (CDC) [7] lags reporting deaths vs. the Center for Systems Science and Engineering (CSSE) at John Hopkins [1]. In addition, the two sets of deaths data differ slightly when compared daily. To deal with the CDC reporting lag we use CDC age dependent deaths data only over a period of time that CDC deaths reasonably match CSSE deaths. At the time of writing that date was 20-07-15 (see Table 2).
Of more consequence is the fact that CDC age dependent positive test results data (aka confirmed cases) are substantially lower than confirmed case data reported by the CSSE. Our methodology for estimating local mortality rates requires age dependent Covid-19 data along with the best available data for the ratios, . Only CDC data is age dependent. To proceed we assumed that CDC data accurately represents the age dependence of deaths and positive test results even if it under reports both. We then scale the CDC deaths and positive tests equally by age group to match the daily age independent CSSE data. Having done so we believe that we are able to generate a reasonable estimate for the ratios,
, and thereby reasonable estimates of age dependent and local mortality rates. We would welcome better data and methodologies.
Lab-test data: The CDC reports positive test results by age group for public, commercial, and clinical test labs [8, 9, 10]. Only the commercial lab data is fully subdivided by age and US region. The public labs data reports positive test results by age group for the US as a whole and positive tests by region independent of age. In an effort to get the largest, most representative data set, we distribute the public labs regional positive test data into age group buckets in proportion to the public labs US positive test data. The clinical labs positive test data is reported for the US as a whole and is independent of age. So as not to overlook any positive test data versus time, we distribute clinical labs positive tests between age groups and regions in proportion to the age group and region distribution found in the commercial lab data. Once again, we would welcome better data and methodologies.
References
[1] https://raw.githubusercontent.com/CSSEGISandData/COVID19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_deaths_US.csv
[2] https://raw.githubusercontent.com/CSSEGISandData/COVID19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_US.csv
[3] Early estimation of the case fatality rate of COVID-19 in mainland China, https://pubmed.ncbi.nlm.nih.gov/32175421
[4] Estimating the Global Infection Fatality Rate of COVID-19, https://www.thelancet.com/journals/laninf/article/PIIS1473-3099(20)30246-2/fulltext
[5] The infection fatality rate of COVID-19 inferred from seroprevalence data, https://www.medrxiv.org/content/10.1101/2020.05.13.20101253v3
[6] Serology-informed estimates of SARS-CoV-2 infection fatality risk in Geneva, Switzerland,
https://www.thelancet.com/pdfs/journals/laninf/PIIS1473-3099(20)30584-3.pdf
[7] https://data.cdc.gov/NCHS/Provisional-COVID-19-Death-Counts-by-Sex-Age-and-W/vsak-wrfu, Provisional_COVID-19_Death_Counts_by_Sex__Age__and_Week.csv
[8] https://www.cdc.gov/coronavirus/2019-ncov/covid-data/covidview/09182020/public-health-lab.html, public-health-lab.csv
[9] https://www.cdc.gov/coronavirus/2019-ncov/covid-data/covidview/09182020/commercial-labs.html, commercial-lab.csv
[10] https://www.cdc.gov/coronavirus/2019-ncov/covid-data/covidview/09182020/clinical-labs.html, clinical-labs.csv
[11] COVID-19 Projections Using Machine Learning, Youyang Gu, covid19-projections.com