

Discover more from CalculatedRisk Newsletter
Lawler: Census 2020, Population Estimates, and Population Projections: Challenges for Demographers
CR Note: Demographics are useful in making economic projections, especially for housing. Demographic changes helped me call the bottom for rents in 2010 / 2011 and helped me predict the recent homebuying boom. For example, from 2016, see: Demographics: Renting vs. Owning.
In this note, housing economist Tom Lawler discusses some of the current challenges for demographers. This is a bit technical.
From Tom Lawler:
Each year Census produces estimates of the US resident population, both totals and characteristics of the population (age, sex, race/ethnicity, etc.). Population estimates are basically “walk-forwards” from a “base” population year, using estimates of births, deaths, and net international migration to produce population estimates for years subsequent to the “base” year.
The starting point for population estimates has always been the same date as that of last decennial census, and historically Census has used the decennial census population results as the “base” population from which to generate population estimates for subsequent years. It has done so even when post-census analysis or demographic analysis has suggested that the “true” characteristics of the population were different from the decennial census results.
For the most recent Census population estimates, however, it was not possible to use decennial census results because processing/other issues significantly delayed the availability of the detailed census 2020 demographic data, which were just released last month. (There were a few other issues as well.)
In response, Census’ Population Estimates Program (PEP) developed a process to produce what it refers to as a “blended base” set of population estimates for April 1, 2020 based on three different data sources: (1) certain internal census 2020 data; (2) the 2020 demographic analysis (which is a different type of “walk-forward” which starts with births by birth-year) for age and sex, and the “vintage 2020” population estimates for the characteristics of the population at the national, state, and county levels. (See METHODOLOGY FOR THE UNITED STATES POPULATION ESTIMATES: VINTAGE 2022 for more details.)
PEP first implemented this new methodology in its “vintage 2021” population estimates, and then incorporated more information into its “vintage 2022” estimates, which were significantly different from those in vintage 2021. Here is how Census characterizes its “blended base” estimates for April 1, 2020.
“At the national level, then, it is accurate to say that resident, household, and group quarters (GQ) population totals are derived from the 2020 Census, age and sex detail is drawn from 2020 DA, and race and Hispanic origin detail comes from V2020.”
Here is a table showing estimates of the April 1, 2020 population by 5-year age groups from Census 2020, Vintage 2022 (the latest available), Vintage 2021, and the “middle” Demographic Analysis (DA) series. For those who wonder, DA estimates start with births in any given year, and assume that the population for any age equals births less cumulative deaths plus cumulative net international migration. The exception is for people born before 1945, where population estimates are based on Medicare enrollment.
Totals may not add up due to rounding.
Compared to the Decennial Census, Vintage 2022 estimates and DA estimates are much higher for young children, which is consistent with post-enumeration survey results. Vintage 2002 and DA estimates are also significantly lower for the 50-74 year age groups, also consistent with the post-enumeration survey results.
While not shown here, neither Vintage 2022 estimates nor DA estimates exhibit the “age-heaping” present in the Census 2020 results. Age-heaping is the statistical observation that measured population “spikes” for ages ending in either a zero and a five and is most prevalent when people only report an age and do not include a date of birth, especially with proxy reporting. Age heaping has been evident in previous decennial Census’, but it was more pronounced in 2020.
One puzzling feature about Vintage 2022 estimates for 4/1/2020 compared either Vintage 2021 or the DA estimates is the estimated population of 18- to 21-year-olds. Here is a table showing the various estimates of the 18–21-year-old population for 4/1/2020.
As the table shows, the Vintage 2022 18–21-year-old population for 4/1/2020 is 5.8% higher than that for Vintage 2021, and 5.5% higher than DA estimates. In looking at births over the 2001-05 period, as well as cumulative deaths for those birth cohorts, the Vintage 2022 estimates imply what in my view are unrealistically high cumulative net international migration numbers for those age groups. Stated another way, they look “off.”
In terms of the population from 0 to 74, the Vintage 2021 population estimates “fit” the middle DA estimates significantly better than do the Vintage 2022 estimates. (Recall that the DA estimates for 75+ year olds are based on Medicare enrollment data, and not births, cumulative deaths, and cumulative NIM). As such one could make a case that the Vintage 2021 estimates for 4/1/2020 might be a better “base” year for subsequent year population estimates. One could also make a case to use just the DA estimates, though I don’t think those are available at the state or county level, and probably aren’t available by race/ethnicity. It’s also a bit of a pain to use a different “base” year and then construct one’s own 2021 and 2022 population estimates. But once one decides not to go with decennial Census results (and there are valid reasons not to), then …
Be that as it may, the latest “official” population estimates for 2020, 2021, and 2022 are from Vintage 2022 released by age this year, and for Census’ upcoming population projection update (tentatively scheduled for the fall, and the first population projection update since 2017!) Census will be using the Vintage 2022 estimates as its starting point
I’ll have more on this topic later, including how I hope to update my short-term population projections.
Appendix: Demographic Analysis
In its Demographic Analysis, Census looks at births by year, cumulative deaths of people born in that year, and estimated cumulative net immigration for people born in that year to get an estimate for a given age’s population. E.g., in the 2020 DA “middle scenario” analysis, there were 4.015 million US births from 4/1/2001 to 3/31/2002; there were an estimated 43,000 deaths through 3/31/2020 of people born from 4/1/2001 to 3/31/2002; and cumulative net international migration (NIM) through 3/31/2020 of people born from 4/1/2001 to 3/31/2002 was an estimated 337,000 (this latter estimate is subject to a high degree of error). If all of these estimates were correct, then the population of 18-year-olds as of 4/1/2020 would be 4,015,000 (births) less 43,000 (deaths) plus 337,000 (NIM) or 4.309 million (The DA estimate is actually 4.308 million due to rounding). The Vintage 2022 estimate for the 18-year-old population as of 4/1/2020 was 4.49 million. If the birth and death estimates cited above were correct, that would imply that NIM for this age cohort was what appears to be an implausibly high 518,000.
An even easier example is for the under 1-year old population. CDC estimates that there were about 3.737 million births from 4/1/2019 to 3/31/2020, and there were about 20 thousand under 1-year old deaths over that period. As such, excluding NIM the estimated under 1-year old population as of 4/1/2020 would be about 3.717 million. Census 2020 “counted” only 3.48 million of under 1-year olds. If all these numbers were accurate, that would imply that over the 12 months ended 3/31/2020 there was a net international EMIGRATION of 237,000 under 1-year olds, a preposterous figure.
Note: This was from housing economist Tom Lawler.