South Africa's 2022 Census May Not Be Accurate Enough for Official Use - Demographers Explain What Went Wrong


The 2022 South African census set an undesirable record: it is the census with the highest undercount among those where the undercount is measured and reported by the United Nations Population Division. The reported undercount of 31% is some 10 percentage points higher than the previous highest notified undercount (in Comoros in 2017).

While the results from the 2022 South African census, released in October 2023, were adjusted for the undercount, it means the results are more estimates than counts, producing a number of anomalies in the census data. These call their usefulness into question.

A census is, primarily, as accurate a count as possible of the number of people in a country at a point in time. It attempts to describe the demographic and socioeconomic characteristics of the population. It also provides key benchmark estimates of fertility and mortality of the population.

Census data are also crucial for planning investment, and in determining the allocation of resources by both public and private sector entities. In particular, a census provides information about small area populations which is usually not available from other sources.

Decisions on where to build houses, schools, infrastructure and factories are shaped by census data.

Census data are also used as a sampling frame in other surveys. In South Africa these include the Quarterly Labour Force Survey, national poverty lines and burden of disease studies. These surveys provide more detailed information on the country's population, track the progress made in addressing socioeconomic disparities and provide the denominators for a large number of indicators used to track the Sustainable Development Goals.

We are demographers with long experience in the analysis of census data from many countries, but particularly with data from every post-apartheid census. In a technical report recently published by the South African Medical Research Council, we highlight a number of operational and logistical difficulties encountered in planning and running South Africa's 2022 census.

We conclude that these difficulties render the census data collected unfit for purpose. We recommend that the results be used with extreme caution in planning and resource allocation until thorough investigations are made possible by Statistics South Africa.

Identified weaknesses

Rigorous statistical procedures exist to produce final estimates of the population correcting for the undercount using a Post-Enumeration Survey (PES). This is a small-scale survey conducted soon after the census date that seeks to identify who was, and was not, counted in the census.

Based on the PES conducted in the second half of 2022, Statistics South Africa estimates that the census undercounted the population by 31%. The extent of the undercount was revealed at the time of the release of the census results on 10 October 2023. But the undercount was not highlighted by Statistics South Africa. Nor did it command widespread popular or public attention.

Among our most important findings are that:

  • Based on demographic reconstruction, the national population of 62 million, after adjustment for the estimated undercount might have been overestimated by around 1 million people (or under 2%). Half of this excess is attributable to overestimates of the Indian/Asian and white population groups, where the estimated undercount in both groups exceeded 60%.
  • The excess is concentrated in those aged 50 and over. It cannot plausibly be attributed to net immigration at these ages.
  • There is a significant undercount, even after adjustment, of children aged 5 at their last birthday.
  • There are a number of anomalies in the national and provincial population estimates by age, sex and population group that are inconsistent with data from previous censuses and national vital registration data.
  • The estimates of population numbers at district and municipal levels are highly inconsistent with estimates from a number of other data sources. These include Statistics South Africa's own mid-year population estimates, and data from the voters' roll from the local government elections.
  • There are demographic and statistical anomalies in the adjustments for the undercount used to produce the final population estimates from the enumerated population. These imply a much greater degree of confidence and certainty about the estimates than is reasonably possible.

Taken together, these findings call into question the reliability of the 2022 South African census data as a source for planning and resource allocation, in particular the Equitable Share Formulae. These are used by the national treasury to determine apportionment of budgets to provinces, districts and municipalities.

What happened?

At this point, and given the paucity of information in the public domain, it is hard to pin down precisely why the undercount in the 2022 census was nearly twice that encountered in 2011.

However, the technical report suggests a range of contributory factors.

  • Attempting to run a census in the middle of the COVID pandemic complicated planning and operations. This decision appears to have been forced by the national treasury's refusal to allow the budget for the census to be held over to the next fiscal year.
  • Delays in recruiting and training field staff to conduct in-person data collection. The original intent had been to collect as much data as possible from households online using a web portal. But a massive fieldwork operation had to be mounted at short notice.
  • Repeated extensions to the period during which data is collected. This was particularly true in the Western Cape, where enumeration finally ended nearly four months after the census date.
  • Problems with the Post-Enumeration Survey used to derive the adjustments to produce the final population estimates. First, delays in the running of the survey, which involved attempting to match household membership many months after the census date. Second, the scope and size of the survey was far too small to accurately adjust the enumerated population for the undercount identified. Third, there are inexplicable statistical anomalies in the results from the survey and the adjusted census results.

Implications - and what needs to be done

The technical report concludes that the data from the 2022 South African census should be used with extreme caution until thorough investigations are made possible by Statistics South Africa. Such an investigation would also require the release of census data collected on fertility, mortality and migration. To date these have not been released.

Pending those investigations, Statistics South Africa is also strongly advised not to base its annual series of projected mid-year population estimates on the results of the 2022 census.

Another census is unlikely to be conducted before 2031. This means that an alternative set of population estimates by age, sex and population group is urgently required. These alternative estimates may more accurately describe the South African population, and provide a better basis for resource allocation and planning.

Tom Moultrie, Professor of Demography, University of Cape Town

Rob Dorrington, Professor Emeritus, University of Cape Town

