Skip to content

Availability: Population data

The following indicator is under consideration for this pilot edition of the Barometer: To what extent is population information available as open data?

Feedback on draft Global Data Barometer Indicators

You are looking at a draft indicator to be included in the expert survey of the Global Data Barometer. Between now and May 10th we are inviting your feedback on this indicator and the elements it contains. You can provide your feedback by (a) completing the feedback form below; or (b) adding in-line annotations.

Feedback form

You can share your feedback on the Availability: Population indicator here, or make use of Hypothes.is annotations

Show/hide supporting questions

Existence

  • Is this data available online in any form?
    • Data is not available online
      Supporting questions: Are there other offline ways to access this data in the country? (e.g. attending an office to inspect it)
    • Data is available, but not as a result of government action
      Supporting questions: If government is not providing access to data, how is this data available? Please provide a URL for where this data can be found
    • Data is available from government, or because of government actions
      Supporting questions: Please provide a URL for where this data can be found

Elements

Part 1: Data structure and openness.

  • Data is timely and updated. (No, Partially, Yes)

    Supporting questions (conditional)

    When was the most recent update to this dataset?

  • Dataset is available free of charge. (No, Partially, Yes)

  • Data is openly licensed. (No, Partially, Yes)

    Supporting questions (conditional)

    If No: If there are explicit restrictions placed on re-use of the dataset, briefly describe those here.

    If Partially or Yes: If the data is provided with an explicit open license, please provide the name of the license, or a link to it here.

  • Data is available in all the country’s official or national languages. If the country has no official or national languages, data is available in the major languages of the country. (No, Partially, Yes)

    Supporting questions (conditional)

    If Partially or Yes: Please briefly describe the language coverage available.

  • Data is provided in machine-readable format(s) (No, Partially, Yes)

    Supporting questions (conditional)

    If Partially or Yes: Please provide a URL where this machine-readable data can be found. (Additional URLs can be included in the justification and supporting evidence)

    If Partially or Yes: Please provide a comma separated list of the formats available? (E.g. csv, json)

    If Partially: What prevents you from assessing this data as fully machine-readable?

  • The machine-readable dataset is available as a whole (No, Partially, Yes) Answer no if it's only possible to access individual records; Answer partially if it's possible to export extracts of the data; Answer yes if there are bulk downloads or APIs providing access to the whole dataset without financial, technical or legal barriers.

    Supporting questions (conditional)

    If Partially or Yes: Please provide a URL where bulk download access is available or described.

    If Partially or Yes: If bulk access is provided through an API, please provide a link to where the API is described.

Part 2: Data fields and quality assessment.

  • Data includes information about population totals. (No, Partially, Yes) Some countries use projections to update their population totals between censuses. Here we're looking for population estimates based on recently gathered evidence.

  • Data includes information about de facto population estimates. (No, Partially, Yes) The de facto population is a concept under which individuals (or vital events) are recorded (or are attributed) to the geographical area where they were present (or occurred) at a specified time. It contrasts with the de jure population, under which individuals (or vital events) are recorded (or are attributed) to a geographical area on the basis of the place of residence. (Definitions from OECD's Glossary of Statistical Terms)

  • Data includes information about coverage error and the methods for calculating such error. (No, Partially, Yes)

  • The data integrates geospatial information. (No, Partially, Yes)

  • Data includes information about population projections. (No, Partially , Yes )

  • Historical data is available, to track change over time. (No, Partially, Yes)

Part 3: Data disaggregation or differentiation.

  • Data includes information about individuals' sex or gender. (No, Partially, Yes)

    Supporting questions (conditional)

    If Partially or Yes: Please briefly describe what data includes sex or gender information.

  • Data includes information differentiated by age of individuals. (No, Partially, Yes)

  • Data includes information differentiated by place of birth. (No, Partially, Yes) For individuals born inside the country, place of birth refers to state, province, or other relevant civil division. For individuals born outside the country, place of birth refers to the country in which they were born.

  • Data includes information differentiated by place of usual residence. (No, Partially, Yes ) For census purposes, usual residence is defined as the place at which a person lives at the time of the census and at which they have been living for “some time” or intend to stay for “some time.”

  • Data includes information differentiated by migrant status. (No, Partially, Yes)

  • Data includes information differentiated by level of education. (No, Partially, Yes)

  • Data includes information differentiated by level of income. (No, Partially, Yes)

  • Data includes information differentiated by occupation. (No, Partially, Yes)

Extent

  • How comprehensive is the data assessed for this question?
    • The data assessed covers one or more localities, but there are many other localities without available data, or with data of a lesser quality.
      Supporting questions: Which locality does this data cover?
    • The data assessed covers one or more localities, and is a representative example of the kind of data that can be found for all, or most, localities.
    • The data assessed provides national coverage.

Definitions and Identification

Population data should include estimates of the country's de facto population. A de facto population is a concept under which individuals (or vital events) are recorded (or are attributed) to the geographical area where they were present (or occurred) at a specified time. It contrasts with the de jure population, under which individuals (or vital events) are recorded (or are attributed) to a geographical area on the basis of the place of residence. (Definitions from OECD's Glossary of Statistical Terms)

Further, population data should include information about coverage errors and how these have been calculated, integrate geo-references, and provide a sense of population change over time through both historical data and projections. Finally, population data should be meaningfully disaggregated to the lowest possible level compatible with maintaining privacy.

Population data may be generated in various ways, including:

  • censuses, usually circulated every 10 years;
  • a continuous population register: population registers such as births, marriage, and death registers, or an integrated civil registration and vital statistics (CRVS) system;
  • administrative systems such as tax records or databases for administration of social programmes;
  • population surveys based on a sample of the overall population;
  • novel data collection approaches such as estimates derived from remote sensing or big data.

Start with your country's national statistics office; data about population size and details is commonly published there. Sometimes, however, that data will be too old to fall within the scope of this study, so please check sources and dates carefully. This might happen, for example, if the source of the data is a census that hasn't been conducted recently and the data hasn't been otherwise updated by data collection of other means. For cases where the most up-to-date population data is organized sub-nationally, please see the guidance below.

Starting points

  • Sources:

    • The 2020 World Population and Housing Census Programme lists latest census dates for all countries and includes links to where the data is published for many of them.
    • The Centre of Excellence for CRVS Systems host profiles for 27 different countries around the world, that include detailed information about CRVS systems, what they contain, and what agencies manage them.
    • UNICEF hosts profiles of CRVS systems for countries in sub-Saharan Africa, drawing on information from 2016–2017; while profiles don't link directly to relevant datasets, they offer insights into what data should be collected in your country and what agencies may be involved.
    • OECD's historical population data for 28 countries, located under the 'population statistics' sub-category. To identify the national source of the data, see table A.1 in the related methodological document.
  • Search:

    • The website of your country's national statistics office, census bureau, or civil registration and vital statistics agency.
    • Latest population surveys conducted by central or local public administrative units.
    • Dashboards and other public analytic tools may help you to assess the comprehensiveness and coverage of the population data.
  • Consult:

    • National or local government officials who manage censuses, registration of vital statistics, or population surveys.
    • Officers of civil society organizations dedicated to migrant or refugee welfare.
    • Journalists who report on the changing demographics of the population; immigration, emigration, or diasporic issues.
    • Academics who study the sociological or economic effects of population change.

What to look for?

To complete the assessment for this question you will need to access and explore the available data. This may involve running queries on datasets to check what it includes.

Look for evidence of:

  • Population total: Check for estimates of population size. Are these estimates based on recent data or are they projections based on older census data?
  • Coverage errors: Does the data include information about coverage errors and the methods used to calculate such errors?
  • De facto estimates: Do estimates record the de facto population? Or do they record the de jure population or something else?
  • Population size past and present: Check for historical data and population projections. The latter will provide information about the future size and structure of a population, for example, by sex and age.
  • Population data disaggregation: How is the data disaggregated or differentiated? Is it, for example, disaggregated by gender, age, place of birth, residence, migrant status, education, income, and occupation?

National and sub-national considerations

Population data is typically published at the national level, by national statistics offices. Sub-national considerations, however, may arise, particularly with regard to the recency and completeness of population data.

To address this possibility, focus on national government first, and then assess whether:

  • National datasets also include data from sub-national or local government units;
  • Equivalent data exists for a selection of sub-national or local government units, but is not nationally aggregated;

To assess countries where the most up-to-date population data is organized sub-nationally, researchers should select the strongest example of sub-national practice, and then indicate whether this is an outlier, or an example of widespread practice.

The availability of accurate and reliable information about a country’s population is critical for understanding a country's shifting demographics, and serves as the foundation for evidence-based policy and strategic planning. In 2015, the UN Commission on Population and Development decided to focus its forty-ninth session on "Strengthening the demographic evidence base for the post-2015 development agenda.” A subsequent report, prepared by the Population Division of the Department of Economic and Social Affairs, underscored that "Improving the reliability, timeliness, and accessibility of demographic data needs to be a central focus of any effort to strengthen statistical systems for monitoring the Sustainable Development Goals." (2016: 1-2)

Population data not only describes the size of a country's population, it grounds future projections and helps its users identify inequities and track the effects of interventions designed to address them. However, as Pelletier (2020) notes, there remains "no internationally agreed standards for producing annual population statistics" (8), a problem identified by the United Nations Task Team on Population Estimates in 2008; further, undercounts are common.

Population data becomes more useful when it is differentiated across various dimensions, including age, sex, residence, education, labor force status, occupation, migration status, and other attributes. At the same time, great care must be taken to protect individuals' and communities' privacy.

Population data and migration data intertwine: fundamentally, populations change via birth, death, and migration. Objective 1 of the Global Compact for Safe, Orderly, and Regular Migration expresses the collective commitment of the member states "to collect and utilize accurate and disaggregated data as a basis for evidence-based policies," committing to "improving and investing in the collection, analysis, and dissemination of accurate, reliable, comparable data, disaggregated by sex, age, migration status, and other characteristics relevant in national contexts, while upholding the right to privacy under international human rights law and protecting personal data" (article 17). Along similar lines, in support of SDG 10 (reducing inequalities within and among countries), indicator 10.7.2 examines national migration policies.