Skip to content

Global Data Barometer Handbook

You are viewing the Global Data Barometer Research Handbook (2021).

Introduction

Data is a source of power. It can be exploited for private gain and used to limit freedom, and it can be deployed as a public good—a resource for tackling social challenges, enabling collaboration, driving innovation, and improving accountability.

The Global Data Barometer aims to fill critical knowledge gaps on how the use of data is evolving across different sectors, regions, and countries around the world. The Barometer's design thus investigates how the policies and practices of governing, releasing, and using data for the public good are unfolding across the globe.

Building on the Open Data Barometer, which has been used to drive policymaking, advocacy, and academic research around the world, the Global Data Barometer asks the question: To what extent are countries managing data for the public good?

To answer this question, the Barometer incorporates both quantitative and qualitative assessments, drawing on primary data collected through an expert survey implemented by a network of global partners and researchers, as well as data from existing secondary sources and a complementary government survey.

Primary data collection is structured around a selection of country-wide indicators which seek to measure progress in terms of data governance, capability, availability, and use and impact. In addition, specific thematic indicators focus on the state of data for public good in relation to specific sectors or areas of public policy. This combined breadth and depth approach is designed to:

  • Review and assess the overall environment for data for the public good in each country studied;
  • Explore whether assessments of the overall environment reflect existing data practices in particular thematic sectors;
  • Ensure that the collection of indicator data supports reuse in companion products developed by project partners and maximizes the potential reuse value of the data collected within the broader research and policy community.

This handbook contains the primary indicators included in our expert survey. Prospective indicators drawing from secondary data sources will be released later in 2021. The handbook is a living document, to be updated as the pilot edition moves forward in its study of data for the public good.

2021 Pilot Edition

For the 2021 pilot edition we are combining three approaches to understand data practices and policies around the world:

  • We are investigating core data governance and data capabilities;
  • We are doing a deep dive into a larger data ecosystem with well-established data practices: the intersection of money, property, and power;
  • We are exploring two areas of urgent global concern where changed data practices are increasingly called for: climate action and health and COVID-19.

The 2021 pilot edition of the Global Data Barometer also explores several research innovations that depart from its predecessor study:

  • An expanded scope that moves beyond the limited open data focus of past indicators to examine data for the public good more broadly;
  • New indicator designs and research guidance;
  • New partnership models for developing and using thematic indicator data;
  • New approaches to fieldwork through a network of regional hubs in order to benefit from localized expertise.

Design Principles

The Barometer’s structure, components, and weighting respond to a specific set of design principles.

Flexibility for government structure

While some countries set policy nationally, many countries operate within federal systems that mean that aspects of data policy, capability, availability, and use are shaped by sub-national governments. Indicators have been designed to accommodate this reality. In federal systems, researchers are able to provide a detailed assessment for a single sub-national context and indicate whether this is representative or not of other sub-national contexts.

Universality

The highest scores in the Barometer are achieved when governance, capability, availability, and use can be shown to be universal—when everyone in the country is covered by, or protected by, governance rules; when everyone has access to capabilities or the development of capabilities; when everyone has access to meaningful data; and when data use has impact for the public good across the country.

Bright spots design

The Barometer has adopted a bright spots design approach with the intention of collecting data on leading examples of good practice, even if the practice is not yet universal within a country.

The Barometer Structure

The overall structure of the Barometer was developed following a participatory design workshop, which identified four pillars or core components, a range of potential themes, and a number of cross-cutting issues. The following description offers a snapshot of how the study’s structure addresses all of these; you can read more about the Barometer’s structure and process in the methodology section of the handbook.

Pillars

The Barometer is organized around four pillars or core areas of assessment: governance, capability, availability, and use and impact.

Themes

The 2021 pilot edition examines seven themes:

These themes allow us to examine data for public good in significantly different kinds of data ecosystems and to speak to a range of government functions and sustainable development goals, while benefiting, as well, from the deep expertise of partners.

Modules

Modules, divided into core modules and thematic modules, implement the Barometer’s pillars and themes and structure the survey for presentation to researchers, reviewers, and users of GDB data. Modules vary in scope and size; weights applied during index calculations will balance their influence on the overall Barometer score.

Core modules correspond to data governance and capability, two of the Barometer’s pillars. These standalone modules contain indicators designed to provide a country-wide assessment of two of the most critical issue areas for developing an effective data ecosystem. Additional indicators on governance and capability in the thematic modules complement these core modules. The other two pillars of the Barometer, availability and use and impact, are assessed through the thematic modules.

Thematic modules investigate the interaction of governance, capability, availability, and use in specific domains or public policy areas.

Five thematic modules examine data for the public good related to money, property, and power; these five are organized around:

  • Company information;
  • Land;
  • Political integrity;
  • Public finance;
  • Public procurement.

While each of these themes has its own data particularities, they intersect with regard to anti-corruption, integrity, and accountability. In these thematic modules we ask a mixture of questions related to the four pillars of the Barometer.

Our other two thematic modules, climate action and health & COVID-19, are areas that are globally urgent, but often lack open and locally relevant data. Consequently, for these we have designed our indicators to focus on the pillar of availability.

Cross-cutting Issues

The Barometer’s indicators are also designed to provide insight into a number of cross-cutting concerns: equity and inclusion, COVID-19, emerging AI practices, open data, and data as a critical tool to advance development and meet SDGs. These cross-cutting concerns are reflected variously in the choice of indicators, the sub-questions that indicators contain, and the wording and scoring of indicators. The following sections describe specific ways the Barometer addresses equity and inclusion, COVID-19, emerging AI practices.

Equity and Inclusion

For the pilot edition of the Barometer, we are testing a variety of methods for investigating equity and inclusion in data policies and practices, drawing on both dedicated indicators and sub-questions used across modules.

One group of indicators and sub-questions examines equity and inclusion issues tied directly to generating and publishing data: Core governance indicators ask researchers to assess countries’ provisions for requiring comprehensive language coverage and compatibility with assistive technologies. A parallel sub-question in all of the thematic availability indicators asks researchers to assess the languages in which datasets are, in practice, available. We plan to also complete an automated assessment of related websites' conformance with WCAG 2.1 (or WCAG 2.2. depending on release date), drawing on the URLs researchers provide as answers throughout the survey.

In the thematic modules our approach is shaped by the particular theme. For example:

  • The land module includes a dedicated use indicator that investigates how land data influences policies with regard to equitable and inclusive land tenure and use;
  • The public procurement module includes a sub-question that asks whether public procurement data analytics are being used to improve access to procurement opportunities for marginalized groups;
  • The climate action module includes a sub-question that asks whether available climate vulnerability data includes information on poverty, gender, and marginalized populations;
  • The health & COVID-19 module includes sub-questions that ask about whether COVID-19 vaccination data is broken down by age, sex and/or gender, disability status, membership in a marginalized population, residency in a long term care facility, and incarceration status.

Note: The above represent a sample of relevant indicators and sub-questions in the pilot edition, not the entirety.

We have designed our indicators and sub-questions with several practical challenges in mind. First, while sex and gender are increasingly recognized as spectrums not dependent on each other, some organizations and cultures use sex and gender interchangeably. Consequently, in many cases, sex data is listed under the heading of gender, obscuring or blurring gender data. Our relevant sub-questions thus ask about “sex and/or gender” and we ask researchers to note when they find both. This clarification will give us a better idea of the state of good practice in the field. This is imperfect and frustrating and something we hope to improve in the future.

Second, while marginalization occurs around the world, the specifics of marginalization vary within each country. Consequently, at the beginning of the survey we ask researchers to identify patterns of marginalization in their country; then, for sub-questions that ask about marginalized populations, we ask them to cross-check against this list.

We aim to expand our approach in future editions.

COVID-19

In the health & COVID-19 thematic module, the Barometer assesses COVID-19–related datasets directly, using both primary indicators and others built from secondary data sources. In many countries, the coronavirus pandemic has also affected data policies and practices more broadly, whether through the establishment of new data cooperatives or the suspension of particular data governance provisions. Consequently, to investigate COVID-19 as a cross-cutting concern, we also draw on data collected via sub-questions in other modules to identify regulations that have changed in light of COVID-19, and impacts of COVID-19 on the availability and use of certain datasets.

Artificial Intelligence

The pilot edition of the Barometer examines emerging AI practices through a mixture of primary indicators and others built from secondary data sources. Sub-questions in governance indicators assess how/if governance frameworks are addressing artificial intelligence; sub-questions in use indicators track algorithmic uses of datasets. We are also exploring secondary data sources, both to complement these approaches and to track relevant capacities to deploy AI for the public good in conjunction with the Barometer’s core capability module.