About AidData's Global Official Finance Dataset Version 2.0

Scope -- What does the dataset cover?

All Sources and Types of Transfers from Official Sector: AidData's Global Chinese Development Finance Dataset (Version 2.0) captures 13,386 Chinese government-financed projects worth $851 billion across 165 countries in every major world region. One of the most important features of the dataset is its comprehensive scope. It covers all regions, all sectors, and all sources and types of financial and in-kind transfers from government and state-owned institutions in China. There are other datasets that capture official financial transfers from China to a single sector (e.g., energy) or region (e.g., Latin America), or that only track certain types of financial flows (e.g. loans) and funding sources (e.g. China’s policy banks). However, the 2.0 version of AidData's dataset is unique in that it captures the full range of projects that align with the OECD’s definitions of Official Development Assistance (ODA) and Other Official Flows (OOF). Therefore, any project that benefits from financial or in-kind support from any official sector institution in China is included, regardless of its purpose, level of financial concessionality, funding source, and overseas destination. The 2.0 dataset captures grants, technical assistance, loans, buyer’s credits, seller’s credits, debt forgiveness, debt rescheduling, scholarships, and training activities. It captures projects supported by 369 different official sector institutions in China, including central government agencies (like the Ministry of Commerce, the Ministry of Foreign Affairs, and the Ministry of Agriculture), regional and local government agencies (like Chongqing Municipal Health Commission and Tianjin Municipal Government), state-owned enterprises (like CNPC, CMEC, CATIC, and CRBC), state-owned policy banks (like China Development Bank and China Eximbank), state-owned commercial banks (like ICBC, BoC, and CCB), state-owned funds (like the Silk Road Fund), and non-profit government organizations (like Hanban and the China Foundation for Poverty Alleviation).

Temporal Coverage

The 2.0 dataset captures the known universe of projects (with development, commercial, and representational intent) supported by official financial and in-kind commitments (and pledges) from China between 2000 and 2017, with details on the timing of project implementation over a 22-year period (2000-2021). The dataset also assigns every project to one of six status categories (Pipeline: Commitment, Implementation, Completed, Suspended, or Cancelled) based on sources that were available as late as August 2021.

Geographic Coverage

AidData used the 2.0 version of the Tracking Underreported Financial Flows (TUFF) methodology to systematically search for projects supported by official financial and in-kind transfers from China across 165 countries and territories. The resulting dataset covers official financial and in-kind transfers from China to every low-income, lower-middle income, and upper-middle income country and territory across every major world region, including Africa, Asia, Oceania, the Middle East, Latin America and the Caribbean, and Central and Eastern Europe. It also covers 11 high-income countries that were included to help ensure comprehensive coverage in each world region to the extent possible. In total, the dataset identifies Chinese government-financed projects in 145 countries and territories, meaning that no projects were identified in 20 countries and territories despite systematic searches. See the CountryList tab in this file for more details.

Total Number of Projects

In total, there are 13,386 project records in the 2.0 dataset. 10,802 of these are formally approved, active, and completed projects. The remaining 2,584 projects include (i) projects that secured official financial or in-kind commitments from China but were subsequently suspended or cancelled, (ii) projects that secured pledges of financial or in-kind support from official sector institutions in China but never reached the formal approval (official commitment) stage; and (iii) so-called “umbrella” projects that are designed to support multiple, subsidiary projects. For analysis that requires the aggregation of projects supported by official financial (or in-kind) commitments from China, AidData recommends using the “Recommended for Aggregates” marker variable to isolate formally approved, active, and completed projects (and avoid double-counting projects and monetary amounts).

Total Financial Value

The 10,802 formally approved, active, and completed projects in the 2.0 dataset are collectively worth $851.7 billion (in constant 2017 U.S. dollars). An important feature of the 2.0 dataset is that AidData not only captures official financial commitments in their original (nominal) amount and currencies of denomination, but also converts these financial commitments into constant 2017 U.S. dollars (using the OECD’s deflation methodology) to adjust for inflation and ensure comparability over time and geograghic space.

Key Features of the Data

Basic Project Information

The dataset provides foundational information about each project, including its title in English, Chinese, and host country languages, a unique and stable project identification number, the date of the official commitment, the monetary value of the official commitment, the currency in which the official commitment was denominated, the identity of the funder and recipient, the primary purpose of the project, the current status of the project, and URLs for all of the sources that supported the creation of the project record.

Transactional Details

AidData identifies the nature of the financial or in-kind transfer (e.g., grant, loan, technical assistance, export buyer’s credit, export seller’s credit, supplier’s credit, debt forgiveness, debt rescheduling, scholarship/training) supporting each project in the dataset. Whenever applicable, it documents loan and export credit pricing details (interest rate, maturity, grace period, management fee, commitment fee); levels of financial concessionality, as measured by the OECD’s grant element calculator; the monetary value and timing of disbursements and repayments; the use of credit enhancements, including guarantees, insurance, and collateral; the establishment of special purpose vehicles, subsidiary on-lending arrangements, and escrow/revenue/special accounts; and the monetary value and timing of underlying commercial contracts. The 2.0 dataset also provides stable URLs to unredacted grant, loan, export credit, and debt forgiveness/rescheduling agreements whenever they have been successfully retrieved.

Development Finance Categorization

AidData seeks to designate each project in the 2.0 dataset as Official Development Assistance (ODA) or Other Official Flows (OOF) based on measurement of the project’s primary intent and the concessionality level of the financing provided for the project. AidData adheres closely to the OECD-DAC reporting directives that outline the financial, structural, and intent-related eligibility criteria for ODA and OOF. This unique feature of the dataset allows users to make cross-donor and cross-lender comparisons at global, regional, national, and subnational scales and over time.

Sectoral Categorization

AidData assigns 3-digit OECD sector codes and names to all projects in the 2.0 dataset using the OECD’s classification criteria. This unique feature of the dataset enables cross-donor and cross-lender comparisons—at global, regional, national, and subnational scales—since most official sources of international development finance (including all of the members of the OECD-DAC and the most multilateral institutions) use the same criteria. It also facilitates analysis of sectoral patterns and trends over geographic space and time.

Stakeholder Organizations

Another feature that sets the 2.0 dataset apart is the level of detail that it provides about the organizations involved in Chinese government-financed development projects. It provides information about five different types of organizations for each project: (1) the official sector institution in China that is responsible for providing funding and/or in-kind support for the project; (2) the co-financing institutions from inside and outside of China that are supporting the same project; (3) the recipient institutions that are responsible for managing incoming funds and in-kind transfers; (4) the contractors and subcontractors that are responsible for project implementation, and (5) the third-parties that provide repayment guarantees, credit insurance policies, and collateral that can be seized in the event of default. The 2.0 dataset also categorizes each of these organizations by type (i.e., Government Agency, State-Owned Bank, State-Owned Company, State-Owned Fund, Intergovernmental Organization, Special Purpose Vehicle/Joint Venture, Private Sector, NGO/CSO/Foundation) and country of origin (i.e., China, Recipient Country, or Other). In the latest version of the dataset, AidData identifies 369 official sector financing institutions from China, 443 co-financing institutions, 2,335 recipient institutions, 3,500 implementing institutions, and xx institutions that provide guarantees, insurance, or sources of collateral.

Timing of Project Implementation

The 2.0 dataset records high-precision (i.e., calendar day) data on the implementation start dates and completion dates of Chinese government-financed projects. These variables were also included in an earlier, 1.0 version of the dataset. However, in the 2.0 version of the dataset, these variables have vastly improved coverage rates. Whereas AidData was able to identify precise implementation start dates for 745 projects and precise project completion dates for 906 projects in the 1.0 version of our dataset, AidData has identified precise implementation start dates for 5,366 projects and precise project completion dates for 5,769 projects in the 2.0 version of the dataset. With calendar day-level information on the timing of project implementation and exact locational details, users of the dataset can now measure the spatio-temporal rollout of project implementation with a high level of precision. The 2.0 dataset also provides data on the originally scheduled project implementation start dates and completion dates, so that users can determine if projects have been implemented on schedule, behind schedule, or ahead of schedule.

Location Details

For projects that have physical footprints or involve specific locations, the 2.0 version of the dataset provides precise locational information that technical users can use to conduct geospatial analysis and non-technical users can use to instantly view where projects are taking (or have taken) place. Written descriptions of the geographical locations and features of project activities and OpenStreetMap links are available in the 'Geographic Location' field, while corresponding GeoJSON files are available for download in the 'GeoJSON URL' field for each relevant project. The 2.0 dataset provides geoJSON files and OpenStreetMap URLs for 2,630 projects (worth $303 billion). AidData has made the complete set of GeoJSON files along with usage tips, and related documentation accessible via https://github.com/aiddata/china-osm-geodata

Project Risks, Achievements, Failures, and Setbacks

The 2.0 dataset provides a suite of variables (e.g., Commitment Year, Implementation Start Year, Completion Year, Status) that allow users to track projects over their full life cycles. Whenever possible, the dataset also provides a detailed overview (in the Description field) of the various challenges that arose during project design and implementation (such as strikes, riots, public protests, wars, corruption scandals, natural disasters, public health restrictions, political transitions, bankruptcies, debt defaults, contractual disputes, lawsuits, and ruptures in diplomatic relations) and how funding, receiving, implementing, and accountable institutions responded to these challenges. Additionally, the dataset provides information about project achievements and failures, contractor performance vis-à-vis deadlines and deliverables, and findings from project audits and evaluations.

Using the Data

How can this data be used in analysis?

For most types of analysis that require the aggregation of projects supported by official financial (or in-kind) commitments from China, AidData recommends using the “Recommended for Aggregates” field to isolate formally approved, active, and completed Chinese government-financed projects -- and exclude all cancelled projects, suspended projects, and projects that never reached the formal approval (official commitment) stage. This field is set to "Yes" for all projects with a “Status” designation of Pipeline: Commitment, Implementation, and Completion that have not also been designated as umbrella agreements. It is set to “No” for all cancelled projects, suspended projects, and projects that never reached the official commitment stage (i.e., those projects with a “Status” designation of Pipeline: Pledge, Suspended, and Cancelled). Additionally, to avoid double-counting, the “Recommended for Aggregates” field is set to “No” for all umbrella agreements. For more information on umbrella agreements, see the description of the “Umbrella” field in the “Definitions” tab of this file. Also, note that not all projects with a “Recommended for Aggregates” value of “Yes” identify a financial transaction value (since some transactions are difficult to monetize, such as in-kind donations, technical assistance, scholarships, and training activities).

For users who wish to analyze pledged, suspended, or cancelled projects, AidData recommends using the “Status” variable (without any application of the “Recommended for Aggregates” filter) to identify the relevant subset of projects. Umbrella projects should likely be excluded.

For users who wish to analyze cases of debt forgiveness or rescheduling, several additional factors should be considered. In cases of debt forgiveness, the Umbrella field is set to "Yes" if the original contracted loan could be captured elsewhere in the dataset as a loan project record. This is done to avoid double counting. If the original contracted loans occured before 2000 (when the dataset begins to track Chinese development finance), then the Umbrella field is set to "No." As such, if users are interested in isolating all cases of debt forgiveness, AidData recommends turning the “Recommended for Aggregates” filter off and then using the “Flow Type” field to identify all projects assigned to the “Debt Forgiveness” category (irrespective of whether they are coded as umbrella projects). Also, to help users avoid double-counting, AidData does not populate any fields related to transaction amounts [Amount (Original Currency), Original Currency Amount (Constant USD2017), Amount (Nominal)] for projects assigned to the “Debt Rescheduling” category. However, users who wish to undertake analysis of debt reschedulings can find detailed information about the terms and conditions of these reschedulings in the “Description” fields of the project records that are assigned to the “Debt Rescheduling” category."

Data Collection

How were the data generated?

This dataset was collected using Tracking Underreported Financial Flows (TUFF). TUFF is a transparent, systematic, and replicable methodology that enables the collection of detailed financial, operational, and locational information about Chinese government-financed projects. It was undertaken in three stages: (1) project identification, (2) project verification and enhancement, (3a) project-level data quality assurance, and (3b) quality assurance of the dataset as a whole.

TUFF 2.0: The 2.0 version of AidData’s Global Chinese Development Finance Dataset was collected using a re-engineered version of the TUFF methodology (“TUFF 2.0”) that involved three major improvements. First, instead of relying on media sources to identify individual projects, AidData began its search process by systematically reviewing tens of thousands of primary, official sources. These sources include unredacted grant and loan agreements published in government registers and gazettes, official records extracted from the aid and debt information management systems of host countries, annual reports published by Chinese state-owned banks, Chinese Embassy and MOFCOM websites, reports published by parliamentary oversight institutions in host countries, and AidData’s direct correspondence with finance ministry officials in developing countries. Official source retrieval was undertaken on a country-by-country basis in order to comprehensively track the full range of financial and in-kind transfers from official sector institutions in China. Then, as a supplement, AidData conducted a set of systematic search procedures in Factiva -- a Dow Jones-owned media database that draws on approximately 33,000 media sources worldwide in 28 languages, including newspapers and radio and television transcripts -- to identify non-official sources that also provide useful information about Chinese government-financed projects. Second, TUFF 2.0 involved the implementation of an enhanced set of data quality assurance protocols to identify important project implementation details, such as calendar day-level commencement and completion dates, precise geographical locations and features of project activities, and the contractors and subcontractors responsible for implementation. Third, TUFF 2.0 prioritized the collection of more detailed information on the terms and conditions that govern the loan and export credit contracts issued by Chinese state-owned entities, such as maturities, grace periods, interest rates, grant elements, commitment fees, management fees, and the use of credit enhancements (including collateral, insurance, and repayment guarantees).

Sources

The 2.0 version of AidData’s Global Chinese Development Finance Dataset is underpinned by 90,000 sources (including 62,750 unique sources, of which 35,215 are official sources). Whereas the average project record in the 1.0 dataset was based upon 3.6 sources, the average project record in the 2.0 dataset is based upon 6.7 sources. 88% of the project records in the 2.0 version of the dataset are underpinned by at least 1 official source (compared to 62% of the project records in the 1.0 dataset). The dataset relies primarily upon Chinese-, English-, Spanish-, French-, Portuguese-, Russian-, and Arabic-language sources. However, for certain countries, it uses other local language sources (e.g., Farsi sources in Iran, Dutch sources in Suriname, Vietnamese sources in Vietnam).

The construction of the dataset only possible because of the collective efforts of hundreds of professional staff, faculty, research assistants, and expert reviewers at multiple institutions over the last five years. More details on the full methodology and how it was applied to construct the 2.0 version of AidData’s Global Chinese Development Finance Dataset can be found in the latest (2.0) version of the TUFF methodology