Global ﬁre emissions estimates during 1997–2016

. Climate, land use, and other anthropogenic and natural drivers have the potential to inﬂuence ﬁre dynamics in many regions. To develop a mechanistic understanding of the changing role of these drivers and their impact on atmospheric composition, long-term ﬁre records are needed that fuse information from different satellite and in situ data streams. Here we describe the fourth version of


Introduction
Fires have occurred naturally since the rise of vascular plants on land over 400 million years ago (Scott and Glasspool, 2006), shaping biomes and influencing climate through modulation of the carbon cycle and emissions of greenhouse gases and aerosols (Edwards et al., 2010;Langmann et al., 2009;van Langevelde et al., 2003). During the Anthropocene, humans have become an increasingly important driver of fire occurrence (Bowman et al., 2011). Human activity has enhanced fire activity in locations such as deforestation zones, while fire suppression and conversion of fireprone landscapes such as savannas to agriculture in Africa, or of fire-maintained open lands to closed-canopy forests in the eastern US has generally decreased fire activity (Andela and van der Werf, 2014;Bowman et al., 2009;Nowacki and Abrams, 2008). To study how climate influences fires at the global scale and, in turn, how fires influence the carbon cycle, air quality, and climate we have developed the Global Fire Emissions Database (GFED).
The scientific community has used past releases of GFED for over a decade. GFED has been used by atmospheric and biogeochemical modeling groups as an input dataset to study the impact of fires on biogeochemical cycles (Chen et al., 2010;Schwietzke et al., 2016), atmospheric chemistry (Aouizerats et al., 2015;Castellanos et al., 2014), and human health (Johnston et al., 2012;Marlier et al., 2013), in assessment reports of the Intergovernmental Panel on Climate Change (IPCC) to estimate the role of fire and deforestation in biogeochemical cycles (Ciais et al., 2013), in the National Oceanic and Atmospheric Administration (NOAA's) CarbonTracker system (Peters et al., 2007), and in annual updates of the Global Carbon Project (Le Queré et al., 2015). GFED also serves as a benchmark for optimizing fire modules in dynamic global vegetation and Earth system models (Hantson et al., 2016), and for fire emissions estimates derived from fire radiative power (FRP), including the Global Fire Assimilation System (Kaiser et al., 2012). Finally, burned area from GFED has provided a means for building early warning systems of fire season severity (Chen et al., 2016).
The first version of GFED was released in 2004 and has since undergone several revisions as improved burned area estimates became available. GFED2 was released after Giglio et al. (2006) improved on the mapping of burned area from active fire data. GFED3 was released when this conversion was no longer necessary because almost all burned area in the Moderate Resolution Imaging Spectroradiometer (MODIS) era had been mapped , and the current version follows further improvements in the burned area algorithm . Satellite burned area is the most important input dataset regulating the spatial and temporal pattern of emissions following the Seiler and Crutzen (1980) approach, and is complemented in GFED by a biogeochemical modeling framework that provides estimates of biomass in various carbon "pools" including leaves, grasses, stems, coarse woody debris, and litter. These pools are combusted to different degrees during a fire depending on pool-specific parameters and environmental conditions that influence fuel moisture and the simulated burn depth in organic soils of boreal forests and peatlands.
Over the past decade, a parallel line of research has made considerable progress in estimating emissions using satellite observations of FRP. When continuous observations are available or the FRP diurnal cycle can be modeled, FRP can be integrated over time, yielding fire radiative energy (FRE). FRE is directly related to fire emissions (Wooster, 2002), and approaches using FRP observations can provide emissions estimates in near-real time (Darmenov and da Silva, 2015;Kaiser et al., 2012). Despite progress (Ichoku and Ellison, 2014;Schroeder et al., 2014a), there is still substantial uncertainty and some of these FRE approaches apply a scaling factor to match GFED. Comparisons between the "classical" burned area approach and the FRP approach, or approaches based on active fire detections in general, have indicated there is considerable variability in the amount of burned area associated with an individual active fire detection, and thus the two approaches do not always align Randerson et al., 2012). In general, direct mapping of burned area excels when fires are large, but has difficulty in detecting smaller fires, for example, in croplands and in other areas where many fires have a size below the 21 ha of an individual 500 m MODIS pixel. Combining both burned area and active fire data, Randerson et al. (2012) provided evidence that the total area burned by these relatively small fires could be substantial at the global scale. Therefore, emission estimates based solely on active fires, including the Fire INventory from NCAR , may better capture spatial and temporal variability in regions with many small fires than emission estimates based solely on burned area (Reddington et al., 2016). However, approaches based solely on active fires often do not account for spatial and temporal variability in the amount of burned area per active fire detection or variability in fuel consumption within biomes.
In this paper we describe the emissions estimates associated with the GFED4 burned area product from , with or without additional burned area from small fires based on a revised version of the Randerson et al. (2012) small-fire estimation approach. The main focus of our analysis will be on the model version that includes small fires (GFED4s), while the emissions estimates based on burned area without small fires will be referred to as GFED4. We also used a recent meta-analysis (van Leeuwen et al., 2014) to constrain our modeled estimates of fuel consumption. Fuel consumption is the amount of biomass, coarse and fine litter, and soil organic matter consumed per unit area burned and is the product of fuel load and combustion completeness. Besides these two main improvements over earlier versions, we made a number of additional modifications including updated input datasets, the use of satellite-derived estimates of parameters governing fuel consumption and tree mortality in the boreal region , and application of a new emission factor methodology that separates temperate and boreal forest ecosystems . In Sect. 2 we provide more detail on these input datasets, followed by a description of the modeling framework in Sect. 3. Results are given in Sect. 4 followed by a discussion in Sect. 5 that includes a description of the main differences with GFED3 and an assessment of the primary sources of uncertainty in estimating fire emissions. In the conclusions (Sect. 6) we summarize the main points of our analysis and describe several important directions for future work.

Input datasets
Our version of the Carnegie-Ames-Stanford Approach (CASA) model described in Sect. 3 requires input datasets on vegetation characteristics, meteorology, and fire parameters. Most of these datasets are somewhat different from those used in previous versions of GFED, in part from a need for shorter latency in our updates. We re-gridded all of the input datasets to 0.25 • spatial resolution and a monthly temporal resolution. We took additional steps to create estimates of fire dynamics on daily and 3-hourly time steps.

Vegetation characteristics
In CASA, the fraction of absorbed photosynthetically active radiation (fAPAR) is used to estimate net primary production (NPP), fractional tree cover (FTC) is used in the allocation of NPP between living carbon pools, and land cover (LC) is used to set turnover rates for stems and leaves, applying emission factors, and for categorizing fire carbon emissions into various fire types.
We calculated fAPAR based on the Global Inventory Modeling and Mapping Studies (GIMMS) normalized difference vegetation index (NDVI) version 3g (Pinzon and Tucker, 2014) and relations established by Los et al. (2000). This dataset is derived from the Advanced Very High Resolution Radiometer (AVHRR) sensor flying on board several satellites. We capped fAPAR at 0.95, corresponding to an NDVI value of 0.9. Data were not available for several remote islands, including Hawaii and Fiji, and we do not report emissions for these locations.
FTC was derived by aggregating the annual MODIS MOD44B vegetation continuous fields (250 m, V051; Hansen et al., 2005) to 0.25 • . In order to provide consistency over the full time period, we used the last year available (2013) and increased FTC in prior years using the fire-driven deforestation rates. These fire-driven deforestation rates were based on the amount of burned area within tropical forests at an annual time step. We used land cover maps from the annual MODIS MCD12C1 land cover type product and University of Maryland (UMD) classification scheme (Friedl et al., 2010). The climate modeling grid (CMG, 0.05 • ) dataset was resampled to 0.25 • based on the most abundant land cover type. This dataset was available for 2001-2012; data from 2001 were applied to earlier years in the time series, and 2012 land cover data were used for years after 2012.

Meteorological datasets
We now use air temperature (t2m), soil moisture (swvl), and solar radiation (ssrd) from the ERA-Interim dataset (Dee et al., 2011) produced by the European Centre for Medium-Range Weather Forecasts (ECMWF). We calculated the monthly mean for all datasets and regridded the 0.75 • dataset to our 0.25 • resolution without interpolation.
These datasets are somewhat different from inputs for earlier GFED versions but are now internally consistent. Interannual and seasonal variability was relatively similar to datasets previously used in GFED, and these variations have the largest impact on our calculations. The use of soil moisture is new; previously, we used a bucket model based on rainfall and potential evaporation to calculate the wetness of soils, a key input dataset for calculating heterotrophic respiration (R h ) rates and combustion completeness (see Sect. 3). Soil moisture is now transformed to a soil moisture index (SMI) based on soil-type-specific permanent wilting point (PWP) and field capacity (FC) values as described in http://www.ecmwf. int/en/forecasts/documentation-and-support/evolution-ifs/ cycles/change-soil-hydrology-scheme-ifs-cycle and is capped at 1. This was done for all four different soil layers (0-7, 8-28, 29-100, 101-255 cm). The SMI for the 0-7 cm layer replaced the scalar used previously for combustion completeness. The average SMI of the top two layers was used to down-regulate NPP in herbaceous vegetation in the light use efficiency model when moisture was limiting, whereas the average of the top four layers was used for NPP in woody vegetation. The average SMI for the upper two layers was also used to represent the influence of soil moisture on the abiotic scalar regulating rates of R h . Finally, the average SMI of all layers was used in the allocation of assimilated carbon to above-and belowground pools (see Sect. 3).

Fire processes
We derived burned area (both mapped burned area and active fire detections scaled to burned area) and metrics that can be used to assess fire-induced tree mortality and combustion completeness from satellite. Our burned area time series is based on MODIS data for the August 2000 onwards period (the "MODIS era") and based on other sensors before that period. In Sect. 2.3.1 we briefly describe the MODIS burned area data for which a more detailed description is described in Giglio et al. (2013). In Sect. 2.3.2 we then explain how the small fire burned area estimates for the MODIS era were derived based on Randerson et al. (2012). This is the GFED4s burned area time series and complemented with other sensors to compute the full 1997-2016 time period dataset (Sect. 2.3.3).

Burned area from MODIS
For the MODIS era we used the MODIS Collection 5.1 MCD64A1 burned area product . Compared with Collection 5 and earlier versions of the MCD64A1, the Collection 5.1 product reduces the unintentional removal of small burns and eliminates some systematic omission errors . The MCD64A1 product maps daily burned area at 500 m spatial resolution; these data are then aggregated to a 0.25 • grid (both monthly and daily) to produce the MODIS-era GFED4 burned area product (Fig. 1a).

Small fire burned area during the MODIS era
In the MODIS era, we combined 500 m burned area (see above), 1 km thermal anomalies (active fires) from Terra and Aqua MODIS, and 500 m surface reflectance observations to statistically estimate burned area associated with small fires, BA sf , in each 0.25 • grid cell (i), month (t), and aggregated vegetation type (v): where FC out is the number of active fire pixels outside of the perimeter of the MCD64A1 burned area, α is a ratio of burned area to active fires within MCD64A1 burned areas, and γ is a correction factor derived by comparing difference normalized burned area (dNBR) of active fires observed outside (dNBR out ) and inside (dNBR in ) of MCD64A1 burned areas with unburned control areas (dNBR control ; see Eq. 4 of Randerson et al., 2012). α and γ scalars were estimated each year (y), as a function of region (r), seasonal interval (s), and aggregated vegetation type (v). Our method was similar to that described in Randerson et al. (2012), but with several important modifications to each of the three factors on the right-hand side of Eq. (1) as described below. First, we used the MCD64A1 product from Collection 5.1, replacing Collection 5 that was used in Randerson et al. (2012). Second, instead of using a single source of level 3 composited thermal anomaly/fire product from Terra (MOD14A1), here we used individual active fire detections from both Terra and Aqua. Third, to improve geolocation accuracies, we used the MODIS fire location product (MCD14ML) instead of the gridded composite fire product (MOD14A1). To further reduce geolocation uncertainties, we only retained active fire detections with small or moderate scan angles (equal to or less than 0.5 radians). This threshold was somewhat arbitrary and future research is required to identify how a balance between sample size and area of view is best achieved. Even with the above adjustments to improve georegistration, some remaining resampling error was introduced in the process of projecting the variablesize MODIS fire pixels onto the 500 m sinusoidal grid on which the MCD64A1 burned area product is generated. To partially correct this known bias, we applied region-specific factors ranging from 0.88 in Africa north of the Equator to 1.12 for temperate and boreal Asia. These correction factors, which were derived using a rigorous model of the sampledependent MODIS pixel shape and size, partially compensated for the simplified, fixed 1 km radius initially used to determine whether an active fire pixel was co-located (inside) or outside of the MCD64A1 burn area pixels. Finally, to estimate dNBR for active fires inside of MCD64A1 burned area, we only used active fire detections for which each of the four overlapping 500 m pixels were classified as burned. This was Earth Syst. Sci. Data, 9, 697-720, 2017 www.earth-syst-sci-data.net/9/697/2017/  Figure 2. The distribution of difference normalized burn ratio (dNBR) for active fires detected within burned areas from MCD64A1 (red), outside of burned areas (orange), and for control areas (blue) within Northern Hemisphere Africa (NHAF) and Central Asia (CEAS). The distributions, generated using observations in 2001-2012, were constructed during the peak fire month for each region. The improved approach (see Sect. 2.3.2 for details) compressed the distributions in unburned control areas and increased the separation between the three categories. a stricter criterion than in Randerson et al. (2012) that increases dNBR in and its separation from dNBR out and other areas used as controls (Fig. 2).
It was not possible to apply the same constraint in the calculation of dNBR out , so this adjustment usually had the effect of lowering γ . We note that dNBR out in particular is strongly affected by resampling error; thus, the individual γ correction factors are in turn also influenced by resampling error. The net effect is to limit the range of values that may be attained by γ , in a sense leaving an "imprint" of resampling error on the resulting small fire burned area estimates. This imprint is an unavoidable outcome of using relatively coarse 1 km and 500 m gridded time series data to track small, subpixel fires. At the same time, we raised the filtering standard for control pixels (Eq. 4 of Randerson et al., 2012) so that pixels within a 1 km buffer area of active fire detections by either Terra or Aqua MODIS were excluded in the calculation of dNBR for non-burning areas (dNBR control ). During the regional aggregation of dNBR, we excluded 500 m pixels that were marked as "water" by MODIS land cover type product (MCD12Q1).
During the time both Terra and Aqua fire detections were available (January 2003-December 2016), we calculated BA sf separately for Terra (MOD) and Aqua (MYD). BA sf was then estimated as the arithmetic mean of the two estimates. A climatological ratio of BA sf−MYD / BA sf−MOD was used to estimate BA sf−MYD during periods when Aqua MODIS observations were not available (August 2000- As expected, burned area from small fires is more prevalent in areas with extensive agriculture and in other humandominated landscapes (Fig. 1c). (1997)(1998)(1999)(2000) for GFED4s

Estimating burned area prior to the MODIS era
For the pre-MODIS era, we used monthly active fire data from the Visible and Infrared Scanner (VIRS) aboard the Tropical Rainfall Measuring Mission (TRMM) or the Along Track Scanning Radiometers (ATSR) on board multiple platforms to estimate burned area. Two steps of optimization were used to derive total burned area, starting with the GFED4s product described above. The first step was to develop a relationship between aggregated active fires (from VIRS or ATSR) and burned area during the MODIS era in each GFED region, with the aim of using this relationship to estimate regional burned area during 1997-2000. The second step involved distributing the aggregated burned area within each region to individual 0.25 • grid cells.
To calculate the regional sum of BA during the pre-MODIS era, we first performed regression analyses between ATSR or VIRS active fires and the regional sum of GFED4s burned area during the MODIS era. We developed linear regression models for each GFED region (Fig. 3), for each month, and for each of the five aggregated vegetation classes (see Randerson et al., 2012, for a description of the vegetation classes). When ATSR and VIRS active fire data were both available (January 1998-July 2000), the highest performing regression from these two datasets was used to estimate the burned area in each region. Among the 14 continental-scale regions, we used VIRS data in Africa, Southeast Asia, Equatorial Asia, and Australia and ATSR data in all other regions (Fig. 4). Prior to 1998, when VIRS data were not available, regressions based on ATSR were used. If the ATSR or VIRS active fires for any given month were outside the dynamic range of active fires during the MODIS era, we instead used linear regression derived from all of the monthly data during the MODIS era for that region.
After quantifying the sum of burned area within each region, we distributed it among 0.25 • grid cells using the following approach. While active fires from ATSR or VIRS provide some indication about the temporal dynamics of fire in a region, the active fire approach tends to underestimate burning in savannas and other areas with herbaceous fuels. To assess how well active fires captured regional spatial patterns, we estimated the spatial correlation between active fires and burned area in each GFED region during the MODIS era. Higher correlations from these analyses indicated better agreement between the spatial distribution of ATSR/VIRS active fires and GFED4s burned area. Since we found the correlation coefficients varied seasonally, a mean monthly (m) set of spatial correlation coefficients (SC) was derived to determine the level of representation of burned area by ATSR/VIRS active fires. The spatial distribution function of burning was based on a linear combination of climatological distribution of burned area (cl) and the distribution of active fires (FC): where SDF FC and SDF cl are unitless spatial distribution functions that each sum to 1 in each GFED region and were derived from active fire detections or the monthly climatology of burned area during the MODIS era from GFED4s, and BA rs is the regional (r) sum of burned area for that month and region derived from the regressions between GFED4s and ATSR or VIRS active fires described above. In temperate and high-latitude regions, where the spatial correlation between active fires and burned area is relatively high, the equation primarily uses information from the pre-MODIS active fires to assign the spatial distribution of burned area. In regions where the spatial correlation between active fires and burned area is relatively low, the equation relies more on the climatological burned area pattern from the MODIS era. For consistency with the previous step, the source of the active fires for generating the SDF was the same as active fires used to generate the regional sum of burned area in each region. The contribution of ATSR, VIRS, MCD64A1, and BA sf to the total burned area is shown in Fig. 4 for the GFED4s time series.

Combustion completeness and fire-induced mortality in boreal forests
Despite relatively similar environmental conditions and vegetation attributes, the boreal regions in North America and Eurasia exhibit significantly different patterns of fire severity (Wooster and Zhang, 2004). This was shown to primarily be a function of divergent plant traits for the dominant tree species in each continent . Species in North America tend to promote crown fires with higher levels of combustion completeness of the canopy and tree mortality compared to lower-severity surface fires in Eurasia. As with other global fire models, GFED3 did not capture these differences due to biome-wide parameterizations.
To address the large-scale differences in boreal fire effects, we integrated satellite-based metrics of severity from Rogers et al. (2015) including immediate tree mortality and an index of vegetation destruction. These were initially calculated at 1 km and 500 m resolutions, respectively, and aggregated to 1 • , but here rescaled to our 0.25 • grid without interpolation. Vegetation destruction was derived from three MODIS-based metrics that provide information on immediate fire-induced losses of green vegetation, reduction in canopy and soil water, and landscape charring. These included dNBR, decreases in NDVI, and increases in summer land surface temperature (LST). The original vegetation destruction product used LST from Aqua and was available from 2003 to 2012. We extended it here to 2001 and 2002 using multiple linear regression relationships based on Terra LST, dNBR, and changes in NDVI at 1 • (r 2 = 0.95 for North America, 0.96 for northwest Eurasia, 0.95 for northeast Eurasia, and 0.91 for southern Eurasia). Immediate tree mortality was based on decreases in tree cover and increases in spring albedo 1 year after a fire, and was provided for fires between 2001 and 2009. For both products, grid-cell-specific averages were used in years not covered, and grid cells without valid values were assigned regional burned-area-weighted means. On average, vegetation destruction was 36 % lower and fire-induced tree mortality was 42 % lower in boreal Eurasia compared to boreal North America. More details on model integration are given in Sect. 3.1, and more information on these products can be found in Rogers et al. (2015).

Modeling framework and modifications
GFED is based on the CASA model, which was developed in the early 1990s to simulate the terrestrial carbon cycle using satellite data (Potter et al., 1993;Field et al., 1995;Randerson et al., 1996). In previous work we adjusted the model to account for fires (van der Werf et al., 2003; further revisions were implemented in GFED2 (van der  and GFED3, including modifications to estimate the contribution of different fire categories including agricultural waste burning, boreal forest fires, deforestation fires, peatland fires, and savanna fires (van der Werf et al., 2010).
Below we describe the model in general (Sect. 3.1), followed by a more detailed explanation of the changes we made in this version (Sect. 3.2-3.5).

CASA-GFED framework
When CASA was developed it computed carbon fluxes as the difference between NPP and R h . Both are still calculated for each month and each 0.25 • grid cell. NPP is based on a light use efficiency model (Field et al., 1995) and is distributed over various live biomass "pools" (leaves, stems, roots) according to satellite-derived fractional tree cover maps. In forests we allocate NPP to all three live biomass pools, and in grasslands to leaves and roots, accounting for variability in allocation due to gradients in mean annual precipitation as in GFED3. The carbon in these pools is subsequently delivered to nine litter pools at the surface and in the soil with turnover rates set for each pool depending on moisture conditions and temperature.
The turnover rates of the wood pool in GFED4 (the modeling framework used to derive both GFED4 and GFED4s emissions) were adjusted at the biome level to match observed aboveground biomass (Avitabile et al., 2016;Santoro et al., 2015). Wood turnover now varies between 40 years for deciduous broadleaf forest and 65 years for deciduous needleleaf forest, with turnover times for evergreen forest in between those values: 52 years for evergreen needleleaf and 55 for evergreen broadleaf (Fig. 5). Similarly, turnover times of slowly decomposing soil pools were adjusted in GFED4 to better match measured values reported for 0-30 and 30-100 cm (Batjes, 2016).
In GFED1 we added fire, herbivory, and grazing as additional carbon loss pathways besides R h . Fires transfer carbon to the atmosphere and between the different pools depending on the burned fraction of the grid cell, combustion completeness, fire-induced mortality rates, and information on whether belowground carbon pools are susceptible to fire or not.
Combustion completeness (CC) is treated similarly in GFED4 as in our previous work with set minimum and maximum values; see Table 1 in . We scaled CC using the soil moisture index (SMI) of the top 7 cm such that the 5th and 95th percentiles corresponded with the minimum and maximum values. Fire-induced tree mortality was set to 2 % for low tree cover regions (mainly savannas and agriculture) and 50 % for forests in general but modified in tropical forests based on fire persistence as in GFED3, and in boreal regions according to satellite derived proxy datasets (Sect. 2.3.4). More specifically, in boreal forests we used the satellite-derived instantaneous tree mortality to represent fire-induced tree mortality. In addition, we did not use the CC scaling by SMI for the aboveground wood in the boreal region but used the satellite-derived vegetation destruction scalar for this. The combustion completeness of the wood pool ranged between the set minimum and maximum values (0.2 and 0.4, respectively), and linearly depended on the vegetation destruction scalar instead of SMI.

Modifying the burned fraction to account for sub-grid-scale heterogeneity in fuels
In our previous model setup, fires lowered the fuel load in each grid cell depending on burned area, combustion completeness, and fire-induced mortality rates. This was done uniformly in the grid cell, not accounting for the fact that fires only lower fuel in the fraction of the grid cell that actually burned. This may have led to an underestimation of emissions in frequently burning regions, especially towards the end of the fire season. For example, in a grassland grid cell that burns in two consecutive months, each with 0.5 burned fraction, modeled fuel loads in the second month are half those of the first month if combustion completeness is set at 100 % (Fig. 6). In reality, the fuel load in that grid cell in the second month should be similar to that in the first month for the part that had not burned, and depleted for the part that had burned. To compensate for this effect we now calculate the modified burned fraction of the grid cell as where MBF is the modified fraction of the grid cell that burns, BA is the burned area, and A is the area of the grid cell at location (i). In our hypothetical example from above MBF now becomes 1 in the second month according to Eq. (3), thus generating similar emissions in the 2 months that each burn the same area (Fig. 6). When cumulative burned area over a fire season exceeds the grid cell area this approach yields negative values towards the end of the season; if this occurs these values are replaced by the burned area divided by the grid cell area. Because we only take into account the burned area from the actual month and the three preceding months, grid cells with two burning seasons are probably not impacted because they are usually separated in time by more than 3-4 months. Our approach does not influence the burned area datasets but only the way it is used in the conversion of burned area to emissions.

Fuel consumption optimization
Emissions are derived from the multiplication of burned area and fuel consumption per unit burned area, the latter being the product of fuel loads per unit area and combustion completeness. Van Leeuwen et al. (2014) summarized the peerreviewed literature on fuel consumption rates consisting of 76 studies and covering 121 unique measurement locations. In addition to the fuel consumption measurement, we also included the fuel load measurements mostly in savannas from Scholes et al. (2011) and assumed a combustion completeness of 0.9 for these fuel measurements to calculate fuel consumption. This latter set of 95 measurements were mostly confined to South Africa, Botswana, and Zambia. We used these two compilations to adjust the turnover rates of herbaceous leaf and surface litter pools where the largest discrepancies between the model and measurements were found. Uncertainties in the comparison stem from comparing different time period (most measurements were made before our study period) and from comparing local measurements with model estimates for 0.25 • grid cells. Fuel consumption rates are highly variable, not only between biomes Previous Modified Figure 6. Burned area, fuel load, and emissions for a hypothetical grid cell where 50 % of the area burns in month 2 and 50 % in month 3, and assuming a combustion completeness of 100 %. "Previous" refers to our previous work in GFED3 and before where no adjustments were made in the conversion of burned area to the fraction of fuel load that is combusted; "modified" refers to the current approach (GFED4 and GFED4s), where we treat the burned fraction as the fraction of the total remaining fuel in the grid cell that is combusted using Eq. (3).

Emission factors
Emission factors are used to convert dry matter burned into emissions of trace gases and aerosols. These were assigned in GFED3 based on the compilation of Andreae and Merlet (2001) with annual updates. A new compilation was de-veloped by , who considered a subset of the available literature focusing on measurements of smoke that had cooled to ambient temperature but had not undergone photochemical processes. In addition to this approach that may better match the requirements from the atmospheric community,  reported mean values for more biome categories. The most important change in that regard from the GFED perspective is the partitioning of the extratropical forest category into temperate and boreal forests. We compiled a subset of the available species that are most frequently used in large-scale chemistry transport models and filled missing values using those of Andreae and Merlet (2001) with annual updates (see Table 1 http://bai.acom.ucar.edu/Data/fire/ and will be incorporated into future GFED versions.

Redistributing monthly emissions on daily and 3-hourly timescales
We made several improvements to the approach described by Mu et al. (2011) for redistributing monthly emissions to daily and 3-hourly time steps in each 0.25 • grid cell. This set of higher temporal resolution emissions was created only for the period of 2003 to the present because of increased MODIS active fire data availability after the launch of Aqua.
To estimate the daily distribution of emissions, we used two sources of information: active fires from MCD14ML and the day of burning reported in the MCD64A1 burned area product. In tropical regions between 25 • N and 25 • S, we weighted the information content from these two sources equally in grid cells for which both data streams were available. In GFED3, the day of burning was not available for use as a constraint on daily variability. In the extra-tropics (poleward of 25 • N and 25 • S) we solely used active fires to distribute the daily pattern of emissions. In these regions, gaps between successive overpasses of Aqua and Terra are smaller, and active fires have been shown to be moderately effective in capturing daily variations in fire spread rates (Veraverbeke et al., 2014). We removed persistent active fire locations associated with volcanoes, gas flaring, and many other non-fire sources, using a more recent static hotspot database . A simple 3-day center mean smoothing filter was applied in tropical regions to adjust for gaps in MODIS coverage, following Mu et al. (2011).
We created a climatological diurnal cycle of burning in each region and for different aggregated vegetation types to redistribute daily emissions on a 3 h time step. The approach is similar to the one described in Mu et al. (2011), and uses active fire data derived from full hemispheric scans of GOES-11 (west) and GOES-12 (east) observations during 2007-2009 with version 6.0 of the WF_ABBA algorithm (Prins et al., 1998;Reid et al., 2009). Here, we used an improved land cover type product from Friedl et al. (2010), MCD12C1 version 5.1, during 2007-2009 to create diurnal cycles of emissions for three aggregated vegetation classes within continental-scale regions in the western hemisphere. These diurnal cycles were then applied in other regions using the same mapping strategy as described in Mu et al. (2011). An example of the redistribution of emissions using this approach for daily and hourly emissions is shown in Fig. 7, showing relatively comparable results as in GFED3.

Results
Over the 1997-2016 period, fire emissions according to GFED4s are on average 2.2 Pg C yr −1 with substantial interannual variability. In Sect. 4.1 we discuss the spatial pattern of burned area and the resulting emissions, and in Sect. 4.2 the temporal patterns. We then discuss the modeled fuel consumption (Sect. 4.3) and the greenhouse gas forcing of fires in Sect. 4.4. We also explain the main differences between GFED4s and GFED3 as well as differences in emissions between GFED4s and GFED4, with the latter derived from the same modeling framework but using the burned area dataset without small fires (i.e., with burned area from GFED4) (Sect. 4.5).

Spatial patterns
The spatial patterns of emissions and burned area are similar but because fuel consumption is, in general, inversely related to fire frequency (Table 2), emissions are less spatially variable than burned area (Fig. 8). About 84 % of global carbon emissions have an origin in the tropics between 23.5 • N and 23.5 • S (1830 Tg C yr −1 ), and 62 % come from tropical savannas (1341 Tg C yr −1 ), underscoring the importance of fire as a driver of biogeochemical cycles and ecosystem processes in tropical ecosystems.
The relative importance of different regions or continents varies depending on whether one is considering burned area, carbon emissions, or trace gas emissions. For example, while Equatorial Asia (mostly Indonesia) is responsible for only 0.6 % of global burned area, the region accounts for 8 % of carbon emissions and 23 % of CH 4 emissions from global fire activity. Boreal forests offer a similar, although less extreme, example: 2.5 % of global burned area, 9 % of global fire carbon emissions, and 15 % of global fire CH 4 emissions. This difference is due to the large variability in fire behavior and fuel consumption in forested regions with high fuel loads, especially when fires consume organic soils. The larger contribution of coarse fuels and smoldering stages of combustion in organic soils also contributes to higher emission factors for reduced species such as CO and CH 4 . More information on the relative contribution of the different regions is provided in Tables 2 and 3 for fire carbon emissions and in Table 1 for mean annual emissions of individual trace gases and aerosols. More time series information on individual trace gases and aerosols can be found at http://www.geo.vu.nl/~gwerf/GFED/GFED4/tables/.

Temporal dynamics
Forest fires are the primary driver of interannual variability in fire emissions (Fig. 9, Table 3). In the tropics, much of this variability is linked with sea surface temperatures, including large-scale climate modes such as El Niño, which alter fire risk in tropical forests (Chen et al., 2016). El Niño years including [1997][1998]2002, and 2015 have relatively large contributions from tropical forests. Peat burning in Equatorial Asia contribute substantially to anomalously high emissions 1997 and 2015, in part due to the human-ignited fires that burn in drained peatlands during prolonged drought periods associated with El Niño (Field et al., 2016;van der Werf et al., 2008). Most of the interannual variability in emissions originates from regions outside of Africa, which is shown in the top right panel in Fig. 9.
August and September are usually the months with highest emissions, coinciding with the main austral fire season (Fig. 10). This dominance of the Southern Hemisphere is because Southern Hemisphere Africa has higher emissions than Northern Hemisphere Africa (especially during the latter part of our time period) and the deforestation regions south of the equator are larger and more active than those north of the equator. Finally, it coincides with the burning season in the temperate and boreal Northern Hemisphere summer, which produces far more emissions than these eco-regions in the Southern Hemisphere summer. The inclusion of small fires does not influence these dynamics (Fig. 10), while the modified conversion of burned area to burned fraction of fuel causes a slight delay in the peak fire season, mostly in Africa (Fig. 11).

Fuel consumption
Modeled and measured (van Leeuwen et al., 2014) fuel consumption agree reasonably when aggregated to biome levels (Fig. 12). Fuel consumption in savannas and other regions with herbaceous fuels is lower in GFED4 (both with and without small fires) than in GFED3 because of increases in the turnover rates of herbaceous leaf and surface litter pools. As a consequence, fuel consumption in GFED4 in savannas has decreased 30 % compared to GFED3. Compared with the fuel consumption database from van Leeuwen et al. (2014), GFED4 predicts estimates that are, on average, 14 % higher than the fuel consumption measured in the collocated grid cells. GFED4 also shows a somewhat lower range than the observations. Fuel consumption in tropical forests is substantially higher (45 %) than measured. However, measured fuel consumption typically does not account for repeated burning during the deforestation process, which can lead to complete combustion over a full fire season following multiple fires Yokelson et al., 2007). In temperate forests, GFED4 average fuel consumption is 33 % below the measured values, while in boreal forests the model is 39 % higher. The discrepancy in temperate forests can be traced back to one very high measurement in Tasmania that is not reproduced in the collocated grid cell in GFED4; the medians are in close agreement. Pinpointing the reasons for the disagreement in boreal regions is less straightforward; the range, mean, and medians for the modeled values exceed the measured ones. One potential reason might be related to the relatively large number of experimental burns in the database of van Leeuwen et al. (2014) for this biome, which in general occur under conditions less favorable for large fires to prevent them from growing out of control. For the state of Alaska, GFED4 estimates of fuel consumption are similar to estimates from the Alaska Large Fire Database that rely solely on fuel consumption observations from uncontrolled wildfires (Veraverbeke et al., 2015). The satellitederived maps of tree mortality and combustion completeness led to an increase in fuel consumption in North America. On average, fuel consumption there is now 38 % higher than in boreal Asia for grid cells north of 55 • N and with more than 20 % tree cover. For all other biomes the number of fuel consumption measurements is probably too small for a fair comparison.

Greenhouse gas forcing of fires and potential for mitigation
Fires emit the greenhouse gases CO 2 , CH 4 , and N 2 O and also modify the climate by emitting precursors of aerosols and ozone, aerosols, and changing surface properties such as albedo in often complex ways Ward et al., 2012). Average total annual greenhouse gas emissions according to GFED4s were 7.3 Pg CO 2 , 16 Tg CH 4 , and 0.9 Tg N 2 O. Note that in this section we refer to C emissions in CO 2 mass units rather than the C mass units used in the rest of the paper. Using a 100-year time horizon and based on global warming potentials of 34 for CH 4 and 298 for N 2 O (Myhre et al., 2013), this translates to 8.1 Pg CO 2 equivalent annually, or 23 % of global fossil fuel CO 2 emissions in 2014 (Boden et al., 2017;Le Queré et al., 2015). However, fire emissions are not generally a net CO 2 source to the atmosphere, and may be better viewed as "fast respiration", because regrowing vegetation in many burned areas will sequester a roughly equivalent amount of atmospheric CO 2 during post-fire stages of ecosystem recovery over a period of years to decades (Landry and Matthews, 2016). In general, only fires that are not balanced by regrowth are a net CO 2 source. The most obvious fire types in this category are fires used in the deforestation process or those that burn drained peatlands. CO 2 emissions from these two fire types are estimated here to be 0.4 Pg C or 1.3 Pg CO 2 per year. Including CH 4 and N 2 O of all fire types, the contribution of fires to the greenhouse gas budget is 2.1 Pg CO 2 equivalent annually or 6 % of global fossil fuel CO 2 emissions in 2014 (Boden et al., 2017). Another category of fire emissions that may add to the build-up of atmospheric CO 2 are those that increase over time, for example increasing burned area or combustion completeness in boreal regions related to climate change. Our time series is too short and our modeling framework is too incomplete to capture the exact magnitude of emissions from a changing boreal fire regime. Savanna fire season management has been proposed as a climate mitigation instrument (Russell-Smith et al., 2013). By burning early in the season instead of late, fires are in general more patchy, release fewer emissions, and prevent large late-season fires. According to GFED4s, total annual tropical savanna fire emissions averaged 4.9 Pg CO 2 , 6 Tg CH 4 , and 0.6 Tg N 2 O. In this case, only CH 4 and N 2 O emissions are relevant and combined account for 0.3 Pg CO 2 equivalent of annual emissions. Experiments with early burning in Australia have shown a potential reduction of up to 50 % (Walsh et al., 2014), but it is not known to what extent it is possible to use this approach in other regions, what the side effects will be, and whether some of the mitigation will be offset by higher CH 4 emission factors because early season fires may occur when fuels have had less time to cure. In Australia the latter is probably not the case (Meyer et al., 2012), but whether this is found in other regions remains to be investigated.

Differences between GFED4s, GFED4, and GFED3
In general, small fire burned area (GFED4s) and the modified burned-area-to-burned-fraction conversion (GFED4 and GFED4s) cause emissions to increase, while the optimization of fuel consumption causes emissions to decrease as compared with earlier versions of GFED. On a global scale, these modifications yield a modest net increase in fire carbon emissions in GFED4s as compared with GFED3 (11 % for the overlapping 1997-2011 time period). However, the effects of the three main adjustments vary spatially; on a regional scale the differences are larger (Fig. 13). The relative effect of the small fire burned area is largest in temperate and subtropical regions where agricultural waste burning and shifting cultivation are important drivers of fire activity. The more than doubling of burned area in Central America and North-ern Hemisphere South America compared to GFED3 reflects differences in both GFED4 burned area and the inclusion of small fires (Fig. 13). Burned area in Temperate North America and Europe also increases by about a factor of 2, and most of this difference is due to small fire burned area. Our modifications to herbaceous fuel turnover rates cause fuel consumption per unit area (per m 2 of burned area) to decrease, whether or not small fire burned area is included, in all regions except Central Asia, where consumption increased by approximately 20 to 30 % (Fig. 13). Estimates of fuel consumption per unit area are similar in GFED4 and GFED4s, indicating that fuel loads in areas burned by small fires are not substantially different from those in nearby mapped burned areas (or that our relatively coarse modeling setup cannot resolve finer-scale landscape differences). The exception is Central Asia, where small fire burned area causes a relative increase in burned area in forested regions. In Central America and Equatorial Asia, in contrast, small fire burned area occurs predominantly in areas with relatively low fuel loads.
The modified burned-area-to-burned-fraction parameterization causes an increase of 5 % in carbon emissions (not shown). The new parameterization only influences grid cells that burn for more than 1 month in a season, and has a larger effect in grid cells that have a high burn fraction. Regions with frequent savanna fires therefore have the highest sensitivity, with emissions in Northern Hemisphere Africa, Southern Hemisphere Africa, and Australia increasing by 9, 8, and 6 %, respectively. In other regions, the differences are smaller than 2 %. In addition to the increase in emissions in frequently burning savannas, the new parameterization also changes the temporal dynamics (Fig. 11); early season emissions are lower because less fuel remains from the previous growing season, and late-season emissions are higher because the parameterization has the effect of increasing gridcell level fuel consumption later in the fire season.
Without small fire burned area, the impact of decreasing fuel consumption and a minor reduction in burned area (2 % globally) yields a total carbon emissions estimate of 1.5 Pg C yr −1 in GFED4, a 23 % reduction compared to GFED3 during 1997-2011. Although globally GFED4 emissions are lower than GFED3, in some regions both burned area and emissions increase, mostly in temperate regions (Fig. 13). Using the new set of emission factors that separate extratropical forests into boreal forest and temperate forest components generates a larger increase in CO emissions in boreal regions than expected from the change in carbon emissions alone (Fig. 14).

Discussion
We have calculated global carbon emissions from fires by using a biogeochemical model to combine satellite fire observations with estimates of fuel consumption that respond to variations in environmental conditions. In a subsequent step, we have used a higher-resolution set of emission factors to convert carbon emissions into emissions of trace gases and aerosols. Since the publication of GFED3 in 2010, burned area algorithms have been improved considerably , and now include a preliminary estimate of the impact of small fires . In parallel, the fuel consumption database created by van Leeuwen et al. (2014) has enabled the development of an improved parameterization of herbaceous vegetation turnover in grass-land and savanna ecosystems, and validation of our modeled values in several other biomes. New emission factor measurements and a more systematic assessment of the available data has led to a more consistent set of emission that better resolve extratropical forest biomes . Together, all of the elements required to calculate emissions following the Seiler and Crutzen (1980) paradigm have seen substantial improvements. Our new emission estimates are therefore more reliable than previous estimates because they account for updated information on key components of the fire emissions equation, but uncertainties remain substantial and are difficult to quantify.
The addition of small fire burned area is a key improvement in GFED4s compared to earlier versions, for example, and the modifications we describe in this paper have improved our estimates compared to Randerson et al. (2012). However, the actual magnitude of small fire burned area is difficult to quantify on global scales because it requires a large sample of burned area measurements from sensors with a higher spatial resolution than MODIS. To date, Landsat estimates of burned area have been produced for various regions and purposes including the validation of coarser resolution data (Padilla et al., 2014(Padilla et al., , 2015Roy and Boschetti, 2009;Silva et al., 2005) but a publicly available and globalscale database of Landsat burned area is needed to better validate ongoing efforts to produce reliable burned area estimates from coarser resolution satellite imagery. In addition, new missions such as the Visible Infrared Imager Radiometer Suite (VIIRS) and Landsat-8 also increase the number of active fires detected compared to MODIS (Schroeder et al., 2014b).
A somewhat similar story exists with respect to validating fuel consumption. The fuel consumption database from van Leeuwen et al. (2014) has enabled a more systematic validation but the number of studies is limited, relatively few measurements were made during our study period, and it is questionable to what degree the local measurements are representative for the 0.25 • grid cell averages reported here. Thus, our estimates are likely to remain most useful for large-scale studies. Although recent regional studies have shown that our global modeling framework is indeed capable of generating reliable large-scale emissions in Alaska and the tropics, these studies also show that GFED may have problems capturing finer-scale dynamics Veraverbeke et al., 2015). While improved satellite missions and combining various data streams may help in improving the fuel consumption parameterization in models, systematic field-based assessments of fuel consumption along gradients of productivity and other factors influencing variability in fuel consumption within biomes are a necessary step in further improving bottom-up fire emission estimates. New satellite estimates of biomass may be helpful in this regard (for example the Global Ecosystem Dynamics Investigation (GEDI) mission), particularly in deforestation and temperate forest and shrubland regions, where aboveground living biomass comprises a large component of fuel consumption.
Given the large uncertainties in bottom-up emission estimates in the past, top-down constraints have often been used to pinpoint discrepancies between modeled and measured atmospheric abundances of trace gases or aerosols. Carbon monoxide (CO) was most often used Hooghiemstra et al., 2011;Huijnen et al., 2016) because fires are a major source of CO, its lifetime is relatively long, and column CO is measured from several satellite sensors. More recent work also includes other species such as formaldehyde, NO 2 , and aerosol optical depth (Bauwens et al., 2016;Mebust et al., 2011;Petrenko et al., 2012). While providing additional information on strengths and weaknesses of inventories such as GFED, for example potentially missing late-season fires (Castellanos et al., 2014), the results of these studies are often contradicting (van Leeuwen et al., 2013), potentially due to the use of different atmospheric models and sources of observations. We would therefore respectfully argue that uncertainties in bottom-up and top-down approaches are overlapping. For example, carbon emissions from Indonesia during the 2015 high fire year according to GFED4s were almost 400 TgC (Fig. 9, http://www.geo.vu.nl/ gwerf/GFED/GFED4/tables/GFED4.1s_C.txt). Two inversion studies using Measurement of Pollution in the Troposphere (MOPITT) CO measurements derived either 100 Tg higher (Yin et al., 2016) or 100 Tg lower (Huijnen et al., 2016). Part of the difference can be attributed to the use of higher CO emission factors in the latter study, which thus requires less carbon burned to match atmospheric observations, but part is also due to differences in model setup and analysis design. The use of different top-down constraints (e.g. Infrared Atmospheric Sounding Interferometer (IASI) versus MOPITT) could lead to additional discrepancies, although studies employing column CO 2 from the Orbiting Carbon Observatory-2 (OCO-2) may omit some of the issues related to uncertainty in emission factors. Heymann et al. (2017) provided evidence for lower estimates than found in GFED4s in Indonesia for 2015 based on OCO-2 data.
Studies focusing on aerosol optical depth (AOD) do not give conflicting results but indicate that bottom-up estimates are roughly a factor 3 too low (Johnston et al., 2012;Kaiser et al., 2012;Petrenko et al., 2012;Tosca et al., 2013). While some studies have therefore boosted bottom-up emissions or created new inventories with much higher emissions to get AOD values more in line with observations (Liousse et al., 2010), this may jeopardize the reasonable agreement between bottom-up and top-down estimates found for most trace gases. To date, the disagreement between measured and modeled AOD has most often been linked to bottom-up emissions, but AOD calculation in models are uncertain as well. For example, increasing the hygroscopicity reduced the offset in tropical regions (Reddington et al., 2016). Besides exploring the factors that are used to estimate AOD in models such as the hygroscopicity, combining multiple species in inversion studies and better emission factors are needed to resolve one of the most important questions in biomass burning emissions research.
Most of the emission factors (EFs) used in these top-down approaches are based on midday sampling during peak fire emission rates. The EFs measured under these somewhat restricted circumstances are still highly variable with a coefficient of variation about the mean of about 40 % on average . The diurnal or longer-term variation in EFs should be larger but has not been explicitly wellmeasured yet (Saide et al., 2015). The EFs of many species have rarely been measured in the field for important fire types such as wildfires  and for some compound classes with perhaps the most important missing species being the semi-volatile precursors to organic aerosol, which are difficult to measure even in lab experiments (Gilman et al., 2015). A related area of uncertainty is the temporal evolution of emissions within the fire plume. Only a few field studies have measured how organic aerosol (OA) levels change with time. In one an increase in OA by a factor of about 2.5 was observed (Yokelson et al., 2009), while in another study OA decreased by about 20 % (Akagi et al., 2012). Understanding what controls secondary OA levels is critical to guide the proper use of AOD in inversions and to understand health and climate impacts.
Additional small errors also occur. In straightforward application of the carbon mass balance method the carbon content of the fuel that is actually volatilized is based on a few carbon content measurements of fuel subsamples. EFs are proportional to the carbon content used. This can theoretically cause an overestimation of the EFs by about 4 % if charcoal yields are important (Surawski et al., 2016). On the other hand, uncertainty in what ecosystem components actually burn means that the high carbon components can burn preferentially leading to underestimated EFs if based on average fuel C content (Santin et al., 2015). In general these small uncertainties may tend to cancel out. EFs may also be systematically overestimated by 1-3 % because many carbon-containing species cannot yet be measured .
For GFED3, we performed a Monte Carlo simulation to estimate carbon emissions uncertainties based on assumed uncertainties of key input data including burned area and best-guess estimates of various model parameters. We now refrain from estimating formal uncertainties because of difficulties in assessing the uncertainties in the various layers. For example, the burned area in many regions where small fires seem to be important now by far exceeds the range of uncertainty reported for GFED3 burned area. Given the level of agreement between our burned area estimates and more refined regional estimates , and between our modeled biome-average fuel consumption estimates and those measured in the field, a best-guess uncertainty assessment at regional scales could be a 1σ of about 50 % in general but higher in areas where small fire burned area is important or where there is significant fuel consumption in organic soils.
Lowering and/or better quantifying this uncertainty involves a thorough assessment of the burned area estimates and especially those from small fires, using more direct satellite observations of fire severity and fuel consumption based on FRP data, and new field data on fuel consumption and emission factors along critical gradients such as productivity and grazing intensity. Increasing the spatial resolution of our modeling framework could lower the impact of spatial heterogeneity in fire parameters and make for easier comparisons with or validation using ground-based data. Better understanding and modeling diurnal cycles may be equally important in addressing how variable, for example, the relative importance of flaming and smoldering combustion is. Finally, with new missions such as Suomi-NPP and the various Sentinel satellites now collecting data, an emphasis on merging various time series would help in lengthening the time series over which we have consistent data to over 20 years.

Data availability
GFED data are freely available at http://www.globalfiredata. org. The site provides documentation, related publications, updates, and online analysis tools to compute emissions for custom regions and countries.

Conclusions
We have revised the Global Fire Emissions Database using new observations of burned area including those from smaller fires as well as several other new data streams. In addition we have modified the fuel consumption parameteriza-tion in our model to better match observations. Global average fire emissions were estimated to be 2.2 Pg C yr −1 over 1997-2016 with substantial interannual variability. This is an 11 % increase compared to our previous work (GFED3), and in regions where small fires are relatively important such as temperate cropland regions the increase could be as large as 100 %. Net greenhouse gas emissions from all fires were on average 6 % of global 2014 fossil fuel CO 2 emissions, consisting of 0.4 Pg C yr −1 emissions from deforestation and tropical peat fires, which are a net CO 2 source to the atmosphere just like fossil fuel emissions, and 16 Tg CH 4 and 0.9 Tg N 2 O yr −1 from all fire types using a 100-year horizon to convert the warming potential of these greenhouse gases to CO 2 equivalents.
Over the past several years, uncertainties in all of the data layers used to calculate emissions (burned area, fuel consumption, and emission factors) have been reduced from new algorithms and data availability. While biome-level fuel consumption rates are now more in line with observations than in our previous work, uncertainties are still substantial at higher resolutions as indicated by regional studies. In addition, the small fire burned area approach carries substantial uncertainties and is known to be impacted by resampling error. Merging information from the long-term MODIS era with newer instruments could reduce some of these uncertainties, but carefully designed and interdisciplinary field campaigns measuring fuel consumption, fire dynamics, and emission factors along gradients and throughout fire seasons are equally necessary to further improve biomass burning estimates.