Exchanges of carbon, water and energy between the land surface and the atmosphere are monitored by eddy covariance technique at the ecosystem level. Currently, the FLUXNET database contains more than 500 registered sites, and up to 250 of them share data (free fair-use data set). Many modelling groups use the FLUXNET data set for evaluating ecosystem models' performance, but this requires uninterrupted time series for the meteorological variables used as input. Because original in situ data often contain gaps, from very short (few hours) up to relatively long (some months) ones, we develop a new and robust method for filling the gaps in meteorological data measured at site level. Our approach has the benefit of making use of continuous data available globally (ERA-Interim) and a high temporal resolution spanning from 1989 to today. These data are, however, not measured at site level, and for this reason a method to downscale and correct the ERA-Interim data is needed. We apply this method to the level 4 data (L4) from the La Thuile collection, freely available after registration under a fair-use policy. The performance of the developed method varies across sites and is also function of the meteorological variable. On average over all sites, applying the bias correction method to the ERA-Interim data reduced the mismatch with the in situ data by 10 to 36%, depending on the meteorological variable considered. In comparison to the internal variability of the in situ data, the root mean square error (RMSE) between the in situ data and the unbiased ERA-I (ERA-Interim) data remains relatively large (on average over all sites, from 27 to 76 % of the standard deviation of in situ data, depending on the meteorological variable considered). The performance of the method remains poor for the wind speed field, in particular regarding its capacity to conserve a standard deviation similar to the one measured at FLUXNET stations.
The ERA-Interim reanalysis data de-biased at FLUXNET sites can be downloaded
from the PANGAEA data centre (
In the late 1970s and early 1980s, exchanges of carbon, water and energy between
the land surface and the atmosphere began to be monitored by the eddy
covariance technique at the ecosystem level (Desjardins and Lemon, 1974;
Anderson et al., 1984; Anderson and Verma, 1986; Ohtaki, 1984; Desjardins et
al., 1984; Baldocchi, 2003, for a review). Since this period, several
networks of eddy sites have been built, on regional or continental scales:
Euroflux in 1996 for Europe (Aubinet et al., 2000; Valentini et al., 2000),
AmeriFlux in 1997 for North America (Running et al., 1999), AsiaFlux in 1999
for Asia (Kim et al., 2009) and OzFlux in early 2000 for Australia. Currently
most of these networks evolved in long-term research infrastructures, such as Integrated Carbon Observation System (ICOS) ( to quantify the spatial differences in carbon dioxide and water vapour
exchange rates that may be experienced within and across natural ecosystems
and climatic gradients; to quantify the temporal dynamics and variability of carbon, water and
energy flux densities; and to quantify the variations in carbon dioxide and water vapour fluxes due
to changes in insolation, temperature, soil moisture, photosynthetic
capacity, nutrition, canopy structure and ecosystem functional type.
These scientific goals have been largely achieved by several publications;
examples of other studies published in the last years are Jung et al. (2010), Teuling et
al. (2010), Beer et al. (2010), Stoy et al. (2009) and Mahecha et al. (2010).
Many modelling groups have also used the FLUXNET data set for evaluating
models' performance at simulating energy, water and carbon exchanges
between the surface and the atmosphere. Krinner et al. (2005) evaluate the
temporal dynamics (mainly the mean diurnal cycle) of the sensible heat,
latent heat, net ecosystem exchange (NEE) and net radiation simulated by the
Organising Carbon and Hydrology In Dynamic Ecosystems (ORCHIDEE) model against
In most of these studies, where models are evaluated against in situ FLUXNET data, the attempt is to assess the intrinsic performance of the models and to diagnose a model's parameterization errors or missing processes in the models. Consequently, one wants to make use of meteorological data measured at the FLUXNET sites, jointly with the flux data, to force the models in such a way that errors due to inaccurate meteorological forcing data are avoided. To complement this aim, other studies such as Zhao et al. (2012) examine how errors in meteorological variables impact simulated ecosystem fluxes at FLUXNET sites by using several reanalysis (SAFRAN (Système d'Analyse Fournissant des Renseignements Atmosphériques à la Neige), REMO (Regional Model), ERA-Interim) and in situ data sets.
While models require uninterrupted time series for the meteorological variables used as input, original in situ data often contain gaps, from very short (few hours) up to relatively long (some months) ones. The reasons why meteorological data are missing are few compared to those for flux data (Baldocchi et al., 2001). In the case of meteorological data, gaps are mainly due to calibration and maintenance operations or system breakdown, in particular in remote sites powered by solar panels. These gaps prevent the use of original in situ meteorological data directly as inputs to the models. A gapfilling procedure using adequate methods is consequently needed.
In some of studies, simple gapfilling methods have been developed. For instance, in Blyth et al. (2010), “gap filling involved, for each precise time step that was missing, using the average of values from other years at the same time step”. In Stöckli et al. (2008), “up to two month long successive gaps were filled by applying a 30 day running mean diurnal cycle forwards and backwards through the yearly time series. Years with more than 2 month of consecutive missing data were not used”.
For long gaps, these simple methods may have strong limitations. Even if the evaluation of the modelled fluxes is only performed when in situ meteorological data are available, for some processes accounting for lag effects, periods where no in situ meteorological data are available may have an important impact on modelled fluxes over later periods, when meteorological data are available.
Other studies develop more sophisticated gapfilling procedures. For example
methods, such as artificial neural networks or look-up tables, that are based on the relations between variables, such as the one presented in
Papale (2012), and that are generally applied to fill gaps in the fluxes can be successfully
used also for gaps in meteorological data. The problem is, however, that often during
gap periods in meteorological data, all the variables are missing and so these methods
cannot be applied. Krinner et al. (2005) used the ECMWF ERA15
The main limitations of these more sophisticated gapfilling methods are the lack of tools for evaluating their performances and a non-standardized application.
To overcome these limitations, we develop a new, robust and powerful method,
making use of the ERA-Interim reanalysis for filling the gaps in
meteorological data measured at FLUXNET sites. This approach has the benefit
of making use of continuous data available globally (ERA-Interim) and a high
temporal resolution spanning from 1989 to today. The ERA-Interim reanalysis
performs well in simulating most of the atmospheric variables that are used
for the gapfilling method presented here (Dee et al., 2011), but
precipitation is overestimated in tropical areas (Dee et al., 2011; Balsamo
et al., 2015) compared to observation-based estimates of the GPCP (Global
Precipitation Climatology Project; Adler et al., 2003). Zhao et al. (2012)
and Balzarolo et al. (2014) have shown that using raw ERA-Interim data
instead of local atmospheric observations has little or no impact on the
scores of the simulations of a land surface model with respect to local
observations of
We first present the data sets used (the FLUXNET data set and the ERA-Interim reanalysis) and the methods developed for filling the gaps. We then present the results of our gapfilling procedure for the overall fair-use data set of FLUXNET sites and discuss the potential use of this method for the ecosystem modelling community and its main limitations.
We use level 4 data (L4) from the La Thuile collection
(
FLUXNET data are given in Coordinated Universal Time (UTC).
The time (
The variables are classified into two main groups:
instantaneous: this group includes air temperature, vapour pressure
deficit and wind speed, which are state variables where the instantaneous
measurement is relevant as is; averaged: includes the radiation and the precipitation where the relevant
value is a flux measured over a time range.
Timestamps in the data indicate the time of measurement in the case of
“instantaneous” variables, and in the case of “averaged” variables, the
end of the averaging period, which is, in general, 30 min (i.e. first data
in the year are for 01 January; 00:30 for the instantaneous variables and for
01 January; 00:00–00:30 for the averaged variables).
The ERA-Interim (ERA-I) is the latest reanalysis (Dee et al., 2011) from the
European Centre for Medium-range Weather Forecast (ECMWF). It is available
from 1989 to the present, on a regular grid (0.7
We first change the units of some ERA-I variables to agree with FLUXNET
units:
The Magnus–Tetens relationship (Murray, 1967) is used to calculate
In order to compare ERA-I and FLUXNET data at similar time steps, original
FLUXNET meteorological variables, denoted by
For the instantaneous fields (
When
Appendix A gives an application of each pseudo-algorithm defined in this
paper for a site located in time zone UTC
For the averaged fields (
We denote the original ERA-I meteorological data by
For the precipitation field, we do not expect that the timing of
precipitations in the ERA-I data set is accurate enough for the linear
regression between
In order to use the de-biased meteorological fields of the ERA-I data set to fill the gaps in the meteorological fields of the FLUXNET data set, they need to be interpolated from the original 3-hourly time step to the half-hourly time step.
For the instantaneous fields (all fields, except for the global radiation, the
longwave radiation and the precipitation fields), the 3-hourly data are
simply linearly interpolated in order to reconstruct a diurnal cycle at a
half-hourly resolution. The half-hourly de-biased field of the ERA-I data set is
denoted
For the global radiation,
In order to evaluate the gapfilling method, we compare the
We use first the root mean square error (RMSE) and the standard deviation
(SD) in two appropriate metrics in order to evaluate how the gapfilling
method performs:
We also evaluate how the standard deviation of the ERA-Interim products
before and after correction differs from the one of the FLUXNET data set by
calculating normalized standard deviations
(SD(
Lastly, we specifically evaluate the diurnal cycle interpolated from the
3-hourly de-biased meteorological fields of the ERA-I data set. To this end,
two new time series have been constructed from
Distribution across sites of the error reduction (top panel) and
relative error (bottom panel) of the bias correction method for air temperature, vapour pressure deficit, wind speed, global radiation and
longwave incoming radiation. The box extends from the lower (25 %) to
upper quartile (75 %) values of the data, with a red line at the median.
The whiskers extend from the box to show the range of the data within 1.5
The mean error reduction for air temperature over all sites equals 14 % (Fig. 1). Scores vary significantly across sites. For most sites, the error reduction is less than 40 % (Fig. 1), showing that most of the mismatch between downscaled and measured data is due to non-systematic bias that our correction approach cannot account for. Sites for which the error reduction is higher than 40 % (IT-LMa, IT-Col, IT-Pia, ES-ES1, ES-ES2 and AT-Neu; Fig. 1) are mountain sites or located near the coast, locations where the meteorological local conditions (as recorded by the meteorological stations at FLUXNET sites) and meteorological conditions provided by ERA-Interim may vary the most.
Distribution across sites of the normalized standard deviation of
the ERA-I data before (left) and after (right) bias correction for
air temperature, vapour pressure deficit, wind speed, global radiation and
longwave incoming radiation. The box extends from the lower (25 %) to
upper quartile (75 %) values of the data, with a red line at the median.
The whiskers extend from the box to show the range of the data within
1.5
The relative error varies across sites from low values (13 % for RU-Ha2
and CA-NS3) to up to 50 % or more (BW-Ghg, BW-Ghm, BR-Sa3, ID-Pag,
US-Wi7). Sites where the relative error is low are located in continental
regions where the air temperature varies significantly (by more than
40
The error reduction for the VPD (vapour pressure deficit) signal is of the same order as the one obtained
for air temperature (mean value of 14 %, maximum of up to 60 %), but the
relative error is much larger (mean value of 52 %), with only few sites
having a relative error less than 40 %. The large relative error, which
reflects the difficulty of correcting the ERA-Interim signal, might be partly
due to the way we calculate VPD for ERA-Interim. It is inferred from the
Wind speed is the meteorological field for which the error reduction is the
largest (mean value of 36 %). This large bias correction mainly reflects
the fact that the reference heights at which the wind speed data are provided
by ERA-interim (10
The mean error reduction over all sites for global radiation equals 11 % (with only 21 sites having an error reduction higher than 20 %). The global radiation is the field for which the mean error reduction is the lowest. The highest error reductions are obtained for the sites US-Wi7 and US-Wi8, whose global radiation values appear abnormally low, especially when compared to nearby sites such as US-Wi4 or US-Wi5. This could be due to a problem in the units of the original data or in the data processing and correction before their publication in the La Thuile collection. The relative error after bias correction for global radiation (mean value of 34 %) is of the same order as the one obtained for air temperature (mean value of 27 %), but it varies much less across sites.
Distribution across sites of the error on the mean annual
precipitation as measured at FLUXNET stations when using the ERA-I product, in
absolute (mm yr
The longwave incoming radiation has a mean error reduction and relative error similar to the VPD field (17 and 57 %, respectively), with large site-to-site variations.
Figure 2 represents the normalized standard deviation (NSD) of the ERA-I products (Ta, VPD, WS, Rg and LWin) before and after the bias correction, and, consequently, it gives insights into how the de-biasing procedure impacts the internal variability of the meteorological fields (in comparison with the measured variability). Overall, the bias correction tends to reduce the spread of the NSD across sites. This is especially true for the global radiation field. The mean NSD is not significantly modified by the bias correction for air temperature (mean NSD before correction of 0.91 compared to 0.87 after correction) and global radiation (1.06 compared to 0.93). By contrast, the bias correction impacts negatively on the NSD of the vapour deficit (mean NSD of 0.94 vs. 0.77), the wind speed (mean NSD of 0.98 vs. 0.65) and the longwave incoming radiation (mean NSD of 0.80 vs. 0.64) from ERA-I. These negative impacts show the limits of a bias correction method based on linear regression for meteorological fields for which the bias between FLUXNET and ERA-I data do not vary linearly.
Regarding the precipitation field, for which we only correct for the
cumulative flux over the observation period, the error reduction can be
large, both in terms of absolute and relative values. Figure 3 and Table B1
show the distribution across sites of the error on the mean annual precipitation
(MAP) field when using ERA-Interim
precipitation fields, in absolute values (
Taylor diagram representing the NSD and correlation (
We evaluate here how good the interpolation of the ERA-I data from original 3-hourly to half-hourly time steps is (Fig. 4).
For air temperature, on average, over all sites, the mean correlation (
For vapour pressure deficit, the model–data agreement in terms of diurnal
cycle is lower that the one obtained for air temperature: mean
Wind speed is the meteorological variable for which the diurnal cycle
inferred from the ERA-I data set is least in agreement with the observation
(mean
The diurnal cycle of the global radiation inferred from the ERA-I data set is
in very good agreement with the observed one. None of the sites have values
lower than 0.8 and 0.75 for
The diurnal cycle for the incoming longwave radiation does not match the
observed one, with mean values across sites of 0.51 and 0.64 for
The method presented in this study has shown its capacity for filling the gaps in meteorological data collected at FLUXNET sites. The performance of the method developed varies across sites and is also a function of the meteorological variable. The results, however, show that when large gaps are present, the proposed methodology is the best available strategy (when no nearby stations are present). Nevertheless, the performance of the method remains poor for the wind speed field, in particular regarding its capacity to conserve a standard deviation similar to the one measured at FLUXNET stations. A significant effort should be undertaken to improve the bias correction method that could in the future be based on a non-linear fit between the ERA-I and FLUXNET data set. In addition, some methodological issues remain, which are discussed below.
The method presented in this study is based on the assumption that the ERA-I data contain some biases that we can correct in order to better match local meteorological information at FLUXNET sites. Nevertheless, one may ask whether, for some specific variables at some sites, the diagnosed ERA-I vs. FLUXNET bias does not reveal a problem in the FLUXNET measurements rather than a bias within the ERA-I data. As presented in Sect. 3, this is possibly the case for, among others, the precipitation field for different sites, the global radiation (e.g. for site US-Wi8) and the air temperature (site US-Wi7). It is not our purpose to point out particular sites but rather to highlight that our method and the associated graphical tools may serve also to support data-quality controls.
As underlined in Sect. 1, the FLUXNET data set is highly valuable for modelling purposes in order to evaluate how terrestrial ecosystem models perform at site level. In order to get the most valuable information at site level, it would be of interest to add the atmospheric pressure field to the standard FLUXNET data sets. Even if atmospheric pressure slightly varies over time, this variable is a required input of many ecosystem models and it would be good to benefit from the data measured locally instead of using only data from reanalysis. Similarly, measurement and vegetation heights are key parameters for modelling the turbulent fluxes within and at the top of the canopy; these are not yet standardly available for all the sites in the FLUXNET data set. In our method, we bias-correct the wind speed at a height of 10 m of ERA-I to better match the observed values at site level, without knowing the height at which these observations have been collected. Using default values for vegetation and measurement heights may have strong limitations on some modelled energy fluxes (latent and sensible heat fluxes).
We provide here a numerical application of the main equations used in the
pseudo-algorithms developed in this study for the first day of a data set for
a site located in the time zone UTC
Numerical application of the main equations used in the pseudo-algorithms based on the records from the ERA-Interim data set.
Numerical application of the main equations used in the pseudo-algorithms based on the records from the FLUXNET data set.
Error reduction (ER, %) and relative error (RE, %) of the
bias correction method for air temperature (Ta), vapour pressure deficit
(VPD), wind speed (WS), global radiation (Rg), longwave incoming radiation (LWin) and mean annual precipitation (mm yr
Continued.
Continued.
The authors sincerely thank the ECMWF for providing ERA-Interim reanalysis. The authors also thank Frédéric Chevallier, Fabienne Maignan and Nicolas Viovy for their help in using the ERA-Interim reanalysis data set. The authors thank the sites' PIs and staff for making the meteorological data used in this study available. These data were acquired by the FLUXNET community and in particular by the following networks: AmeriFlux (US Department of Energy, Biological and Environmental Research, Terrestrial Carbon Program (DE-FG02-04ER63917 and DE-FG02-04ER63911)), AfriFlux, AsiaFlux, CarboAfrica, CarboEuropeIP, CarboItaly, CarboMont, ChinaFlux, Fluxnet-Canada (supported by CFCAS, NSERC, BIOCAP, Environment Canada and NRCan), GreenGrass, KoFlux, LBA, NECC, OzFlux, TCOS-Siberia and USCCC. We acknowledge the financial support to the eddy covariance data harmonization provided by CarboEuropeIP, FAO-GTOS-TCO, iLEAPS, the Max Planck Institute for Biogeochemistry, National Science Foundation, the University of Tuscia, Université Laval, Environment Canada and the US Department of Energy and the database development and technical support from Berkeley Water Center, Lawrence Berkeley National Laboratory, Microsoft Research eScience, Oak Ridge National Laboratory, the University of California – Berkeley, and the University of Virginia. Dario Papale is grateful for support from the GeoCarbon EU project. Edited by: G. König-Langlo