Global radiation , photosynthetically active radiation , and the diffuse components dataset of China , 1981 – 2010

Solar radiation, especially photosynthetically active radiation (PAR), is the main energy source of plant photosynthesis; and the diffuse component can enhance canopy light use efficiency, thus increasing ecosystem productivity. In order to predict the terrestrial ecosystem productivity precisely, we not only need global radiation and PAR as driving 10 variables, but also need to treat diffuse radiation and diffuse PAR explicitly in ecosystem models. Therefore, we generated a series of radiation datasets, including global radiation, diffuse radiation, PAR, and diffuse PAR of China from 1981 to 2010, based on the observations of China Meteorology Administration (CMA) and Chinese Ecosystem Research Network (CERN). The dataset should be useful for the analysis of the spatio-temporal variations of solar radiation in China and the impact of diffuse radiation on terrestrial ecosystem productivity based on ecosystem models. The dataset is freely available from 15 Zenodo at the website of https://zenodo.org/record/1198894 (DOI:10.11922/sciencedb.555).


Introduction
Solar radiation is the primary energy source for life on Earth (Wild, 2009), and the portion of global radiation with 400-700 nm wavelengths, i.e., photosynthetically active radiation (PAR), is critical for vegetation photosynthesis.Therefore, global radiation or PAR is a prerequisite for the modeling of terrestrial ecosystem productivity (Jacovides et al., 2007).Besides the quantity, the composition of global radiation and PAR, i.e., the proportion of diffuse and direct components, is also important (Farquhar and Roderick, 2003;Lauret et al., 2010), since the diffuse radiation can reduce photosynthetic saturation and increase the canopy light use efficiency, thereby enhancing the ecosystem carbon uptake (Kanniah et al., 2012;Mercado et al., 2009).The explicit treatment of diffuse radiation in ecological models is needed to accurately simulate the carbon dynamics of terrestrial ecosystems, making the diffuse radiation or diffuse PAR an important environmental driving factor (Gu et al., 2003;Kanniah et al., 2012;Mercado et al., 2009).The effects of diffuse radiation on ecosystem productivity have become a hot issue in carbon cycle research (Alton et al., 2007;Gu et al., 2002Gu et al., , 2003;;Mercado et al., 2009;Zhang et al., 2011Zhang et al., , 2017)).However, in China, global radiation, PAR, diffuse radiation, and diffuse PAR are not generally measured in contrast to other meteorological variables such as sunshine duration (Ren et al., 2014(Ren et al., , 2013)).Therefore, a long-term high-quality reanalysis dataset of global radiation, PAR, diffuse radiation, and diffuse PAR is required for a better understanding of the ecosystem carbon dynamics in China as well as their spatial and temporal variability.
Globally, a widespread decrease in solar radiation between the 1950s and 1980s has been detected, known as global dimming, with a partial recovery thereafter at many locations, known as global brightening (Wild, 2009;Wild et al., 2005).As a large country of the world, has China experienced the same variation?Employing the observation data from national meteorological stations of the China Meteorology Administration (CMA) and the field sites of the Chinese Ecosystem Research Network (CERN), Ren et al. (2014,  2013) parameterized the estimation models of global radiation, diffuse radiation, PAR, and diffuse PAR, and performed cross validation, which indicated high estimation accuracy.They then generated the radiation dataset in China from 1981 to 2010 and analyzed the spatiotemporal variations.This dataset has been employed to estimate the above-ground biomass and net ecosystem productivity of alpine grasslands in the Three-River Headwaters Region (Ren et al., 2017a;Zeng et al., 2017) and the gross primary productivity of the alpine grasslands on the Tibetan Plateau (He et al., 2014;Ren et al., 2017b).We have published the monthly diffuse PAR spatial dataset in China Scientific Data, and have briefly introduced the estimation method of diffuse PAR (Ren et al., 2017c).
In this paper, we systematically describe the estimation and interpolation methods of global radiation, diffuse radiation, PAR, and diffuse PAR, and provide the estimated values of model parameters as well as the accuracy of estimation and interpolation.The spatial dataset of monthly and yearly global radiation, diffuse radiation, PAR, and diffuse PAR in China from 1981 to 2010 (called radiation dataset for short hereafter) is shared in this paper, providing integral radiation data for ecological modeling and the analysis of the effects of diffuse radiation on terrestrial ecosystem productivity.

Data and method
The schematic workflow of the radiation dataset production is shown in Fig. 1.Since the observation sites of sunshine duration are widely distributed in China (756 sites), we first expanded the observation data of global radiation, diffuse ra-diation, and PAR from 122 sites, 81 sites, and 39 sites to 756 sites based on sunshine duration data through estimation models, respectively.Then diffuse PAR, which has few observation sites in China, was estimated through the empirical relationships with global radiation, diffuse radiation, and PAR.Finally, ANUSPLIN software (Hutchinson, 2001) was employed to acquire the spatial global radiation, diffuse radiation, PAR, and diffuse PAR data.The details are described in the following sections.

Basic data
The basic data used here include daily sunshine duration, global radiation, diffuse radiation, and PAR observation data (Fig. 2), as well as digital elevation model (DEM) data.The observed daily sunshine duration data (756 meteorological stations), global radiation data (122 meteorological stations), and diffuse radiation data (81 meteorological stations) during 1981 to 2010 were provided by the CMA (http://data.cma.cn/en, last access: 2 July 2018).We also used the daily global radiation and PAR data observed in 39 CERN field sites from 2004 to 2010 (http://www.cern.ac.cn, last access: 2 July 2018).The DEM data (500 m × 500 m) were from the Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences (http://www.resdc.cn,last access: 2 July 2018).It should be noted that due to the station adjustment of the CMA in 1993, the number of stations observing diffuse radiation dropped from more than 70 to 17 after 1993.In sum, there are 81 stations that have more  than a 1-year record of diffuse radiation during the period of 1981 to 2010 (Ren et al., 2013).

Quality control of observation data
Quality control is an important part of reanalysis dataset production.Observations with poor quality may offset the parameter values of estimation models, thus affecting the quality of the generated dataset.The CMA and CERN have performed basic quality control on the observational data (Shi et al., 2008).We made further quality checks according to the following criteria.(1) Daily sunshine duration cannot be longer than the daily possible sunshine duration, which is calculated according to the geographical latitude and day of year.(2) Daily extraterrestrial radiation must be bigger than daily global radiation, whereby extraterrestrial radiation is determined by the solar constant and the geographical latitude.(3) Daily PAR cannot exceed daily global radiation, and the ratio between daily PAR and daily extraterrestrial radiation cannot be larger than 40 % (Tsubo and Walker, 2005.(4) Daily diffuse radiation cannot exceed daily global radiation, and needs to satisfy the requirements for overcast and clear skies described by Eq. ( 1) (Reindl et al., 1990;Ren et al., 2013).The numbers of sites with valid observational data for each variable per year from 1981 to 2010 are shown in Fig. 3.
where Q d , Q, and Q represent daily diffuse radiation, daily global radiation, and daily extraterrestrial radiation, respectively.

The expansion and estimation of radiation data at a site scale
The coverage of radiation stations in CMA is limited; thus we used estimation models to expand daily global radiation, diffuse radiation, and PAR at a site scale based on the widely distributed daily sunshine duration observations.Diffuse PAR is not generally measured; thus it is estimated using the empirical relationships with global radiation, diffuse radiation, and PAR.Due to the highly heterogeneous topography and climate of China, we estimated the model parameters for eight different geographical regions according to the Chinese Physical Geography Division (Zhao, 1997), including Northwest China, Inner Mongolia, Northeast China, North China, Central China, South China, Southwest China, and the Qinghai-Tibet Plateau.Meanwhile, the aerosols can influence the total and partitioning of solar radiation into direct and diffuse components (Kanniah et al., 2012), and the relative importance of aerosols may differ among regions due to the different levels of human activities (Wild, 2009), which further justifies the separate data expansion for different regions.
1.The expansion of global radiation.Previous studies indicated that global radiation in China was estimated more accurately using sunshine duration than other predictors such as temperature, through the comparison of multiple global radiation models (Chen et al., 2004).Therefore, we used the Ångström model (Eq.2) to expand daily global radiation (Ångström, 1924;Chen et al., 2004;Ren et al., 2017a).The model was parameterized using daily global radiation and sunshine duration data from 122 CMA stations.Then the daily global radiation of 756 CMA stations was derived using the informed model and sunshine duration data from 756 CMA stations.
where k t represents the clearness index, defined as the ratio of the daily global radiation (Q) to the daily extraterrestrial radiation (Q ), n and N are the actual and possible daily sunshine duration, and a and b are undetermined parameters.

The expansion of diffuse radiation.
There are many radiation decomposition models relating daily diffuse radiation with daily diffuse fraction, including the Liu & Jordan model (Liu and Jordan, 1960), the Page model (Page, 1961), the Reindl model (Reindl et al., 1990), and the Boland model (Boland et al., 2001).Using data from several sites in Europe, Africa, Australia, and Asia, Lauret et al. (2010) indicated that the Boland model (Eq.3) had a better or a similar performance compared to other models but with a much simpler model structure.We also compared several models and found that the Boland model is the best one in our case (Ren et al., 2013).Therefore, we parameterized the Boland model using daily global radiation and diffuse radiation data, and then expanded the daily diffuse radiation data from 81 CMA stations to 756 CMA stations.
where k d represents the daily diffuse fraction, defined as the ratio of daily diffuse radiation (Q d ) to global radiation (Q), and c and d are undetermined parameters.
3. The expansion of PAR.The PAR model (Eq.4), which has been proved applicable in China (Ren et al., 2014;Zhu et al., 2010), was used to expand the daily PAR data from 39 CERN filed sites to 756 CMA stations.Firstly, we used the daily PAR and global radiation data measured in 39 CERN field sites to estimate the model parameters, and then utilized the informed model and ex-panded daily global radiation data to expand daily PAR data.
where e and f are undetermined parameters.
4. The estimation of diffuse PAR.Diffuse PAR is usually roughly estimated by multiplying PAR and the diffuse fraction of global radiation.However, the diffuse fraction of global radiation is not equivalent to the diffuse fraction of PAR, since the latter is significantly greater than the former under clear skies, while almost equivalent under cloudy skies (Ren et al., 2014;Spitters et al., 1986).Therefore, the Spitters model (Spitters et al., 1986) (Eq.5) was applied to estimate the daily diffuse PAR of 756 CMA stations.
where PAR d represents diffuse PAR.

Dataset generation
ANUSPLIN software (Hutchinson, 2001) was utilized to generate the reanalysis radiation dataset with 10 km × 10 km spatial resolution in China from 1981 to 2010.ANUSPLIN is a widely used spatial interpolation package, developed by the Centre for Resource and Environmental Studies at the Australian National University (Hijmans et al., 2005;Hutchinson, 1995).This software implemented thin plate smoothing splines, which can incorporate the covariates in addition to the independent spline variables.We used three-dimensional spline to interpolate radiation data, with latitude and longitude as the independent variables and the elevation as the covariate.The specific steps of this process are shown in Fig. 4. The main procedures are as follows.
1. Scale daily radiation data to a monthly scale and format the data following the instructions of ANUSPLIN using MATLAB software; resample the DEM data to 10 km × 10 km using ArcGIS software.
2. Determine the specific parameter values in the command files of Splina.exe and Lapgrd.exe,which are submodules of ANUSPLIN.
4. Convert the output ASCII files to ArcGIS grid files.

Description and analysis of the radiation dataset
The radiation dataset has four subsets, including global radiation, diffuse radiation, PAR, and diffuse PAR in China with a resolution of 10 km × 10 km from 1981 to 2010.Each subset has 12 × 30 × 2 monthly files and 30 × 2 yearly files, except diffuse PAR, which only has 30 × 2 yearly files.We provide two formats for each data file, i.e., ArcGIS grid and ASCII text, as well as the Python code for the conversion from text to grid.To be comparable with each other, the units of radiation data are all set to MJ m −2 month −1 (monthly) and MJ m −2 yr −1 (yearly).
It should be noted that the measuring systems of PAR include radiation flux density (W m −2 ) and photosynthetic photon flux density (µmol m −2 s −1 ), which are convertible through a conversion coefficient of 4.57 µmol J −1 .Users can convert the units of PAR and diffuse PAR from MJ m −2 to mol m −2 if needed.
The spatial patterns of global radiation, diffuse radiation, PAR, and diffuse PAR are shown in Fig. 5.It can be seen that the distribution of radiation in China is inhomogeneous.The global radiation and PAR are higher in the northwest and lower in the southeast, while the diffuse radiation and Earth Syst.Sci.Data, 10, 1217Data, 10, -1226Data, 10, , 2018 www.earth-syst-sci-data.net/10/1217/2018/  1.The average values of global radiation, diffuse radiation, PAR, and diffuse PAR are 5270.0,2477.0,2164.5, and 1106.9MJ m −2 yr −1 , respectively.Global radiation in China declined during the 1980s, and then started to recover, which is consistent with global findings (Wild, 2009).There were dramatic increases of diffuse radiation in 1982, 1983, 1991, and 1992, which may have been caused by the El Chinchón eruption in 1982 and the Pinatubo eruption in 1991 (Ren et al., 2013).Global radiation is the highest in the Qinghai-Tibet Plateau and the lowest in Central China, while the diffuse radiation has the highest value in Southwest China and the lowest value in Northeast China (Table 2).A more detailed discussion about the spatiotemporal variations of radiation in China during 1981-2010 has been reported in previous papers (Ren et al., 2014(Ren et al., , 2013)).

Validation of data expansion at a site scale
To validate the precision of data expansion at a site scale, we utilized a leave-one-out cross-validation method to calibrate and check the location and time independence of the Ångström model, Boland model, and PAR model.Taking time expansion of global radiation as an example, we used the sunshine duration and global radiation data of 29 years out of 30 years to calibrate the Ångström model and used the data of the last remaining year to perform validation.This process was repeated 30 times, and then the average model performance, measured by the correlation coefficient (R) and the root mean square error (RMSE), was derived.In the case of site expansion, we left the data of one site out and fitted the model with the remaining data repeatedly, and the number of repetitions is equivalent to the number of sites.Then the average performance of site expansion was derived.
Table 3 shows the estimated parameter values and validation results for the models, which indicated that the data expansion at a site scale in all regions of China had high accuracy.Almost all the correlation coefficients exceeded 0.8; only the Boland model in the Qinghai-Tibet Plateau was an exception, which might be caused by the large differences in climatic conditions among the sparse stations there (Figs. 2  and 6).

The prediction standard error of spatial interpolation
The ANUSPLIN software can not only interpolate the climatic data but also estimate the prediction standard error (Hutchinson, 2001).The spatial distribution of the interpolation error for global radiation, diffuse radiation, PAR, and diffuse PAR is shown in Fig. 7.The mean error for yearly global radiation, diffuse radiation, PAR, and diffuse PAR is 280.8, 98.9, 107.7, 40.9 MJ m −2 yr −1 , respectively, and the relative error is 5.3, 4.0, 5.0, and 3.7 %, respectively.
The interpolation error in the northwestern part of the Qinghai-Tibet Plateau is relatively large, probably because the meteorological stations there are very limited in number (Fig. 2).Because of the absence of observations in Taiwan, the interpolation error there is rather large compared with other areas.It should be noted that the data around the border also have a relatively large error, because we do not have the observation data beyond the border.

Data availability
The spatial dataset is freely available from the Zenodo website at https://zenodo.org/record/1198894#.Wx6--C_MwWo (https://doi.org/10.11922/sciencedb.555,Ren et al., 2018).The dataset is freely accessible, although some users may need to use a keyword search ("global radiation China") to establish initial access.There are four folders for global radiation (i.e., Global radiation.zip),diffuse radiation (i.e., Earth Syst.Sci.Data, 10, 1217Data, 10, -1226Data, 10, , 2018 www.earth-syst-sci-data.net/10/1217/2018/ Diffuse radiation.zip),PAR (i.e., PAR.zip), and diffuse PAR (i.e., Diffuse PAR.zip), respectively, as well as a description text file (i.e., Readme.txt).Two formats are offered, i.e., Ar-cGIS grid and ASCII text, along with the Python code for the conversion from text to grid.As for the original observation data, global users can contact us for global radiation data of CERN field sites for research needs.However, due to national policies, other original observation data of the CMA and the CERN cannot be released to global users currently.

Conclusions
Solar radiation is pivotal to the modeling of terrestrial ecosystem productivity, and the quantity and quality of solar radiation are both important because of the differences between vegetation light use efficiency for direct and diffuse light.A reanalysis spatial radiation dataset from 1981 to 2010, i.e., monthly and yearly global radiation, diffuse radiation, PAR, and diffuse PAR in China, was produced based on several estimation models and observation data from the CMA and the CERN.It provides a series of systematic and integral radiation data for the community of ecological modeling, making the analysis of the effects of solar radiation and its diffuse components on terrestrial ecosystem productivity in China more convenient.

Figure 2 .
Figure 2. Distribution of the meteorological stations of the China Meteorology Administration (CMA) that measured sunshine duration, global radiation, and diffuse radiation, and the field sites of the Chinese Ecosystem Research Network (CERN) that measured global radiation and PAR (I: Northwest China; II: Inner Mongolia; III: Northeast China; IV: North China; V: Central China; VI: South China; VII: Southwest China; VIII: Qinghai-Tibet Plateau).

Figure 3 .
Figure 3.The numbers of sites with valid observational data of sunshine duration, global radiation, diffuse radiation, and photosynthetically active radiation (PAR) per year from 1981 to 2010.CMA and CERN represent the China Meteorology Administration and the Chinese Ecosystem Research Network, respectively.

Figure 4 .
Figure 4.The workflow of the generation of radiation dataset.

Figure 6 .
Figure 6.The numbers of stations observing sunshine duration, global radiation, diffuse radiation, and photosynthetically active radiation (PAR) for each region (I: Northwest China; II: Inner Mongolia; III: Northeast China; IV: North China; V: Central China; VI: South China; VII: Southwest China; VIII: Qinghai-Tibet Plateau).CMA and CERN represent the China Meteorology Administration and the Chinese Ecosystem Research Network, respectively.

Table 1 .
Global radiation, diffuse radiation, PAR, and diffuse PAR in China for each year.Units are MJ m −2 yr −1 .

Table 2 .
Global radiation, diffuse radiation, PAR, and diffuse PAR averaged from 1981 to 2010 for each region in China.Units are MJ m −2 yr −1 .

Table 3 .
Calibration and validation of Ångström, Boland, and PAR models in different regions across China.R and RMSE are the correlation coefficient and the root mean square error, respectively.