Edinburgh Research Explorer Daily gridded datasets of snow depth and snow water equivalent for the Iberian Peninsula from 1980 to 2014

. We present snow observations and a validated daily gridded snowpack dataset that was simulated from downscaled reanalysis of data for the Iberian Peninsula. The Iberian Peninsula has long-lasting seasonal snowpacks in its different mountain ranges, and winter snowfall occurs in most of its area. However, there are only limited direct observations of snow depth (SD) and snow water equivalent (SWE), making it difﬁcult to analyze snow dynamics and the spatiotemporal patterns of snowfall. We used meteorological data from downscaled reanalyses as input of a physically based snow energy balance model to simulate SWE and SD over the Iberian Peninsula from 1980 to 2014. More speciﬁcally, the ERA-Interim reanalysis was downscaled to 10 km × 10 km resolution using the Weather Research and Forecasting (WRF) model. The WRF outputs were used directly, or as input to other submodels, to obtain data needed to drive the Factorial Snow Model (FSM). We used lapse rate coefﬁcients and hygrobarometric adjustments to simulate snow series at 100 m elevations bands for each 10 km × 10 km grid cell in the Iberian Peninsula. The snow series were validated using data from MODIS satellite sensor and ground observations. The overall simulated snow series accurately reproduced the interannual variability of snowpack and the spatial variability of snow accumulation and melting, even in very complex to-pographic terrains. Thus, the presented dataset may be useful for many applications, including land management, hydrometeorological studies, phenology of ﬂora and fauna, winter tourism, and risk management. The data presented here are freely available for download from Zenodo (https://doi.org/10.5281/zenodo.854618). This paper fully describes the work ﬂow, data validation, uncertainty assessment, and possible applications and limitations of the database.


Introduction
Seasonal snowpack exerts an important control on the hydrology and economy of many mountainous and cold regions worldwide (Barnett et al., 2005).Snow variability also affects different ecological processes, such as species composition, distribution, and phenology (Keller et al., 2000;Wipf et al., 2009).For example, snowpack on Mediterranean mountains is a crucial source of water during the dry season (Fayad et al., 2017;García-Ruiz et al., 2011;Viviroli et al., 2007).Long-term data are required to analyze the spatiotemporal dynamics of snowpack, to assess the importance of snow as a resource, and to understand the effect of climatic fluctuations.However, there are only limited in situ observations of snowpack for most mountain regions (Raleigh et al., 2016).Currently remote-sensing techniques can only reliably pro-vide information about snow cover based on observations in the visible spectrum (Dietz et al., 2012).Current spaceborne sensors do not provide accurate data on snow water equivalent (SWE) and/or snow depth (SD) in mountainous regions (Dozier et al., 2016).Microwave imaging has a coarse resolution (grid cell size: ∼ 25 km), so does not characterize snowpack variability in the Mediterranean mountains, which have a high spatial heterogeneity not captured with this resolution.There are also spatial and temporal limitations when attempting to estimate snowpack using close-range remote-sensing techniques such as lidar (Revuelto et al., 2016).
There are limited in situ snow observations and meteorological data at high elevations in the Iberian Peninsula.Although the number of monitored sites has increased in recent years, there are no long-term series and there is insufficient characterization of snowpack dynamics at a regional scale.However, snowpack in the Iberian Peninsula is an important hydrological and also economical resource.An area of 19456.4km 2 in the Iberian Peninsula lies above 1500 m a.s.l., mostly in the five most important mountain ranges (Pyrenees, Cantabrian Mountains, Central System, Iberian Range, and Sierra Nevada).At this elevation, snowpack occurs for at least 4 months of the year (López-Moreno et al., 2011) making it a critical resource for water management in the largest hydrological basins (Morán-Tejeda et al., 2014).Snowpack influences the interannual variability of water resources (López-Moreno and García-Ruiz, 2004) and the timing of the winter low flows and spring peak flows (Sanmiguel-Vallelado et al., 2017).Moreover, winter tourism (mainly skiing) has been increasingly important to the economy of mountain valleys in recent decades, and the large interannual fluctuations of snowpack in the different mountain regions of the Iberian Peninsula affect the economic viability of tourism (Gilaberte-Búrdalo et al., 2014, 2017).
The importance of snow to the environment and economy of the Iberian Peninsula, and the lack of data on snowpack in this region, motivated us to use meteorological outputs from downscaled reanalysis data to simulate snowpack at different elevations in the Iberian Peninsula.Atmospheric reanalyses, based on data assimilation and modeling (Saha et al., 2010), can provide important information about the temporal evolution of the atmosphere.Meteorological variables obtained from reanalysis data can be used as inputs for models of snow mass and energy balance which can be applied to describe the behavior of the snowpack over large areas (Brun et al., 2013;Krogh et al., 2015;Wegmann et al., 2017).However, the coarse resolution (cell size: around tens of kilometers) implies these simulations may have insufficient spatial resolution for characterizing the topographical complexity of mountain areas (Mass et al., 2002).To overcome this limitation, regional climate models (RCMs) are often used to obtain better representations of surface climatology, because they downscale physically reanalysis products (García-Valdecasas Ojeda et al., 2017;Kryza et al., 2017;Warrach-Sagi et al., 2013).Previous studies have used RCMs to study SD and SWE dynamics at finer resolutions (grid cell size: 5 to 11 km) when they are driven with reanalyses, and the resolution increases further (grid cell size: 1 km) when using forecasted data (Bellaire et al., 2011;van Pelt et al., 2016;Quéno et al., 2016;Wu et al., 2016).
van Pelt et al. (2016) used the High Resolution Limited Area Model (HIRLAM) in Svalbard (Norway), with forcing by ERA-40 and ERA-Interim reanalysis, and then used the meteorological simulation as driving data for SnowModel (Liston and Elder, 2006a).Their results support the usefulness of the methodology extracting snowpack trends from these data.Wu et al. (2016) used a similar procedure to describe the behavior of snowpack over the Altai Mountains in China.They coupled outputs from the Weather Research and Forecasting (WRF) model (Skamarock et al., 2008) driven by NCEP/NCAR reanalysis with a temperature index model (based on remote sensing), and their results had low error values.To increase the spatial resolution of the WRF outputs, they used the MICROMET model (Liston and Elder, 2006b), a submodel of SnowModel in which WRF outputs are interpolated to a new grid, and then corrected physically according to topography.Wrzesien et al. (2017) tested the capability of WRF to estimate SWE over complex terrain concluding that WRF simulations can be used over areas with few observational data.
We used a different approach, in an effort to make our database more computationally practicable and to avoid the uncertainties of the statistical interpolations of climatological variables over complex areas.More specifically, we projected WRF outputs to different elevation bands to generate simulations for multiple elevations.This procedure allows one to study the elevation-dependent characteristics of the snowpack over different mountain ranges, preserving the WRF output resolution.
Our procedure uses the physically based Factorial Snow Model (FSM; Essery, 2015), which is fed by ERA-Interim reanalysis (Berrisford et al., 2011), and downscaled by the WRF model.The final products of our analysis are simulated daily time series of SD and SWE at different elevations from 1980 to 2014.

Data and methods
We used an existing WRF simulation (cell size: 10 km) for the whole Iberian Peninsula (Fig. 1), with a 3 h time step from January 1979 to November 2014, as input data for the FSM.Most inputs of the FSM were extracted directly from the WRF simulation, but some were calculated using other submodels.We projected the WRF outputs and derived variables to different elevation bands, from 500 to 2900 m a.s.l. at steps of 100 m, from the WRF pixel elevation using several hygrometric and psychrometric formulas and elevation lapse rates.FSM outputs were aggregated at a daily time step in order to increase the manageability of the data.Validation was performed at different steps of the workflow using different observational data sources.Figure 2 shows the workflow completely, which is described in more detail below.Appendix A lists all the abbreviations used in this study.

Meteorological driving data
The meteorological variables were calculated using the WRF model, a mesoscale climate model.Previous researchers used this model to simulate climate at regional scales for analysis of past, present, and future conditions (Chen et al., 2011;Heikkilä et al., 2011).The spatial resolution of our simulation is 0.088 • (∼ 10 km) and the time step is 3 h.ERA-Interim reanalysis (Berrisford et al., 2011) was used as driving data for the WRF model.With this procedure all meteorological variables for running snowpack models were generated for the whole Iberian Peninsula.The WRF configuration was described in detail by García-Valdecasas Ojeda et al. (2017).This simulation provided the following variables: wind speed (Ua), surface temperature (T ), precipitation (Pr), relative humidity (RH), shortwave incoming radiation (SW), and atmospheric pressure (Ps).

Snow energy and mass balance model
SD and SWE time series were obtained using a mass and energy balance snowpack model.The FSM is a multi-physics snow model that simulates the accumulation and melting of snow (Essery, 2015).This model allows selection of two options for parameterizations of five different process, thereby enabling 32 different model configurations.The configuration used to develop our simulations decreases snow albedo and increases snow density at different rates for cold and melting snow, calculates thermal conductivity as a function of snow density, adjusts the turbulent exchange coefficient as a function of the bulk Richardson number, and allows retention and refreezing of liquid water inside the snowpack.
The model works with different numbers and thicknesses of layers, depending on snowpack depth.Thus, it assumes a single layer when snow depth is less than 0.2 m, and a maximum of three layers when the depth is greater than 0.5 m.This configuration allows the model to characterize the highly variable climatological conditions of the Iberian mountains.In addition to the variables provided by the WRF simulation (listed in Sect.2.1), the FSM also needs estimates of snow rate (Sf), rain rate (Rf), and longwave incoming radiation (LW).To avoid the expense of rerunning WRF in this study, these variables have been reconstructed from available WRF simulation outputs.
To calculate Sf and Rf, we used a psychrometric energy balance method (PPPm; Harder and Pomeroy, 2013), which uses relative humidity and air temperature to calculate the surface temperature of falling hydrometeors.From this value, the fraction of liquid precipitation is as follows: where f r is the percentage of liquid precipitation, T i is the temperature ( • C) of the falling hydrometeor, and b and c are derived from statistical fits (2.50286 and 0.125006, respectively, for hourly time intervals).T i is calculated from Eq. ( 2), which we solved numerically using the method described by Brent and Richard (1972): where T a is the temperature (K), D is the diffusivity of water vapor in air (m 2 s −1 ), λ t is the thermal conductivity of air (W m −1 K −1 ), L is the latent heat of sublimation or vaporization (J kg −1 ), and ρ T a and ρ sat (T i ) (kg m −3 ) are respectively the vapor densities in free atmosphere and at the saturated hydrometeor surface.This methodology gives the percentage of liquid precipitation; the percentage of solid precipitation is directly calculated from f r .
Incoming longwave radiation (W m −2 ) was estimated from the Stefan-Boltzmann law: where σ is the Stefan-Boltzmann constant and ε is the emissivity of the atmosphere.Emissivity was calculated as a function of elevation and cloud cover, as proposed by Liston and Elder (2006b), who used a variation of the methodology described by Iziomon et al. (2003).Thus, emissivity is calculated as follows: where e (Pa) is the atmospheric vapor pressure, cc is the fractional cloud cover, and X s , Y s , and Z s are coefficients that are corrected with elevation: where z ( m) is the elevation above sea level, and X s , Y s , and Z s can be substituted for C, with X 1 = 0.35, X 2 = 0.51, Y 1 = 0.100 K Pa −1 , Y 2 = 0.130 K Pa −1 , Z 1 = 0.224, Z 2 = 1.100, z 1 = 200 m a.s.l., and z 2 = 3000 m a.s.l.Different parameterizations using SW were tested to estimate c c , from potential SW, a more accurate approach than the parameterization proposed by Liston and Elder (2006b), according to Gascoin et al. (2013).This approach uses the relationship between SW and potential SW radiation that is restricted to daylight hours.Thus, in this work, we used the parameterization proposed by Walcek (1994) for c c estimation, which is the original parameterization proposed by Liston and Elder (2006b).
where RH 700 is the relative humidity at 700 mb.
The methodology used to project RH to 700 mb elevation is described below.To scale the snow simulations to different elevations, we first used the internationally accepted standard air temperature lapse rate (β = 0.0065 K m −1 ; Barry and Chorley, 1987;ISO, 1975) to project the surface air temperature.For RH, the methodology proposed by Liston and Elder (2006b) was used, in which a lapse rate is applied to the dew point temperature (HRm).First, we calculated the dew point temperature from RH and the saturation vapor pressure.Then, we applied the standard air temperature lapse rate to the dew point temperature, and recalculated the RH at the target elevation from the scaled dew point temperature and the saturation vapor pressure.Once we rescaled temperature and RH, we calculated the precipitation phase and LW radiation at the different elevations.
Finally, to estimate the scaled surface air pressure we used a generalization of the barometric formula for scenarios that consider air temperature lapse rates (Berberan-Santos et al., 1997): where p (0) is the surface air pressure, z is the elevation difference (m), m is the molecular mass of air (0.0289644 kg mol −1 ), and R is the universal gas constant (8.31432J K −1 mol −1 ).

Validation procedure
Validation was performed at different resolutions and at different steps of the workflow, using all available observational data (Fig. 2).Previous studies (Argüeso et al., 2012;García-Valdecasas-Ojeda et al., 2016)  In this work, we used the Moderate-Resolution Imaging Spectroradiometer (MODIS) satellite sensor to validate our snow cover product for the period September 2000 to November 2014.Similarly, we used data from telenivometers, which were available in the Pyrenees from October 2009 to June 2014.
First, we compared MODIS data with the SD and SWE time series (10 km resolution).MODIS snow maps were generated using the same workflow for each mountain range in the study area (Pyrenees, Cantabrian Mountains, Central System, Iberian Range, and Sierra Nevada).We downloaded all the available MOD10A1 and MYD10A1 products (version 5) from the National Snow and Ice Data Center (Hall et al., 2006).The original granules were mosaicked and reprojected from the sinusoidal system to the Universal Transverse Mercator (UTM) reference system.Then, we ran a gapfilling algorithm, using the binary snow product to avoid data losses due to cloud cover (Gascoin et al., 2015).This provided gap-free daily maps showing the presence and absence of snow in each mountain range from 2000 to 2014.From these maps, the probability of snow was calculated as follows: where P (Snow) is the probability of snow (%), N s is the number of days with snow, and N is the total number of days of the period.Snow probability maps were also calculated from the FSM snow cover maps.In this work, we chose a threshold of 0.11 m for SD and a threshold of 40 mm for SWE (Gascoin et al., 2015) in the FSM time series.This allowed us to generate snow cover maps from FSM outputs.Then, we aggregated the MODIS pixels (500 m) to the simulation grid (∼ 10 km), with averaging of the values of MODIS pixels to make them comparable.
We also used data from 11 telenivometers, which measure sub-hourly SWE and SD using gamma ray attenuation and acoustic sensors.These data were provided by the ERHIN program (Estimación de Recursos Hídricos Procedentes de la Nieve) of the Hydrological Ebro River Basin Authority (Navarro-Serrano and López-Moreno, 2017).Ten telenivometers were located in the Pyrenees, and one in the Cantabrian Mountains.A complete description of the telenivometers and their locations can be found at www. saihhebro.com.We also used an SD sensor in the Central System mountain range (Durán et al., 2017), which is from the National Meteorological Agency of Spain (AEMET).We projected the meteorological variables from the WRF simulation to elevations of the different telenivometers for simulations.Figure 3 shows a comparison of the modeled and observed SD time series at these 10 sites.
It must be noted that it is challenging to validate gridded products from ground-based data (Snauffer et al., 2016).Snowpack can have large variability over small distances (López-Moreno et al., 2015;Meromy et al., 2013).This implies that punctual measurements may not be representative of the 10 km resolution data, even when comparing a simulation at the same elevation as the telenivometer.In addition, snow measurements always include biases from the different measuring devices (Kinar and Pomeroy, 2015).Thus, we focused on the temporal patterns of snowpack during the season.More specifically, we compared the accumulation patterns during the season, assuming that accumulation and melting rates were similar in the simulated and observational data, but that SD and SWE likely differ between the telenivometer and the simulation.
Thus, we first compared different percentiles of SD and SWE in the telenivometer and the simulated time series.Then, using each percentile as a threshold for snow presence, we converted the series into binary data, allowing use of the kappa test (Cohen, 1960) for each percentile.The kappa coefficient ranges from 1 and < 0, but it is difficult to assign an agreement criterion based on kappa value.Thus, we used the thresholds proposed by Landis and Koch (1977), which basically agree with values proposed by Fleiss et al. (1969; < 0.00: poor; 0.00-0.20:slight; 0.21-0.4:fair; 0.41-0.60:moderate; 0.61-0.80:substantial; and 0.81-1.00:almost perfect).We examined percentile values between 10 and 90 %, as more representative of snow accumulation during the season.

Validation
Our analysis of the probability of snow presence from MODIS and FSM shows that the outputs had good correlations (Fig. 4).This analysis compared the probability of snow at each pixel (∼ 10 km × 10 km) from MODIS and FSM outputs for the SWE and SD time series from September 2000 to November 2014.The mean coefficient (R 2 ) was 0.76, and a mean absolute error was 6.3 %.This analysis also shows the correlations for each mountain range, and the distribution of errors for SWE and SD (simulated less observed).
These results also show there are no significant differences in the errors of P (Snow) for the different mountain ranges.However, the correlation was not strong for the Sierra Nevada range, probably due to its limited snow cover, although this remained inside the variability of the scatter plot.
Validation of these results with telenivometers indicated kappa values for thresholds in the 10th to 90th percentiles of each season (Fig. 5).The kappa values were mostly above 0.6, although accuracy declined for the highest percentiles.
The kappa coefficient does not account for the displacement magnitude of the different percentiles, and a difference of a few days at the time of peak accumulation may www.earth-syst-sci-data.net/10/303/2018/ Earth Syst.Sci.Data, 10, 303-315, 2018 cause a sharp decrease in the kappa value.This is the reason for the loss of accuracy at the highest percentiles.Thus, we further analyzed these data to determine the time of the year when snowpack exceeded the 90th, 75th, and 50th percentiles at each telenivometer in the observed (OBS) and simulated (SIM) series (Fig. 5c).This analysis shows that, despite small temporal shifts, the simulated snow series accurately represents the temporal patterns when different snow percentiles are exceeded.The biggest shift in the position of the 90th and 75th percentiles was during the 2011/2012 season.This season was extremely dry on the Iberian Peninsula, and there were very few snowfall events (Fig. 3).Thus, a small bias in the simulation of a single event during this time could lead to a large error in prediction of the magnitude and timing of SD and SWE maxima.

Gridded snow dataset: applications and limitations
The final products of the models are daily gridded datasets (resolution: 0.088 • , ∼ 10 km) of SD and SWE at elevations from 500 to 2900 m a.s.l.(100 m intervals) from 1980 to 2014.The datasets (ncdf4 format) cover the entire Iberian Peninsula, including the north side of the Pyrenees in France.Each dataset contains information of the entire Iberian Peninsula and a mask that covers pixels that do not present areas at the elevations of the simulation estimated from a 250 m resolution DEM.
This snow database provides new opportunities for studies of snow in the Iberian Peninsula.In particular, the tem-Earth Syst.Sci.Data, 10, 303-315, 2018 www.earth-syst-sci-data.net/10/303/2018/  poral resolution and the duration of the series show significant improvements over previous observational data.Also, the geographic data generated on SD and SWE provides the opportunity to obtain more snow and hydrologically relevant information than available from remote sensing alone.It is also possible to develop different snow products at different elevations, allowing for comparison of different elevations and different regions.For example, Fig. 6 shows the longterm average interannual maximum SD and SWE at three different elevations.
Figure 7 shows examples of other snow variables that can be derived from the database: average number of snowfall  events and percentage of days with snow cover at three elevations.These analyses are particularly useful for the development of different snow climatologies for the whole Iberian Peninsula, or for specific areas, in studies that rely on ecological data (e.g., phenology or distribution of plants and animals, forest growth), studies that require hydrological pa-rameters for different catchments, and studies that determine risk maps for snow-related events.
It is also possible to extract daily time series for different areas or elevations at each pixel.For example, Fig. 8 compares SWE series at three elevations in the pixel at the highest peak of the Pyrenees (Aneto Peak, 3404 m a.s.l.).Thus, these series allows for the study of different annual snow accu- mulation and melting patterns at a specific location and how elevation influences snow evolution.Similarly, it enables one to study the existence of temporal trends or the occurrence of extreme snowfall and melting events.
The database contains uncertainties that are not easy to quantify, due to the limited amount of observational data.Biases may be due to uncertainty of the boundary conditions from the ERA-Interim reanalysis (Chaudhuri et al., 2013) since errors from the WRF downscaling model are difficult to quantify in mountain areas (Gutmann et al., 2012), and uncertainties that typically result from simulations of snow mass and energy balance from meteorological data (Essery et al., 1999(Essery et al., , 2013;;Magnusson et al., 2015).The use of the standard air temperature lapse rate could also be a source of uncertainty.Although other studies have observed a decrease in the lapse rate during winter months, this effect is result of thermic inversions that are not considered to be due to the spatial resolution of the simulation.
Despite these limitations, we had very satisfactory results when testing the duration and the interannual variability of the snowpack against MODIS and telenivometer data, which provided reliable observations during several snow seasons.This way, the database presents a reliable validation for more than a third of the time period generated.When using this database, it is important to consider that it was based on the assumption of flat topography within each 10 km × 10 km pixel.Therefore, this dataset is not suitable for studies of snow variability due to terrain aspect, slope, and snow redistribution processes, such as avalanches and wind transport.

Data availability
The data presented here are freely available for download from Zenodo (https://doi.org/10.5281/zenodo.854618).SD and SWE datasets are in ncdf4 format, with one file for each elevation band.The observational information used to validate the main data is also available for download.All telenivometer data are in CSV format.Daily snow cover (derived from MODIS) is provided as five multiband GeoTiff files (one file for each mountain range, each band is a date), and a CSV file indicates the date of each band.

Conclusions
We presented a new daily gridded database of SD and SWE for the Iberian Peninsula from 1980 to 2014 period at a resolution of 0.088 • (∼ 10 km).The database consists of 50 ncdf4 files for SD and SWE from 500 to 2900 m a.s.l., and another 2 files of WRF simulation DEMs, summing more than 652 000 maps.A mask label of "no data" is included if the grid is not found at the elevation of the simulated elevation band.
The scarcity of snow observations in the Iberian Peninsula made it necessary to couple a dynamic downscaling of ERA-Interim reanalysis using the WRF model by use of a snow energy and mass balance model (FSM).Input data of FSM provided directly, or estimated from WRF outputs, were available for the average elevation of each 10 km × 10 km pixel, and these data were transformed to achieve an elevation offset at 100 m intervals.
Despite some uncertainties, the database is consistent with available observational data.More specifically, validation with MODIS data indicated an error of 6.07 % and an R 2 of 0.76 from the analysis of the mean presence of snow.The database also provides good representation of the temporal patterns of the telenivometers, with kappa values generally over 0.6, and above 0.4 for all analyzed percentiles.This database will be an important resource for studies of many different hydrological, environmental, and economic processes in Mediterranean areas.Thus, we expect the database presented here will be useful for future snow-related studies at regional scales on the Iberian Peninsula, and for a broad community of researchers and land managers working in areas where snowfall occurs.

Figure 1 .
Figure 1.Digital elevation model (DEM) of the Iberian Peninsula and locations of the telenivometers, Cotos Pass SD sensor, and MODIS study areas.

Figure 2 .
Figure 2. Simulation workflow.Squared boxes represent modeling steps and rounded boxes represent meteorological variables.Variables that are not inputs or outputs of a model are indicated by dotted lines (see a glossary of abbreviations used in Appendix A).

Figure 3 .
Figure 3.Comparison between modeled (red) and observed (black) SD time series for each telenivometer and the Cotos SD sensor.

Figure 4 .
Figure 4. Correlation between the long-term (2000-2015) mean probability of snow depth (a) and snow water equivalent (b) from MODIS data and from FSM output.Box plot insets show the frequency distributions of errors (%), with the central red lines indicating average errors, boxes indicating the 25th and 75th percentiles, bars indicating the 10th and 90th percentiles, and dots indicating the 5th and 95th percentiles.

Figure 5 .
Figure 5. Kappa values derived from comparison of observed and simulated series for different percentiles of snow depth (a) and snow water equivalent (b), and periods of the year (blue) when snowpack exceeds the 90th, 75th, and 50th percentiles (c).In (c), each pair of bands shows the times when the different percentiles in the observed (OBS) and simulated (SIM) series at each telenivometer exceeded the indicated percentile.