CPLFD-GDPT 5 : High-resolution gridded daily precipitation and temperature dataset for two largest Polish river basins

The CHASE-PL Forcing Data Gridded Daily Precipitation & Te mperature Dataset 5km (CPLFD-GDPT5) consists of 1951-2013 daily minimum and m axi um air temperatures and precipitation totals interpolated onto a 5 km grid based on d aily meteorological observations from Institute of Meteorology and Water Management (IMGW-PIB; P olish stations), Deutscher Wetterdienst (DWD, German and Czech stations), ECAD and NOAA-NCDC (Slovak, Ukrainian and Be5 larus stations). The main purpose for constructing this pro duct was the need for long-term aerial precipitation and temperature data for earth-system model ling, especially hydrological modelling. The spatial coverage is the union of Vistula and Odra basin an d Polish territory. The number of available meteorological stations for precipitation and t emperature varies in time from about 100 for temperature and 300 for precipitation in 1950’s up to abo ut 180 for temperature and 700 for 10 precipitation in 1990’s. The precipitation dataset was cor rected for snowfall and rainfall under-catch with the Richter method. The interpolation methods were: kr iging with elevation as external drift for temperatures and indicator kriging combined with universa l kriging for precipitation. The kriging cross-validation revealed low root mean squared errors exp ressed as a fraction of standard deviation (SD): 0.54 and 0.47 for minimum and maximum temperature, res pectively and 0.79 for precipita15 tion. The correlation scores were 0.84 for minimum temperat ures, 0.88 for maximum temperatures and 0.65 for precipitation. The CPLFD-GDPT5 product is cons istent with 1971-2000 climatic data published by IMGW-PIB. We also confirm good skill of the produ ct for hydrological modelling by performing an application using the Soil and Water Assessme nt Tool (SWAT) in the Vistula and Odra basins. 20 Link to the dataset: http://data.3tu.nl/repository/uuid :e939aec0-bdd1-440f-bd1e-c49ff10d0a07


Introduction
High-resolution aerial precipitation and air temperature data are becoming more and more desired as input or verification data for distributed earth-system modelling.Certainly, one of the most demanding branches for these data is distributed hydrological modelling.Rainfall-run-off models -e.g.Soil and Water Assessment Tool (SWAT; Arnold et al., 1998), WetSpa (Liu and De Smedt, 2004), or TOPMODEL (Beven et al., 1995) -rely on precipitation as a driver for hydrological processes.Similarly, integrated models -e.g.MIKE SHE (Abbott et al., 1986), HydroGeoSphere (Brunner and Simmons, 2012), or ParFLOW (Kollet and Maxwell, 2006) -and channel and floodplain hydrodynamical models, e.g.
Published by Copernicus Publications.
T. Berezwoski et al.: High-resolution gridded daily precipitation and temperature data set LISFLOOD-2D (Bates and De Roo, 2000), can use precipitation data as a boundary condition.There are also numerous applications for precipitation in hydrological engineering, e.g.peak flow at a given return period or run-off coefficient estimations.Applicability of temperature data in hydrological modelling is also very important, albeit less straightforward.Temperature is often used as a variable in subcomponents of hydrological models.Models like WetSpa (Water and Energy Transfer between Soil, Plants, and Atmosphere), SWAT, SRM (Martinec, 1975), or HydroGeoSphere use temperature for snowmelt estimation with the degreeday model.Several models use air temperature and ancillary variables to estimate potential evapotranspiration (e.g.Hargreaves, 1975) or actual evapotranspiration (e.g.Wang et al., 2007).These models can be applied independently from hydrological models in order to generate evapotranspiration series or implemented in the models source code (e.g.Hargreaves, 1975, in SWAT).
Global data sets which include (sub-)daily gridded precipitation and temperature (Sheffield et al., 2006;Dee et al., 2011;Schamm et al., 2014;Weedon et al., 2014) are available in a range of spatial resolutions, with typically highest resolution of 0.25 × 0.25 • , which is equivalent to 28 × 28 km at the Equator and 28 × 14 km at 60 • north.Numerous applications of these data sets are found for large-scale hydrological modelling (Haddeland et al., 2011;Li et al., 2013;Abbaspour et al., 2015).However, the aforementioned spatial resolution is often not high enough when the study area is smaller than one grid cell of the data.For this reason local meteorological gridded data sets are continuously appearing, at countrywide or regional scale (Jones et al., 2009;Rauthe et al., 2013;Isotta et al., 2014;Keller et al., 2015).So far no high-resolution gridded data set exists either for the Vistula and Oder basins or for Polish territory, except a partial cover by the CARPATCLIM (Climate of the Carpathian region; Spinoni et al., 2015) and HYRAS (central European high-resolution gridded daily data sets; Frick et al., 2014) projects.The advantage of the regional data sets is their finer spatial resolution than in the global data sets, usually between 5 × 5 km and 1 × 1 km.Hence, they are better suited for local-scale modelling (e.g.hydrological modelling) than the coarser-resolution global data sets (Berezowski et al., 2015a).
The above-mentioned regional gridded data sets are constructed by interpolating observations from national meteorological networks.However, no clear guideline exists for selecting the optimal method for spatial interpolation of meteorological variables.In general the geostatistical (kriging) and inverse distance weighted (IDW) methods are preferable for seasonal and daily rainfall interpolation over Thiessen polygons, polynomial interpolation, or other deterministic methods (Ly et al., 2013).An earlier study by Szcześniak and Piniewski (2015) showed that kriging interpolation of precipitation for several meso-scale basins in Poland outperformed IDW and Thiessen polygons in skill for hydrological modelling.Indeed, kriging has recently often been used as the interpolation method for precipitation and air temperature, with satisfactory results quantified by correlation coefficients or root-mean-squared errors (Carrera-Hernández and Gaskin, 2007;Hofstra et al., 2008;Ly et al., 2011;Herrera et al., 2012).Each of these studies, however, uses different flavours of kriging, which can be ordinary kriging (assumes constant mean), universal kriging (removal of trend based on the spatial coordinates), kriging with external drift (mean is dependent on external variable, e.g.elevation map), cokriging (estimates a variable based on its values and values of other variables), and others.Again, selection of the most appropriate kriging method is variable-dependent (by studying phenomena responsible for observations of the variable, e.g. relations with elevation) and case-study-dependent (by investigating whether any trend is observed at given spatial and temporal scale, e.g.seasonal and geographical relations with climate).
In this study we show the workflow for constructing the CHASE-PL (Climate change impact assessment for selected sectors in Poland) Forcing Data-Gridded Daily Precipitation & Temperature Dataset-5 km (CPLFD-GDPT5) product.The CPLFD-GDPT5 product is aimed at providing input data for earth-system modelling, especially hydrological modelling.The workflow description is accompanied by detailed verification of the product, consistency check of the product with long-term climatic maps, and description of the product applicability.The objective of this study is to give transparent information about the CPLFD-GDPT5 details for users.We also believe that the workflow presented herein can be a guideline for other regional meteorological interpolation studies.

Temporal and spatial representation of the CPLFD-GDPT5
The temporal range for the CPLFD-GDPT5 product is from 1 January 1951 to 31 December 2013 in daily resolution, in total 23 011 days.In the source data some records beginning in 1940 were available; however, the network of meteorological stations was too sparse for reasonable interpolation results before 1951.The spatial extent of the CPLFD-GDPT5 product is the union of the Vistula and Oder basins and the Polish territory (Fig. 1).The spatial resolution is 5 × 5 km grid.We used projected coordinate system PUWG 1992 (EPSG:2180), which has an advantage of being valid for our entire area of interest.The coordinate system PUWG 1992 has distortions varying longitudinally from −70 cm km −1 (western study area) to 90 cm km −1 (eastern study area), which are negligible at 5 km grid resolution.Moreover, PUWG 1992 is easily reprojected to other coordinate system, because it is based on the Geodetic Reference System 1980 (GRS 80) ellipsoid (GRS 80 is almost the same as the World Geodetic System 1984 -WGS 84 ellipsoid).

Source data
We have compiled meteorological data from four databases: (Project team ECAD, 2013; Menne et al., 2012) which are also WMO members.All organizations from which we have compiled the data conduct quality control checks for raw data before the data are made publicly available.

No-data filtering and quality check
IMGW-PIB measurement network is divided into three groups of stations depending on their order: "synoptyczne" (En.: synoptic), "klimatyczne" (En.: climatic), and "opadowe" (En.: precipitation).Precipitation data from the lowestorder stations, opadowe, do not distinguish between "lack of precipitation" (or "0") and "no data".Occurrence of the true no-data records is generally extremely rare, and clearly lack of precipitation is incomparably more frequent.For this reason we have initially changed all no-data records into 0 values for all "holes" whose duration was shorter than 2 months, whereas all holes longer than 2 months were left unchanged.This was based on a valid assumption that, in the Polish climate, periods with no precipitation would never exceed 2 months.In the next step, for each case of a station with a hole we have computed precipitation totals at all neighbouring stations (20 km radius).Depending on the amount of precipitation registered at the neighbouring stations, we have either left the 0 values unchanged or re-established former no-data values.After carrying on these procedures, the mean percentage of 0 values at all stations belonging to the order opadowe was equal to 52 %.In comparison, the respective numbers for two other orders of stations that were free of such problems (synoptyczne and klimatyczne), were equal to 51 and 55 %, respectively, which shows that the applied procedure did not alter the distribution of dry days in the time series.
We have also removed suspicious values from the temperature time series.There were two cases of suspiciously high values of temperature at two stations, both with periods of these incorrect values no longer than a month.These data were removed from the data set.Possibly the errors may be attributed to the failure of the measuring device.In addition daily and monthly values of precipitation and maximum and

Rainfall and snowfall under-catch correction
Local precipitation measured in a rain gauge is not representative for aerial precipitation.This is due to various factors, e.g.wind speed, rain gauge shielding, or precipitation type.Several models for correcting precipitation for undercatch have been developed.In Poland an empirical model proposed by Chomicz (1976) is often used.The Chomicz (1976) model considers only liquid precipitation and has parameters available only within Polish borders.Hence, we had to used another model that is valid internationally and accounts for both solid and liquid precipitation.We have chosen the Richter (1995) model for correcting snowfall and rainfall under-catch, which is recognized by WMO (Goodison et al., 1998).The rainfall and snowfall under-catch correction is applied in the following steps: 1. Mean daily air temperature ( • C) is calculated for all precipitation stations as t = (t n +t x )/2.The tn and tx are the minimum and maximum daily temperatures ( • C) obtained from our CPLFD-GDPT5 product, respectively (see Sect. 2.5.1).For the calculations we have used tn and tx values from a grid cell containing the respective meteorological precipitation station.We could not use the average temperatures measured at the stations because the latter were not recorded systematically at the majority of the stations.We did not want to use a blended approach (use measured temperature if available, otherwise use interpolated) either, because this would be harmful for calculation consistency.3. The corrected precipitation (mm) is calculated based on Richter (1995) formulae as p = bp , where p is the measured precipitation total (mm), b is the coefficient (-) for the influence of wind exposition of the measurements site, and is a seasonally varying empiric coefficient (-) for the precipitation type (snow, mixed snow and rain, or rain).The range of b and values used in this paper is available in Richter (1995).The values of b were set as for "medium shielding" for all stations apart from those in the mountains or close to the coast, where b were set as for "low shielding" (Fig. 5).The rationale behind assigning different values of b for different location lies in the fact that wind speed is generally higher in mountains and at the seaside than in the lowlands.

Interpolation
Our meteorological observations with spatial coordinates after pre-processing steps described above (Sects.2.2-2.4) were interpolated with two different kriging methods.Minimum and maximum temperatures were interpolated with the kriging with external drift, and precipitation with universal kriging.The exponential variogram model has been used in each case, with the variogram parameters estimated automatically for each daily kriging with the weighted least-squares fit (Pebesma, 2014).The block kriging approach was used with the block size equal to the output square grid size, i.e. 5 km.Computations were conducted in the R software (R Core Team, 2015) with the "gstat" package (Pebesma, 2004).

Temperature kriging
Both minimum and maximum temperatures were interpolated in the same way.Kriging with external drift was used in order to account for the temperature variability with elevation.This approach was used in other similar studies (e.g.Hattermann et al., 2005;Haylock et al., 2008).The external drift variable was elevation (m) obtained from the Shuttle Radar Topography Mission (SRTM) digital elevation model aggregated to a 5 km grid.In order to remove the no-data values from the SRTM data, the elevation of seas and oceans was relabelled to 0.0 m.

Precipitation kriging
Precipitation was interpolated using a two-step approach combining the universal kriging of the precipitation data (first step) with the indicator kriging of the precipitation occurrence data (second step).This approach was selected due to giving good results with a similar problem (e.g.Herrera et al., 2012).Universal kriging was chosen because the trend was not removed from the precipitation data beforehand.Indicator kriging was applied on binary data in order to allow reducing the smoothing effect of around 0-value zones.The daily precipitation totals p in each station were reclassified to binary according to The 0.1 mm threshold value was used as it reflects the error due to rain gauges which is measured by ISO standards.As a result of the indicator kriging a raster of real values ranging between 0 and 1 was obtained; these values represent probabilities of a day being wet, i.e. precipitation being equal to 0.1 mm or higher, or "wet-day probabilities".This raster was used to mask the very small precipitation totals obtained from the universal kriging of precipitation data (first step).
The mask was applied if the indicator kriging interpolation value was smaller than a threshold.Usually a value of 0.5 is selected as the threshold because it represent the 50 % probability, but in our case better results were obtained with a smaller threshold, equal to 0.1.In the final step any negative precipitation values (if still present) were changed to 0.

Cross validation
For each daily interpolation a cross validation was performed for all stations; i.e. each station was removed from the sample one at a time, and the remaining stations were used to predict the value of the missing station.
The cross validation was conducted on both a temporal and spatial scale.On the temporal scale the errors were calculated for each day from all stations having data on this day.Due to a high number of records in the temporal scale, we present the results in the form of a descriptive statistics table.On the spatial scale the errors are calculated for each station from all of a station's available daily values.The number of records on the spatial scale calculation is equal to the number of meteorological stations used; hence, we present the result in the form of maps.
The interpolation errors were quantified using two functions: (1) the Pearson's correlation coefficient (ρ (-)) and (2) the root-mean-squared error normalized to the standard deviation of the observed data (-): where Y and Ŷ are respectively the observed and interpolated values of a given variable (precipitation or temperature), N is the number of observations (number of stations in the spatial approach or number of days in the temporal approach), and σ Y is the standard deviation of observations.The ρ values show the collinearity of the observed and interpolated data, and the RMSEsd values show the interpolation error as a fraction of the observation standard deviation.Note that ρ and RMSEsd can not be calculated for days with no observed precipitation (i.e.precipitation at all stations is 0.0 mm) because the standard deviation is 0.0 mm (N = 154).Because of that, these days were excluded from the cross-validation analysis.

Cross validation of precipitation
The daily ρ and RMSEsd statistics for precipitation show that 75 % of ρ values are higher than 0.47 and RMSEsd values are lower than 0.93 (Table 1).Median ρ is 0.65 and median RMSEsd is 0.79.The majority of the RMSEsd values do not exceed 1 standard deviation, and nearly all ρ values are positive.
The median of daily RMSEsd values aggregated in years is negatively correlated (−0.72) with the number of available stations (Fig. 7), sharply decreasing as the number of stations increases and reaching an equilibrium in the 1980s.This suggests that the kriging errors are dependent on the density of the observation network.We also found that the interpolation results before 1960 are associated with higher uncertainty (Fig. 7).
When considering the RMSEsd calculated spatially for all stations, the results show a clear pattern of higher errors at the edge of the interpolation area, particularly in Belarus, Ukraine and Slovakia (Fig. 6).Notably, most of the stations with high errors come from other sources than IMGW-PIB.Indeed, stations at the edge of the interpolation area, which are managed by IMGW-PIB or DWD, do not show the higher errors (cf.NE and W boundary).The median RMSEsd on the spatial scale is 0.50.An analogous situation is obtained for ρ spatial pattern at the meteorological station (Fig. 6) with the median ρ equal to 0.87.
A similar interpolation study was conducted by Ly et al. (2011).In their study the error was quantified by RMSE not normalized to standard deviation; thus, the units were millimetres (mm).After recalculation of our RMSEsd back to RMSE, the values show similar ranges to Ly et al. (2011), with an interquartile range of 0.5-2.5 mm and 97.5 percentile of 5.5 mm, suggesting that our errors are within an acceptable range.The daily results, also summarized in Table 1, were used for calculating the annual medians.

Cross validation of temperature
The statistics of daily RMSEsd show the median equal to 0.54 for the minimum temperature and 0.47 for maximum temperature (Table 1).Nearly all the RMSEsd values in both cases do not exceed 1 standard deviation, and the ρ values are positive.The daily ρ statistics for minimum and maximum temperature show that 75 % of ρ values are above 0.77 for minimum temperature and above 0.86 for maximum temperature.The median of ρ is 0.84 for the minimum temperature and 0.88 for the maximum temperature.
The median of daily RMSEsd values for minimum temperature aggregated in years is not correlated (0.00) with the number of available stations (Fig. 10).The RMSEsd reached a minimum in the 1970s.Since then, however, it has been gradually increasing (at a small rate), which does not seem to be related to changes in the number of available stations.The situation is only slightly different for the maximum temperatures (Fig. 11), for which correlation with the number of stations is weak (0.22).Although the lowest RMSEsd can be observed for the 1960s, since then a gradual increase (again, at a small rate) can be observed.It should also be noted that the range of RMSEsd is not very wide, in general (0.48-0.6 for the minimum temperature and 0.42-0.5,neglecting one outlier, for the maximum temperature).Overall, it seems that our kriging errors for temperature are not dependent on the density of the observation network.However, as in the case of precipitation interpolation the results before 1960 are associated with higher uncertainties.
When the RMSEsd results calculated spatially for all stations are analysed, the minimum temperature shows a rather uniformly distributed values with a few outliers located at the boundary of the interpolation area, mostly in the mountains (southern border, Fig. 8).An analogous situation is observed for maximum temperatures (Fig. 9).The median RMSEsd on the spatial scale is equal to 0.17 for minimum temperature and 0.10 for maximum temperature.An analogous situation is observed for ρ spatial pattern for the meteorological sta- tion with the median ρ equal to 0.99 both for minimum and maximum temperature.A similar interpolation exercise was conducted by Carrera-Hernández and Gaskin (2007).In their study the error was quantified by ρ 2 .The ranges of ρ 2 obtained in their study were 0.73-0.88for the maximum temperature and 0.68-0.96for the minimum temperature.These results are very similar to ours (after recalculating ρ to ρ 2 ), suggesting good quality of the gridded temperature data set.

Consistency with climatic data
The precipitation and minimum-and maximum-temperature gridded products were analysed for consistency with longterm climatic data.Since the majority of our product spatial coverage is in Poland, we have used long-term climatic maps from the Polish meteorological service (IMGW-PIB) for comparison.The IMGW-PIB maps were developed for the period 1971-2000 and include precipitation totals, 5 % mini-  mum temperature, and 95 % maximum temperature (IMGW-PIB, 2015).For purposes of the comparison analysis we have constructed analogous maps from a 1971-2000 subset of CPLFD-GDPT5 by ( 1) averaging the annual precipitation totals, (2) calculating the 5 % quantile from the daily minimum temperatures, and (3) calculating the 95 % quantile from the daily maximum temperatures.This comparison analysis have several limitations.Both products were constructed by different interpolation methods and using data from a different collection of meteorological stations.Another limitation in this comparison is that input data were subjected to different preprocessing steps, of which the most important is the different rainfall and snowfall under-catch correction.Due to these limitations we do not except to have an ideal match between our and IMGW-PIB climatic data.However, we expect that ranges and diversity of climatic data would present similar spatial patterns if our gridded product were constructed properly.1, were used for calculating the annual medians.
The pattern of latitudinal precipitation totals with 750 mm in the north, a decrease in centre, and the maximum in the south was well preserved by the CPLFD-GDPT5 product (Fig. 12).The range of precipitation totals in CPLFD-GDPT5 in Poland (1971-2000) was from 552 to 1402 mm, whereas for the IMGW-PIB the range was from 450-500 to 1250-1300 mm.The central regions with the lowest (450-500 mm) precipitation were overestimated in CPLFD-GDPT5.Similarly, CPLFD-GDPT5 overestimates by about 100 mm the highest precipitations in the central-southern region.We believe this discrepancy is due to rainfall and snowfall undercatch correction used in our study, which assigns higher correction factors to snowfall than to rainfall.
The longitudinal 5 % minimum-temperatures pattern with −6 • C in the east, −12 • C in the west, and the minimum (−13 • C) in the central-south was well preserved by CPLFD-GDPT5 (Fig. 13).The range of 5 % minimum temperatures in CPLFD-GDPT5 in Poland (1971-2000) was from −15.1 to −5.8 • C, whereas for the IMGW-PIB the range was from −14 to −13, to −5 to −6 • C. We do not identify any substantial difference in the temperature patterns in both data sources except an island of −6 to −7 • C located in the centraleastern region that was not present in CPLFD-GDPT5.We believe that this and other, smaller discrepancies are a result of methodological differences in the constructing of both data sources, especially the use of different meteorological stations' data sets.
The complex latitudinal pattern in the north, the longitudinal pattern in the centre and south, and the elevationdependent pattern in the south (Sudeten and Carpathian Mountains) for 95 % maximum temperatures were well preserved by CPLFD-GDPT5 (Fig. 14).The range of 95 % maximum temperatures in CPLFD-GDPT5 in Poland (1971-2000) was from 15.7 to 28.0 • C, whereas for the IMGW-PIB the range was from 18-19 to 27-28 • C. Again, we do not identify any substantial difference in the temperature patterns in both data sources.However, CPLFD-GDPT5 underestimates the temperature in the peaks of the Carpathian Moun-  1, were used for calculating the annual medians.
tains by about 2 • C.More clutter is also observed in our product in the entire region, especially in the south (mountains).We believe that these discrepancies are a result, as in the previous cases, of methodological differences in the constructing of both data sources, especially the use of high-resolution and detailed elevation data in our study, including differences in the available stations used.

Applicability
This product was developed with the purpose of its further (re)use for earth-system modelling, and in particular for hydrological modelling and climate impact studies.The spatial resolution (5 km) of CPLFD-GDPT5 ensures that it will be useful not only for regional studies, where generally lowerresolution data would be appropriate, but also for modelling on the local scale, where high spatial resolution is necessary to capture the variability.
As an example application of CPLFD-GDPT5 for hydrological modelling, we have set up, calibrated, and validated the SWAT model for the Vistula and Oder basins (Piniewski et al., 2016).SWAT is a process-based, semidistributed, continuous-time hydrological model that simulates the movement of water, sediment, and nutrients on a catchment scale with a daily time step (Arnold et al., 1998).Apart from the CPLFD-GDPT5 precipitation and temperatures that are major input data, the SWAT set-up of the Vistula and Oder basins uses various spatial input data such as topography, hydrography, land cover, and soils, all with quite complex parametrizations.Figure 15a shows spatial variability of mean annual potential evapotranspiration (PET) calculated in SWAT using the Hargreaves (1975) method, relying on our minimum-and maximum-temperature data.In the Hargreaves method PET is proportional to mean temperature (approximated by the arithmetic mean of minimum and maximum temperature) and the difference between maximum and minimum temperature (being a proxy of solar radiation).As illustrated in many studies, the Hargreaves PET is highly correlated to other methods for PET estimations (Lu et al., 2005) and to PET observations (Hargreaves and Allen, 2003).Moreover, it was found particularly useful for SWAT modelling by decreasing the observed PET estimation error when compared to the Penman-Monteith PET estimation (Earls and Dixon, 2008).Not surprisingly, the spatial pattern in simulated PET largely follows the pattern of maximum temperature (Fig. 14).
PET constitutes an upper bound for actual evapotranspiration, which is another variable simulated in SWAT, crucial for the process of model calibration, typically performed using measured discharge data.Figure 15b-c show two examples of calibration plots illustrating simulated (95 % prediction uncertainty band and one simulation with the highest value of objective function) and measured daily stream flow from the Vistula and Oder SWAT model developed in Piniewski et al. (2016).The gauges were selected to demonstrate high simulation skill of CPLFD-GDPT5 across a range of scales: the Oder catchment upstream of Gozdowice is 110 000 km 2 , while the Drweca River upstream of Rodzone is 1700 km 2 .Both the positive visual inspection of the hydrograph and the high values of objective functions (e.g.coefficient of determination equal to 0.81 and 0.71, respectively, and percent bias of 1.4 and −5.8 %, respectively) confirm the usefulness and the quality of the developed interpolation product.Piniewski et al. (2016)   density (a proxy for kriging error) is positively correlated with the values of goodness-of-fit measures across a set of 80 calibration catchments.Therefore, it is recommended that station density should be always checked prior to the direct use of CPLFD-GDPT5 for hydrological modelling applications in a given study area.CPLFD-GDPT5 could also be well-suited as an observation-based reference data set for bias correction of GCM/RCM climate projections, in the same way as the WFD-ERA40 served as the reference for bias correction at the global scale using the ISI-MIP approach (Hempel et al., 2013) or as the SPAIN02 data set (Herrera et al., 2015) served for bias correction of EURO-CORDEX (Coordinated Downscaling Experiment -European Domain) models at the regional scale (Casanueva et al., 2015).The potential for further use of bias corrected climate projections (in combination with other requested data sets) in model impact studies is huge and includes such fields as water, agriculture, biomass, coastal infrastructure, and health (Warszawski et al., 2014).

Data access
The CPLFD-GDPT5 product is available in NetCDF and GeoTIFF formats.The gridded structure of the data and the NetCDF and GeoTIFF data format ensure that it will be easily processed in GIS and data analysis software (e.g.R for both NetCDF and GeoTIFF; list of NetCDF manipulation software: http://www.unidata.ucar.edu/software/netcdf/software.html).We provide some example R scripts with which to read the data and conduct some basic processing (Supplement).
The data are publicly available in the 3TU.Datacentrum repository under the doi:10.4121/uuid:e939aec0-bdd1-440f-bd1e-c49ff10d0a07(Berezowski et al., 2015b).The NetCDF file naming convention is VariableForTimeStep.nc.Every NetCDF file is accompanied by an additional description in a *.txt file and follows the CF-1.0 convention.TimeStep can be days, months, or years.Variable can be tmin/tmax for minimum/maximum air temperature ( • C) or preci for precipitation (kg m −2 ).Each daily grid for precipitation or temperature is also stored as a separate GeoTIFF file.The naming convention for GeoTIFF is prefixYYYYMMDD.tif,with prefix being "pre" for precipitation, "tmin" for minimum temperature, and "tmax" for maximum temperature; whereas YYYYMMDD is the date format.

Conclusions
We have constructed a 5 km gridded product of daily precipitation, minimum air temperature, and maximum air temperature intended primarily for use as input data for local Earth Syst.Sci.Data, 8, 127-139, 2016 www.earth-syst-sci-data.net/8/127/2016/ Various preprocessing steps were introduced in order to filter missing data and correct the precipitation for undercatch.The quality of the product is assessed by a crossvalidation procedure conducted in parallel with the kriging interpolation.The cross validation shows high correlations and root-mean-squared errors lower than 1 standard deviation of the observations in all three interpolated variables.However, some particular stations located at the border of the study area show slightly higher errors and lower correlations.We also evaluated the consistency of our products with climatic data provided by the Polish meteorological organization (IMGW-PIB).The consistency check confirms the high quality of the products, with only small differences resulting from different methodologies and number of selected stations across the region.Finally, we show an example application of the gridded product for hydrological modelling in SWAT.The results show very good discharge modelling efficiency in both a small and a large catchment, which shows that the data set serves its purpose across a wide range of scales.The high-resolution gridded data set will certainly add value when used as a reference in bias-adjustment of regional climate model results (e.g.EURO-CORDEX within the CORDEX Initiative) by providing more reliable climate projections for Poland.The data set is provided in GeoTIFF and NetCDF format in order to provide ease of access for most of the modelling community.Nonetheless, we provide sample R scripts for managing the data in the Supplement.
The data set and methods presented herein have several aspects of further research.It would be interesting to compare the data set with other data sets of lower resolution (e.g.EOBS) in order to show its true added value for highresolution hydrological modelling.Moreover, the comparison of the Hargreaves PET, estimated with the CPLFD-GDPT5 temperatures, could be further investigated in the scope of other PET estimation methods (e.g.Penman-Monteith) and observations.Lastly, we believe that there is still space to improve the interpolation methods by testing other interpolation algorithms.
The CPLFD-GDPT5 product was constructed for the period 1951-2013.New meteorological data and novel approaches to interpolation algorithms are continuously appearing.Hence, we are planning to update the product both by extending the time span and by testing new interpolation algorithms.The extension is planned on a 3-year basis.
Deutscher Wetterdienst (DWD), European Climate Assessment and Dataset (ECAD), and National Oceanic and Atmosphere Administration-National Climatic Data Center (NOAA-NCDC) for providing meteorological data.The anonymous reviewer is acknowledged for providing comments that considerably improved the manuscript.Tobias Conradt from PIK, Potsdam, is acknowledged for support in getting access to German climate data.Mikołaj Piniewski is grateful for support from the Alexander von Humboldt Foundation.

Figure 1 .
Figure 1.The spatial extent for the CPLFD-GDPT5 temperature and precipitation products.Countries are labelled with black national codes.The Oder and Vistula basins are labelled in grey.

TFigure 3 .
Figure 3. Number of meteorological stations for temperature observations per year from 1951 to 2013.

Figure 4 .
Figure 4. Spatial distribution of meteorological stations for temperature and precipitation used for interpolation of the CPLFD-GDPT5 product.

Figure 7 .
Figure 7. Annual RMSEsd median (blue) and number of available stations per year (pink) for precipitation in the period 1951-2013.The daily results, also summarized in Table1, were used for calculating the annual medians.

Figure 8 .
Figure 8. Minimum temperature ρ (top) and RMSEsd (bottom) values calculated for stations in the period 1951-2013.National borders (black lines) are labelled with country codes.

Figure 9 .
Figure 9. Maximum temperature ρ (top) and RMSEsd (bottom) values calculated for stations in the period 1951-2013.National borders (black lines) are labelled with country codes.

Figure 10 .
Figure10.Annual RMSEsd median (blue) and number of available stations per year (pink) for minimum temperature in the period 1951-2013.The daily results, also summarized in Table1, were used for calculating the annual medians.

Figure 11 .
Figure11.Annual RMSEsd median (blue) and number of available stations per year (pink) for maximum temperature in the period 1951-2013.The daily results, also summarized in Table1, were used for calculating the annual medians.

Figure 12 .
Figure 12.Comparison of CPLFD-GDPT5 precipitation-totals contours with IMGW-PIB precipitation-totals map for the 1971-2000 period.Contours are in 50 mm intervals.The IMGW-PIB map indicates also major rivers (blue lines) and major cities (labelled black circles).The IMGW-PIB map is adapted from the Climatic Maps of Poland (IMGW-PIB, 2015).

Figure 15 .
Figure 15.Example application of CPLFD-GDPT5 in the hydrological model SWAT of the Vistula and Oder basins.(a) Mean annual potential evapotranspiration calculated using the Hargreaves method based on daily minimum and maximum-temperature data.(b) Simulated and observed daily stream flow on the Oder River at Gozdowice gauging station.(c) Simulated and observed daily stream flow on the Drweca River at Rodzone gauging station.Green band denotes 95 % prediction uncertainty; blue and red lines denote observed and simulated (best solution) flows, respectively.