An open-access CMIP5 pattern library for temperature and precipitation: Description and methodology

Pattern scaling is used to efficiently emulate general circulation models and explore uncertainty in climate projections under multiple forcing scenarios. Pattern scaling methods assume that local climate changes scale with a global mean temperature increase, allowing for spatial patterns to be generated for multiple models for any future emission scenario. Of the possible techniques used to generate patterns, the two most prominent are the delta and least squared regression methods. For uncertainty quantification and probabilistic statistical analysis, a library of patterns with descriptive statistics for each file 5 would be beneficial, but such a library does not presently exist. This paper presents patterns from all CMIP5 models for temperature and precipitation on an annual and sub-annual basis, along with the code used to generate these patterns. We explore the differences and statistical significance between patterns generated by each method and assess performance of the generated patterns across methods and scenarios. Regardless of epoch chosen, local temperature sensitivity to global mean temperature change is similar with differences of≤ 0.2◦C. Differences in patterns across seasons between methods and epochs were largest 10 in high latitudes (60-90N/S). Bias and mean errors between modeled and pattern predicted output from the linear regression method were smaller than patterns generated by the delta method. Across scenarios, differences in the linear regression method patterns were more statistically significant, especially at high latitudes. We found that pattern generation methodologies were able to approximate the forced signal of change to within ≤ 0.5◦C, but choice of pattern generation methodology for pattern scaling purposes should be informed by user goals and criteria. The dataset and netCDF data generation code are available at 15 http://doi.org/10.5281/zenodo.235905.


Data Analysis
The delta pattern (DP) is described as follows: For each model (M ) and future scenario (S), local (2 dimensional, a value for each latitude and longitude pair) temperature change (T L) is normalized by global (scalar value) mean temperature change (T G), with respect to a 30 year reference epoch 5 from the CMIP5 historical simulation.
All epochs were thirty years in length, as it was assumed that the length of epochs used should not alter the resulting pattern. Barnes and Barnes (2015) argue that the ideal epoch length is dependent on minimizing variable variance by selecting a epoch length with a high signal-to-noise ratio, which is largely dependent on length of time series, and whether the trend in the time series is linear. They found that for temperature, one-third the length of the time series is ideal, and for a 100 year time In impact studies, a later reference epoch is more suitable because it is more representative of the current climate, and hence what socio-economic systems were already somewhat adapted to (Fowler et al., 2007;Herger et al., 2015).
In adaptation/mitigation analyses, a pre-industrial control simulation epoch is often used as the baseline from which change is 15 diagnosed, as this period is likely to provide the largest deviation from projected future climate, but for pattern generation, an epoch in the later half of the 20 th Century is often used Tebaldi and Arblaster, 2014).
For the aforementioned epoch variations, we used two reference epochs to generate patterns: a late 19 th ) and a late 20 th  Century average, hereafter referred to as L19C and L20C respectively. The bulk of the epoch patterns The least squared regression (LSR) patterns were calculated from future forcing scenarios only. We use a least squares approach, which provides the best fit for calculating the regression pattern: In this equation, T G is the GMT time series (one-dimensional, unsmoothed), and T L is the gridded time series (three dimensional). β is a two-dimensional field of regression slopes, and is a two-dimensional residual term (error) stemming from linearly fitting the dependent variable to the predictor. α is the y-intercept, which we take to be 0 by only computing change, not absolute temperature.
To examine the assumption that the multi-model ensemble probability distribution and the sample mean between patterns 30 and scenarios generated by each method were not significantly different, we calculated the Student's t-distribution probability.
where A(x) is the area of the grid box x and sums were calculated over all x.

Pattern Differences
For the delta methodology, choice of epoch can be important, and in our ensemble, at the local spatial scale, absolute temperature differences between reference epochs were small, but differences in future epochs often exceeded 2 o C in rcp8.5, particularly over land and at high latitudes ( Figure 2). Differences in variance across epochs were also small ( Figure 3), and these relatively small differences in variance between epochs were not likely to affect the resulting temperature patterns. This 20 may not be true when using other climate variables like precipitation, which may have large year to year or decadal natural variability in the observed period.
Patterns across epochs were similar despite differences in rate of GMT change and absolute temperature differences in epochs ( Figure 4). Differences between reference epoch patterns were largest in the Northern Hemisphere mid and high latitudes, but differences were generally not significant, except for the Great Lakes region of North America in December through Regardless of epoch chosen for the delta method, the resulting patterns were similar to the regression patterns ( Figure 5).
The key idea in either pattern scaling method is that local temperature change scales with global temperature change, despite different ways of calculating the local/global relationship. With the exception of the high latitudes, the differences in the annual pattern were small (< 0.2 • C). Pattern differences were similar across seasons, but differences in patterns were the largest in DJF, particularly for the delta pattern with the earlier reference period. The regression patterns have a stronger temperature sensitivity to GMT change in the Northern Hemisphere and a weaker temperature sensitivity in the Southern Hemisphere at high latitudes as compared to the delta methods. These differences in sensitivity stem from how each methodology capture the 5 effect of Arctic amplification, where the warming trend in the Arctic is almost twice as large as the trend in the global average, but the effect of Arctic amplification on pattern generation is not explored in here.
There were few regions where the patterns differ significantly ( Figure 5), and there were fewer significant differences between the regression method and the delta method using the L20C epoch over the L19C epoch. Significant differences between patterns generated from each method were shown in the Baltic/ N. European region for both epochs in the annual and DJF 10 pattern, but in the earlier epoch, significant differences across seasons were shown in the Northwest Pacific region. In general, the temperature patterns across methods were very similar.
To evaluate performance of each pattern methodology, accuracy was based on how well the patterns approximated the linear GMT change of 1 o C simulated by each GCM. For this evaluation of metric the delta patterns largely underestimate the spatial pattern, particularly over land and mid-high Northern latitudes ( Figure 6). The Antarctic region is both overestimated (L21C/ 15 L19C pattern) and underestimated (L21C/ L20C pattern) by a magnitude of ≥0.15 • C, which is generally larger than the error in the regression pattern estimates. Also, as shown in Figure 4, the delta patterns have a strong temperature sensitivity over the the Baltic/ N. European region. Overall, it appears that the regression pattern scaling method underestimates the relationship between global temperature and local temperature, but the degree to which it overestimates the relationship is small (< 0.08 • C).
Emulator performance was also approximated by examining the RMSE between the actual and pattern predicted anomaly 20 (Table 2). For this metric, the regression patterns also outperforms the delta pattens regardless of epoch. DJF RMSE were higher than the JJA RMSE, and the rcp4.5 RMSE was consistently lower than the rcp8.5 across methods. This may be because the rcp8.5 patterns largely underestimates the relationship between global and local temperature as seen in Figure 6. Nevertheless, Table 2 indicates that the both methodologies do well emulating actual model output.
Overall, the annual and seasonal patterns from each method were not significantly different from each other, regardless 25 of reference epoch for the delta method. The differences were slightly larger when using an earlier reference epoch, but the regions where the ensemble differences were significant (above the 95% significance level) were small. Our small ensemble size (12 models with only one realization) may have contributed to lack of significance in differences across epoch patterns, particularly when using parametric tests like calculating p-values for the Student's t-test. A more robust analysis would include multiple realizations from all available models.

Scenario Differences
To test the assumption that local temperature sensitivity to global mean temperature change, regardless of methodology ( the calculated linear trend. In the rcp4.5 scenario, the differences between the two ensemble mean GMT changes were as much as 1 o C, which suggests that the way the global signal is calculated and the rate at which the signal changes plays a key role in understanding the differences between methods across scenarios. This is further supported by Mitchell (2003) who found that the GMT rate of change can have a significant impact on response patterns.
There were significant differences between patterns generated across scenarios, and the resulting pattern differed by more 5 than 0.5 • C in some regions (Figure 8). For the delta patterns, the largest differences across scenarios were in the Northern Hemisphere at high latitudes, areas where temperature variability is large (Figure 9). The differences in patterns generated by the regression method under different forcing scenarios were generally larger with statistically significant differences in the mid-high latitudes, particularly in the Arctic, land areas bordering the Mediterranean, and the subtropical South Pacific. The rcp4.5 also has a lower signal-to-noise ratio than the rcp8.5 (Figure 8), which makes the pattern for the rcp4.5 scenario more 10 difficult to estimate because the signal is harder to distinguish from the noise in this scenario.
Temperature change at high latitudes cannot be approximated by a linear relationship due to strong regional feedbacks, for example Arctic Amplification (Holland and Bitz, 2003), and therefore is not well predicted when using pattern scaling methods. The differences between scenarios are larger in the regression method, but both methods show similar spatial patterns.
To further examine why the regression method produces larger differences across scenarios, we looked at the linear fit of local 15 temperature to GMT (Figure 10). In the rcp8.5 scenario, the R 2 values were large, but in the rcp4.5 scenario, R 2 values were much lower particularly along the Antarctic continent and in the North Atlantic. Even though the global/local fit is poorer in the rcp4.5 scenario, the lower forcing scenario predicted pattern is more like the actual model output (Table 2).
Large differences in patterns across scenarios were mainly due to a larger local/global ratio at high latitudes in the rcp4.5 scenario as compared to the rcp8.5 scenario despite lower local and global trends ( Figure 11). These differences at high latitudes 20 result from a steep temperature change gradient and the fast rate of change after sea/land ice has melted. Sensitivity of high latitudes to even small changes in GMT is evident across scenarios, but the rcp4.5 scenario overestimates this relationship, resulting in substantial differences in patterns between the scenarios, particularly for the regression methodology.
Differences between patterns across scenarios is further examined by separating the land and ocean patterns ( Figure 12).
The differences between scenarios for the regression method when isolating the land/ocean pattern were comparatively large, 25 especially over the Arctic and Antarctic regions. For the regression method, the rcp4.5 ocean only pattern sensitivity is ≥ 0.5 • C than the rcp8.5 (ocean only) pattern sensitivity over the Arctic, and the rcp4.5 land only pattern sensitivity is ≥ 0.5 • C than the rcp8.5 (land only) pattern sensitivity over the Antarctic. The differences in patterns across scenarios for the delta method when isolating the land/ocean pattern were small except over the Arctic region, which shows strong seasonal differences (≥ 0.5 • C) in boreal autumn (SON). In this way the delta method is more consistent across future forcing scenarios, which should be taken result in larger pattern errors. We found that differences in patterns between scenarios are more evident in the regression method as compared to the delta method, but similar features appear in the patterns produced by the delta method. How models incorporate sea-ice may also add to the variability of patterns across models, but this is a subject we have not explored.

Conclusions
The differences in patterns generated by each method were minor except at Northern Hemisphere high latitudes and along the Choice of scenario can affect the resulting pattern, particularly at high latitudes. With the regression methodology pattern, the GMT temperature sensitivity is stronger when using the rcp4.5 scenario because the GMT trend is proportionally smaller and changes in GMT have a stronger effect on local temperature, particularly when strong mitigation is employed later in the simulation. Delta method patterns were more consistent across scenarios with less heterogeneity in local temporal and spatial GMT sensitivity. With the assumption that different future forcing scenarios should not change the resulting pattern, the delta 15 pattern is more consistent across scenarios, regardless of epoch chosen, despite differences in epoch trends being large.
Our pattern library was created because the online tools and software that generates pattern scaling products do not provide pattern data and diagnostics, and do not offer flexibility in use of a SCM for scaling. We have created a library of patterns with descriptive statistics for each output file, which we believe to be beneficial for uncertainty quantification and probabilistic statistical analysis.

20
Creation of a pattern library is the first step in our goal of exploring inter-model and future forcing uncertainty in climate projections. Our next steps will be to push the current boundaries of pattern scaling by exploring sub-annual pattern scaling, scaling measures of climate variability, and scaling of different variables, such as pH. Our efforts will be documented in future manuscripts, and all patterns will be added to the repository.

25
The pattern library is available on GitHub through the Joint Global Change Research Institution repository (https://github.com/JGCRI/).
The purpose of creating this pattern library was to allow for researchers across various fields to be able to efficiently use the statistical patterns generated by the described regression method to examine model response to change in global mean temperature for all the available CMIP5 models (41 models, at present). We also further intend for those patterns to be easy to scale using a scaler generated from a SCM of ones choosing. To this end, included in each netCDF file for each model is:  The patterns range in size (1 MB to 165 KB) due to spatial resolution, but all patterns were kept at the native resolution of the dependent variable. This was done to retain model specific information, which may have been lost if regridded to a common spatial resolution.
All source code used to produce patterns is available in the aforementioned repository. Source code is written in NCAR