PeRL: a circum-Arctic Permafrost Region Pond and Lake database

. Ponds and lakes are abundant in Arctic permafrost lowlands. They play an important role in Arctic wetland ecosystems by regulating carbon, water, and energy ﬂuxes and providing freshwater habitats. However, ponds, i.e., waterbodies with surface areas smaller than 1 . 0 × 10 4 m 2 , have not been inventoried on global and regional scales. The Permafrost Region Pond and Lake (PeRL) database presents the results of a circum-Arctic effort to map ponds and lakes from modern (2002–2013) high-resolution aerial and satellite imagery with a resolution of 5 m or better. The database also includes historical imagery from 1948 to 1965 with a resolution of 6 m or better. PeRL includes 69 maps covering a wide range of environmental conditions from tundra to boreal regions and from continuous to discontinuous permafrost zones. Waterbody maps are linked to regional permafrost landscape maps which provide information on permafrost extent, ground ice volume, geology, and lithology. This paper describes waterbody classiﬁcation and accuracy, and presents statistics of waterbody distribution for each site. Maps of permafrost landscapes in Alaska, Canada, and Russia are used to extrapolate waterbody statistics from the site level to regional landscape units. PeRL presents pond and lake estimates for a total area of Published by Copernicus Publications. 1 . 4 × 10 6 km 2 across the Arctic, about 17 % of the Arctic lowland ( < 300 m a.s.l.) land surface area. PeRL waterbodies with sizes of 1 . 0 × 10 6 m 2 down to 1 . 0 × 10 2 m 2 contributed up to 21 % to the total water fraction. Waterbody density ranged from 1 . 0 × 10 to 9 . 4 × 10 1 km − 2 . Ponds are the dominant waterbody type by number in all landscapes representing 45–99 % of the total waterbody number. The implementation of PeRL size distributions in land surface models will greatly improve the investigation and projection of surface inundation and carbon ﬂuxes in permafrost lowlands. Waterbody maps, study area boundaries, and maps of regional permafrost landscapes including detailed metadata are available at https://doi.pangaea.de/10.1594/PANGAEA.868349.


Introduction
Globally, Arctic lowlands underlain by permafrost have both the highest number and area fraction of waterbodies (Lehner and Döll, 2004;Grosse et al., 2013;Verpoorter et al., 2014). These landscapes play a key role as a freshwater resource, as habitat for wildlife, and as part of the water, carbon, and energy cycles (Rautio et al., 2011;CAFF, 2013). The rapid warming of the Arctic affects the distribution of surface and subsurface water due to permafrost degradation and increased evapotranspiration (Hinzman et al., 2013). Remotesensing studies have found both increasing and decreasing trends in surface water extent for waterbodies in permafrost regions across broad spatial and temporal scales (e.g., Carroll et al., 2011;Watts et al., 2012;Boike et al., 2016;Kravtsova and Rodionova, 2016). These studies, however, are limited in their assessment of changes in surface inundation since they only include lakes, i.e., waterbodies with a surface area of 1.0×10 4 m 2 or larger. Ponds with a surface area smaller than 1.0×10 4 m 2 , on the other hand, have not yet been inventoried on the global scale. Yet ponds dominate the total number of waterbodies in Arctic lowlands, accounting for up to 95 % of individual waterbodies, and may contribute up to 30 % to the total water surface area (Muster et al., 2012;Muster, 2013). Arctic ponds are characterized by intense biogeophysical and biogeochemical processes. They have been identified as a large source of carbon fluxes compared to the surrounding terrestrial environment (Rautio et al., 2011;Laurion et al., 2010;Abnizova et al., 2012;Langer et al., 2015;Wik et al., 2016;Bouchard et al., 2015). Due to their small surface areas and shallow depths, ponds are especially prone to change; various studies reported ponds drying out or increasing in abundance due to new thermokarst or the drainage of large lakes (Jones et al., 2011;Andresen and Lougheed, 2015;Liljedahl et al., 2016). Such changes in surface inundation may significantly alter regional water, energy, and carbon fluxes (Watts et al., 2014;Lara et al., 2015). Both the monitoring and modeling of pond and lake development are therefore crucial to better understand the trajectories of Arctic land cover dynamics in relation to climate and environmental change. Currently, however, the direction and magnitude of these changes remain uncertain, mainly due to the limited extent of high-resolution studies and the lack of pond representation in global databases. Although recent ef-forts have produced global land cover maps with resolutions of 30 m (Liao et al., 2014;Verpoorter et al., 2014;Feng et al., 2015;Paltan et al., 2015), these data sets only include lakes.
To complement previous approaches, we present the Permafrost Region Pond and Lake (PeRL) database, a circum-Arctic effort that compiles 69 maps of ponds and lakes from remote-sensing data with high spatial resolution (of ≤ 6 m) (Fig. 1). This database fills the gap in available global databases that have cutoffs in waterbody surface area at 1.0 × 10 4 m 2 or above. In addition, we link PeRL waterbody maps with existing maps of permafrost landscapes to extrapolate waterbody distributions from the individual study areas to larger landscapes units. Permafrost landscapes are terrain units characterized by distinct properties such as climate, surficial geology, parent material, permafrost extent, ground ice content, and topography. These properties have been identified as major factors in the evolution and distribution of northern waterbodies (Smith et al., 2007;Grosse et al., 2013;Veremeeva and Glushkova, 2016).
The core objectives of the PeRL database are to (i) archive and disseminate fine-resolution geospatial data of northern high-latitude waterbodies, (ii) quantify the intra-and interregional variability in waterbody size distributions, and (iii) provide regional key statistics, including the uncertainty in waterbody distributions, that can be used to benchmark site-, regional-, and global-scale land models.

Definition of ponds and lakes
The definition of ponds and lakes varies in the literature and depends on the chosen scale and goal of studies when characterizing waterbodies. The Ramsar classification scheme defines ponds as permanently inundated basins smaller than 8.0×10 4 m 2 in surface area (Ramsar Convention Secretariat, 2010). Studies have also used surface areas smaller than 5.0×10 4 m 2 (Labrecque et al., 2009) or 1.0×10 4 m 2 (Rautio et al., 2011) to distinguish ponds from lakes.
In remote-sensing studies, surface area is the most reliably inferred parameter related to waterbody properties. Physical and biogeochemical processes of waterbodies, however, also strongly depend on waterbody depth. Differences in thermodynamics are associated with water depth, where deeper lakes may develop a stratified water column while shallow Earth Syst. Sci. Data, 9, 317-348, 2017 www.earth-syst-sci-data.net/9/317/2017/  Brown et al. (1998). ponds remain well mixed. In high latitudes, waterbodies with depths greater than 2 m are likely to remain unfrozen at the bottom throughout winter, thus providing overwintering habitat for fish and other aquatic species. In permafrost regions, a continuously unfrozen layer (talik) may develop underneath such deeper waterbodies, which strongly affects carbon cycling in these sediments (Schuur et al., 2008). Several studies have shown a positive correlation between waterbody surface area and depth (Langer et al., 2015;Wik et al., 2016). However, there is large variability in the areadepth relationship, i.e., there are large but shallow lakes that freeze to the bottom and small but deep ponds that develop a talik, and these characteristics may also change over time with changes in water level and basin morphology. In this study we distinguish ponds and lakes based on their surface area. We adopt the distinction of Rautio et al. (2011) and define ponds as bodies of largely standing water with a surface area smaller than 1.0 × 10 4 m 2 and lakes as waterbodies with a surface area of 1.0 × 10 4 m 2 or larger.

Data sources and processing
PeRL's goal is to make high-resolution waterbody maps available to a large science community. PeRL compiles both previously published and unpublished fine-scale waterbody maps. Maps were included if they met the resolution criteria of 5 m or less for modern imagery and 6 m for histori-cal imagery. Historical imagery was included to enable highresolution change detection in ponds and lakes. Twenty-nine maps were specifically produced for PeRL to complement the published maps in order to represent a broad range of landscape types with regard to permafrost extent, ground ice content, geology, and ecozone. All waterbody maps were derived from optical or radar airborne or satellite imagery that were acquired between mid and late summer (July-September), thereby excluding the snowmelt and early summer season. Modern imagery dates from 2002 to 2013, and historical imagery dates from 1948 to 1965. Previously published maps are the product of many independent studies, which leads to a broad range of image types and classification methods used. Details on image processing and classification procedures for already published maps (n = 31) are listed in Table A1 and the respective publications. The processing of new PeRL maps is described in Sect. 3.2.

Image processing
Available high-resolution imagery used for PeRL map production included optical aerial and satellite imagery (Geo-Eye, QuickBird, WorldView-1 and -2, and the Korean Multi-Purpose Satellite 2 (KOMPSAT-2)) and radar (TerraSAR-X) imagery.
Most optical imagery provided a near-infrared band that was used for classification, with the exception of World-View 1, which only has a panchromatic band. Preprocessing of the optical imagery involved georeferencing or orthorectification depending on the availability of high-resolution digital elevation models (Appendix A, Table A1).
TerraSAR-X (TSX) imagery was acquired in Stripmap Mode with an HH polarization as the geocoded Enhanced Ellipsoid Corrected (EEC) product or as the Single Look Slant Range Complex (SSC) product which was then processed to EEC (Sect. A1). TSX images were filtered in ENvironment for Visualizing Images (ENVI) v4.7 (ITTVIS) in order to reduce image noise using a lee filter with a 3 × 3 pixel window followed by a gamma filter with a 7 × 7 or 11 × 11 window depending on the image quality (Klonus and Ehlers, 2008).

Classification of open water
Imagery was classified using either a density slice or an unsupervised k-means classification in ENVI v4.8 (ITTVIS). The panchromatic, the near-infrared, and the X-Band (HH polarization) show a sharp contrast between open water and surrounding vegetation. Visual inspection of the imagery could therefore be used to determine individual threshold values (in the case of density slice) or to assign classes (in the case of k-means unsupervised classification) for the extraction of open-water surfaces. Threshold values and class boundaries varied between images and sites due to differences in illumination, acquisition geometry, and radiometric properties of images. Detailed information on remote-sensing imagery and the classification procedure for each site is listed in Appendix A (Table A1).
The classification procedure in ENVI produces raster images that were converted to ESRI vector files so that each waterbody is represented as a single polygon. Vector files were then manually processed in ArcGIS v10.2 (ESRI) to fill gaps in waterbody surfaces and remove streams, rivers, and shadows due to clouds or topography and partial lakes along the study area boundaries. The minimum waterbody size was set to at least 4 pixels. This equals less than 4 m 2 for the highest resolutions of less than 1 m and 64 m 2 for the lowest resolution of 4 m for modern imagery (1.4 × 10 2 m 2 for historical imagery with resolutions of 6 m). All classified objects smaller than the minimum size were removed. Partial lakes along the study area boundaries, segments of streams and rivers, and shadows due to clouds or topography were manually removed.

Study area boundaries
Each waterbody map is associated with a vector file that delineates the study area's boundary. Boundaries were calculated for each map -whether new or previously published -in ArcGIS by first producing a positive buffer of 1-3 km around each waterbody in the map and merging the individual buffers into one single polygon. From that single polygon we then subtracted the same distance again, which rendered the study area boundary. The area of the boundary is referred to as the total mapped extent of that site (Table 1). For sites with multi-temporal data, the total mapped extent of the oldest classification was chosen as a reference in order to calculate changes in pond and lake statistics over time.

PeRL statistical analysis
Statistics such as areal fraction of water or average waterbody surface area are meaningful measures to compare waterbody distributions between individual study areas and permafrost landscapes. Statistics were calculated for all waterbodies, as well as separately for ponds and lakes. We calculated areal fraction, i.e., the area fraction of water relative to land (the total mapped area), and waterbody density, i.e., the number of waterbodies per kilometer for each site, using the software package R version 3.3.1. However, statistics are subject to the size of the study area. Very small study areas may not capture larger waterbodies, which may nonetheless be characteristic of the larger landscape. Very large study areas, on the other hand, may show more spatial variation in waterbody distribution than smaller study areas. In order to make statistics comparable between study areas, we subdivided larger study areas into boxes of 10 km×10 km. The box size was chosen as a function of the standard error (Sect. A2). We calculated the statistics for each box and then averaged the statistics across all boxes within each study area. This subgrid analysis was conducted for all study areas larger than 300 km 2 for which at least four boxes could be sampled.
Statistics are also subject to image resolution, which defines the minimum object size that can be confidently mapped. For all modern imagery, the minimum waterbody size included in the calculation of statistics was therefore set to 1.0 × 10 2 m 2 (1.4 × 10 2 m 2 for historical imagery). Very large lakes are not representative of all study areas and may only be partially mapped within a 10 km × 10 km box. We therefore chose a maximum waterbody size of 1.0 × 10 6 m 2 to be included in the calculation of statistics.

Regional maps of permafrost landscapes
Waterbody statistics of each site were extrapolated to permafrost landscapes based on the assumption that distributions of ponds and lakes are similar for similar permafrost landscapes, i.e., areas with similar properties regarding climate, geology, lithology (soil texture), permafrost extent, and ground ice volume. Vector maps of permafrost landscapes (PLM) are available on the regional level: the Alaskan map of permafrost characteristics (AK2008) (Jorgenson et al., 2008a), the National Ecological Framework for Canada (NEF) (Marshall et al., 1999), and the Land Resources of Russia (LRR) (Stolbovoi and McCallum, 2002). Despite differences in mapping approaches and terminology, these databases report similar landscape characteristics on comparable scales. All regional maps were available as vector files, which were converted to a common North Pole Lambert Azimuthal Equal-Area (NPLAEA) projection. All PLM were clipped in ArcGIS v10.4 with a lowland mask including only areas with elevations of 300 m or lower. The lowland mask was derived for the entire Arctic using the digital elevation model GTOPO30 (USGS). Details on the properties of each PLM are provided in Appendix B. The original PLM were merged in ArcGIS to produce a unified circum-Arctic vector file and map representation. Landscape attributes that were retained from the original PLM were ecozone, permafrost extent, ground ice volume, surficial geology, and lithology. Variable names were consolidated using uniform variable names (Appendix B, Table B4).

Extrapolation of waterbody statistics to permafrost landscapes
Waterbody maps were spatially linked with their associated permafrost landscape. Maps within the same landscape were combined, whereas maps spanning two or more landscapes were divided by selecting all waterbodies that intersected with the respective permafrost landscape. If several maps were present within one permafrost landscape unit they were combined and average statistics calculated across all maps in Earth Syst. Sci. Data, 9, 317-348, 2017 www.earth-syst-sci-data.net/9/317/2017/ that unit. Historical maps and unedited classifications were not used in the extrapolation. Extrapolations were done in Alaska, Canada, and Russia for waterbody maps with a (combined) extent of 100 km 2 or larger but not for Europe where available waterbody maps were too small. Maps in the Canadian high Arctic were smaller than 1.0 × 10 2 km 2 but represent typical wetlands in that region and were therefore included in the extrapolation. Figures D1, D2, D3, and D4 show the location of waterbody maps within their associated permafrost landscape.
Extrapolated statistics were assigned two confidence classes: (1) high and (2) low confidence. Permafrost landscapes were assigned to the high confidence class if a map was present in the permafrost landscape of that ecozone. The low confidence class indicates that statistics were derived from the same permafrost landscape but in a different ecozone. Due to differences in the mapping and generalization of landscape properties of the regional PLM, the extrapolation was conducted only within each region.

PeRL database features
The database provides two different map products: (i) sitelevel waterbody maps and (ii) an extrapolated circum-Arctic waterbody map. The database also provides different tables which present statistical parameters for each individual waterbody map (Appendix B) and aggregated statistics for permafrost landscape (PL) units in the circum-Arctic map (Table 3). 6.1 Site-level waterbody maps 6.1.1 Data set structure Altogether, the database features 70 individual waterbody maps as ESRI shape files. Each waterbody shape file is named according to a map ID. The map ID consists of a three-letter abbreviation of the site name, followed by a running three-digit number and the acquisition date of the base imagery (YYYY-MM-DD). Vector files were projected to the NPLAEA projection. The area and perimeter of each waterbody and site were calculated in ArcGIS 10.4 in square meters. Each vector file is accompanied by an xml-file which lists metadata about classification and references as presented in Tables 1 and A1. Each map has a polygon associated with it that contains the study area, i.e., the total land area of the waterbody map. All study area boundaries are stored in the file PeRL_study_areas.shp and can be identified via the map ID (Table C2). The study area shape file also includes the site characteristics listed in Table 2.
Fifty-eight maps are considered "clean", i.e., they have been manually edited to include only ponds and lakes (Table 1). Eight maps are "clean with partial waterbodies". These are multi-temporal maps with very small map extents where partial waterbodies along the study area bound- ary were not deleted in order to retain information for change detection analysis. Four maps were not manually edited due to their very large map extent and may include partial waterbodies, streams, rivers, or shadows.

Spatial and environmental characteristics
PeRL study areas are widely distributed throughout Arctic lowlands in Alaska, Canada, Russia, and Europe and cover a latitudinal gradient of about 20 • (55.3-75.7 • N), including tundra to boreal regions, and they are located in continuous, discontinuous, and sporadic permafrost zones (Fig. 1). Mean annual temperature ranges from 0 to −20 • C, and average annual precipitation ranges from 97 to 650 mm (Table 1). Twenty-one sites are located in Alaska, covering a total area of 7.3 × 10 3 km 2 . Canada has 14 sites covering 6.4×10 3 km 2 , and Russia has 30 sites covering 2.9×10 3 km 2 in total. Four sites are located in Sweden, with a total mapped area of 41 km 2 . Individual map extents range from 0.2 to 9825.7 km 2 , with a mean of 622.8 km 2 (Table 1). The database includes six multi-temporal classifications in the Kotzebue Sound lowlands and on the Barrow Peninsula in Alaska (Andresen and Lougheed, 2015), on the Grande Rivière de la Baleine Plateau (Bouchard et al., 2014) and in the Hudson Bay Lowlands in Canada, in Lapland in Sweden, and in the Usa River basin in Russia (Hugelius et al., 2011;Sannel and Kuhry, 2011). Ponds contributed about 45-99 % of the total number of waterbodies, with a mean of 85 ± 14 %, and up to 34 % to the total water surface area, with a mean of 12 ± 8.3 % ( Fig. 2 and Appendix E). The water fraction of the total mapped area ranged from about 1 to 21 % for all waterbodies and from > 1 to 6 % for ponds. Waterbody density per square kilometer ranged from 1.0 × 10 km −2 in the Indigirka lowlands, Russia, to 9.4 × 10 1 km −2 in the Olenek Channel of the Lena Delta (Table E4). The unified vector file PeRL_perma_land.shp contains the permafrost landscapes and the extrapolated waterbody statistics (Table 3). Average statistics were calculated for 10 km × 10 km boxes within large maps or when four or more maps were present in the permafrost landscapes. Average statistics are reported with their relative standard error (RE), i.e., the standard error expressed as a percentage. The permafrost landscapes are also provided as separate vector files for each region (alaska_perma_land.shp, canada_perma_land.shp, and russia_perma_land.shp) and contains the landscape characteristic of each permafrost landscape as individual attributes (Appendix B, Tables B1,  B2, and B3). The unified vector file (PeRL_perma_land.shp), and the regional files can be joined using the common PER-MID (Appendix B, Table B4).

Spatial and environmental characteristics
Altogether, we identified 230 different permafrost landscapes in the Russian lowlands, 160 in the Canadian lowlands, and 51 in the lowlands of Alaska. PeRL waterbody maps were located in 28 different permafrost landscapes (Table 4) which cover a total area of 1.4 × 10 6 km 2 across the Arctic; thereof Earth Syst. Sci. Data, 9, 317-348, 2017 www.earth-syst-sci-data.net/9/317/2017/ 1.0 × 10 6 km 2 are in Russia, 2.1 × 10 5 km 2 in Canada, and 1.7×10 5 km 2 in Alaska. About 65 % of the extrapolated area was classified as high confidence (Fig. 3). The highest landscape average areal fraction of water surface was 21 % (Fig. 4 and Table 3), and there was a waterbody density per square kilometer of 57 ( Fig. 5 and Table 4). RE of areal fraction for different subsets or maps within a permafrost landscape was about 7 % on average, with a maximum of 30 % (Table 4). RE for waterbody density was 8 % on average, with a maximum of 50 %. Our extrapolated area (1.4 × 10 6 km 2 ) represents 17.0 % of the current Arctic permafrost lowland area (below 300 m a.s.l.). PeRL provides pond and lake estimates for about 29 % (in area) of the Alaskan lowlands, 7 % of the Canadian lowlands, and 21 % of the Russian lowlands. Together all extrapolated landscapes contributed about 7 % to the current estimated Arctic permafrost area (Brown et al., 1998). In Alaska, waterbody maps were missing for permafrost landscapes with isolated permafrost (16 % of total area) or rocky lithology (36 % of total area). Dominant types of surficial geology that were not mapped include colluvial sites and sites with bedrock or of glacial origin, which together contribute 61 % to the total area. In Canada, neither isolated nor sporadic permafrost were inventoried (22 % of total area) nor was this done for areas with a ground ice content of 10-20 % or less (23 % of total area). Six of the nineteen geology classes were inventoried, which contributes 75 % to the total area. Six of seven lithology types with an areal coverage of 90 % were represented. In Russia, waterbody maps were not available in the discontinuous permafrost zone (13 % of the total area). No maps were present in regions with the geological type "deluvial-coluvial and creep" which accounts for 28 % of the total area. . Confidence for permafrost lowland landscapes. Confidence class 1 (high confidence) designates permafrost landscapes where waterbody maps are available in lowland areas. Confidence class 2 (low confidence) represents permafrost landscapes with extrapolated waterbody statistics. No-value (dark grey) areas indicate that no maps were available in these permafrost landscapes. Lightgrey areas indicate terrain with elevations (GTOPO 30, USGS) higher than 300 m a.s.l. which were not considered in the extrapolation. Permafrost boundary was derived from the regional databases.

Classification accuracy and variability
The accuracy of the individual waterbody map depends on the spectral and spatial properties of the remote-sensing imagery employed for classification as well as the classifica-

Field name Description
PERMA_LAND Permafrost landscape: permafrost extent/ground ice volume/surficial geology/texture PERMID Each permafrost landscape in the vector file is assigned a unique ID (PERMID). The first digit stands for the region (1: Alaska; 2: Canada; 3: Russia), digits 2-6 identify the single polygon, and the last three digits identify the ecozone. AREA Area of polygon in square meters PERIMETER Perimeter of polygon in square meters Map_ID Short ID of waterbody map used for extrapolation of statistics confidence 1: high confidence; 2: low confidence frac Areal fraction of waterbodies (1.0 × 10 2 to 1 × 10 6 m 2 in surface area) in percent frac_re Relative standard error of areal fraction of waterbodies (1.0 × 10 2 m 2 to smaller than 1 × 10 6 m 2 in surface area) in percent dens Density: number of waterbodies (1.0 × 10 2 to 1 × 10 6 m 2 in surface area) per square kilometer dens_re Relative standard error of density of waterbodies (1.0 × 10 2 to 1 × 10 6 m 2 in surface area) in percent frac_ponds Areal fraction of waterbodies (1.0 × 10 2 m 2 to smaller than 1 × 10 4 m 2 in surface area) in percent frac_po_re Relative standard error of areal fraction of waterbodies (1.0 × 10 2 m 2 to smaller than 1 × 10 4 m 2 in surface area) in percent dens_ponds Ponds density: number of ponds (1.0 × 10 2 to 1 × 10 4 m 2 in surface area) per square kilometer dens_po_re Relative standard error of pond density (1.0 × 10 2 m 2 to smaller than 1 × 10 4 m 2 in surface area) in percent Figure 4. Areal fraction of waterbodies with surface areas between 1.0 × 10 2 and 1.0 × 10 6 m 2 . Permafrost boundary was derived from the regional databases.
tion method.
In general, open-water surfaces show a high contrast to the surrounding land area in all utilized spectral bands, i.e., panchromatic, near-infrared, and X-band, since water absorbs most of the incoming radiation (Grosse et al., 2005;Muster et al., 2013). Ground surveys of waterbody surface area were available for only a few study sites. Accuracy ranged between 89 % for object-oriented mapping of multispectral imagery (Lara et al., 2015), 93 % for object-oriented mapping of panchromatic imagery (Andresen and Lougheed, 2015), and more than 95 % for a supervised maximum-likelihood classification of multispectral aerial images (Muster et al., 2012). Errors in the classification may be largely due to commission errors: i.e., the spectral signal is misinterpreted as water where in reality it may Figure 5. Waterbody density per square kilometer for waterbodies with surface areas of between 1.0 × 10 2 and 1.0 × 10 6 m 2 within permafrost landscape units. Permafrost boundary was derived from the regional databases.
be land surface. Many shallow ponds and pond-lake margins are characterized by vegetation growing or floating in the water which cannot be adequately classified from singleband imagery (Sannel and Brown, 2010). PeRL classifications dating from early August are likely most affected since the abundance of aquatic plants peaks around that time of year. In some cases, even multispectral imagery cannot distinguish between lake and land because floating vegetation mats fully underlain by lake water may spectrally appear like a land surface (Parsekian et al., 2011). Seasonal processes, such as snowmelt, progressing thaw depth, evaporation, and precipitation do affect the extent of surface water. Waterbody maps therefore reflect the local wa-Earth Syst. Sci. Data, 9, 317-348, 2017 www.earth-syst-sci-data.net/9/317/2017/  www.earth-syst-sci-data.net/9/317/2017/ Earth Syst. Sci. Data, 9, 317-348, 2017 ter balance at the time of image acquisition. Seasonal reduction in surface water extent, however, is largest in the first 2 weeks following snowmelt (Bowling et al., 2003). All PeRL maps date from the late summer season so that snowmelt and the early summer season are excluded. Changes of water extent in late summer are primarily due to evaporation and precipitation. In a study area on the Barrow Peninsula, Alaska, we find that the open-water extent varies between 6 and 8 % between the beginning and end of August of different years. However, the effect is hard to quantify as other factors such as spectral properties and resolution also impact classifications of different times at the same site. Seasonal variations may be larger in the case of heavy rain events right before image acquisition but ultimately depend on local conditions which control surface and subsurface runoff.

Uncertainty of circum-Arctic map
Uncertainties regarding the extrapolation of waterbody distributions arise from (i) the combination of different waterbody maps, (ii) the accuracy of the underlying regional permafrost maps, and (iii) the level of generalization inherent in the permafrost landscape units. Waterbody statistics of permafrost landscapes are derived from diverse remote-sensing imagery. Imagery dates from different years and months and features different image properties. However, the effect of seasonal variability or image properties on the average statistic is small compared to the natural variability within and between permafrost landscape units.
Permafrost landscapes present a unified circum-Arctic categorization to upscale waterbody distributions. Due to the uncertainty and scale of the regional PLM, however, it cannot be expected that nonoverlapping waterbody maps within the same permafrost landscape have the same size distribution. The regional PLM are themselves extrapolated products where finite point sources of information have been used to describe larger spatial domains. No error or uncertainty measure, however, was reported for the regional maps. In addition, the variables used to describe permafrost landscapes present the dominant classes within the landscape unit. Thus, certain waterbody maps may represent landscape subtypes that are not represented by the reported average statistic. For example, two permafrost landscapes have been classified in the Lena Delta in northern Siberia. The southern and eastern parts of the delta are characterized by continuous permafrost with ground ice volumes larger than 40 %, alluviallimnetic deposits, and organic substrate. Local studies differentiate this region further based on geomorphological differences and ground ice content. The yedoma ice complex in the southern part features a much higher ground ice content of up to 80 % and higher elevations than the eastern part; this is, however, not resolved in the Russian PLM. These subregional landscape variations are also reflected in the waterbody size distributions which are significantly different for the southern and eastern part of the delta. In the averaged statistics this is indicated by a high relative error of 11 and 28 % for the areal fraction of waterbodies and ponds, respectively, and of about 50 % for waterbody density estimates. In this case, the permafrost landscape unit in that area does not adequately reflect the known distribution of ground ice and geomorphology and demonstrates the need to further improve PLM in the future.

Potential use of database and future development
Waterbody maps and distribution statistics are the most accurate at site level. On this scale, maps can be used as a baseline to detect changes in surface inundation for seasonal, interannual, and decadal periods. Site-level size distributions can also be used to validate statistical extrapolation methods which have previously been used to extrapolate from coarser databases to finer scales (Downing et al., 2006;Seekell et al., 2010). Validation of these approaches has questioned the validity of power laws for smaller lakes and ponds but has also been limited to waterbodies as small as 1.0 × 10 4 m 2 , i.e., 2 orders of magnitude larger than the minimum size in PeRL data sets.
The circum-Arctic map provides spatially extrapolated information for larger-scale applications. Coarse-scale global databases such as the Global Lakes and Wetlands database (GLWD) by Lehner and Döll (2004) are used in global Earth system models to represent the water fraction in model grid cells (Wania et al., 2013). The GLWD renders a reliable inventory of lakes larger 1 km 2 (Lehner and Döll, 2004). Compared to the GLWD, PeRL inventoried up to 21 % additional waterbody area. Moreover, ponds are the most frequent waterbody type (45-99 %). In light of the observed scaling of biogeochemical processes with waterbody surface area (Wik et al., 2016), PeRL results emphasize the need to include waterbodies of 1.0 × 10 6 m 2 and smaller in conjunction with their size distributions in physical and biogeochemical models of the high-latitude surface. Moreover, the combination of waterbody size distributions with landscape properties can motivate further study for process-based predictive simulations both on the site and regional scale. However, users should be aware of the map's uncertainty when using it to upscale landscape properties such as methane or heat fluxes. For this purpose, users should refer to the reported spatial variability, confidence class, and extensive metadata.
PeRL's permafrost landscape units represent the least common denominator across the Arctic where landscape properties have been strongly generalized. More detailed information about landscape properties was available for the Canadian database and northern Alaska (Jorgensen et al., 2014) but not for central and southern Alaska or Russia. We suggest that more detailed and accurate classes of ground ice as well as further refinement of physiography within the broad lowland zone will likely explain differences in waterbody distributions between different maps in the same permafrost Earth Syst. Sci. Data, 9, 317-348, 2017 www.earth-syst-sci-data.net/9/317/2017/ landscape. Regionally different methodologies currently prohibit a comparison of permafrost landscapes between regions and extrapolation across regions. The harmonization of landscape properties, delineation of common terrain units, and extrapolation methods for the whole Arctic require a coordinated circum-Arctic effort. Our extrapolated area (1.4 × 10 6 km 2 ) represents only 7.0 % of the current estimated Arctic permafrost area (Brown et al., 1998) but about 17 % of the current Arctic permafrost lowland area (below 300 m a.s.l.) where most of the Arctic lakes are located (Lehner and Döll, 2004;Smith et al., 2007;Grosse et al., 2013). With a few exceptions, the reported sites are predominantly located in coastal areas. In particular, the lake-rich permafrost lowlands of Canada and central Siberia are underrepresented, despite their large spatial coverage. Underrepresented landscape types are areas with discontinuous, isolated, or sporadic permafrost, as well as areas in boreal regions. PeRL maps are conservative estimates of surface inundation as most maps capture open water only and do not include ponds smaller than 1.0 × 10 2 m 2 in size. PeRL maps with resolutions of less than 1 m, however, indicate the presence of many waterbodies smaller than the current threshold of 1.0 × 10 2 m 2 . These very small waterbodies as well as water areas with emersed vegetation are highly productive environments that require attention in future mapping efforts.

Data availability
Waterbody maps, study area boundaries, and maps of regional permafrost landscapes including a link to detailed metadata are available at https://doi.pangaea.de/10.1594/ PANGAEA.868349 (Muster et al., 2017).

Conclusions
PeRL maps and statistics provide a great resource for a large suite of applications across the Arctic such as resource and habitat management, hydrological and ecological modeling, pond and lake change detection, and upscaling of biogeochemical processes. PeRL maps includes waterbodies with surface areas as small as 1.0 × 10 2 m 2 ; this complements available global databases and increases waterbody size resolution by 2-4 orders of magnitude. Ponds, i.e., waterbodies with surface areas smaller than 1.0 × 10 4 m 2 are the dominant waterbody type found in all study areas across the Arctic. This demonstrates the need to include small waterbodies and parameterize size distributions in global land surface models. Furthermore, PeRL presents a baseline that allows future studies to investigate the direction and magnitude of past and future Arctic surface inundation. The current compilation of high-resolution waterbody maps underlines the need to produce more: vast areas in all regions are still unmapped regarding small waterbodies, especially the Canadian lowlands and boreal regions of Russia. Future mapping efforts should therefore focus equally on filling gaps and monitoring inventoried sites. The combination of waterbody statistics and landscape properties has great potential to improve our understanding of environmental drivers of surface inundation in permafrost lowlands. However, permafrost landscape maps need to be improved by increasing the level of detail as well by harmonizing mapping and extrapolation approaches across Arctic regions.

S. Muster et al.: A circum-Arctic PeRL
Appendix A: Image processing and subgrid sampling

A1 Processing of TerraSAR-X imagery
Geocoded EEC products obtained from the German Space Agency (DLR) are delivered in radar brightness. They are projected to the best available digital elevation model (DEM), i.e., Shuttle Radar Topography Mission (SRTM) Xband DEMs (30 m resolution) and SRTM C-band DEMs (90 m resolution). For the remaining areas, the 1 km resolution Global Land One-kilometer Base Elevation Project (GLOBE) DEM is used. The EEC is a detected multilook product with reduced speckle and approximately square cells on the ground. The slant-range resolution of the image is 1.2 m, which corresponds to 3.3-3.5 m projected on the ground for incidence angles between 45 and 20 • and an azimuth resolution of 3.3 m (Eineder et al., 2008). SSC were geocoded to the Data User Element(DUE) Permafrost DEM, and no multi-looking was applied.

A2 Subgrid sampling
In large study areas we performed a subgrid analysis, i.e., we selected waterbodies within equally sized boxes and averaged statistics from all boxes of the study area. In order to determine a representative box size, we compared the variability of waterbody distribution statistics within three study areas in Russia, Canada, and Alaska. In each study area, we selected waterbodies from a minimum of 5 and up to 50 randomly distributed boxes with varying sizes of 5 km × 5 km, 10 km × 10 km, and 20 km × 20 km. We calculated the standard error (SE) of the mean of all statistics across all boxes of the same size. SE of density (waterbody number per square kilometer) and waterbody mean surface area was lowest for 10 km × 10 km boxes. SE increased for 20 km × 20 km boxes, which is probably due to the significantly lower number of boxes that could be sampled. Only 12 PeRL sites have a study area larger than 1000 km 2 that would allow sampling a minimum of five boxes of 20 km × 20 km in size. A box size of 10 km×10 km allows the subsampling of 26 sites with a minimum of five boxes. Taking into account the overall variability of distributions and the possible number of subgrid samples, a box size of 10 km × 10 km was chosen for subgrid analysis. Subgrid analysis was conducted for study areas larger than 300 km 2 .

B1 Alaskan permafrost landscape maps
The permafrost landscape map of Alaska reports surficial geology, mean annual air temperature (MAAT), primary soil texture, permafrost extent, ground ice volume, and primary thermokarst landforms (Jorgenson et al., 2008). A rule-based model was used to incorporate MAAT and surficial geology. Permafrost characteristics were assigned to each surficial deposit under varying temperatures using terrain-permafrost relationships and expert knowledge (Jorgenson et al., 2008b).

B2 Canadian permafrost landscape maps
The permafrost landscapes of Canada are described in the NEF. The NEF distinguishes four levels of generalization nested within each other. Ecozones represent the largest and most generalized units followed by ecoprovinces, ecoregions, and ecodistricts. Ecodistricts were delineated based mainly on differences in parent material, topography, landform, and soil development derived from the Soil Landscapes of Canada Working Group (2010) on a map scale of 1 : 3 000 000 to 1 : 1 000 000 (Ecological Stratification Working Group, 1995;Marshall et al., 1999), whereas ecoregions and ecoprovinces are generalized based mainly on climate, physiography, and vegetation. Ecodistricts were therefore chosen as the most appropriate to delineate permafrost landscapes. NEF reports the areal fraction of the underlying soil landscape units and attributes nested within each ecodistrict. The dominant fraction of surficial geology, lithology, permafrost extent, and ground ice volume was chosen to describe each ecodistrict. Ecodistricts with the same permafrost landscape type within the same ecozone were then merged to PL units.

B3 Russian permafrost landscape characterization
In Russia, information about permafrost extent, ground ice content, generalized geology, and lithology was available only as separate vector maps (Stolbovoi and McCallum, 2002). The individual maps were combined in ArcGIS 10.4 to delineate Russian permafrost landscape units similar to the Canadian and Alaskan databases. Russian ecozones were mapped using the global-scale map by Olson et al. (2001) which conforms to the Alaskan and Canadian ecozones. The geometric union of ecozone, ground ice content, and permafrost extent was calculated in ArcGIS 10.1 with the tool "intersect". Each unique combination of these three variables was to assigned the dominant fraction of geology and lithology type.  area of polygon in square meters AREA_SQKM area of polygon in square kilometers Figure D2. Study areas and associated permafrost landscapes in Canada. Legend lists type of permafrost extent (C: continuous; D: discontinuous; S: sporadic), ground ice content (vol %), surficial geology, and lithology. Shadowed labels name study areas with waterbody maps. Black lines and labels denote ecozones. Figure D3. Study areas and associated permafrost landscapes in east Russia. Legend lists type of permafrost extent (C: continuous; D: discontinuous; S: sporadic), ground ice content (vol %), surficial geology, and lithology. Shadowed labels name study areas with waterbody maps. Black lines and labels denote ecozones. Figure D4. Study areas and associated permafrost landscapes in west Russia. Legend lists type of permafrost extent (C: continuous; D: discontinuous; S: sporadic), ground ice content (vol %), surficial geology, and lithology. Shadowed labels name study areas with waterbody maps. Black lines and labels denote ecozones.
Earth Syst. Sci. Data, 9, 317-348, 2017 www.earth-syst-sci-data.net/9/317/2017/ Appendix E: Areal fraction and density for waterbody maps Table E1. Areal fraction and density per waterbody map in Alaska. Map IDs with an asterisk were not used for extrapolation. F : areal fraction of waterbodies from 1.×10 2 to 1.0×10 6 m 2 in size; REF: relative error of fraction for map subsets of 10 km×10 km; D: waterbody density per square kilometer; RED: relative error of density; PF: pond areal fraction for waterbodies from 1.0 × 10 2 m 2 to smaller than 1.0 × 10 4 m 2 ; REPF: relative error of pond fraction; PoD: pond density; REPD: relative error of pond density.  Table E2. Areal fraction and density per waterbody map in Canada. Map IDs with an asterisk were not used for extrapolation. F : areal fraction of waterbodies from 1.0×10 2 to 1.0×10 6 m 2 in size; REF: relative error of fraction for map subsets of 10 km×10 km; D: waterbody density per square kilometer; RED: relative error of density; PF: pond areal fraction for waterbodies from 1.0 × 10 2 m 2 to smaller than 1.0 × 10 4 m 2 ; REPF: relative error of pond fraction; PD: pond density; REPD: relative error of pond density. Table E3. Areal fraction and density per waterbody map in Scandinavia. Map IDs with an asterisk were not used for extrapolation. F : areal fraction of waterbodies from 1.0×10 2 to 1.0×10 6 m 2 in size; REF: relative error of fraction for map subsets of 10 km×10 km; D: waterbody density per square kilometer; RED: relative error of density; PF: pond areal fraction for waterbodies from 1.0 × 10 2 m 2 to smaller than 1.0 × 10 4 m 2 ; REPF: relative error of pond fraction; PD: pond density; REPD: relative error of pond density.  Table E4. Areal fraction and density per waterbody map in Russia. Map IDs with an asterisk were not used for extrapolation. F : areal fraction of waterbodies from 1.0×10 2 to 1.0×10 6 m 2 in size; REF: relative error of fraction for map subsets of 10 km×10 km; D: waterbody density per square kilometer; RED: relative error of density; PF: pond areal fraction for waterbodies from 1.0 × 10 2 m to smaller than 1.0 × 10 4 m 2 ; REPF: relative error of pond fraction; PD: pond density; REPD: relative error of pond density. 1.3 × 10 3 9 5 4 3 0 3 3 3 yam00220100820