An explicit GIS-based river basin framework for aquatic ecosystem conservation in the Amazon

Despite large-scale infrastructure development, deforestation, mining and petroleum exploration in the Amazon Basin, relatively little attention has been paid to the management scale required for the protection of wetlands, fisheries and other aspects of aquatic ecosystems. This is due, in part, to the enormous size, multinational composition and interconnected nature of the Amazon River system, as well as to the absence of an adequate spatial model for integrating data across the entire Amazon Basin. In this data article we present a spatially uniform multi-scale GIS framework that was developed especially for the analysis, management and monitoring of various aspects of aquatic systems in the Amazon Basin. The Amazon GIS-Based River Basin Framework is accessible as an ESRI geodatabase at doi:10.5063/F1BG2KX8.


Amazon Basin system
The Amazon is the largest river basin in the world.Its strict hydrographical area covers 6.3 million km 2 (Milliman and Farnsworth, 2011), and when the Tocantins Basin and estuarine coastal areas are included to define the Amazon region, the total area is 7.287 million km 2 .The average discharge of the Amazon River at its mouth is approximately 206 000 m 3 s −1 , contributing approximately 17 % of all river water reaching the world's oceans, at least 4 times that of the Congo, the second largest tributary (Richey et al., 1986;Callede et al., 2004Callede et al., , 2010)).Two of the Amazon River's tribu-taries, the Madeira and Negro, are also among the 10 largest rivers in the world as measured by average discharge (Milliman and Farnsworth, 2011).Wetlands occupy 14 % of the Amazon Basin (Melack and Hess, 2010) and play an important role in the ecology and biogeochemistry of this immense fluvial ecosystem.These environments include nearly all of the 35 inland or coastal wetland types defined by the Ramsar Convention (Mathews, 2013) but are composed primarily of alluvial floodplain habitats.Tree-dominated wetlands are the dominant habitat types on the floodplains, often covering 75 % or more of inundated areas where there has not been deforestation (Melack and Hess, 2010;Junk et al., 2012;Cunha et al., 2015;Melack, 2016).Floodplains are also character-E.Venticinque et al.: An explicit GIS-based river basin framework ized by lake-like waterbodies where water depth prevents the establishment of forest but where large rooted and floating herbaceous communities develop, especially along whitewater rivers that receive nutrients from the Andes (Junk, 1970;Piedade et al., 2010) and are under the strong influence of seasonal inundation pulses, which are monomodal for most of the lowland region and range from 5 to 15 m depending on the exact location but can be bimodal near the Equator or with numerous spikes in or near the Andes.Flooding in the easternmost part of the Amazon floodplain is tidally influenced though river discharge prevents an invasion of saltwater except during the lowest water period in the Marajó Bay area (Barthem and Schwassmann, 1994).Due to a backwater effect caused by the temporally different contributions of the southern and northern tributaries, the Amazon River and the lower courses of most of its tributaries remain in flood longer than expected from the tributary flood pulses alone (Meade et al., 1991).During the high-water period the lower courses of the tributary basins also become functionally a part of the Amazon main stem and the latter, although not a basin, behaves as an ecologically distinct hydrological unit.
The spatial and temporal variability in the river flood pulse and its influence on inundation patterns in floodplain environments play a fundamental role in sustaining the diversity and productivity of floodplain biota and the livelihoods of human populations throughout the Amazon.Infrastructural development, including plans to construct new dams, roads, and hydrovias across the basin, together with accelerating land use and climate change, threaten to disrupt this complex hydro-ecological system, with predictable negative consequences for the biota and river-dwelling populations that depend on its integrity.The conservation and management of the natural resources and services provided by this ecosystem will require a uniform hydrological framework, covering the entire Amazon region, specifically adapted for this objective.

Actual spatial framework
River basins are the most natural spatial units of aquatic ecosystems and are also the units generally used by the agencies or authorities (Agência/Autoridad Nacional de Águas/Agua -ANAs) charged with managing waters in Amazonian countries.The ANAs have traditionally used a basin coding system based on the work of Otto Pfafstetter, usually called the Pfafstetter Coding System (Pfafstetter, 1989), and the basins delineated in this system are referred to as Pfafstetter basins (or Otto basins, in Brazil).Each delineated basin is assigned an identification number that establishes a hierarchical and sequential arrangement of basins, often with a larger basin divided into at least nine smaller units (Verdin and Verdin, 1999).The Pfafstetter methodology was applied to the Amazon Basin in the HydroSHEDS product, which includes 12 basin levels (Lehner and Grill, 2013) and has also been applied to North American river basins (Verdin and Verdin, 1999).Pfafstetter Basin classifi-cations, especially those used by the ANAs, will undoubtedly continue to be the geographical basis for water use management in Amazonian countries, but complementary classifications, adapted for specific local objectives, such as the development of the Strategic Plan of Hydrological Resources of the Right Margin of the Rio Amazonas have also been adopted (Agência-Nacional-de-Águas-(Brasil) -ANA, 2012).The Pfafstetter methodology and most other basin classifications, used to date in the Amazon, have not considered the main stem and its associated floodplains as a hydrological unit.These areas contain the most productive river and wetland habitats and should thus be managed in the same way as large tributary basins.By including the main channel and surrounding floodplains of the Amazon River and it largest tributaries as discrete sub-basins in a regional basin hierarchy, we have produced a new spatially explicit integrated river basin framework, specifically adapted for the management and conservation of the Amazon fluvial ecosystem.
The digital river networks currently available for the Amazon region also lack some aspects essential for the management of aquatic ecosystems.The HydroSHEDS product (http://hydrosheds.cr.usgs.gov/index.php), the most accurate and regionally uniform river network that was available previous to the present work, lacks lower-order streams which are important habitats for many aquatic organisms; an equally uniform but higher-resolution vector product was thus needed to include these habitats.Ecologically and geographically important attributes such as stream order, river name, river length and water type are also needed for a spatially robust conservation and management framework.
Accelerating land use, infrastructure development and resource exploitation present a growing threat to the integrity of the Amazon River ecosystem (Castello and Macedo, 2016).The Amazon GIS-Based River Basin Framework presented here, including an ecologically consistent basin hierarchy and a spatially uniform, high-resolution, classified river drainage network, should help by providing a spatial basis to increase the scope of management and conservation efforts to meet the challenges of large-scale impacts.

Data
Two types of hydrological data are included in this spatial framework for the Amazon Basin.
1. Polygon: a hierarchical river basin classification and delineation of main stem floodplains.Main stems are considered the large downstream segments of the Amazon River and its major tributaries.Although not basins, per se, these main stem sub-basins contain large areas of wetlands and are important for fisheries production and aquatic biodiversity in the Amazon Basin.2. Line: a new high-density drainage network containing important geographical attributes, including stream order (1-11th order), tributary name (6-11th order), river type (6-11th order) and distance above the Amazon River mouth (4-11th order).

Acquisition and correction of DEM (digital elevation model)
To obtain a spatially uniform and high-resolution stream network and drainage basin hierarchy for the Amazon Basin, flow direction and flow accumulation patterns were derived from the 90 m resolution SRTM-DEM, which was the most accurate DEM available for the South American continent.
The near-global Shuttle Radar Topography Mission (SRTM) digital elevation data set (Farr et al., 2007) was developed by NASA and the US National Geospatial-Intelligence Agency for the entire Earth using stereo C-band imagery acquired by the space shuttle Endeavour in February of 2000, which corresponds to the early rising water period in the central Amazon region, when the Amazon main stem begins its 10-12 m annual flood cycle.The data product has a spatial resolution of 3 arcsec, approximately 90 m in the Amazon region, and a vertical accuracy of 1 m locally and 4 m globally.
Like most DEMs derived from synthetic aperture radar, the SRTM-DEM contains regions where useable data were not obtained (voids) and also regions where spatial variation in elevations are close to the vertical accuracy of the product, and consequently poorly represented.These latter areas include large lakes, river channels and wetlands.Furthermore, the SRTM DEM is not a "bare earth" DEM, but represents the elevation of a scattering centroid that varies as a function of vegetation height and density (Carabajal and Harding, 2005).For our analysis, we used the version 4.1 DEM available through CGIAR-CSI (Lehner et al., 2006).This "void-filled" DEM was provided in 6000 × 6000 pixel panels which we mosaicked using the "mosaic tool" in ArcGis 10.1 (ESRI, 2012) to produce a uniform DEM covering all of South America above 22 • south latitude.Three additional modifications of the SRTM-DEM mosaic were performed before flow direction patterns were analyzed to improve the quality of the final drainage network.First, we manually modified the DEM at one location in the headwaters of the Caquetá River in Colombia where the river passed through a channel in a large rock formation that was so narrow that it was not represented in the DEM.To ensure that water "flowed" through this point in the final stream network, it was necessary to "excavate" the channel digitally so that it was wider than the 90 m resolution of the DEM image.This was done by changing the elevation values of the rock forma- tion in the DEM to those of the river channel.In the second modification, the DEM was "reconditioned" to ensure that the main river channels followed a more precise path as they crossed the extensive floodplains in the central Amazon lowlands.This was done by lowering the elevation of all cells in the DEM along channels ≥ 7th order in the lower resolution HydroSHEDS stream product (Lehner et al., 2008).Finally, a "Fill Sinks" procedure was used to fill any remaining depressions in the reconditioned DEM which might impede water flow.This was done by raising the elevation values of all cells completely surrounded by higher elevation cells.All GIS analyses were performed in ArcGis 10.1 (ESRI, 2012) and Arc Hydro 2.0 (Arc Hydro, 2011).

Area of basins and length of river calculations
For all calculations of area of the basins, length of rivers and distance to the mouth we used the Albers projection with the following parameter configuration (Table 1).

Drainage network development
Once the DEM was corrected, a flow direction raster file was generated.In this file, each cell in the original DEM was replaced with a code indicating the direction of the steepest descent, determined by comparing the elevation of that cell to those of the eight surrounding cells in the DEM.This flow direction raster was then used to generate a flow accumulation raster, where each flow direction cell was replaced by a cell containing the accumulated number of cells upstream of that cell.The flow accumulation raster was then used to generate a stream raster file where each pixel having flow accumulation above a user-specified threshold value was replaced with a single nonzero value.The threshold value indicates the number of upstream cells or basin area where the delineation of the drainage network begins and determines the spatial resolution of the final stream network.We chose a stream threshold of 100 cells, which corresponded to a drainage area of approximately 81 ha.All cells with accumulations values below this threshold were attributed a value of zero.The stream raster file was then used together with the flow direction raster to create an ordered stream raster.This was done by replacing the nonzero values in the stream raster with values of stream order as defined by Strahler (1957).This ordered stream raster was then vectorized to produce a single high-resolution stream network shapefile for the entire Amazon Basin containing a stream order attribute.The calculated stream order varied from 1st to 11th order in this product, which is probably underestimated by 1 order, since the drainage areas of first-order streams, defined by Strahler (1957) as permanent streams with no permanent upstream tributaries, tend to vary from 10 to 50 ha in the central Amazon Basin.Assuming that this is correct, the smallest streams in the stream network developed here would be approximately 2nd order and the Amazon River main channel near its mouth would be 12th order.The order included in the attribute table of the final shapefile was the value generated originally by the stream order tool.Three different stream network shapefiles were created from this high-resolution product, containing streams from 1st to 11th order, 6 to 11th order and 7 to 11th order, respectively.Tributary names, derived from existing databases, were added to the 6-11th-order river network.The shapefile containing 1-11th-order streams was filtered to remove anomalous first-to third-order streams which were generated on open water surfaces and wetlands due to the inaccuracy of the DEM and the flow direction raster that was generated from it.These anomalies consisted of spurious low-order stream segments, generated predominantly in low-relief wetland environments where variation in elevation was either extremely low (open water environments) or due primarily to variations in vegetation height.The filter eliminated first-to third-order streams present in the wetland mask and stream segments adjacent to and intersecting the mask that were delimited by BL7 basins.While most of the anomalous segments were removed by the filter, some are still apparent at higher resolutions.The length (km) of each segment in the full-resolution network was calculated with the South America Albers Equal Area Conic projection.All GIS analyses were performed in ArcGis 10.1 (ESRI, 2012) and Arc Hydro 2.0 (Arc Hydro, 2011).

Development of basin hierarchy
Seven different scales or hierarchical levels were delineated in our basin hierarchy, denominated basin level 1 (BL1) to basin level 7 (BL7) (Figs. 1 and 2).
-Basin code generation.Basin codes for BL1 and BL4 basins were derived from the names of the principal rivers in each polygon.Codes for BL5-BL7 basins were created combining the associated BL2 basin name with the ID numbers generated automatically when each basin was delimited.-Basin level 1 (BL1), regional basins -divides the working area into three drainage polygons: one large polygon containing the Amazon and Tocantins river basins and two smaller ones containing the northern and southern coastal basins draining directly into the Atlantic.
-Basin level 2 (BL2), major Amazon tributary basins -delimits all tributary basins larger than 100 000 km 2 (main basins) whose main stems flow into the Amazon River main channel, as well as an Amazon River main stem polygon that consists of the open waters of the Amazon River, its floodplain and adjacent small tributary basins (Fig. 3).
-Basin level 3 (BL3), major tributary basins -delimits all basins larger than 100 000 km 2 , including those that do not flow directly into the Amazon River main channel, all tributary basins larger than 10 000 km 2 and less than 100 000 km 2 that flow into the Amazon River main stem, and a single central floodplain drainage polygon.
-Basin level 4 (BL4), minor tributary basins -delimits all tributary basins greater than 10 000 km 2 and less than 100 000 km 2 .Floodplain drainages include all tributaries with basins less than 10 000 km 2 flowing toward the floodplain at high water.
Polygon shapefiles for major Amazon tributary (BL2), major tributary (BL3) and minor tributary (BL4) basins were created from the flow direction raster and a point shapefile for basin outlets using a "batch watershed delineation" tool.Basin outlets were added to the point shapefile manually using the high-resolution stream network as a guide and the "point generation" feature of the "batch watershed delineation" tool.The flow direction raster was used to delineate the flow divides upstream from these points and define the basin limits.All major and minor tributary basins were attributed areas and the name of the principal tributaries in each polygon.Sub-basin raster files with sub-basin thresholds of 5000 (BL5), 1000 (BL6) and 300 km 2 (BL7) were created for the entire Amazon Basin from the flow direction raster  and segmented stream rasters developed at these scales, using a "catchment grid delineation" tool.The segmented stream rasters were generated from stream raster files created with these thresholds and the flow direction raster, using a "stream segmentation" tool.This tool separated stream raster reaches between confluences into separate segments and attributed the cells in each segment a unique identifying code.The flow direction raster was then used to aggregate the drainage cells associated with each segment.These raster sub-basins were then transformed into separate polygons using a "catchment polygon processing" tool.This tool delineated the limits of each raster sub-basin.Sub-basin areas in each shapefile ranged between its defining stream raster threshold and the stream raster threshold of the next basin level.General characteristics and statistics for each basin level are summarized in Table 2.All GIS analyses were performed in ArcGis 10.1 (ESRI, 2012) and Arc Hydro 2.0 (Arc Hydro, 2011).

Definition of floodplain drainage polygons
Large river floodplains play an important role in the Amazon, sustaining aquatic primary production and fish yields in the region.At high water, when the inundated area of floodplains is greatest, many small tributaries are completely flooded, altering regional drainage patterns.Many of these tributaries which are independent of the main channel at low water are "captured" by flooding and incorporated in the main stem drainage at high water.Due to their ecological importance, we prioritized these high water drainage patterns in the delineation of floodplain drainage polygons.The drainage areas of major tributary floodplains were delineated initially at the BL4 level with the drainage network derived from the DEM and then adjusted manually with a wetland mask to better represent high water drainage patterns.The wetland mask used to identify floodplain environments was generated by Hess et al. (2015a, b) from the analysis of JERS-1 L-band radar imagery covering most of the lowland Amazon Basin acquired during both low-and high-water periods.Detailed methods are provided in the original reference.Wetlands were defined as areas that were inundated during either of both periods together with areas adjacent to flooded areas which displayed landforms consistent with floodplain geo-morphology.Tidal wetlands in the lower Amazon and Tocantins rivers that were missing from this product were delineated here using a similar methodology and then annexed to the larger Amazon Basin mask.The final wetland mask, together with the BL5 and BL7 sub-basin shapefiles, was used to identify and delimit the floodplain drainages of major tributaries.Floodplain drainages were defined to include all main stem floodplain wetlands identified with the mask plus all upland sub-basins less than 10 000 km 2 that flowed directly into them.All tributary wetland drainage polygons were attributed with the name of the associated major tributary.The floodplain drainage associated with the Amazon River main stem was further divided into four areas based on geomorphology (Dunne et al., 1998), habitat distribution (Hess et al., 2015a) and fisheries.Once all major floodplain drainages were delineated, vectored data and metadata were added and then aggregated as polygons to the BL4 shapefile and as attributes to the BL5, BL6 and BL7 shapefiles.

Classification of river type
Water quality or type varies considerably in the Amazon River system and has been shown to have a major influence on biogeochemical processes and on the distribution and dynamics of aquatic habitats and biota.There are three main types of rivers in the Amazon Basin based on natural differences in water color and quality (Sioli, 1968): (1) whitewater rivers, with neutral pH and rich in suspended sediments and nutrients; (2) blackwater rivers, low in pH, nutrients and suspended sediments and high in dissolved organic carbon; and (3) clearwater rivers, low to neutral pH and low in nutrients, suspended sediments and dissolved organic carbon.The determination of water type (white, black or clear) in 6th-11th-order rivers was based on field observations of apparent river color made by the authors or their regional collaborators.Where direct field observations were unavailable, water color was determined through the visual analysis of river surfaces in cloud-free "natural color" optical images provided by Google Earth (Google, Inc.).These were primarily 15-30 m resolution Landsat images, although higherresolution SPOT and QuickBird images were also available Earth Syst.Sci.Data, 8, 651-661, 2016 www.earth-syst-sci-data.net/8/651/2016/

Definition and mapping of fish spawning nodes
Many migratory characiform fish species spawn at the confluences of whitewater and blackwater or clearwater rivers (Goulding, 1980(Goulding, , 1988;;Ribeiro and Petrere-Jr., 1990;Araujo-Lima and Ruffino, 2004).These fish spawning nodes were identified and incorporated in a shapefile for 6-11th-order rivers.The "feature vertices to points" tool in ArcGis 10.1 was used to convert the last downstream drainage line before each conflu-  ence in the 6-11th-order river network into a point.Next a buffer of 1000 m around each point was generated in order to define the confluence areas where spawning takes place.For each buffer area a spatial join was applied for the following information: order and type of tributary and order and type of river into which tributary flows.Important confluence areas for spawning were then derived from the intersection of spawning nodes and sub-basins or main stem drainages important for commercial fishing.Two shapefiles with confluence nodes were generated: (1) "NodesGeneral", where the nodes represent all confluences between rivers above sixth order, independent of water color, and (2) "Nodes_MainStemFishRegion", where the nodes represent the confluences of all tributaries above sixth order with the main stems of the Amazon river and its principal whitewater tributaries.These whitewater main stem confluences are the most important for commercial fishing activities.The TRIB-SIZE field in the attribute tables of the "NodesGeneral" and "Nodes_MainStemFishRegion" shapefiles refers to the minimum order of the smallest tributary in the confluence, "Big' being > 7th order and "Small" being ≤ 7th order.The resulting distribution of fish spawning zones is indicated in Fig. 5.

River distances
Distances along the river network from the mouth of the Amazon River to specific points in the river system can be important for characterizing spawning routes and calculating the resident time and velocities of fish larvae/juvenile during downstream migrations and other materials in the system.Distances from the Amazon's mouth to all stream segments between 4th and 11th order were calculated using the Barrier Analysis Tool (BAT) extension for ArcMap 10.1 developed for the Nature Conservancy (software developer: Duncan Hornby of the University of Southampton's GeoData Institute).The tool uses point data to divide a routed river network (polylines with from-node and to-node coding) into connected networks from which a direct path distance calculation can then be made.The data provide distances not only to specific points from the Amazon River mouth but also to distant regions (Fig. 6).Distance values and stream order were included as segment attributes in the final river network shapefile.

Data availability
Interested researchers can access the data and metadata at doi:10.5063/F1BG2KX8(Venticinque et al., 2016).

Conclusions
The multi-level basin hierarchy and classified river network presented here provides a new spatial framework for analyzing aquatic and terrestrial data at a variety of sub-basin levels, Earth Syst.Sci.Data, 8, 651-661, 2016 www.earth-syst-sci-data.net/8/651/2016/ including the Amazon Basin and Amazon region as a whole.Its architecture is appropriate for use in the monitoring and management of aquatic ecosystems, especially within an integrated river basin management framework at distinct spatial scales.The principal data products provided in the GIS include the following: 1.A multi-level basin hierarchy specifically designed for the conservation and management of river basins and floodplain environments at a variety of basin and subbasin scales.
2. A high-resolution, spatially uniform, ordered drainage network for the Amazon Basin and its adjacent coastal basins (coastal north, coastal south and Tocantins).
3. A first approximation of river types based on water color as a proxy for distinct chemical characteristics, included as an attribute for 6-11th-order tributaries.
4. Estimates of the distance of individual stream segments from the mouth of the Amazon River, included as an attribute for 4-11th-order streams in the Amazon basin.
5. A point shapefile indicating confluences (nodes) of different river types that are critical spawning zones for migrating fish species.
This regional hydrological database provides a coherent framework for the integration and analysis a wide array of spatial data, critical for management and conservation of this valuable fluvial ecosystem.

Figure 1 .
Figure 1.Cartographic representation of the first four levels of the classification of the Amazon and adjacent coastal basins (south and north): BL1, BL2, BL3 and BL4.BL: basin level.

Figure 4 .
Figure 4. Cartographic representation of Amazon River type classification.

Figure 5 .
Figure 5. Cartographic representation of important confluence areas for spawning, derived from the intersection of spawning nodes and sub-basins or main stem drainages important for commercial fishing.

Figure 6 .
Figure 6.Cartographic representation of river distances from Amazon River mouth.

Table 1 .
Parameter configuration of projection used for all calculations of area and length in this database.

Table 2 .
General description of catchments system for Amazon region.