Multi-year high-frequency physical and environmental observations at the Guadiana Estuary

. High-frequency data collected continuously over a multi-year time frame are required for investigating the various agents that drive ecological and hydrodynamic processes in estuaries. Here, we present water quality and current in situ observations from a ﬁxed monitoring station operating from 2008 to 2014 in the lower Guadiana Estuary, southern Portugal (37 ◦ 11.30 (cid:48) N, 7 ◦ 24.67 (cid:48) W). The data were recorded by a multi-parametric probe providing hourly records (temperature, salinity, chlorophyll, dissolved oxygen, turbidity, and pH) at a water depth of ∼ 1 m, and by a bottom-mounted acoustic Doppler current proﬁler measuring the pressure, near-bottom temperature, and ﬂow velocity through the water column every 15 min. The time series data, in particular the probe ones, present substantial gaps arising from equipment failure and maintenance, which are ineluctable with this type of observation in harsh environments. However, prolonged (months-long) periods of multi-parametric observations during contrasted external forcing conditions are available. The raw data are reported together with ﬂags indicating the quality status of each record. River discharge data from two hydrographic stations located near the estuary head are also provided to support data analysis and interpretation. The data set is publicly available in machine-readable format at PANGAEA (doi:10.1594/PANGAEA.845750).


Introduction
Estuaries are one of the most productive types of ecosystems on Earth and are of considerable value to both humans and wildlife. Despite extensive research efforts during past decades, the preservation of these systems demands a greater knowledge and understanding of the complex physical and biological mechanisms that control their health (Kennish, 2002;Zalewski, 2013). In particular, multi-year in situ observations performed at high (minutes to hours) frequencies are desirable for investigating the various agents that drive their ecology and hydrodynamics. Although increasingly implemented in highly developed countries, such monitoring programmes are not yet carried out worldwide in estuaries (Baptista et al., 2008;Dang et al., 2010;Garel et al., 2009a). Furthermore, the collected data are not always made available to the research community.
The SIMPATICO (integrated system for in situ multiparametric monitoring in coastal areas) station has been op-erating from March 2008 until April 2014 in the lower Guadiana Estuary for the in situ continuous monitoring of current and water quality (Garel and Ferreira, 2011;Garel et al., 2009a). This estuary, at the southern border between Spain and Portugal, is a rock-bounded system of 80 km long, particularly narrow (700 m at max) and relatively shallow (about 5 m deep, on average). It is oriented north-south and connects directly the Guadiana River (810 km long, having the 4th largest drainage area of the Iberian Peninsula) to the Gulf of Cadiz (Fig. 1). Numerous studies have established its physical (Boski et al., 2002;Fortunato et al., 2002;Garel and Ferreira, 2013;Garel et al., 2009bGarel et al., , 2014Lobo et al.,   land use changes and strong flow regulation owing to increasing freshwater demand Garel et al., 2009a;Guimarães et al., 2012). In particular, more than 100 dams were built since the 1950s in the river basin, including the controversial Alqueva dam on the Guadiana River, closed in February 2002 to form the largest reservoir in Western Europe at 80 km from the estuary head.
The SIMPATICO monitoring station included a multiparametric probe providing hourly observations near the surface and a bottom-mounted Acoustic Doppler current Profiler (ADP) operating at 15 min intervals. This contribution presents the data collected by these instruments between 2008 and 2014, together with the concurrent freshwater discharge into the estuary (Table 1). First, a description of the SIMPATICO system is given (Sect. 2). Then, the time span of the records and the techniques used for their validation are detailed (Sect. 3). In Sect. 4, an overview of the data is provided at both the seasonal and tidal timescales. Access to the data published at the PANGEAE digital library is briefly described in Sect. 5. As a conclusion (Sect. 6), some of the key eco-hydrodynamic aspects that can be addressed by the data set are outlined.

The SIMPATICO monitoring station
The SIMPATICO system is located at the lower Guadiana Estuary, near the mouth, ∼ 3 km from the tips of a pair of jetties that have stabilised the inlet and ∼ 100 m from the Por- tuguese shore (37 • 11.30 N, 7 • 24.67 W) in front of Vila Real de Santo Antonio (red star in Fig. 1). The station is constituted with a foam-hull floating platform (YSY EMM 550;YSI, 2007) measuring 90 cm in diameter, anchored to the seabed with chains and concrete blocks (Fig. 1). Inside the buoy, a water-tight electronic compartment houses a data logger (CR1000; Campbell Scientific, 2006) as well as batteries that provide power to the system and which are recharged by three solar panels located on top of the buoy. The logger is equipped with a modem for the automatic downloading of raw data to a remote server through Global System for Mobile Communications (GSM).
The multi-parameter probe (YSI 6600 V2-4;YSI, 2006) is inserted through the surface buoy, measuring (hourly) water quality parameters at 1 m water depth. It is equipped with three optical sensors to measure turbidity (NTU), both saturated (%) and dissolved (mg L −1 ) oxygen levels, and chlorophyll concentration (µg L −1 , via fluorescence), as well as three other non-optical sensors to measure conductivity (µS cm −1 ), temperature ( • C), and pH. Water conductivity is derived from the decrease in voltage recorded within a cell fitted with four nickel electrodes. Temperature (ITS-90 scale) is converted from resistance variations measured with a precision thermistor of sintered metal oxide. Salinity (PSU) is determined internally from the conductivity and temperature readings according to standard methods (Clesceri et al., 1989). The pH probe determines hydrogen ion concentration using a combination electrode consisting of a protonselective reservoir (filled with a pH 7 buffer) and a Ag/AgCl reference electrode. A fluorometer is used to estimate the concentration of chlorophyll in vivo based on the ability of chlorophyll to fluoresce. Dissolved oxygen is obtained by measuring the lifetime luminescence of a dye exposed to blue light. Turbidity is determined by shining a light beam into the sample solution and then measuring the amount of light scattered off suspended particles. The optical sensors are fitted with wipers for cleaning their optical face before measurement. All sensors (except the temperature sensor) require periodic calibration using buffer solutions (for details, see Sect. 3). For chlorophyll, because of the lack of appropriate buffer, a zero calibration designed to evaluate the sensor drift is performed using distilled water; as such, the fluorescence sensor provides semi-quantitative chlorophyll measurements that are useful for detecting changes over time (rather than accurately measuring concentration levels). For the other sensors, data accuracies provided by the manufacturers are ±0.15 • C for temperature, ±0.5 % for conductivity, ±1 % for salinity, ±0.2 for pH, ±2 % for dissolved oxygen, and ±5 % for turbidity.
The 750 kHz ADP (Sontek Argonaut XR;Sontek/YSY, 2001) is bottom mounted on a structure (spider) fixed to a concrete block, about 10 m north and 5 m west of the buoy, in 9 m water depth (referred to mean sea level; Fig. 1), i.e., near the deepest part of the estuarine channel (see Garel and Ferreira, 2013). The acoustic signal is emitted at 1 Hz from three beams slanted 25 • off the vertical and equally spaced at 120 • . The ADP includes a high-resolution pressure sensor as well as compass/tilt and temperature sensors. The distance between the base of the concrete block and the top of the ADP is ∼ 1 m. The three velocity components (east, north, and vertical) of the flow are measured in 10 cells each 0.8 m thick (multi-cell data, hereafter) along the water column. In   addition, depth-integrated velocities are measured in a cell (main-cell data, hereafter) whose vertical extent is adjusted automatically near the surface based on the pressure records.
Both the main-cell and multi-cell measurements start at 0.8 m above the instrument, and therefore 1.8 m above the bottom; however this distance varied through time owing to changes in bed elevation, and in particular to the burying and tilting of the mooring structure with the episodic passage of sand dunes (Garel and Ferreira, 2011;Lobo et al., 2004;Morales et al., 2006). ADP ensembles are averages of 5 to 15 min measurement periods (depending on the time series considered), collected at 15 min intervals.

Probe data
The SIMPATICO station was pulled out of water for complete maintenance after a major system failure related to large floods, producing a large gap in the time series data from 15 February 2010 to 26 January 2012. Episodic probe faults produced additional shorter data gaps of maximum 1 month duration, except on 26 November 2008-9 February 2009 (74 days) and on 7 May-22 September 2009 (138 days; see Fig. 2). Data gaps occurred especially within the 2008-2010 period, while the period 2012-2013 is more complete in relation with the improvement of the anti-fouling system (see Garel and Ferreira, 2011) and higher frequency of probe maintenance operations. Probe maintenance (i.e., cleaning and sensor calibration) was performed depending on boat and staff availability and on weather conditions (Table 2). Following the manufacturer recommendations, a 1-or 2-point calibration (depending on  1-point calibration in water saturated air environment; and, for chlorophyll, 1-point calibration with chlorophyll-free solution. The turbidity-and chlorophyll-free solutions were obtained from distilled water filtered at 0.22 µ. The probe raw data (temperature, pH, dissolved oxygen, turbidity, salinity and chlorophyll) were flagged as invalid, valid or ambiguous. Technical problems caused bad measurements that were invalidated considering the typical range of variations of all parameters according to the season or river discharge (Table 3). Apart from technical issues, the main concern in the continuous acquisition of valid data was biofouling (Garel and Ferreira, 2011). The development of biofouling on the optical sensors produces a characteristic spiky signal tending towards saturation. Likewise, biofouling within the conductivity cell induces a typical progressive decrease in the readings. The dates of sensor cleaning and calibration (Table 2) were also considered, together with other probe parameters and external factors (such as river discharge or tidal phase and amplitude), to help distinguish between natural and biofouling-induced variations. This distinction was not clear for few subsets of the pH time series data, which were flagged as ambiguous. Small shifts in intensity are observed in some cases before and after calibration (particularly for salinity and pH records) because of sensor calibration inaccuracy (the quality of each calibration depends on the choice of buffers and procedures, as well as on operator skill). Similarly, inaccurate calibration of the turbidity sensor may produce negative values when the concentration of suspended material is close to zero. The data presenting such calibration shifts and negative values were regarded as valid, except for those presenting pronounced shifts (e.g., > 1 PSU for salinity and > 0.1 for pH) or unusually high val-ues (e.g., > 37 PSU for salinity and > 8.3 for pH) which were flagged as ambiguous. Finally, data spikes were removed through comparison of each values with its three pointsmoving average, considering threshold values of 5 • C for temperature, 5 PSU for salinity, 10 for pH, 0.2 µg L −1 for chlorophyll, 40 NTU for turbidity and 100 mg L −1 for dissolved oxygen.

ADP data
The ADP time series include the parameters displayed in Table 1. Sensor data include pressure (dbar), pitch and roll ( • ) and temperature ( • C). Diagnostic parameters -helpful in assessing data quality -are beam noise, amplitude and strength, reported in counts (an internal logarithmic unit representing 0.43 dB). Velocity data (m s −1 ) include the maincell and multi-cell velocities, corrected from the magnetic declination (changing at a rate of 0 • 7 E per year in the area). Standard deviations of the pressure, pitch, roll and both maincell and multi-cell velocities are also reported.
The previously mentioned data gap related to system failure stretches from 4 January 2010 to 7 December 2011 for ADP data (Fig. 2). Defective connectors caused another large data gap between 12 December 2012 and 2 March 2013. Overall, three long time series of continuous data (except few minor gaps) are available, ranging from (1) (Fig. 2).
Quality control of the ADP data was applied to each individual multi-cell and main-cell (depth-integrated) velocities. Threshold values were selected based on manufacturer recommendations, careful data inspection, and site knowledge  (Table 4). A few temperature and pressure records were obviously wrong (e.g., temperature > 50 • C) and were removed from the raw time series. It was checked that the instrument tilting was not larger than 10 • , an acceptable value for beam cells to align given the (shallow) mooring water depth. All raw velocity records were included in the data set and flagged based on the results of a quality control consisting first of the application of invalid data detection algorithms. This step was followed by a time consistency check in order to accept invalidated data that are close to values predicted from harmonic analyses of the valid velocity data. Flags indicate if velocities are valid, invalid or acceptable (for invalid but temporally consistent data); for multi-cell velocities, an additional flag indicate when the upper bins are out of water or affected by reflection at the water surface. The quality control algorithm of the main-cell data includes a check of the beams' signal-to-noise ratio (SNR), of velocity standard deviations, and of the beams' signal strength. The SNR was computed as (Sontek/YSY, 2001): (1) The pressure sensor was clogged by sediment from 15 October to 12 December 2012 resulting in unrealistic large pressure values. Main-cell velocity samples are weakly dependent of the upper bins measurements and were therefore not significantly affected; the velocity data associated to these bad pressure records were visually good and flagged as acceptable. For valid pressure records, it was verified that the upper limit of the sampling volume used to compute the depth-integrated velocities is not significantly lower (i.e., within 1 cell size) to that predicted (CP) based on the pressure (P ) and its standard deviation (SD P ), using (Sontek/YSY, 2001): For multi-cell data, the detection algorithms considered the difference in SNR between the three beams and the veloc-ity standard deviations (Table 4). In addition, the bin velocities out of water or affected by interference at the surface boundary were identified based on pressure records (using Eq. 2) and signal amplitude (typically, the mean amplitude decreases as the signal propagates upward, but increases near the boundary owing to strong reflection that compromises the accuracy of the readings). Velocity records at the upper (10th) bin were almost always flagged as affected by boundary interference; note that these data were missing before May 2009, due to an inadequate pulse length (2 m, changed to 1 m with no effect on the main-cell velocities). Invalidated (main-cell and multi-cell) velocities were then compared with values obtained from an M2 fit to check for consistency. Harmonic analyses of the (largely predominant) north-velocity component were performed with the T-Tide software (Pawlowicz et al., 2002). About 2 months-long subsets of validated data collected during low river discharge conditions (summer 2008, 2012 or 2013, depending on the year) were used for this analysis. In each case, the variance of the predicted data was > 0.98 of the variance of the observations, indicating that the model was able to reproduce satisfactorily the actual conditions. Previously invalidated velocities being in the range of observations during the period selected for the tidal analysis were accepted. The soundness of this time consistency check was verified by careful visual inspection. Finally, peaks of validated and accepted data were discarded using a 3-points moving average and a threshold difference of 0.15 m s −1 between observed and averaged values.

Useful ADP and probe data
Useful data are defined here as the validated and accepted ADP data, and the validated probe data (without ambiguous records). The total extent of useful probe and ADP records is displayed in Fig. 2. Overall, more than 99 % of the ADP samples were validated or accepted, representing a cumu-Earth Syst. Sci. Data, 7, 299-309, 2015 www.earth-syst-sci-data.net/7/299/2015/ lative time series data of about 3.8 years ( Table 5). The probe was operating during a cumulative time of ∼ 3 years (25 649 hourly records). During this period, each probe sensor recorded more than 65 % of valid records (∼ 1.9 years of cumulative time), the highest success percentage being temperature, and the lowest for salinity and pH ( Table 5).
The data from the six probe sensors are all valid during ∼ 18 % of the time with useful velocity data (Fig. 3). This percentage represents a cumulative time of about 6.5 months during when all parameters of the station were recorded successfully. This duration is limited by the rate of valid salinity and pH data. For example, considering four, or less, probe parameters, the time of useful combined ADP -probe data is at least 2 years (Fig. 3). However, it is important to note that ambiguous (pH or salinity) data can also be useful for data analysis, as exemplified in Sect. 4.2.

River discharge
To assist with the interpretation of the data sets and to enhance their potential use, river discharge data (m 3 s −1 ) at two hydrographic stations were compiled from March 2008 to April 2014. These data are freely available from the Portuguese Water Institute (INAG; presently APA, Portuguese Environment Agency) website (http://snirh.pt). The two stations, Pulo do Lobo (managed by INAG/APA) and Pedrogão (managed by EDP -Portugal Electricity) are located about 20 and 50 km from the estuary head, respectively, catching the runoff from 90 % of the Guadiana river basin. The river discharge at Pulo de Lobo was converted from water level records using calibration discharge curves. The two data sets are complementary: the daily and nearly continuous time series from Pedrogão represents the discharge from the Alqueva dam; the hourly data from Pulo do Lobo station are patchy, in particular from 2010 onwards (due to the interruption of the station maintenance), but include moderate discharge events triggered by intense rainfalls in the region (for an example, see Sect. 4). , together with concurrent ADP main-cell velocities (north component) and river discharge records. The previously described shifts at times of sensor calibration are clearly visible in the pH and salinity signals (e.g., in May 2008, and February-March 2009, when the pH data shift is significant and data are flagged as ambiguous). Clogging of the pressure sensor in October-December 2013 did not affect the main-cell velocities, which were then flagged as accepted (Fig. 4b).
At this yearly scale, expected seasonal temperature variations, inversely correlated with DO variations are clearly observed (see also Garel and Ferreira, 2011). Pronounced and rapid temperature variations in summer are induced by the alternation of cold eastward upwelling jets and warm westward counter-currents that characterise the coastal circula- tion in this region (Relvas and Barton, 2002;Garel et al., 2015). The other parameters are mostly affected by flood events that occurred in January 2010 (up to ∼ 1400 m 3 s −1 ) and April 2013 (up to ∼ 2000 m 3 s −1 ), associated to a dampening and increasing of northward and southward velocities, respectively (periods "iii" in Fig. 4). Other smaller discharge events recorded only at Pulo de Lobo, such as in November 2013, are due to intense rainfall events in the region which occur in winter months only (with high inter-annual variability). Such events have also a clear effect on the probe parameters (in particular salinity, turbidity, chlorophyll and pH) and ADP velocities. This strong reactivity of the estuary to moderate increase of freshwater inflows in response to local rainfalls is related to pronounced shortage of soil and vegetation in the area, and to the (long, narrow and relatively shallow) morphology of the Guadiana estuary. This sensitivity to freshwater inflows also explains the strong variability observed in the salinity signal. In particular, when the dis-charge form Pedrogão is nearly zero (i.e., when water storage within the Alqueva reservoir is prioritised, such as before 2010 and during the drought year of 2012), the salinity varies from about 25 to 37 PSU (see periods "i" in Fig. 4; Table 3). By contrast, salinity variations are significantly larger when the discharge at Pedrogão is ∼ 50 m 3 s −1 (periods "ii" in Fig. 4), corresponding to the so-called environmental flow released from the Alqueva dam for sustaining ecosystem health (Dyson et al., 2008). In February-June 2012, the progressive decrease in the salinity range corresponding to the transition from ecological flow to nearly zero discharge is clearly observed (Fig. 4b).

Tidal variability
Intra-tidal variability of the recorded parameters is wellevidenced at a fortnightly time-scale (Fig. 5). Larger turbidity values are also clearly observed at spring tide, in relation to the flooding of higher area along the margins, and to sediment re-suspension by stronger currents. The near-surface (probe) and near-bed (ADP) temperatures are coherent, but show some small differences at the weakest neap tides (beginning of February and March). Likewise, these periods correspond to a decrease in salinity values. Note that salinity values for this subset are flagged as ambiguous (due to maximum values > 37 PSU), but display variations which are consistent with the estuarine hydrodynamics. Indeed, these temperature and salinity patterns are induced by fortnightly changes in the strength of vertical stratification. More precisely, the lower estuary is well-mixed at spring tide, with unidirectional seaward residual flows, and partly-stratified at neap tide, displaying a typical 2-layer flow oriented seaward at the surface and landward near the bed (Garel et al., 2009b). The estuarine circulation at neaps is associated with fresher (hence lighter) water at the surface than at the bed, in agreement with salinity temperature observations reported in Fig. 5. Moreover, rivers in the region are cooler than the sea in winter, explaining the surface-bed temperature differences at neap when the estuary is partly-stratified. This temperature contrast between sea and river can be used as a qualitative surrogate of vertical density differences, and thus of stratification (see Garel et al., 2013).

Data access
The data set (March 2008-April 2014 is deposited at PANGAEA in machine-readable format (tab-delimitated text files). The data files (water quality data, current measurements and river discharge) are referred to with explicit file names and include extensive information (in header) about the site, instrument, hardware, setup, and units. The probe time series data are organised into two data files separated by a large data gap in 2010-2011 related to major system failure: For probe data, the flagging convention is 0-invalid; 1valid; 3-ambiguous. For ADP data, it is 0-invalid; 1-valid; 2affected by surface boundary/out of water; 3-accepted. Missing data and time gaps are indicated with "-999".

Conclusions
This contribution presents flagged data from a current-metre and a multi-parametric probe operating between 2008 and 2014 at the lower Guadiana Estuary, together with the concurrent freshwater discharge. Although the time series contains various extended periods with gaps (in particular the probe data in 2008-2010), scientific interest in these data lies in the availability of physical and environmental observations of the estuarine conditions at high frequencies for prolonged periods and contrasted external forcing conditions.
One example of application of the presented data set is to support the development of numerical (hydrodynamic and ecological) models. In particular, coupled hydro-ecological models are increasingly used in estuaries (Robson, 2014) but their performances are rarely evaluated against highfrequency data sets of several months long. Such assessment using the present data set should be conducted for validation purposes only as model calibration should be based on observations at various locations along the system (on request, the authors can provide measurements performed episodically at other locations in the estuary together with other complementary data useful for model implementation such as bathymetric and surface sediment maps). Furthermore, the data set can be compared with the outputs of schematic models or realistic models at other sites to discuss processes.
The presented data set also allows the study of complex interacting hydrodynamic or ecological processes, especially when they require long-term observations at high frequency in order to be specified. For example, previous studies have shown that the vertical structure of the barotropic boundary layer can be distinguished from the two-layer exchange flow of the estuarine circulation to address the dynamics of residual currents at tidal and subtidal timescales (Garel and Ferreira, 2013;Stacey et al., 2001). Other data sets may support these studies, such as wind data, available from the Portuguese Sea and Atmosphere Institute (IPMA, http://www.ipma.pt). With strong flow regulation at the Guadiana Estuary, potential data usages also include studies of the effects of large dams, towards the development of best flow regulation practices.
Another example of potential data usage of global interest is the study of specific events that can significantly affect ecosystems at estuaries and coastal margins. Floods, for example, have far-reaching consequences in terms of ecology, morphology and water management. Increased knowledge of flood hydrodynamics is crucial for the formulation of robust flood risk management strategies (as required for example by the EU-FLOOD Directive). Unlike larger systems that widen significantly near their mouth, narrow estuaries such as the Guadiana are affected along their entire length by moderate to high freshwater inflows (FitzGerald et al., 2002). Several of these events are included in the data set (Fig. 3). It is of interest, for example, to determine the capability of fish larvae to retain within the estuary during these events (e.g., Teodósio and Garel, 2015). Coastal upwellings are other examples of specific events, occurring typically between March and September along the Portuguese coast (Relvas and Barton, 2002) that are major drivers of the primary and secondary productivity along continental margins (Washburn and McPhee-Shaw, 2013).
Finally, the data set provides the research community with an important data source for the study of hydrodynamic and eco-hydrodynamic processes acting in estuaries at intra-tidal to seasonal timescales. It also contributes to the development of comparative eco-hydrological science in an international context, such as the one promoted by "Our Global Estuary" initiative (http://wordpress.fau.edu/oge/). Author contributions. Ó. Ferreira developed and directed the monitoring activities at the Guadiana Estuary. E. Garel maintained the station, analysed and organized the data sets, and wrote the paper. Both authors discussed the results and commented on the manuscript.