The ESA GOME-Evolution “ Climate ” water vapor product : a homogenized time series of H 2 O columns from GOME , SCIAMACHY , and GOME-2

We present time series of the global distribution of water vapor columns over more than 2 decades based on measurements from the satellite instruments GOME, SCIAMACHY, and GOME-2 in the red spectral range. A particular focus is the consistency amongst the different sensors to avoid jumps from one instrument to another. This is reached by applying robust and simple retrieval settings consistently. Potentially systematic effects due to differences in ground pixel size are avoided by merging SCIAMACHY and GOME-2 observations to GOME spatial resolution, which also allows for a consistent treatment of cloud effects. In addition, the GOME2 swath is reduced to that of GOME and SCIAMACHY to have consistent viewing geometries. Remaining systematic differences between the different sensors are investigated during overlap periods and are corrected for in the homogenized time series. The resulting “Climate” product v2.2 (https://doi.org/10.1594/ WDCC/GOME-EVL_water_vapor_clim_v2.2) allows the study of the temporal evolution of water vapor over the last 20 years on a global scale.


Introduction
Water vapor is a key component for the Earth's climate as it is an important natural greenhouse gas and it drives cloud formation.Thus, for reliable climate modeling, understanding the H 2 O cycle and possible feedback mechanisms is crucial.The analysis of the temporal evolution or trends of measured H 2 O on a global scale is thus key for improving our knowledge of the Earth's climate system.International efforts are made to collect, improve, and assess available water vapor measurements, e.g., within the GEWEX Water Vapor Assessment (http://gewex-vap.org) by the WMO World Climate Research programme.
Total column water vapor (TCWV) measurements can be made from radiosondes or from the analysis of ground-based GPS signals.Both techniques provide good coverage for, e.g., North America and Europe, where many ground stations exist, but only sparse coverage over, e.g., Central Africa or the oceans.Satellite measurements from microwave (MW) or infrared (IR) sensors, however, are primarily sensitive over ocean or land, respectively.In addition, radio occultation (RO) is an accurate method to determine water vapor concentrations in the upper troposphere and lower stratosphere regions and is a key contributor to numerical weather prediction.
Since the launch of GOME (see Table 1 for abbreviations and references) in 1995, spectral measurements of moderate resolution became available, including the red spectral range, and have been continued by SCIAMACHY and GOME-2 up to now.These measurements allow the retrieval of TCWV Published by Copernicus Publications.
(e.g., Noël et al., 1999;Wagner et al., 2003;Lang et al., 2003;Grossi et al., 2015) using differential optical absorption spectroscopy (DOAS) (Platt and Stutz, 2008), providing global coverage with similar sensitivity over both land and ocean.Thus, TCWV products from satellite observations in the red spectral range are a valuable complement to MW, IR, and RO water vapor products, which are sensitive only to specific surfaces or altitude ranges.TCWV products derived from GOME, SCIAMACHY, and GOME-2 have already been used to investigate the water vapor evolution over time on a global scale, e.g., the effects of El Niño (Wagner et al., 2005;Loyola et al., 2006) or trends (Wagner et al., 2006;Mieruch et al., 2008Mieruch et al., , 2011Mieruch et al., , 2014)).
The TCWV retrieval implemented in the operational GOME-2 data processor (GDP) (from version 4.7 on) has been developed by MPIC and DLR and is described in detail in Grossi et al. (2015).It is robust and almost independent of external data sets.Essentially, it is based on a. DOAS analysis, plus a simple correction for spectral saturation effects; b. empirical air-mass factors (AMFs) based on the O 2 absorption; and c. a simple cloud masking, again based on O 2 absorption.
These steps are briefly explained in Sect.2; for further details see Grossi et al. (2015) and references therein.Within the ESA GOME-Evolution project, the "Climate" product has been developed, which provides monthly mean TCWV from July 1995 to December 2015 at 1 • resolution.The goal of the Climate product is to provide an -as much as possible -consistent time series of TCWV from GOME, SCIAMACHY, and GOME-2.This consistency is reached by (a) spatial merging of the smaller SCIAMACHY and GOME-2 pixels (60 and 80 km across track, respectively) to the GOME pixel width (320 km) and (b) limiting the broader GOME-2 swath (1920 km) to that of GOME and SCIAMACHY (960 km).
For the Climate product, the TCWV retrieval is in large part similar to Grossi et al. (2015), i.e., requires almost no external data, but derives the information required for AMF correction and cloud masking directly from the spectral analysis.This allows for a consistent treatment of cloud effects for the different sensors, which would be difficult to achieve based on operational cloud products from different sensors and different algorithms.The resulting climate product is a valuable, independent data set for model evaluation, comparison to other water vapor products, and trend analyses.
The paper is organized as follows: in Sect.2, the TCWV retrieval used in the GDP is briefly summarized, and modifications for the climate product are explained.In Sect.3, the climate product is introduced: the spatial merging procedure is described in Sect.3.3, the consistency across the different instruments during overlap periods is analyzed in Sect.3.4, offset corrections yielding homogenized time series are introduced in Sect.3.5, the need for an additional TCWV data field smoothed over ocean is justified in Sect.3.6, and standard deviations and standard errors of the mean are discussed in Sect. 3.7. In Sect. 4, some specific properties of the Climate product are discussed.Section 5 summarizes the results of validation studies.Details of and a link to the final data product are given in the data availability section, followed by conclusions in Sect.7.

TCWV retrieval
The retrieval of TCWV from satellite spectra for the climate product is based on the operational GDP TCWV retrieval described in Grossi et al. (2015).Below, we briefly summarize the single steps of the operational retrieval and point out where the climate algorithm differs.The particular operations for the climate product, i.e., the spatial resampling and the homogenization of time series across different satellite instruments, are described in Sect.3.
The TCWV retrieval is generally kept simple, making it robust and almost independent from external data sets.The impact of some of the simplifications made below is discussed further in Sect. 4. Note, however, that all instruments are affected likewise and thus trend analysis is not impaired.

Spectral analysis
Slant column densities (SCDs), i.e., concentrations integrated along the effective light path, are derived from the satellite spectra using DOAS (Platt and Stutz, 2008).The retrieval is performed in the red spectral range from 614 to 683 nm, including the O 2 and H 2 O absorption bands at 630 and 650 nm, respectively.Within the spectral analysis, absorption spectra of H 2 O, O 2 , and O 4 are accounted for.In addition, an inverse irradiance spectrum and a "Ring spectrum" are included, accounting for intensity offsets and Raman scattering, respectively.Furthermore, the spectral signatures from vegetation are considered by including the respective spectral structures deduced from deciduous, conifers, and grass absorption (Wagner et al., 2007).For SCIAMACHY, polarization correction spectra are included as well in order to account for its particularly strong polarization sensitivity.A polynomial of degree 4 is included in the fit.
Further details on and examples of the spectral retrieval can be found in Wagner and Mies (2011).

Correction of nonlinearity in spectral absorption
The spectrally finely structured absorption bands of water vapor are not resolved by the considered satellite instruments.Consequently, the relationship between the actual TCWV and the retrieved H 2 O SCD becomes nonlinear.The same holds for O 2 .This effect can be simply modeled based on synthetic spectra as described in Wagner et al. (2003Wagner et al. ( , 2006) ) for H 2 O and O 2 , respectively.For the GDP and the climate retrieval, the H 2 O and O 2 SCDs resulting from the DOAS analysis are corrected accordingly for nonlinearities in spectral absorption.This correction is also denoted as "saturation correction" in Wagner et al. (2003).
The slightly different spectral properties of SCIAMACHY and GOME-2 compared to GOME affect the saturation correction by less than 1 % for both H 2 O and O 2 SCDs at low latitudes and midlatitudes.These effects are mostly canceled out by the application of the O 2 AMF to H 2 O SCDs (see next section).Only at high latitudes (for high SZA), the impact on the O 2 SCD can be up to 3 %.The respective effect on H 2 O introduced by the O 2 AMF is very low in terms of absolute TCWV and corrected by the applied offset correction (see Sect. 3.5).

Air-mass factor
In passive DOAS applications, the derived SCD is usually converted into a vertical column density (VCD) by division with the so-called air-mass factor.The AMF depends on viewing geometry and the vertical concentration profile of the trace gas of interest, and is usually determined by radiative transfer modeling.This is also the procedure used for the complementary GOME-Evolution "Advanced AMF Algorithm (A 3 )" product which is currently being developed by Wang et al. (2017).For the climate product, as for the GDP, however, we follow the approach proposed by Wagner et al. (2003) which takes the O 2 AMF as proxy for the H 2 O AMF.As the O 2 VCD is known, the O 2 SCD resulting from the DOAS fit (and corrected for saturation effects) directly yields the O 2 AMF.Temporal variations of the actual O 2 VCD, driven by pressure and temperature, are neglected, as their impact on the retrieval is far smaller than other potentially systematic impacts of pressure and temperature variations, in particular on cloud conditions.

The climate product
In order to account for the systematic difference in the vertical profiles of O 2 and H 2 O, a correction factor depending on SZA and ground albedo is applied, which is determined from radiative transfer calculations for standard atmosphere conditions (see Grossi et al., 2015, for details).The resulting H 2 O VCD shows a systematic scan-angle dependency, which is particularly strong over ocean, but small over land, as shown in Fig. 1 in Grossi et al. (2015).Note that, in contrast to the GDP, a scan-angle-dependent correction is not applied for the climate product for two reasons: (1) for the climate product, large scan angles (> 31 • ), which occur for GOME-2, are skipped (see next section), and (2) the scanangle dependency is quite complex, i.e., depending on surface (land and ocean), SZA, cloud properties, etc., and the operational scan-angle correction is still imperfect, as the resulting VCDs reveal remaining scan-angle dependencies (Grossi et al., 2015).The impact of the scan-angle dependency on the climate product TCWV is further discussed in Sect.3.6.
The H 2 O VCDs (in units of molec cm −2 ) directly correspond to TCWV (in units of kg m −2 ).In the text hereafter, we use the term TCWV (except for issues directly related to the spectral analysis, i.e., SCDs).In the figures, both units (for VCD and TCWV) are given.

Cloud masking
Within the GDP algorithm, a simple cloud masking is performed based on the retrieved O 2 SCD: as stated in Wagner et al. (2006), pixels with less than 80 % of the maximum O 2 SCD (as a function of SZA) are masked as cloudy.For the climate product, we follow the same approach, whereby the maximum O 2 SCD has been determined over the Pacific for each satellite instrument individually.
The simplified approach has some drawbacks: at altitudes above 2 km, pressure is reduced to less than 80 %.Consequently, mountains above about this altitude (at GOME horizontal resolution) are generally skipped by the simple O 2 cloud masking, while clouds below this altitude are kept.The advantage of the approach, however, is that it directly provides a simple but consistent treatment of cloud effects across the different satellite instruments (when spatially resampled), as O 2 is derived simultaneously with H 2 O in the spectral analysis.

Gridding
The TCWV of the cloud-masked satellite pixels with SZA < 85 • is gridded on a regular latitude-longitude grid with 1 • resolution on a daily basis.Back scans as well as the ascending part of the orbit are skipped.The narrow swath mode (NSM), which is applied about thrice (GOME) and once (GOME-2) a month, is discarded.
Subsequently, monthly means are calculated.Figure 1 exemplarily shows the monthly mean TCWV from GOME measurements in June 1996.
The goal of the climate product is to provide a time series of TCWV that is as consistent as possible from observations of the satellite instruments GOME, SCIAMACHY, and GOME-2, covering a time period of more than 2 decades.As indicated in Table 1, the ground pixel size differs strongly between GOME and its successors.This has a direct impact on the spatial resolution of the resulting daily and monthly means, but in addition more sophisticated consequences related to cloud masking, as the cloud statistics depend on pixel size (Krijger et al., 2007).Thus, for the climate product, "GOME-like" observations are generated from SCIAMACHY and GOME-2 by spatial resampling of SCIAMACHY and GOME-2 pixels to GOME size, and by  reducing the GOME-2 swath to the swath of GOME and SCIAMACHY, as explained in detail in Sect.3.3.The consistency between GOME and the resampled SCIAMACHY and GOME-2 time series is checked in Sect.3.4.Homogenized time series are constructed by applying offset corrections to GOME and GOME-2 with respect to SCIAMACHY (Sect.3.5).In Sect.3.6, an additional product is introduced where monthly mean TCWV is slightly smoothed over ocean in order to remove orbital patterns.Finally, monthly standard deviation and standard error of the mean are presented in Sect.3.7.

Spatial resampling to GOME pixel size and swath
The spatial resolution of GOME is considerably coarser than that of SCIAMACHY and GOME-2 (Table 1).Thus, in order to construct consistent time series amongst instruments, individual SCIAMACHY and GOME-2 observations are merged down to GOME resolution.
The merging might be realized by co-adding the spectra of the respective satellite pixels.It is much easier, however, to use the existing H 2 O SCDs for SCIAMACHY and GOME-2 and determine the SCD of the merged pixels as the radiance-weighted sum of the individual SCDs.We have checked this simplification and found very high correlation (R = 0.99998) of the intensity weighted mean SCD with the "true" merged SCD based on co-added spectra.The slope and intercept of a linear fit are 1.0010 and 0.036 kg m −2 , respectively.Thus we followed this simplified approach.The O 2 SCDs, needed for AMF calculation and cloud masking, are merged likewise.The SZA of the merged pixel (needed for the AMF correction factor) is calculated as the mean of all SZAs of the original pixels.Afterwards, the TCWV retrieval steps described above (Sect.2.2-2.5) are performed for the spatially downsampled SCDs.
The GOME swath in nominal mode is 960 km wide, corresponding to a scan-angle range of ±31 • .The swath contains 3 "forescan" pixels of 320 km × 40 km (across × along track).Back scans as well as orbits with different scan patterns (like NSM) are skipped for the climate product.
For SCIAMACHY, one scan consists of 16 forward pixels with 60 km width.These pixels can only be approximately merged into 3 GOME-like pixels.For the sake of symmetry, we group the 5 westerly, 6 center, and 5 easterly pixels together.The grouping is based on the position of the scan mirror (ESM).Thereby, SCIAMACHY measurements with reduced integration time (corresponding to 30 km across track) are grouped consistently into 10 westerly, 12 center, and 10 easterly pixels.The small difference in along-track extent (30 km for SCIAMACHY versus 40 km for GOME) cannot easily be accounted for and is ignored hereafter.
For GOME-2, grouping is done based on the scan mirror angle as well.Four GOME-2 pixels at a time are merged, matching exactly the extent of 1 GOME pixel.After 8 July 2013, when GOME-2 on Metop-A is switched to "narrow" mode (not to be confused with NSM; the narrow mode still covers half of the original GOME-2 swath, thus matching the GOME swath), 8 GOME-2 pixels (with 40 km width each) are merged by the scan-angle selection.Pixels Earth Syst.Sci.Data, 10, 449-468, 2018 www.earth-syst-sci-data.net/10/449/2018/  with scan angles > 31 • are skipped such that the swath width of the merged GOME-2 pixels matches that of GOME.Note that for the illustration and discussion of the spatial resampling, results from SCIAMACHY and GOME-2 gained in original resolution are indicated by the subscript "orig", while the reduced (with respect to spatial resolution and swath) product is indicated by the subscript "rdcd".Afterwards (from Sect.3.4.2on), all SCIAMACHY and GOME-2 results are derived after spatial resampling at GOME resolution if not explicitly stated differently.
Figure 2 illustrates the merging procedure exemplarily for 1 June 2009, when measurements from all three instruments are available over the Northern Atlantic.The subplots are arranged such that GOME is shown in the center (c) as the central reference.In (a) and (b), the original and merged ground pixels for SCIAMACHY are shown, which have a time difference of 29 min with respect to GOME.In (a), the grouping of the original SCIAMACHY pixels into "GOME-like" pixels is indicated by thick rectangles.In (b), the contours of the GOME orbit are added for better comparison to GOME.Similarly, the SCIAMACHY states are displayed in (c) for orientation.
The respective plots for GOME-2 are displayed in (d) and (e).Note that the orbital patterns of GOME-2 are shifted in longitude.Thus, a direct comparison to spatially coincident GOME measurements is not possible.The outermost 6 pixels of GOME-2 on both sides of the swath are skipped by the merging procedure, thereby reducing the swath from 1920 to 960 km for the merged pixels.
Figure 2 clearly illustrates the complex relation of spatial resolution and cloud masking, and suggests that the comparison between the merged SCIAMACHY pixels and GOME is far more meaningful than a comparison at the original SCIA-MACHY resolution.In the next section, it is shown that on average the TCWV also agrees much better between GOME and SCIAMACHY if the latter is spatially merged to GOME resolution.

Comparison of different sensors
In this section, TCWV from the different sensors are compared during the available overlap periods.We refer differences to SCIAMACHY, as it serves as a link between the GOME and GOME-2 time series.For the comparison between GOME and SCIAMACHY (Sect.3.4.1), the improved consistency gained by the adjustment of spatial resolution is clearly illustrated.The remaining systematic offsets between the different sensors are quantified.This will be used In Fig. 3, we compare the mean difference between GOME and SCIAMACHY TCWV for the overlap period in three different ways.Figure 3a shows the difference in the mean of monthly means, where SCIAMACHY data at original resolution are used.Here, for each data set all available measurements are considered.In contrast, in Fig. 3b the difference is determined from coincident measurements on orbital basis.This is possible as SCIAMACHY has the same orbital pattern as GOME with a time shift of half an hour.In Fig. 3c, the difference between GOME and coincident SCIA-MACHY measurements with reduced resolution is shown.
The comparison of all available measurements for each instrument (Fig. 3a) shows large scatter, caused by the high variability of day-to-day atmospheric water vapor as well as clouds, and the different spatiotemporal sampling for both instruments (missing orbits and SCIAMACHY gaps due to limb measurements).In contrast, the comparison of coincident measurements only (Fig. 3b) shows much smoother patterns, but now also clearly reveals systematic differences down to −3 kg m −2 in the tropics.Note that this is of similar magnitude to the "level shifts" which have been applied in Mieruch et al. (2008) (see Fig. 13 therein) for the determination of trends from combined GOME-SCIAMACHY measurements.
The systematic difference is largely reduced when SCIA-MACHY observations are resampled at GOME resolution (Fig. 3c).This is further illustrated in Fig. 4, where zonal means of GOME and SCIAMACHY TCWV and their difference are shown as a function of latitude.Over ocean, the resampled SCIAMACHY TCWV agrees with GOME within ±0.5 kg m −2 , whereas the original SCIAMACHY TCWV is systematically higher by about 0.3 kg m −2 for midlatitudes and high latitudes, up to 1.0 kg m −2 around the Equator.Over land, good agreement is found between GOME and SCIA-MACHY, except for in the tropics.Here, the merging of SCIAMACHY pixels halves the systematic difference from −1.0 kg m −2 down to −0.5 kg m −2 .

GOME-2 versus SCIAMACHY
Between GOME-2 and SCIAMACHY, a far longer overlap period is available (January 2007 until March 2012).However, in contrast to the comparison between GOME and SCIAMACHY, the selection of coincident measurements is not beneficial, since the orbital patterns of GOME-2 and SCIAMACHY are shifted in longitude with respect to each other, and the swath width of GOME-2 has been reduced for the merged pixels (see Fig. 2).Thus, "coincident" measurements (with respect to time) are only available for a subset of the orbit, with systematic differences of the respective scan angles of the two instruments.
Thus, the mean difference of TCWV from GOME-2 and SCIAMACHY is calculated as the mean of monthly means of all available measurements (Fig. 5).Though the overlap period covers more than 5 years, the resulting difference is still noisy, due to the high spatiotemporal variability of H 2 O Earth Syst.Sci.Data, 10, 449-468, 2018 www.earth-syst-sci-data.net/10/449/2018/ and clouds.In addition, it still reveals small but systematic orbital patterns, in particular over ocean.These, however, are not caused by individual orbits, but turned out to be a consequence of the GOME-2 NSM, which is periodically applied Over land, GOME-2 TCWV is higher than SCIAMACHY by up to 2 kg m −2 locally over tropical rainforest.In the zonal mean, GOME-2 and SCIAMACHY agree within ±0.3 kg m −2 .Over ocean, the zonal mean difference is again close to zero at high latitudes, but goes down to about −1 kg m −2 at the Equator.

GOME-2 versus GOME
GOME lost global coverage due to failure of the onboard tape recorder in June 2003 but continued measurements until July 2011.During that period, the measured spectra have been directly transmitted to an increasing number of ground stations, mostly in the Northern Hemisphere.This allows us to also directly compare GOME-2 and GOME, at least for selected regions.Like for the comparison between GOME-2 and SCIAMACHY, coincidence is not demanded.Since the results over ocean are quite noisy again, we perform the comparison separately over land and ocean.
Figure 7a displays the mean difference of GOME-2 and GOME TCWV over land for regions with sufficient coverage.In Fig. 7b, we also derived an indirect comparison between GOME-2 and GOME via the respective differences to Oceans and regions with poor GOME coverage are masked out.For comparison, (b) displays the indirect difference between GOME-2 and GOME, as derived from the difference between GOME-2 and SCIAMACHY (Fig. 3c) minus the difference between GOME and SCIAMACHY (Fig. 5) for the same spatial selection.SCIAMACHY (i.e., between GOME-2-SCIAMACHY and GOME-SCIAMACHY), for the same regional selection.
Figure 8 displays the zonal mean difference between GOME-2 and GOME over ocean, again determined both directly and indirectly.
Thus, though GOME lost global coverage in June 2003, the ongoing measurements still serve as a valuable consistency check and reveal that a direct comparison to GOME-2 yields basically the same results as the two-step comparison via SCIAMACHY.But due to the low spatial coverage, which is also changing over time, GOME measurements after June 2003 are not included in the merged time series.

Merged TCWV time series V
As shown in the previous section, the resampling of SCIA-MACHY and GOME-2 pixels to GOME resolution and swath width substantially improves consistency across the different instruments.But still, the comparison of mean TCWV during overlap periods reveals systematic regional differences between the different instruments, in particular in the tropics.These differences might be partly related to instrument characteristics (like polarization sensitivity or spectral resolution), spatiotemporal sampling effects (Coldewey-Egbers et al., 2015), or the imperfect spatial merging of SCIAMACHY pixels to GOME pixel size.Most important, however, is probably the difference in local overpass times (see Table 1).This interpretation is supported by the find- ing that the offsets of GOME and GOME-2 with respect to SCIAMACHY, i.e., half an hour after and before, are almost mirrored (see Fig. 9).As shown in Diedrich et al. (2016), the change of TCWV between 09:30 and 10:30 LT is typically small (< 1 %, which still might account for about 0.2 to 0.3 kg m −2 in the tropics).Additional systematic changes of the retrieved TCWV, however, can be easily caused by a systematic change of cloud conditions.The detailed effects of changing cloud fraction and height on the retrieval are complex as they affect both the cloud masking (Sect.3.1) and the AMF (Sect.2.3).In particular over dark surfaces like the tropical rainforest, even small changes related to clouds can have significant impact.
If such systematic differences between the instruments would not be accounted for in the TCWV time series, discontinuities ("jumps") would occur (compare Mieruch et al., 2008) which impair the analysis of trends.For the climate product, the time series from GOME, SCIAMACHY, and GOME-2 are thus homogenized by applying offset corrections derived from the overlap periods.GOME and GOME-2 are corrected with respect to SCIAMACHY, as the latter serves as link between GOME and GOME-2 time series.
GOME is corrected by subtracting the offset derived during the overlap with SCIAMACHY (Fig. 3c) after applying slight spatial smoothing (see Appendix A for details).For GOME-2, the offset (Fig. 5) is smoothed likewise over land; over ocean, however, the slight smoothing is not sufficient to overcome the patchiness of the observed difference.Thus, the zonal mean TCWV is taken for all longitudes over ocean.The resulting offset corrections are displayed in Fig. 9.
The climate product provides a merged time series of monthly mean TCWV V covering the period July 1995 until December 2015.Herein, GOME and GOME-2 monthly means are corrected with respect to the offset determined from comparison to SCIAMACHY.During overlap periods, measurements from all available instruments are averaged.Due to the higher spatial coverage of GOME and GOME-2 Earth Syst.Sci.Data, 10, 449-468, 2018 www.earth-syst-sci-data.net/10/449/2018/ Figure 10a displays the monthly mean TCWV V for September 2015 exemplarily.The time series of TCWV averaged over longitude are displayed in Fig. D1 (Appendix D).

Smoothed TCWV over ocean V
As documented in Grossi et al. (2015), the TCWV gained from the GDP retrieval shows a dependency on scan angle, which results from systematic scan-angle dependencies of both H 2 O and O 2 SCDs.The dependency is quite small over land, but strong over ocean (Grossi et al., 2015, Fig. 1 therein).In the GDP, an empirical post-correction is applied.In the climate product, however, no corrections of scan-angle dependencies are applied, as the large viewing angles of GOME-2 are skipped by reducing the swath width to that of GOME.In addition, the scan-angle dependencies also depend on further quantities like SZA, surface albedo, or cloud properties and are thus hard to correct for appropriately (see Grossi et al., 2015, for detailed discussion).
Within a monthly mean, the effects of scan-angle dependencies on TCWV are usually suppressed by averaging observations with different viewing geometries, but not com-pletely removed.Consequently, monthly means reveal faint orbital patterns over ocean (see Fig. 10a).
In longer temporal averages, the scan-angle effects cancel out completely, as long as the spatial sampling with different scan angles is uniformly distributed.This is usually the case, as shown in detail in Appendix B, with two prominent exceptions: -For GOME, systematic scan-angle biases occur around the calibration region over northern India, as locally measurements from the eastern or western swath pixels dominate (see Fig. B1a).
-For GOME-2, the narrow swath mode is applied regularly at the same geolocations.As the NSM is skipped in the Climate product, these regular gaps result in biased mean scan angles with systematic orbital patterns (see Fig. B1c).This is the reason for the small but systematic orbital patterns in the mean difference between GOME-2 and SCIAMACHY TCWV during overlap periods (Fig. 5).For the applied offset correction, these patterns are removed by taking the zonal mean over ocean for all longitudes (Fig. 9b).
In the climate product, a "warning flag" is provided indicating regions where the mean scan angle systematically deviates from 0 (see Appendix C).In addition, the mean scan angles for each instrument as shown in Fig. B1 are provided so that the user might check whether suspicious spatial patterns might be related to a scan-angle bias.
In order to avoid orbital artefacts caused by systematic scan-angle biases in the climate product, a second version of the climate TCWV time series V ("TCWV smooth_ocean ") is added to the data product where monthly means are smoothed over ocean such that the orbital patterns are removed.Smoothing is applied over ocean only, as the scanangle effects over land are generally negligible (except for the GOME calibration gap).Details of the applied smoothing are provided in Appendix A. The smoothed monthly mean TCWV V is shown in Fig. 10b for September 2015 exemplarily.
Note that the scan-angle effects discussed here are generally small: For instance, in September 2015 (as shown in Fig. 10), the difference between V (where faint orbital patterns can be imagined) and V is about 0.0 ± 1.2 kg m −2 (mean ± SD) over ocean (excluding coastal regions).The corresponding relative differences V −V V are 0.00 ± 0.05 (mean ± SD), i.e., typically within 5 %.For the mean of all months, the respective absolute and relative differences are as low as 0.0 ± 0.1 kg m −2 and 0.00 ± 0.01, i.e., within 1 %.But still, as the effects are systematic, they can still create artificial orbital patterns in trend analyses if ignored.
Thus we generally recommend using the TCWV smooth_ocean product V for trend analysis.For validation of the climate product or comparisons to other data products, we recommend using V as well, except for coastal regions where biases due to edge effects of the convolution with C smooth have to be expected (note that this effect does not affect trend analyses).Here, V should be used.The potentially affected coastal regions are specified by a "convolution flag" which is also provided in the data product and explained in Appendix C.

Standard deviation and standard error
In addition to monthly mean TCWV V , the standard deviation (SD) σ as well as the number of daily measurements (N) are determined per 1 • × 1 • pixel for each month and provided in the climate product.Both quantities are displayed exemplarily for September 2015 in Fig. 10c and d.
The monthly SD σ reflects the day-to-day variability of the water vapor column within a month and allows the magnitude of sampling effects to be assessed.
N is generally within 0 (when no measurement meets the 80 % criterion for O 2 ) up to the number of days of the respective month, i.e., 28-31 (at high latitudes, where orbits overlap), when one instrument is available.During overlap periods of two instruments, it can be up to twice as large.
Note that V , V , and σ are only provided for grid pixels with N ≥ 2.
With σ and N available, the standard error (SE) of the mean σ M can be determined as This reflects the statistical uncertainty of the estimated mean and can be considered as precision of V .Figures D2 and D3 display the relative SD and SE (i.e., σ and σ M divided by V ) averaged over longitude as a function of latitude and time.The temporal pattern of σ is quite consistent over time and for the different satellite instruments.The SD is typically about 12, 28, and 35 % of the mean TCWV V for 0, 30, and 60 • latitude, respectively.
The SE, however, reflects the change in the amount of available data N .It is highest during 2004-2006, when only SCIAMACHY measurements are available, and lowest during the overlap periods.For GOME and GOME-2, at the beginning and the end of the time series, σ M is about 5, 10, and 10 % of the mean TCWV for 0, 30, and 60 • latitude.

Known issues
The climate product is optimized for consistent time series across different satellite instruments.It is thus based on a simple retrieval, merged pixels, and reduced swath of GOME-2, at the cost of algorithm accuracy, spatial resolution, and spatial coverage.Below we list some aspects of the climate product that have to be kept in mind for data interpretation and comparison to other TCWV products.

Spatial resolution
GOME has a coarse across-track resolution of 320 km.For the climate product, SCIAMACHY and GOME-2 observations are also merged to GOME resolution.Thus, gradients in TCWV or in quantities affecting the AMF (like sur-Earth Syst.Sci.Data, 10, 449-468, 2018 www.earth-syst-sci-data.net/10/449/2018/ face albedo, terrain height, or clouds), are not resolved but smeared out in the climate product.Systematic biases of the climate product TCWV are thus expected, e.g., for coastal sites, and in particular for mountainous islands (compare Van Malderen et al., 2014).

Spatiotemporal sampling
Satellite measurements from low Earth orbits provide global coverage, but only a limited number of observations at a given location.For the calculation of "monthly means", spatiotemporal sampling is thus an important aspect (Coldewey-Egbers et al., 2015).
The climate product is based on satellite measurements performed around 10:00 local time.The GOME swath width of 960 km corresponds to global cover within 3 days, i.e., at low latitudes, about 10 overpasses are available per month.The masking of cloudy measurements further reduces the number of days N where TCWV measurements are available within a 1 • ×1 • pixel.Thus, the "monthly mean" is often determined from less than 5 snapshots on different days.
Note that grid pixels with N < 2 are discarded, resulting in gaps in the climate product monthly means.This regularly happens, mostly around the ITCZ, in particular for SCIA-MACHY due to the poorer spatial coverage resulting from the alternating nadir-limb mode.
The simple cloud flagging based on O 2 SCDs (Sect.3.1) also discards observations over high mountains, resulting in persistent gaps in the climate product over the Himalayas, the Andes, or Antarctica.An additional gap is introduced by GOME calibration measurements which are regularly performed north of India.The SD and SE (Sect.3.7) reflect the statistical variability of water vapor and the precision of the monthly mean TCWV product.In addition, systematic effects (like the fixed local time of the measurements or the selection of cloud-free observations) have to be kept in mind when interpreting the climate data product.

Accuracy
The TCWV Climate algorithm applies a simple empirical AMF correction based on the observed O 2 SCDs.The impact of the different vertical profiles of H 2 O and O 2 is corrected for based on mean H 2 O profiles determined from an average lapse rate.For individual observations, actual AMFs might deviate considerably if the H 2 O profiles differ from the mean, especially if clouds are present.This might also affect monthly means in the case of systematic differences.However, the simple and robust settings allow for a consistent retrieval (including the treatment of clouds) across the different instruments.
In addition, the selection of cloud-free observations corresponds to generally dryer atmospheric conditions, which likely results in low biased means.This effect is unavoidable for water vapor retrievals from satellite measurements in the visible range, where clouded scenes have to be masked out.
Comparisons to independent measurements result in relative biases of typically −5 to −10 % for the total mean (see Sect. 5).

Validation
Within the ESA GOME-Evolution project, the Climate product has been validated in Danielczok and Schröder (2017) by comparison to TCWV from Global Navigation Satellite System measurements (GNSS, Wang et al., 2007, version 721.1) as well as from the Analysed RadioSoundings Archive (ARSA, version 2.7).The available GNSS and ARSA stations are mostly located over the Northern Hemisphere and do not cover open ocean.Thus, Grossi (2017) performed additional comparisons to TCWV from the European Centre for Medium Range Weather Forecasts (ECMWF) ERA-Interim reanalysis data set (Dee et al., 2011) as well as Special Sensor Microwave/Imager (SSM/I) and Special Sensor Microwave Imager Sounder (SSMIS) observations using the HOAPS 4.0 data record (Andersson et al., 2010(Andersson et al., , 2017)).Below we briefly summarize the validation results concerning the accuracy and temporal stability of the climate product.
Note the following: -Danielczok and Schröder (2017) and Grossi (2017) are both based on the climate product v2.01.The current version 2.2 presented here is using exactly the same TCWV algorithm, but a slightly different definition of the warning flag.In addition, June 1995 was included in v2.01, but skipped from v2.1 on due to the limited number of available GOME measurements, resulting in a noisy monthly mean.

Accuracy
Figures 11 and 12 display scatter plots of monthly mean TCWV from GNSS and ARSA stations, respectively, compared to the climate product.TCWV from GNSS and ARSA show good correlation to the climate product.Mean biases are −1.0 and −1.9 kg m −2 , respectively.If only station measurements around the satellite local overpass time are considered, biases are reduced to −0.7 and 0.2 kg m −2 , respectively.The respective RMS values range between 4.3 and 6.1 kg m −2 (see Table 5-2 in Danielczok and Schröder, 2017 for details).
On smaller spatiotemporal scales (seasonal, regional), biases can be higher and can even exceed ±15 kg m −2 at low latitudes, in particular for coastal sites (probably related to the coarse spatial resolution of the climate product).
Overall, the observed biases are comparable to those that have been reported for the GOME-2 GDP 4.7 product in Grossi et al. (2015) and can be understood by the simplifications made in the climate product retrieval (compare Sect. 4.3).
Earth Syst.Sci.Data, 10, 449-468, 2018 www.earth-syst-sci-data.net/10/449/2018/ Appendix A: Convolution kernels for spatial smoothing Spatial smoothing is realized as normalized convolution (Knutsson and Westin, 1993) of monthly mean TCWV maps with a convolution kernel (CK) C on a regular 1 • latitudelongitude grid.In contrast to basic matrix convolution, normalized convolution can be applied to matrices containing gaps and removes them (as long as the extent of the CK is larger than the gap).
For convolution, the grid is considered to be cyclic in longitude (i.e., smoothing across the dateline is done appropriately), but finite in latitude (i.e., no smoothing is applied across the poles).
Below we provide the CKs used for the smoothing of offset maps (Sect.3.5) and for the smoothed climate product (Sect.3.6).

A2 Smoothing of climate product
For the smoothed climate product V , smoothing is applied primarily zonally in order to remove the artificial orbital patterns over ocean.For this task, the CK is applied, which is 11 • wide.C smooth is only applied over ocean.Its impact is illustrated in Fig. 10a and b.
Note that the convolution with C smooth is not used to fill gaps in order to avoid data entries at locations where actually no measurements are available; i.e., after normalized convolution, any originally missing value in V is removed from V as well.
V has been introduced in version 2.1 of the climate product (see Table 3), which was the basis of the ESSD discussion paper.Within v2.1, monthly mean TCWV has first been smoothed for each instrument separately before calculating the merged V .During this process, the contribution from GOME has been accidentally skipped from June 2002 on; i.e., in version 2.1, V is empty for June and July 2002, and based on SCIAMACHY measurements alone for the period August 2002 to June 2003.Within v2.2, convolution with C smooth is applied to the final monthly mean V after merging the different instruments.

Appendix B: Mean scan angles
The retrieved TCWV of individual observations shows a scan-angle dependency (SAD), in particular over ocean, resulting from the scan-angle dependencies of both O 2 and H 2 O SCDs (see Grossi et al., 2015 and Sect. 2.3).Within the climate product, the SAD is not explicitly accounted for in the daily TCWV, as a simple correction is not possible.However, SAD effects are reduced in monthly means as the orbital patterns and thus viewing geometry changes from day to day.In longer temporal averages, the effects cancel out completely as long as the mean scan angle is close to 0 (= nadir).Systematic biases of the mean scan angle, however, can cause small but systematic biases of the mean TCWV, in particular over ocean.
Figure B1 displays the mean scan angle (mean of monthly means) for the considered sensors which are discussed below.Based on this, warning flags for the climate product are defined in the next section.

B1 GOME
For GOME, the mean scan angle is generally close to 0. But around the calibration region over northern India, large systematic biases are observed, as locally measurements from the eastern or western swath pixels dominate.Less pronounced scan-angle biases are observed for orbital fragments south of India and around 140-160 • E.

B2 SCIAMACHY
For SCIAMACHY, the mean scan angle is close to 0 all over the world.A calibration gap as for GOME does not exist.Note that the overall average is slightly negative.This is caused by an asymmetry of the SCIAMACHY scan pattern ranging from −31 to +29 • (see Table 3-3 in Gottwald et al., 2010).This is accounted for in the spatial merging of SCIAMACHY pixels to GOME resolution by adjusting the scan-angle thresholds.Consequently, any (small) bias between the different instruments potentially caused by the systematic negative SCIAMACHY scan angles is contained in the offsets determined during overlap periods.
B3 GOME-2 GOME-2 performs measurements in NSM periodically at the same geolocations (GOME-2 Factsheet, 2015).As NSM or- to edge effects of the applied convolution.Note however that this does not affect trend analyses, as all instruments would be affected likewise by such edge effects.The convolution flag is displayed in Fig. C2.

Appendix D: Time series
Figures D1-D3 display the TCWV V , the relative SD σ/V , and the relative SE σ M /V averaged over longitude as a function of time and latitude.

Figure 1 .
Figure 1.Sample monthly mean TCWV from GOME measurements in June 1996.

Figure 2 .
Figure 2. TCWV from the different satellite instruments in original and reduced resolution on 1st of June 2009.White pixels are masked by the cloud flag as described in Sect.3.1.(a) SCIAMACHY pixels in original resolution.The grouping into GOME-like pixels is indicated by thick black lines.Gaps along the orbit are caused by observations in limb mode.(b) SCIAMACHY pixels in reduced resolution.In magenta, the orbital pattern of GOME is displayed for comparison with (c).(c) GOME pixels.In green, the SCIAMACHY states are indicated for better comparison with (a) and (b).The time shift between SCIAMACHY and GOME is 29 min.(d) GOME-2 pixels in reduced resolution.In magenta, the orbital pattern of GOME is displayed for comparison with (c).The GOME-2 orbital patterns are shifted compared to GOME and SCIAMACHY.(e) GOME-2 pixels in original resolution and full GOME-2 swath.The grouping into GOME-like pixels is indicated by thick black lines.

Figure 3 .
Figure 3. Mean difference of GOME and SCIAMACHY TCWV during the overlap period August 2002 to June 2003 calculated as the mean of monthly means (a) or as mean of coincident measurements on orbital basis (b), (c).In (a) and (b), SCIAMACHY data are in original resolution.In (c), SCIAMACHY pixels are merged to GOME resolution.

Figure 4 .Figure 5 .
Figure 4. (a) Zonal mean TCWV for GOME and SCIAMACHY (at original as well as reduced resolution) as a function of latitude.(b) Differences of zonal mean TCWV between GOME and SCIAMACHY at original (light) and reduced (dark) resolution, separately for land (orange) and ocean (blue).

Figure 6 .
Figure 6.Zonal mean of TCWV for GOME-2 and SCIAMACHY (a) and the respective differences, separately for land and ocean (b), as a function of latitude.

Figure 7 .
Figure 7. Mean difference of TCWV between GOME-2 and GOME after tape recorder failure during the overlap period January 2007 to February 2010 calculated as the mean of monthly means (a).Oceans and regions with poor GOME coverage are masked out.For comparison, (b) displays the indirect difference between GOME-2 and GOME, as derived from the difference between GOME-2 and SCIAMACHY (Fig.3c) minus the difference between GOME and SCIAMACHY (Fig.5) for the same spatial selection.

Figure 8 .
Figure 8. Zonal mean of direct and indirect TCWV differences between GOME-2 and GOME over ocean as a function of latitude.

Figure 10 .
Figure 10.Climate product maps for September 2015 of the TCWV V (a), the TCWV smoothed over ocean V (b), the standard deviation of monthly TCWV σ (c), and the number of available days N (d).

Figure 11 .
Figure 11.Scatter plot of TCWV monthly means of all available GNSS stations and the climate product.Figure from Danielczok and Schröder (2017).

Figure 12 .
Figure 12.Scatter plot of TCWV monthly means of all available ARSA stations and the climate product.Figure from Danielczok and Schröder (2017).

Figure 13 .
Figure 13.Time series of the relative difference between TCWV monthly means from the Climate product and GNSS stations.Only stations which were available for the whole time period are considered.Figure from Danielczok and Schröder (2017).

Figure C2 .
Figure C2.Convolution flag indicating regions where V is likely biased due to edge effects of convolution.

Figure D1 .
Figure D1.TCWV V averaged over longitude as a function of time and latitude.

Table 1 .
Characteristics of the satellite instruments used in this study.
Time series of the relative difference between TCWV monthly means from the Climate product and ARSA stations.Only stations which were available for the whole time period are considered.