An internally consistent dataset of δ13C-DIC in the North Atlantic Ocean – NAC13v1

The stable carbon isotope composition of dissolved inorganic carbon (δ13C-DIC) can be used to quantify fluxes within the carbon system. For example, knowing the δ13C signature of the inorganic carbon pool can help in describing the amount of anthropogenic carbon in the water column. The measurements can also be used for evaluating modeled carbon fluxes, for making basin-wide estimates of anthropogenic carbon, and for studying seasonal and interannual variability or decadal trends in interior ocean biogeochemistry. For all these purposes, it is not only important to have a sufficient amount of data, but these data must also be internally consistent and of high quality. In this study, we present a δ13C-DIC dataset for the North Atlantic which has undergone secondary quality control. The data originate from oceanographic research cruises between 1981 and 2014. During a primary quality control step based on simple range tests, obviously bad data were flagged. In a second quality control step, biases between measurements from different cruises were quantified through a crossover analysis using nearby data of the respective cruises, and values of biased cruises were adjusted in the data product. The crossover analysis was possible for 24 of the 32 cruises in our dataset, and adjustments were applied to 11 cruises. The internal accuracy of this dataset is 0.017 ‰. The dataset is available via the Carbon Dioxide Information Analysis Center (CDIAC) at http://cdiac.ornl. gov/oceans/ndp_096/NAC13v1.html, doi:10.3334/CDIAC/OTG.NAC13v1.


Introduction
Stable carbon isotope ratios are utilized as a tracer in several applications in marine carbon research. Particularly the stable carbon isotope ratio of dissolved inorganic carbon (δ 13 C-DIC) can be used to enhance the understanding of carbon-related processes ranging widely from the estimation of glacial circulation changes (Curry and Oppo, 2005) to testing the performance of ecosystem models (Schmittner et al., 2013). By observing the temporal development of the lightening of the inorganic carbon pool due to the uptake of CO 2 originating from the burning of 13 C-depleted fossil fuel carbon, a phenomenon also known as the oceanic 13 C Suess effect, an estimation of the anthropogenic carbon fraction of DIC is possible (Gruber et al., 2002;Körtzinger et al., 2003;Olsen et al., 2006;Olsen and Ninnemann, 2010;Quay et al., 2007;Racapé et al., 2013). Furthermore, δ 13 C can provide information concerning the quantification of biological processes such as net community production (Quay et al., 2009). Using the stable carbon isotope signature facilitates the distinction between anthropogenic, biological, and physical drivers of the carbon system (Gruber et al., 1998).
Published by Copernicus Publications. 560 M. Becker et al.: An internally consistent dataset of δ 13 C-DIC in the North Atlantic Ocean -NAC13v1 A sample's stable carbon isotope ratio, δ 13 C-DIC, is expressed as per mill deviation from that of the commonly used standard material Vienna Pee Dee Belemnite (V-PDB) (Coplen, 1995).
with 13 R being the ratio of the two stable carbon isotopes 13 C and 12 C in the sample. For basin-wide estimates of carbon fluxes due to primary production, studies of seasonal variations, or interannual trends, it is important to have a dataset of sufficiently high coverage both in space and time. Moreover, the dataset should be free of systematic differences between measurements carried out by different laboratories and on different cruises. However, neither criteria is easily met. Since isotope ratio mass spectroscopy (IRMS), the common method to analyze δ 13 C-DIC data, is a very time-consuming and expensive technique that cannot be performed at sea, data coverage has remained relatively poor. Therefore, several efforts have been made to assemble a dataset containing as many cruises as possible.
For oceanic δ 13 C-DIC data this was first done by Kroopnick (1985), who provided an analysis of the distribution of δ 13 C-DIC in the world's oceans. Over the years more data were accumulated and different data collections emerged (Gruber et al., 1999;Quay et al., 2003Quay et al., , 2007Schmittner et al., 2013). During recent years, databases like GLODAP (Global Ocean Data Analysis Project) and CARINA (Carbon dioxide in the Atlantic Ocean) were created for carbonrelated parameters (Olsen et al., 2016). These projects not only assembled the data but also conducted a secondary quality control (QC) so that systematic biases between individual cruises could be identified and adjusted for Velo et al., 2009;Tanhua et al., 2010a;Pierrot et al., 2010). Relative to other parameters such as total alkalinity or DIC, however, the dataset for δ 13 C-DIC is still small and disorganized. Therefore, no secondary quality control -in which deep-water samples from different cruises at nearby locations, so-called "crossovers", are compared to each other -could be carried out within these collections. Several new cruises have become available for the North Atlantic so that now the present crossover study could be performed for this area. This crossover analysis features 29 cruises, of which 22 could be compared quantitatively. Cruises without a quantitatively evaluable crossover were qualitatively related to the corrected dataset.
Please note that, when applying the crossover and inversion routine, we assume that the deep water masses (below 1500 m) are only to a negligible amount influenced by changes due to an increasing amount of anthropogenic carbon. Since the detected differences between some cruises were not consistent with a slowly increasing amount of anthropogenic carbon, we think that this consistent dataset is Figure 1. Map of all stations with δ 13 C-DIC data used in this dataset. Data from deeper than 1500 m were available only for the stations in dark red, so only these stations were used for the crossover analysis. an important step for improving the study of carbon isotope dynamics in the upper 1500 m. In regions for which the deeper water masses have also been shown to contain a high amount of anthropogenic carbon, we considered crossovers with cruises that took place long before or long after the respective cruise with caution. We believe that no temporal trends have been removed or created by the secondary QC procedures employed here. However, care should be exercised for calculating C ant accumulation in water below 1500 m.

Data provenance and structure
This dataset comprises data and metadata from 32 research cruises/campaigns from several international research groups, in total 6820 samples. Some of these consist of multiple cruises and one is a time series. For the crossover analysis, some consecutive cruises, whose data were analyzed together were treated as one cruise. While the focus is on the North Atlantic, four cruises were included that also have stations in the Nordic Seas, and one cruise extends into the South Atlantic. Thereby, consistency with future extended quality-controlled datasets for these regions is ensured. Since only deep (> 1500 m) samples of each cruise are compared in this study, only cruises with at least one deep station could be included in this analysis.  Keeling and Guenther (1994); Gruber et al. (1999) Figure 1 shows the locations of all stations with δ 13 C-DIC data that are part of this compilation. Table 1 shows a summary of the respective cruise dates, the responsible principal investigator, and publications in which the data was used. For cruises that have not been published elsewhere, Table 2 shows the sample handling and time periods during which the samples were analyzed. Some cruises had δ 13 C-DIC measurements over the entire depth range at every station, whereas others just had one or two stations with deep δ 13 C-DIC data. Most of the cruises were conducted in the subpolar North Atlantic, while the tropical region has relatively poor coverage. The temporal and latitudinal distributions of the data are displayed in Fig. 2. The data were collected in the North Atlantic between 1981 and 2014, with the majority falling between 1990 and 2005. Considering the seasonal distribution of the data, a bias towards summertime exists, especially towards late summer. The only two cruises that took place between January and March were located south of 42 • N. The uncertainty of the δ 13 C-DIC samples an-alyzed by IRMS is usually reported to be between ±0.12 ‰ (Gruber et al., 1999) and ±0.03 ‰ .
The presented dataset consists of 19 columns, of which the first 16 are cruise number, station, day, month, year, latitude, longitude, maximal depth, maximal sampling depth, bottle number, cast number, temperature, salinity, depth, CTD (conductivity, temperature, and depth) salinity, and pressure. Column 17 contains the adjusted δ 13 C-DIC data, column 18 a quality flag (C13f), and column 19 the QC flag (C13qc; see Table 3). For bad data the quality flag was set to "not measured", and therefore column 18 has only two entries (2: good; 9: not measured). Cruises that could be quantitatively compared to each other by the secondary QC have a "1" in the QC flag. All others are flagged with "0".
Additional parameters to most of the cruises can be found in either GLODAPv2 (Olsen et al., 2016) or CARINA . Only the most recent cruises are not included in these datasets, but the individual cruise files can be found at the Carbon Dioxide Information Analysis Center (CDIAC) website data are only part of CARINA. The respective cruise  numbers in GLODAPv2 and CARINA of the cruises shown in the NAC13v1 dataset can be found in the documentation.

Computational analysis
In order to derive an internally consistent set of δ 13 C-DIC data in the North Atlantic, all publicly available data in this area were assembled and quality-controlled in two steps. At first, a primary QC was performed in order to identify obviously erroneous data, such as wrong positions, time stamps, and depths. Outliers were also identified and then flagged by comparing the δ 13 C profiles of each cruise internally. After that, the secondary QC procedure was conducted employing a running crossover analysis as described by Tanhua et al. (2010b). This MATLAB-based software package compares two cruises at a time; searches for nearby stations, so-called crossovers; and calculates differences between all crossovers of the two cruises as additive offsets with the unit ‰. As a criterion for identifying crossovers, a maximum of 180 nm (3 • of latitude) distance between stations was used. From these crossovers, the δ 13 C-DIC data collected deeper than 1500 m were compared on equal potential density. Based on the resulting offsets and standard deviations determined for each of these crossovers, a suggestion for a possible adjustment was made. This suggestion was obtained by an inversion routine using a weighted least-squares (WLSQ) and a weighted damped least-squares (WDLSQ) model as described by Johnson et al. (2001). Cruise 33MW199930704-1 was analyzed by a reputable laboratory, has relatively low scatter, and covers wide distances. Therefore this cruise was selected as the core cruise and weighted higher than the other cruises. Unfortunately this was the only cruise to meet these two criteria. Several cruises from different years were in good agreement with the core cruise, while the other cruises were adjusted towards it. Choosing the appropriate distance criterion for crossover locations is always a compromise between including as many statistically relevant crossovers as possible by selecting a large enough radius on the one hand and trying to have only crossovers between stations that share similar oceanographic characteristics on the other hand. However, reducing the crossover distance to 120 nm -which is the distance commonly used in CARINA, PACIFICA, and GLODAPv2 data products -reduced the amount of crossovers and the number of cruises that could be quantitatively compared to each other but did not significantly change the suggested magnitude of adjustments of the remaining cruises. Therefore, the 3 • × 3 • criterion was used instead. For some crossovers in highly variable regions with deep-water formation, such as the Labrador Sea and the Nordic Seas, the standard deviation of the offset between two cruises was decreased significantly by restricting the comparison depths to > 2000 m. Generally, offsets from crossovers in these highly variable regions, from cruises with a relatively poor data precision or with just a few deep samples, were considered in the model with less influence, by weighting the offsets with their uncertainty. In Fig. 3 all crossovers between the cruises 06MT20030723 and 33MW19930704-1 are shown as an example, both for the uncorrected as well as for the corrected dataset. All crossovers from the adjusted and the unadjusted dataset can be found at http://cdiac.ornl.gov/oceans/ndp_096/NAC13v1.html.
Whether an adjustment was applied to the data was decided somewhat subjectively in each case based on a combination of the shape and distribution of individual crossover differences and the suggestions given by the inversion routine with knowledge about the sampling region. After applying the adjustments, the inversion was conducted again and it was checked whether or not the adjustment improved the overall consistency within the entire dataset. Temporal changes of the deep water masses were only considered in this step of the routine when comparing the suggested corrections and the corresponding crossover offsets between cruises in areas where also the deep-water δ 13 C-DIC was expected to change over time. In order to get a quantitative description of the internal consistency of the final dataset, a weighted mean using the respective offsets of all crossovers and their standard deviation was calculated (Tanhua et al., 2010a).
L refers to the total number of crossover, D refers to the respective offset of all crossover, and σ is their standard deviation.
Another method for revealing systematic deviations between different cruises is a regional multi-linear regression (MLR) (Wanninkhof et al., 2003;Jutterström et al., 2010). In this work, a MLR based on core cruise data (deeper than 1500 m) was used to verify the suggested corrections, which resulted from the crossover analysis. Moreover, some cruises without a statistically evaluable crossover could now be related to the other cruises. The following equation was used: with δ 13 C-DIC MLR being the calculated δ 13 C-DIC, S the salinity, the potential temperature in • C, and DIC the DIC concentration in µmol kg −1 . The DIC concentration was chosen because it is strongly related to changes in the isotope composition, and DIC data were available for most cruises. Adding more parameters to the MLR, such as apparent oxygen utilization (AOU) or nutrient concentrations, did not improve the agreement between δ 13 C-DIC and δ 13 C-DIC MLR of the core cruise and reduced the amount of cruises that could be compared via the MLR analysis. The limitation of this method is, of course, that the further away in space and time the cruises are from the core cruise, the more likely it is that an observed offset is real. Especially the cruises reaching into the Nordic Seas show significant deviations, which are most likely real differences between the basins. Therefore, the offsets revealed by the MLR analysis were not taken into account for these cruises.

Adjustments
The data of all cruises as well as locations are shown in Fig. 4. The offsets as well as the corrections suggested by the WDLSQ inversion routine, the MLR analysis, and the final adjustments are listed in Table 4. In Fig. 5 the results of the WDLSQ inversion are shown before and after the adjustments were applied. Some cruises show quite big deviations from the core cruise. However, we do not know the reason for these biases. Besides the actual sample analysis in the laboratory, different sampling routines on board the ship, insufficient poisoning, and the sample storage time can also cause these biases. For most cruises that took place in the North Atlantic, the offsets revealed by the MLR analysis were on the same order of magnitude as the suggested correction by the crossover inversion routine. Cruises reaching far into the Nordic Seas or the South Atlantic show huge dif- Figure 3. Crossovers between cruise 06MT20030723 (blue dots and lines) and the core cruise 33MW19930704-1 (red crosses and lines). The C13 plots show the data and mean profiles of each cruise, and the difference plots show the difference profiles with their standard deviation (black lines) as well as the crossovers offset with their standard deviation (red lines). The left-hand plot shows the original, and the right hand plot the adjusted data. In both cases the distribution of the δ 13 C-DIC on equal density surfaces (left-hand side) as well as the mean offset between both cruises (right-hand side) is shown. Cruise 06MT20030723 was adjusted by −0.15 ‰.
ferences, which are caused by different water mass properties in these areas. A detailed overview of the offset of each crossover in the original as well as the adjusted dataset is given in Table 5 in the Supplement. Moreover, the evidence for our decision will be presented for each cruise.

06MT19941012, cruise no. 1
This cruise on the German R/V Meteor is also known as M30-2. The inversion suggested a correction of −0.07 ‰. The mean offset of all crossovers is 0.11 ‰ too high. The MLR analysis revealed a smaller offset of 0.05 ‰, and thus the cruise was adjusted by −0.07 ‰.

06MT19970515, 06MT19970611, and
6MT19970707, here referred to as 06MT1997-M39, cruise no. 2 These cruises are also known as M39 cruises with three legs of δ 13 C-DIC sampling (M39-2, M39-3, M39-4). Since each leg of this cruise had only a few stations with δ 13 C-DIC samples, and all these samples were analyzed together, these cruises were summarized for the crossover study. Neither the inversion routine nor the single crossover with the adjusted cruises show evidence for an offset.
4.3 06MT19990610 and 06MT19990711, here referred to as 06MT1999-M45, cruise no. 3 These cruises are also known as M45-2 and M45-3. Since both were analyzed together, they were summarized for this crossover study. The inversion suggested a correction of −0.15 ‰, and the mean offset of all crossovers was 0.16 ‰ too high. After applying this adjustment and comparing this cruise to the adjusted dataset, the inversion routine still suggested a small correction. Therefore, an adjustment of −0.20 ‰ was applied.

06MT20010507, cruise no. 4
This cruise is also known as M50-1. The inversion routine suggested a correction of −0.24 ‰, whereas the mean offset was 0.16 ‰ too high. The MLR analysis revealed an offset of 0.30 ‰. Based on the southern crossover with cruise 06MT20040311 and 316N19970717, an adjustment of −0.30 ‰ was applied.

06MT20030723, cruise no. 5
This cruise is also known as M59-2 (Friis et al., 2007). The correction suggested by the inversion routine is −0.15 ‰, which matches the positive offsets of the crossovers, except for those with 33TH20060521. Based on the crossover with the core cruise, an adjustment of −0.15 ‰ was applied. Table 4. Overview of all cruises in this dataset. The data of some cruises were combined for the analysis. For more information, please see the detailed description in the "Adjustments" section. The mean offsets of the crossovers and the MLR as well as the corrections suggested by the WDLSQ inversion for the original and the adjusted dataset are shown. In the last column the applied adjustments are displayed. NC indicates that these cruises were not considered in the inversion since they had no statistically significant crossover, and the core cruise is marked with C. Cruises with insufficient quality data are denoted "poor" and not included in the further analysis. Cruises marked with a * had fewer than 10 deep samples that were part of the MLR analysis. This cruise is also known as M60-5. The inversion routine indicates that the δ 13 C-DIC data of this cruise are 0.10 ‰ too low. Additionally, the mean offset as well as the MLR analysis shows that these data are too low. An adjustment of +0.10 ‰ was applied.
4.7 316N19970717, cruise no. 7, and 316N19970815, cruise no. 8 These cruises followed the WOCE/GO-Ship (World Ocean Circulation Experiment/Global Ocean Ship-based Hydrographic Investigations Program) standard lines A20 and A22. The inversion suggests a correction of −0.06 ‰ for 316N19970717. It shows one crossover with cruise 06MT2004031, in which a significant positive offset is still visible after cruise 06MT20040311 was corrected. Therefore, an adjustment of −0.05 ‰ was applied for cruise 316N19970717. Cruise 316N19970815 does not show a statistically significant crossover.
4.8 316N20030922, cruise no. 9, and 316N20031023, cruise no. 10 These cruises, which took place in the tropical western Atlantic following the A20 and A22 lines, have only one deep station each. The crossovers of these stations with the adjusted data of both cruise 06MT20040311 and cruise 316N19970717 show a good agreement, suggesting that no adjustment should be applied.

33RO19980123, cruise no. 11
This cruise has one statistically insignificant crossover with cruise 06MT20040311 and one with cruise 33MW19930704-1. Both seem to be in good agreement, suggesting that no adjustment should be applied.  4.10 33MW19910711, cruise no. 12, and 33MW19930704-1, cruise no. 13 Cruise 33MW19930704-1 was considered as the core cruise in the present analysis. Cruise 33MW19910711 extends into the South Atlantic, and its crossover with cruise 13 shows no need for an adjustment.
4.11 35TH20020611, cruise no. 14, and 35TH20060521, cruise no. 15 The latter of these two cruises has a few quantitative crossovers, which show a high offset of −0.39 ‰. Furthermore, the inversion suggests a correction of 0.24 ‰. The high variability of the sampling area south of Iceland, as well as an increasing lightening of the deep-water carbon pool over time, does not constitute an adequate explanation for this large deviation; therefore, an adjustment of −0.25 ‰ was applied. Cruise 35TH20020611 shows only qualitatively analyzable crossovers, which show a lighter carbon pool compared to earlier cruises and a heavier one compared to the original data of cruise 35TH20060521 (Racapé et al., 2013). After adjusting cruise 35TH20060521, both cruises, which were analyzed in the same laboratory, are not in good agreement anymore, which suggests that the earlier cruise also has too-low isotope values. The MLR analysis reveals an offset of the 35TH20020611 cruise of −0.23 ‰, which is in the same order as the correction suggested by the crossover routine for cruise 35TH20060521. Since the MLR offset for cruise 35TH20020611 is based only on five samples, we applied an adjustment of −0.25 ‰ to secure the internal consistency of these two cruises.

58GS20030922, cruise no. 16
This cruise has only two very weak crossovers: one with the Transient Tracers in the Oceans (TTO) data, which took place 30 years earlier, and one with 74JC20120601. When comparing cruise 58GS20030922 to the latter, the offset seems to be consistent with an increasing lightening of the  DIC caused by an increasing amount of anthropogenic carbon, which decreases with increasing depth. However, the crossover with the TTO data is not consistent with this. Therefore, no adjustment was applied.
4.13 58JH19920712, cruise no. 17, and 58JH19940723, cruise no. 18 These two cruises took place in a highly variable area. No statistically relevant crossover exists, but the data are in good agreement with the core cruise and the other adjusted cruises in that area.
4.14 64TR19900417, cruise no. 19 This cruise shows extreme scatter compared to all other cruises and, therefore, was not included into the adjusted product. When comparing crossover stations, this cruise shows a mean offset to other cruises of about −1.2 ‰.

74DI20120731, cruise no. 20
Both the inversion and the offset mean of the crossover suggest a correction of +0.13 ‰ for the cruise (Humphreys et al., 2015). This most recent cruise took place near the Scotland-Iceland Ridge, where the deep water masses cannot be assumed to be constant over time. All crossovers indicate a lower δ 13 C-DIC of this cruise when comparing it with the others, which is consistent with an increased amount of anthropogenic carbon. Therefore, no adjustment was applied.

74JC20120601, cruise no. 21
This cruise has only a few stations with δ 13 C-DIC data in a highly variable region. It has only one crossover with cruise 58GS20030922. In the MLR analysis, this cruise is too far away from the core cruise to give a reliable outcome. No adjustment was applied.

74JC20140606, cruise no. 22
This cruise covers the North Atlantic between Canada, Greenland, and Scotland. The crossover inversion gives a suggested correction of 0.07 ‰ and the MLR analysis an offset of the same magnitude: −0.09 ‰. Since this cruise took place 20 years after the core cruise, anthropogenic influences cannot be neglected in this case. Therefore, no adjustment was applied.

OMEX1NA, cruise no. 23
During the OMEX1 project in the North Atlantic δ 13 C-DIC samples were taken in January 1994. The MLR analysis revealed an offset of −0.26 ‰. In contrast to that, the crossover inversion did not suggest a correction. No adjustment was applied.

316N19810401, cruise no. 24
The cruises 316N19810401, 316N19810416, 316N19810516, 316N19810619, 316N19810721, 316N19810821, and 316N19810923 are combined and usually named Transient Tracers in the Oceans North Atlantic Study (TTO-NAS). The inversion does not suggest any correction for this dataset.

Conclusions
The finalized, quality-controlled dataset of δ 13 C-DIC presented here consists of 24 cruises (some of which consist of multiple legs that were grouped) that have been quantitatively compared to each other and form an internally consistent dataset. Nine cruises could not be quantitatively compared to the other cruises due to a lack of crossovers and/or deep δ 13 C-DIC data. The reason for the deviations between single cruises could not be revealed. There was no correlation between a cruise's bias or its scatter and storage time, analyzing period, or volume of HgCl 2 added.
The internal consistency of the adjusted dataset was calculated to be 0.017 ‰ based on Eq. (2).