KRILLBASE: a circumpolar database of Antarctic krill and salp numerical densities, 19262016

. Antarctic krill ( Euphausia superba ) and salps are major macroplankton contributors to Southern Ocean food webs and krill are also ﬁshed commercially. Managing this ﬁshery sustainably, against a backdrop of rapid regional climate change, requires information on distribution and time trends. Many data on the abundance of both taxa have been obtained from net sampling surveys since 1926


Introduction
The crustacean euphausiid species Euphausia superba (hereafter "krill") and the tunicate family Salpidae (hereafter "salps") are key large zooplankton taxa of the Southern Ocean.Both taxa are important in biogeochemical cycling and nutrient export (Pakhomov et al., 2002;Phillips et al., 2009;Gleiber et al., 2012;Schmidt et al., 2016).They have broadly similar size, but have fundamentally different life cycles, habitat preferences, and nutritional composition and thus have contrasting roles in the food web.Krill is a major food item for a suite of vertebrate and invertebrate predator species (Murphy et al., 2007;Trathan and Hill, 2016).Salps appear in the diets of various invertebrates, fish and birds but do not seem to be as important as krill to most of the air-breathing predator group (Pakhomov et al., 2002).Also, compared to krill, salps seem to prefer warmer, deeper water habitats with moderate food concentrations and less sea ice (Pakhomov et al., 2002;Loeb and Santora, 2012).
Over the past 100 years the Southern Ocean has experienced regional warming (Gille, 2002;Meredith and King, 2005;Whitehouse et al., 2008) and regionally variable changes in sea ice cover (de la Mare, 1997;Murphy et al., 2014;Stammerjohn et al., 2012).Whether there has been a consequent reorganisation of plankton distributions is a topic of much interest and debate (Pakhomov et al., 2002;Atkinson et al., 2004;Ward et al., 2012;Loeb et al., 1997Loeb et al., , 2015)).Climate model ensembles predict that current positive trends in atmospheric Southern Annular Mode (SAM) anomalies will continue this century (Gillett and Fyfe, 2013).Since the population dynamics of key euphausiid and salp species relate to these climatic drivers (Saba et al., 2014;Ross et al., 2014;Steinberg et al., 2015;Loeb and Santora, 2015), we need to understand the spatial and temporal dynamics of both krill and salps.
In addition to their ecological role, krill are also the dominant fished species in the Southern Ocean in terms of catch weight, with a potential sustainable yield equiva-lent to 11 % of current global fishery landings (Grant et al., 2013).The Antarctic krill fishery is managed by the Commission for the Conservation of Antarctic Marine Living Resources (CCAMLR) which is committed to precautionary, ecosystem-based management.This means that CCAMLR is responsible for managing the impacts of the fishery on the health, resilience and integrity of the wider ecosystem.However, there is little information about many relevant aspects of krill ecology and population dynamics (Siegel and Watkins 2016), including genetic stock identity (Jarman and Deagle, 2016), and predator-prey relationships (Trathan and Hill, 2016).Reducing these uncertainties might be necessary for CCAMLR to achieve its conservation objectives (Constable, 2011).
Fishery managers and stakeholder groups aim to improve more finely resolved temporal and spatial management approaches, but more information is needed to achieve this (Hill and Cannon, 2013).Thus, understanding krill distribution and dynamics is also important for the development of sustainable fishery management and conservation policy (e.g.identifying suitable Marine Protected Areas and assessing the dynamics of fished stocks).Consequently, a crosssector group representing the fishing industry, scientists and conservation NGOs has recently called for improvements in the availability of information to improve understanding of the state of the krill-based ecosystem and management of the fishery (Hill et al., 2014).
Spatial-temporal information on krill and salps can come from scientific surveys using acoustics or nets, predator studies or data from the fishery.Each has its strengths and weaknesses, and these are expanded on elsewhere (Atkinson et al., 2012b).For net sampling surveys, data are available from a variety of expeditions since the 1920s.These individual surveys provide important snapshots of the ecosystem but in isolation they cannot provide a broader context.Annual monitoring programmes collecting net and acoustics data over standardised survey grids were initiated in the late 1980s and early 1990s (Reiss et al., 2008;Fielding et al., 2014;Stein- berg et al., 2015; Kinzey et al., 2015;Krafft et al., 2016).However, despite the technology used, these multi-year time series surveys only cover a tiny fraction of the Southern Ocean area.A larger-scale and longer-term perspective is thus useful to provide context for the standardised monitoring data sets.The KRILLBASE project was started at the end of the 1990s to bring together the data necessary for this broader context.It was initiated by Angus Atkinson, Evgeny Pakhomov and Volker Siegel and is one of many examples of international collaboration in Antarctic research.Over the last 15 years we have documented and collated over 200 data sets, some of which are 90 years old and previously only available on paper log-sheets, distributed across library archives.KRILLBASE thus pre-dates many other data rescue and compilation initiatives.Only by combining data in this way can we provide coverage on a scale commensurate with that of large marine ecosystems or with management and conservation areas (Fig. 1).The most recent update to KRILLBASE was completed in 2016, and making these data more accessible improves the capacity of a broader community to investigate the dynamics and distribution of ecologically important krill and salps, and to enhance the responsible management of krill fisheries and the conservation of Southern Ocean ecosystems.
The objectives of publishing the revised KRILLBASE are (a) to provide a link to key data and metadata for those wish-ing access to the krill and salp data sets, (b) to illustrate the scope and coverage, with examples of potential uses of these data, (c) to explain in detail its structure, with caveats and guidelines on how the data can be used, and (d) to provide a single, citable reference for these combined data sets.

KRILLBASE overview: summary
The data introduced here were compiled as part of a longterm project to rescue and compile data on a range of krill and salp variables, derived from net sampling surveys.This paper introduces the most recent version of the krill and salp abundance data.More specifically, the main fields indicate numerical density (i.e. the number of individual postlarval krill or salps under 1 m 2 of sea-surface area), which we refer to as abundance for brevity.The version of the data that we present here (doi:10.5285/8b00a915-94e3-4a04-a903-dd4956346439,which can be accessed via https://www.bas.ac.uk/project/krillbase) amalgamates existing time series and other surveys of numerical density of postlarval krill, Euphausia superba, and salps.These data span 1926-1939 (plus 1951) and 1976-2016, albeit with variable spatial and temporal coverage.It is important to emphasise that this is a multi-national composite database not a synoptic snapshot or a true time series, so care is needed when using and interpreting these data due to the different sampling methods used.Table 1 provides a summary of its composite structure.In this paper phrases referring to KRILLBASE column headings are in uppercase italics (e.g.BOTTOM_SAMPLING_DEPTH_M) whereas searchable terms within the data (e.g.stratified haul) are italicised.
The basic data set is in a single table with an accompanying table of column descriptions.These are available either in their entirety as two downloadable CSV files, or as a resource that can be queried online.Both of these versions can be accessed via the doi:10.5285/8b00a915-94e3-4a04-a903-dd4956346439.Metadata are available via (a) this paper, which forms a reference that needs to be cited for the data source, and (b) detailed descriptions of data sources for each row of the data.These data are held at the Polar Data Centre at British Antarctic Survey to allow traceability, continuity of access and future updating.

Relationships to other databases
Antarctic zooplankton data are well represented in a series of databases and metabases, and the inter-relationships among these can be confusing.KRILLBASE and other data collections and time series form a global network entitled IG-METS (International Group of Marine Time Series, http: //igmets.net/),linked to the COPEPOD project http://www.st.nmfs.noaa.gov/copepod/.IGMETS is a metabase that provides a valuable catalogue of marine biological time series.
Table 1.Sources of data for KRILLBASE, according to nation and major sampling programme.Sources are listed in descending order of number of hauls provided.More information on the actual data sources (including the references used where data were transcribed from publications) is provided in the SOURCE field of the database.Coverage is not necessarily evenly spread within the longitudinal boundaries, which are presented in nearest integer degrees.For haul type -H: normal haul; SH: stratified haul that has been pooled into an equivalent "stratified pooled haul".SM: survey mean haul, where density estimates are only available as a mean from multiple stations comprising a survey (see Sect. 2.3).1976, 1978, 1980-1986, 1988-1990, 1994, 1995, 1997, 2001, 2004 122 • W-14  1982, 1985, 1996-1999, 2001-2005, 2007-2009 66-26  1981,1983,1984,1986,1994 62-36  1981, 1983-1987, 1991-1993, 1996, 1999, 2001, 2006 30-150  1980, 1981, 1983, 1994-1998, 2001, 2003 86 • W-179 Other initiatives emphasise the spatial and taxonomic component of data records.For example a previous version of the KRILLBASE data is stored as presence/absence data at SCAR-MarBIN http://www.scarmarbin.be/(De Broyer et al., 2014).SCAR-MarBIN from the Antarctic node of globalscale initiatives including the Ocean Biogeographical Information System (OBIS, http://www.iobis.org/)and the Global Biodiversity Information Facility (GBIF, http://www.gbif.org/).Previous versions of KRILLBASE are also available from CCAMLR (https://www.ccamlr.org/)and as part of a gridded global data set of macroplankton biomass (Moriarty et al., 2013).The present version augments this with 50 % more data.If necessary the abundance values can be converted to an approximation of biomass (mg C m −3 ) using, for example, the procedure of Moriarty et al. (2013), who first calculated the number of individuals per m 3 by dividing density by sampling depth (BOTTOM_SAMPLING_DEPTH_M-TOP_SAMPLING_DEPTH_M), and then applied fixed conversion factors of 63 and 24 mg C ind −1 for krill and salps respectively.
Two of the data sets used in KRILLBASE are available from their respective data websites (http://pal.lternet.edu/and https://swfsc.noaa.gov/aerd/).Although these do not include the standardised krill abundances available in KRILL-BASE, we refer the user to these two websites to obtain the most up-to-date source data from the Palmer-LTER and US-AMLR time series data.A separate data holding external to KRILLBASE, for example including winter krill data from US SO-GLOBEC, is at BCO-DMO http://www.bco-dmo.org/.The purpose of KRILLBASE is not to duplicate all of these efforts but to bring the krill and salp data together within a single file linked to metadata, in order hopefully to make it more user friendly.

Structure of KRILLBASE
It is important to differentiate "records" (i.e.rows of the data in KRILLBASE) from "net hauls" and from "sampling stations".The most common situation is for each record to represent a single net haul at a single station.There is one indexing column (labelled "STATION" and 28 further columns (i.e.fields) describing searchable and filterable date, time, position, sampling and environmental information as well as krill and salp abundance.The detailed description of each of these columns is provided in Table 2, while more detail on the nets used for sampling is in Table 3).
While most of the 14 543 records pertain to a single haul made at a station, there are actually four types of record.These are differentiated in the "RECORD_TYPE" column.The most common record, where a single net haul was taken at the station, is simply labelled "haul".The second category is labelled "stratified haul", (2243 records), and these hauls form part of a depth-resolved stratified series made at a station (e.g.0-50, 50-100, 100-200).The third category is "stratified pooled haul" (567 records) and these pool the abovementioned stratified hauls into a single combined "virtual haul", in this example from 0-200 m.The fourth category (48 records) is labelled "survey mean".In these the record provides the arithmetic mean abundance from multiple stations within a survey.While less than optimal, this aggregated information was the only data recoverable from the relevant surveys, which provided data from a valuable 1290 stations during the 1980s.
The krill data are presented as both the observed abundance (NUMBER_OF_KRILL_UNDER_1M2, no.m −2 ) and the abundance standardised relative to a benchmark (STAN-DARDISED_KRILL_UNDER_1M2, no.m −2 ), which is explained in Sect.2.7.The salp data are presented as observed abundance for all species combined, where an individual can be either a solitary oozoid or an individual within an aggregate chain (NUMBER_OF_SALPS_UNDER_1M2, no.m −2 ).
Overall there are 15 191 hauls in the database, from 13 542 stations.Of these hauls, 7295 have abundance information on both krill and salps.Others have absent data for either salps or krill, and these are flagged as "not a number" (NaN).This distinguishes it clearly from zero, which indicates that either no krill or no salps were caught.Absent data should therefore not be confused with zeros.
In stratified pooled haul records the NUM-BER_OF_KRILL_UNDER_1M2 and NUM-BER_OF_SALPS_UNDER_1M2 values are the sums of the component stratified hauls, but are not given (NaN) if data were missing from one or more of the stratified hauls.Location information is generally taken from the deepest component stratified haul.Time information is taken from the shallowest component stratified haul as krill densities are most sensitive to light levels in the surface layers.

Data processing and error checking
Stations were plotted one survey at a time to identify errors in station positions, stations plotting on land, or with latitude and longitudes transposed or with the wrong sign.Implausibly large distances between consecutive sampling points were identified and corrected.Suspiciously low densities were identified, based on known or estimated volumes filtered by the various nets and the assumption that no fewer than one krill could have been caught.This procedure identified and led to the correction of a major error made on one portion of the data when converting numbers of krill per 1000 m 3 to numbers of krill per m −2 .Tests of date, time and position coincidence led to the removal of several portions of data that had been entered twice with different station numbers.
The veracity of high krill abundances are hard to check, since densities in swarms have been estimated in the thousands per m 3 of water.The highest density values for krill and salps were 9384 and 5886 inds.m −3 , respectively.These form a natural tail to the frequency distribution of catch densities (Fig. 2) and are not isolated outliers.They are also www.earth-syst-sci-data.net/9/193/2017/ Earth Syst.Sci.Data, 9, 193-210, 2017 Table 2. Detailed description of the columns in KRILLBASE.

STATION
Unique identifier for each record (row).The first three letters identify the source of the data (starting letters of the name of the individual, national programme, or country which provided the data).The next four numbers identify the season of sampling (e.g. 1926spans October 1925to September 1926).The next three letters provide additional sample information, often referring either to the net type used or the name of the sampling survey.Additional characters at the end list the station numbers etc.These are, as far as possible, the same as used in the original sources, with British Antarctic Survey and Palmer LTER cruise station numbers being replaced by cruiseunique "event numbers".Records are typically resolved to station but see RECORD_TYPE for more information on resolution.

RECORD_ TYPE
This is an important field that will need screening before any use of the database.Records labelled "haul" are the usual situation meaning that the record refers to a single net haul.

DATE
The date of sampling, based on the dates provided to us (see "DATE ACCURACY" column).
DATE_ ACCURACY "D" means the exact day of sampling is known."M" means that we have been provided only with the month in which samples were taken, so the record's DATE value is entered as the middle of the month."Y" means only the year of sampling was provided, so the date is recorded here simply as 1 January (this affects one record only).NET_TIME This is the time of the haul: either the start, midpoint or end times of hauls were used, as provided to us.Absent data means no net time information was available, or it was not entered into the database because the station was already classified as either day or night (Discovery data net times are recorded in their published "Station Lists" but not entered in KRILLBASE).Net times for Stratified pooled hauls represent that of the shallowest net of the series.

GMT_OR_ LOCAL
Information on whether the time in the previous column is GMT (labelled "GMT").Data which were provided as local times with a stated offset to GMT have been converted to GMT.Data which were provided as local times with no offset have not been converted and are labelled "local".Absent data means there was no net time information.

DAY_NIGHT
This field indicates whether the net was hauled in daylight (labelled "day") or night time (labelled "night") and was used in the calculation of standardised krill densities.See DAY_NIGHT_METHOD for information on the source of these data.

DAY_NIGHT_ METHOD
Method used to determine whether the net was hauled in daylight or at night time, which depends on the time information available: 1 -DAY_NIGHT is based on calculated solar elevation determined using NET_TIME, 2 -DAY_NIGHT is as recorded in the ship's log, 3 -no DAY_NIGHT information was available, and standardised krill densities were adjusted for the probability that the haul was conducted in daylight.

MOUTH_AREA_ OF_NET_M2
This is a nominal mouth area of the net calculated from the net dimensions.It is typically the simple linear area of the mouth, but for RMT8 and 1 it is assigned as value of 8 and 1 respectively.Bongo nets are assigned as an area of both openings combined and LHPR is given as maximum net diameter -both of these are used to crudely compensate for the lack of towing bridles and wire/release gear directly in front of the net, as compared to the standard ring nets often of similar net dimensions.

BOTTOM_ SAMPLING_ DEPTH_M
Deepest sampling depth (m).Note that whilst most hauls were oblique, double oblique or vertical, a small minority were nearly horizontal, as shown by similar top and bottom depths.These would need to be screened out of nearly all analyses as they provide little information on numerical densities (no.m −2 ).

VOLUME_ FIL-TERED_M3
Volume of water (m 3 ) filtered by the net.This value is provided only when the value is provided with the density data.

N_OR_S_ POLAR_FRONT
Position (North or South) relative to the Antarctic Polar Front as published by Orsi et al. (1995).

WATER_DEPTH_ MEAN_ WITHIN_10KM
Mean water depth within a 10 km radius.In South Polar Stereographic projection, the stations were superimposed on the Gebco 2014 Grid bathymetry (http://www.gebco.net)and all pixels within a 10 km radius of the station were extracted.After removing data above sea level, the remaining pixel value for water depth was averaged.

WATER_DEPTH_ RANGE_ WITHIN_10KM
Depth range within a 10 km radius.In the procedure above, having removed pixels above sea level, the range in water depth was calculated as the difference between the shallowest and the deepest pixel.This will provide an index of even-ness of bathymetry (e.g.proximity to seamounts, canyons, continental slope).

CLIMATOLO-GICAL_ TEMPERATURE
Long-term average February sea-surface temperature for the sampling location.This is not the actual sea temperature at the time of sampling but a climatological mean sea-surface value for February, averaged over the years 1979 to 2014, based on data downloaded July 2016 from http://apps.ecmwf.int/datasets/data/interim-full-moda/levtype=sfc/.Data were provided on a 0.75 • by 0.75 • grid and we extracted mean values using the same 10 km buffer method used for the bathymetry.These values may indicate a relative thermal regime as a basis for station characterisation.

SD_OF_SURVEY_ MEAN_KRILL
The standard deviation of the krill densities extracted from the publications where the survey mean value of krill density is provided (see column RECORD_TYPE).

NUMBER_OF_ KRILL_UNDER_ 1M2
Numerical density, N, of numbers of postlarval krill under 1 m 2 (or, where based on a length frequency distribution as in the Discovery Investigations, it is krill > 19 mm in length).Where the numbers of krill n were provided per m 3 filtered, the density of krill was calculated based on top-sampling depth t and bottom-sampling depth b in metres as

STANDARDISED_ KRILL_UNDER_ 1M2
Standardised numerical density of postlarval krill.To reduce possible artefacts arising from differences in sampling method in KRILLBASE, this column presents krill density according to a single sampling method.This method is a 0-200 m night-time RMT8 haul on 1 January, following the standardisation method in Atkinson et al. (2008).See main text for more details.

CAVEATS
Any issues which might require particular caution when using the data (e.g.potential inaccuracies in estimated date or day/night or sampling depths outside of the normal range) are listed here.Default is blank.

NUMBER_OF_ SALPS_UNDER_ 1M2
The numerical density of salps, calculated as for krill.All individuals are counted, irrespective of which salp species or whether they are solitaries or components of aggregate chains.Standardised salp densities have not been calculated.

SOURCE
Information about the source of the data, including a citable reference where available.well within expected values (Hamner and Hamner, 2000).The highly patchy spatial distribution of each taxon results in right-skewed frequency distributions, with modes at zero, i.e. no krill caught (Fig. 2).This distribution type is an important consideration in analyses.
Water depths for every net sample were obtained by superimposing the stations on a GEBCO_2014 grid, version 20150318, www.gebco.netbathymetry using Arc GIS 10.4.1 and extracting the minimum, mean and maximum water depth within 10 km of each station.The bathymetric infor-mation derived from this provides an additional check of the veracity of position information.We identified 32 records in which the BOTTOM_SAMPLING_DEPTH_M was implausibly deeper than the maximum depth in the vicinity of the haul.For 10 of these, the longitude or latitude was reported as an integer.Integer coordinates and shallow bathymetry may indicate inaccuracies in position information.Users should be aware that inaccuracies in latitude can also affect the assessment of DAY_NIGHT information used in the calculation of standardised krill abundances.A couple of reported krill Earth Syst.Sci.Data, 9, 193-210, 2017 www catches were from warmer waters north of the Antarctic Polar Front, giving grounds for suspicion, for example of identification.We kept these records since expatriated individuals are a possibility and we did not want to judge the data provided.Data caveat issues are indicated and described in the fields DATE_ACCURACY and CAVEATS respectively.

Variation in sampling coverage and method
Figure 1 shows that KRILLBASE sampling is highly uneven, focusing on areas of fishing or historical interest to nations in the Atlantic sector (USA, Germany, UK, Poland, South Africa, Spain) or Indian sectors (Soviet Union, Japan, Australia).While Fig. 1   exist in important areas such as the Ross Sea, Weddell Sea and in large parts of the Pacific sector.
The composite nature of KRILLBASE means that the sampling methods vary. Figure 3 illustrates this with a circumpolar comparison of the seasonal timing of sampling (Fig. 3a), bottom depth of sampling (Fig. 3b) and mouth area of the net (Fig. 3c).Time of year of sampling has a potentially strong influence on the abundance of zooplankton, due to life cycle and behavioural traits such as seasonal vertical migration (Foxton, 1966;Atkinson et al., 2012a;Cleary et al., 2016).While samples were obtained during most months of the year, 89 % of the hauls were conducted in the period December to March (Fig. 4), with no longitudinal bias in timing (Fig. 3a).However, in sparsely sampled areas, particularly north of the Antarctic Polar Front, sample timing varied greatly, underlining the caution needed in interpreting these samples.The original objectives for using KRILLBASE did not require winter samples but some winter data are available from several key surveys (e.g.http://www.bco-dmo.org/)and could be included in subsequent updates of KRILLBASE.
Most hauls in KRILLBASE were made between the surface and 100-200 m depth, but vertical coverage varied  greatly between the component surveys, as indicated by the chequered colours of Fig. 3b.Some screening by the user is necessary to remove stations where an unrepresentative portion of the depth distribution was covered.Figure 5 summarises the vertical distribution of krill and salps where stratified series of net hauls were undertaken (269 krill stations and 563 salp stations).This shows the highest densities of krill in the top 200 m, with declining densities below this.KRILLBASE is suitable for exploring the horizontal distribution of krill in the important epipelagic zone, but is unsuitable to map horizontal distribution below 200 m.These deeper and near-seabed zones are being increasingly recognised as important habitats for krill (Gutt and Siegel, 1994;Clarke and Tyler, 2008;Schmidt et al., 2011;Cleary et al., 2016).
Salps have a deeper distribution than krill (Fig. 3) as a result of greater diel and seasonal vertical migrations (Foxton, 1966;Loeb and Santora, 2012).Care is therefore needed to avoid negative bias due to shallow net sampling.A standardi- sation method similar to that applied to krill may reduce these inconsistencies and provide a better picture of the spatial distribution of salps.

Inter-annual coverage
Figure 6 divides the Southern Ocean into broad sectors to illustrate the inter-annual coverage of sampling.The coverage for salps broadly follows that for krill, with good coverage in the Atlantic sector from 1926 to 1938 and after 1976.In the Indian Ocean sector some data exist from the late 1930s when "Discovery" sampling became circumpolar, reasonable coverage occurred from 1981 to the mid-1990s, but few data have been collected there since.While coverage in the Pacific sector is too sporadic to document time trends, data for the other two sectors are sufficient to examine sectorial patterns of inter-annual and decadal-scale variability of both krill and salps.
The survey mean data are included in Fig. 6, and they provide important information for the period before coordinated Earth Syst.Sci.Data, 9, 193-210, 2017 www.monitoring programmes.These data can be included in regional scale analyses (e.g.time series analyses), but since the data pertain only to the whole survey and not the component stations, care is needed when interpreting the data at finer scales than the 3 • latitude by 9 • longitude grids illustrated.

Standardisation: methods
The compiled data represent a range of sampling methods with different net types, sampling depths, times of day and times of year (Fig. 3).Such differences in sampling strategy could potentially bias the outcome of analyses.For example, differences in net mouth size will lead to variable avoidance and the mesh size will affect retention.Differences in net geometry, towing speed and trajectory will further affect catches, as will light levels and swarm packing density (Hamner and Hamner, 2000;Everson and Bone, 1986;Krag et al., 2014).For example, catchability decreases as light levels increase, meaning that there can be a latitudinal effect because summer days are much longer at high latitudes (Fig. 7).These issues were recognised by Marr (1962) and Mackintosh (1973), who adjusted the densities accordingly when producing circumpolar distribution maps.
To minimise the influence of sampling differences, our database includes both the raw numerical abundances of krill and values standardised to a single sampling method.We calculated the standardised krill abundances using the process and conversion factors described in the supplementary appendix of Atkinson et al. (2008).The standardised abundance (STANDARDISED_KRILL_UNDER_1M2) is an estimate of the krill abundance that would have been observed if the haul had conformed with a sampling method consisting of a nighttime haul on 1 January, fishing to a depth of 200 m with a mouth area of 8 m 2 .This strategy achieves near-maximum krill catch that is possible with scientific nets.Standardisation was implemented by multiplying the raw abundances (NUMBER_OF_KRILL_UNDER_1M2, N) by conditional conversion factors as follows: where N is the standardised krill abundance, B is the bottom sampling depth, X is a scalar to adjust the day-to-night conversion factor (2.255) and K pred is the expected krill abundance based on a general linear model in which mouth area and time of year are the independent variables (see Table 4 and Atkinson et al., 2008, for further details).X = 1 when the net was hauled in daylight and X = 1/2.255when it was hauled at night.We also calculated standardised krill densities for nets where there was insufficient information to determine whether hauling occurred in daylight or at night.In these cases the value of X is the probability that the net was hauled in daylight (i.e.day length in hours / 24).The revision of KRILLBASE included reassessment of the DAY_NIGHT field (indicating whether the net was hauled in the daylight or at night; see Table 5).Where valid sampling time information was available (consisting of a GMT NET_TIME or a local NET_TIME and sufficient information to adjust to GMT), we used the Twilight Excel workbook available from http://www.ecy.wa.gov/programs/eap/ models.html to determine whether the haul was conducted in daylight (defined by a solar elevation > −0.833 • ).Where no valid sampling time information was available, but there was an indication of day or night in the original data, we used this information.Where it was not possible to make this assessment because of insufficient information, we used the Twilight Excel workbook to calculate day length for the sampling date and location, which was then used to adjust the standardised krill density as described above.As this type of standardised krill abundance (indicated by a value of 3 in the DAY_NIGHT_METHOD field) uses a different time of day adjustment from other standardised krill abundances it is good practice to assess its influence on results.
2.8 Standardisation: caveats on the use of standardised krill densities KRILLBASE includes standardised krill abundance information for every haul, stratified pooled haul and survey mean except those with TOP_SAMPLING_DEPTH_M deeper than 50 m (because hauls which exclude the surface layers are not comparable with those that include these layers).These standardised densities will be most reliable when the information underlying the standardisation is accurate.Thus where dates or times have been estimated (for example for survey mean data) the database provides information on the accuracy of date information (DATE_ACCURACY) and the type of time information (DAY_NIGHT_METHOD) available in each record.
Although the ideal method for depth standardisation is to make all hauls equivalent to a haul sampling from 0 to 200 m depth, the standardisation described in Atkinson et al. (2008) and used here, is a partial solution which standardises bottom-sampling depth to 200 m when the actual value is less than 200 m.It does not exclude krill caught deeper than 200 m, where krill densities are generally lower (Schmidt et al., 2011), nor does it adjust for nets that did not sample to the surface (TOP_SAMPLING_DEPTH greater than 0 m).Users are advised to screen the data to ensure that topsampling depths are consistent with their requirements, noting that there are 691 hauls in the current version of KRILL-BASE have top-sampling depths deeper than 5 m and Atkinson et al. (2008) excluded such hauls before calculating the conversion factors.
Date information affects the standardisation through the adjustments for time of year and time of day.Atkinson et al. (2008) derived the conversion factors from a data set where the latest sampling date was 26 April.Recent KRILL-BASE updates include hauls taken as late as 30 August, but we have not provided standardised krill densities for sampling dates after 30 April because the standardisation is extremely sensitive to dates after this point (e.g. the time-ofyear adjustment for 30 August increases krill density by a factor of 3834, compared to a factor of 10 for 26 April, and a factor of 1.16 for 31 January).This strong effect of time of year of sampling on abundance likely reflects both mortality and seasonal vertical migration of krill out of the surface layer late in the season (Cleary et al., 2016) Inaccuracies in the date will also affect the time-of-year adjustment applied in standardisation.In the single record where the date is given only to the year, the assigned date was 1 January, meaning that there is no time-of-year adjustment and standardised density is conservative.When the date is given for month as well as year, the assigned full date is the middle of the month, meaning that true dates further away from 1 January will be treated more conservatively as a con-Earth Syst.Sci.Data, 9, 193-210, 2017 www.earth-syst-sci-data.net/9/193/2017/ sequence and true dates closer to 1 January will be treated less conservatively.The effect of any date inaccuracies increases with time from 1 January.The DATA_CAVEATS field in the database clearly indicates for each row which, if any, of the above caveats applies.

Results and discussion
3.1 Effects of heterogeneous data sources and standardisation: spatial effects While hauls with zero krill remained as such, median standardised krill abundance of positive hauls was 2.2 times greater than that of unstandardised values.The overall circumpolar pattern of relative abundance is similar whether based on raw or standardised abundances but the detail in some areas does differ.This is likely due to longer summer days at higher latitudes (requiring upwards adjustment of most catches to night values) or the localised use of poor sampling combinations (e.g.smaller nets and/or early or late season sampling).
The patchy distributions of krill and salps and spatial differences in sampling density influence the spatial patterns shown in the maps.A few grid cells suggest extremely high krill or salp abundance, but some of these cells only include a few stations.Conversely, cells suggesting absence frequently have too few stations for a reliable picture.Users need to allow for variable sampling coverage, and while our standardisation attempts to reduce net sampling inconsistencies, it does not adjust for variable precision.

Effects of heterogeneous data sources and standardisation: temporal effects
The South Georgia area exemplifies the krill-based ecosystem and this has been sampled for many years (Murphy et al., 2007).We have therefore selected a subset of KRILLBASE in this area to show how sampling method can vary from year to year and how this could affect time trends (Fig. 9).This area has been sampled with a wide variety of methods since the 1920s, and the mean krill abundance varies greatly from year to year due to recruitment variability (Fig. 9a; see also Murphy et al., 2007;Fielding et al., 2014).While the standardised annual mean krill abundances are typically greater than the unstandardised values, the offset varies substantially.This is for a number of reasons, including variable mouth areas and sampling depths of the net (Fig. 9b) and variable time of year and time of day of sampling (Fig. 9c).For example, net mouth area is generally larger (albeit more variable) in the modern post-1970s era, concomitant with an increase in bottom-sampling depth of the nets.Likewise, during the modern era, the proportions of hauls in mid-summer and at night have increased.
The above factors are included in the standardisation process, but other issues may be important when deciding how to screen data and interpret time trends from a heterogeneous data set such as KRILLBASE.One factor is the density of sampling coverage within any given year.We have not plotted years when there are very few stations sampled (< 10 stations) because a patchy swarming species like krill is likely to be missed altogether by such limited sampling.However, the number of stations sampled varies greatly from year to year (Fig. 6) so we have scaled the size of the symbols according to numbers of stations to illustrate the variable confidence in the annual means.
A second important feature may be the geographical coverage of sampling (Fig. 9d).Even within a defined area such as South Georgia, the emphasis of sampling campaigns may change.For example 1926 and 1927 were local krill surveys aimed for management of the whaling industry then based at South Georgia, but throughout the 1930s "Discovery" sampling became increasingly circumpolar.while monitoring in the 1990s and 2000s was more shelforientated.

Data availability
The comprehensive data descriptions in this paper allow potential users to understand the breadth of the database and the main caveats that need to be considered to ensure that interpretations are realistic and valid.1) regarding queries.This data paper in addition to the data doi should be cited as the metadata and the source of the data, to allow traceability in the use of this database.This will hopefully provide leverage for obtaining future funding to continue rescuing and updating valuable historical data sets from the Southern Ocean.As a final word we urge users to take a few minutes to consult the metadata, in particular Table 2, since almost every use of KRILLBASE will require an initial screening of some of the records.

Uses and limitations of KRILLBASE
The first version of KRILLBASE was used by Atkinson et al. (2004) to quantify the circumpolar distribution of krill and salps, examine regional trends in their densities and determine inter-annual relationships between krill density and winter sea ice cover.Inter-annual changes in mean krill abundance were subsequently related to temperature by Whitehouse et al. (2008), to whale dynamics by Braithwaite et al. (2015) and to the dynamics of other so-called wasp-waist species by Atkinson et al. (2014).The fact that krill and salp abundances vary so much between years is an advantage for this inter-annual scale of analysis, because the signal is stronger than the noise.The spatial component of KRILLBASE has been used more widely.Circumpolar distributions have been used as a context and validation for various models and analyses including biogeochemical carbon cycling (Moriarty, 2009), krill and climate change (Flores et al., 2012;Hill et al., 2013;Piňones and Federov, 2016), population connectivity (Thorpe et al., 2007;Siegel and Watkins, 2016), predator foraging (Pangerc, 2010) and vertical and horizontal krill habitat analyses (Atkinson et al., 2008;Schmidt et al., 2011).These studies have tended to focus on large scales, but smaller-scale analyses of well-sampled areas (as shown in Fig. 10) are amenable to KRILLBASE, for example to interpret predator foraging areas.The caveat here is that these maps are not synoptic, but instead are more akin to probability maps of where krill or salps occur, providing a context for more synoptic snapshots from surveys (Siegel et al., 2004;Kawaguchi et al., 2004).
In parallel to expansion of the abundance component of KRILLBASE, we are generating a large database on krill length frequency, sex, and maturity stage from scientific and fisheries data, a work still in progress.Combining the length frequency and abundance components provides insights into biomass and production at large scales, allowing a degree of scaling-up of acoustics-derived biomass surveys (Atkinson et al., 2009).The sex/length frequency component has since been used, for example, to relate circumpolar trends in body length to feeding conditions (Schmidt et al., 2014), and to examine sex-related changes in seasonal growth and shrinkage (Tarling et al., 2016).
In comparison to krill, fewer studies have used the salp component of KRILLBASE.Lee et al. (2010) examined inter-annual variability in krill and salps simultaneously, emphasising the opposite nature of the trends observed in the two taxa.Given the fact that about half of the current KRILL-BASE net hauls have both krill and salps recorded, a simultaneous evaluation of the two taxa would be valuable.In any of these analyses, however, we emphasise that great care is needed when interpreting time trends, in order to prevent aliasing of real patterns with differences in sampling methods.This applies equally to salps and to krill, for example, the seasonal and diel vertical migrations of salps mean they are prone to under-sampling by shallow nets (Fig. 4).
An additional caveat concerns the issues of net sampling efficiency for mobile species such as krill.RMT8 catches during night-time were set as our benchmark for standardisation because they were the most efficient means of capturing krill, but even these catches were likely to have underestimated absolute abundance.This is due to both net avoidance and escapement of the smallest juveniles through the meshes.Nevertheless, the overall circumpolar biomass of krill based on averaged KRILLBASE data is 379 Mt, so it is unlikely that this sampling method is yielding order of magnitude underestimates (Atkinson et al., 2009).KRILLBASE may provide insights on the relative distribution and temporal variation in krill density, but modern acoustic methods calibrated with nets are the accepted method for determining krill biomass (Fielding et al., 2014).Integrating the assessments from these two fundamentally different types of sampling represents the most robust practice to achieve largescale and long-term estimates of krill biomass.

Figure 1 .
Figure 1.Distribution of sampling stations in KRILLBASE, showing generally elevated sampling effort in and around designated areas of protection and management.These stations may have krill or salp data or both; Fig. S1 in the Supplement provides the distribution of just the krill sampling stations.

Figure 2 .
Figure 2. Frequency distribution of krill and salp abundances in the database.The data were filtered to remove stratified hauls before plotting the frequency of remaining hauls in relation to logarithmic bins.Data are presented for (a) krill raw (unstandardised) abundance, (b) krill standardised abundance and (c) salp (unstandardised) abundance.

Figure 3 .
Figure 3. Circumpolar variation in sampling method.This plot is based on all data in KRILLBASE, whether for krill or salps or both.(a) Time of year of sampling (mean day from 1 October).(b) Bottom depth of sampling.The data set plotted includes the stratified pooled hauls and thus excludes their component stratified hauls (see Sect. 2.3).(c) Mean mouth area of the net, based on the nominal values presented for each net type in Table3.Antarctic Polar Front position is fromOrsi et al. (1995).

Figure 4 .
Figure 4. Relative frequency of stations sampled within each month of the year.

Figure 5 .
Figure 5. Vertical distribution of krill and salps based on 793 stratified krill hauls and 2130 stratified salp hauls.Given the nonstandard depth horizons between the various surveys sampling in this manner, the data were first subdivided into a nominal seven categories of mean sampling depths, namely 0-50, 50-100, 100-150, 150-200, 200-300, 300-500 and > 500 m.Mean krill or salp densities are presented in each of these mean depth groups, plotted against mean sampling depth within each depth band.

Figure 6 .
Figure 6.Inter-annual sampling coverage.Number of stations sampled south of the Antarctic Polar Front in each austral season (October to following September).These are presented for (a) the Atlantic sector (nominally defined as 90 • W-10 • E), (b) the Indian sector (10-120 • E) and (c) the Pacific sector (120 • E-90 • W).

Figure 8 .
Figure 8. Circumpolar distribution maps of krill based on (a) unstandardised krill densities (no.m −2 ), (b) standardised krill densities and (c) unstandardised salp densities, showing the stations sampled for these.All maps are South Polar Stereographic projection with grid size of 3 • latitude by 9 • longitude.Positions of krill stations are in Fig. S1 in the Supplement.The legend values and colour codings of cells refer to the arithmetic mean krill densities recorded within the cell.

Figure 8
Figure8compares the circumpolar distribution of krill and salps, allowing a comparison between the standardised and unstandardised krill values obtained from KRILLBASE.While hauls with zero krill remained as such, median standardised krill abundance of positive hauls was 2.2 times greater than that of unstandardised values.The overall circumpolar pattern of relative abundance is similar whether based on raw or standardised abundances but the detail in some areas does differ.This is likely due to longer summer days at higher latitudes (requiring upwards adjustment of most catches to night values) or the localised use of poor sampling combinations (e.g.smaller nets and/or early or late season sampling).

Figure 9 .
Figure 9. Inter-annual variability in sampling.Year-to-year variation in net sampling, and its effect on the difference between standardised and unstandardised krill density.Austral season is plotted on the x axis of all panels with a vertical line demarcating the Discovery sampling era from the post-1975 sampling era.(a) interannual variation in arithmetic mean krill densities in the greater South Georgia area (30-40 • W, 50-60 • S, based on hauls from October to April with a top-sampling depth < 20 m and bottomsampling depth > 50 m following Atkinson et al., 2008).While we have not plotted data with fewer than 10 hauls in any year, the symbols are in three sizes to illustrate the variability in sampling effort -smallest: 10-20; medium: 20-50; and largest > 50 hauls per season.(b) Inter-annual variability in mean mouth area of the net and mean bottom-sampling depth of the net from the hauls in panel (a).(c) Inter-annual variability in Julian day of sampling (days from 1 October) and the percentage of night-time hauls.(d) Percentage of hauls over continental shelves of the sampling area, defined as water depth < 1000 m.

Figure 10 .
Figure 10.Basin-scale krill (a, b) and salp distribution (c, d) within two well-studied sectors of the Southern Ocean, plotted on a finer, 1 • latitude by 2 • longitude grid to highlight habitat differences between the two taxa.
"Survey mean" represents a record where the krill or salp density represents an arithmetic mean of a group of stations whose central position and sampling point are thus provided in the database with less accuracy then the other records.Survey means are given only when it was not possible to obtain station-specific data."Stratified haul" represents a haul, usually within the top 200 m, which forms part of a stratified series (e.g.0-50, 50-100, 100-200 m)."Stratified pooled haul" represents a record that integrates these respective stratified hauls, whereby the krill or salp densities from the component nets have been summed (in this example into an equivalent 0-200 m haul).Thus to avoid double counting, any use of the data should sift out either stratified hauls or stratified pooled hauls.
NUMBER_ OF_STATIONSFor Survey mean data (see RECORD_TYPE) this refers to the number of stations that have been averaged to provide the krill or salp density values.NUMBER_ OF_NETSThis refers to the number of sequentially fished nets included in the estimate (e.g. the value would be 3 for a stratified pooled haul consisting of a stratified series sampling 0-50, 50-100 and 100-200 m, and it would be 32 for a survey mean which averages 32 hauls).A LHPR haul counts as one net despite multiple gauzes being cut.This value is also 1 for a paired bongo haul (two nets fished concurrently).LATITUDESouth is negative.Units are decimal degrees.LONGITUDEWest is negative.Units are decimal degrees.SEASONThis is the austral "summer" season of sampling.For example the 1926 season spans all data from 1 October 1925 through to 30 September 1926.DAYS_FROM_ 1 ST _OCT This is the day of sampling during the austral season.Therefore 1 October is DAYS_FROM_1 ST _OCT = 1.The value for dates after 28 February vary depending on whether they occur during a leap year.

Table 2 .
Continued.briefname for the sampling net used.See Table3for more detailed descriptions of each net.

Table 3 .
Nets used in KRILLBASE.The nets are listed in alphabetical order.
Change in day length with time of year at various latitudes, indicating the effect of date inaccuracies on time of day adjustments made during standardisation of krill abundance. earth-syst-sci-data.net/9/193/2017/

Table 5 .
Derivation of Day or night information.