The Theia Snow
collection routinely provides high-resolution maps of the snow-covered area from Sentinel-2 and Landsat-8 observations. The collection covers
selected areas worldwide, including the main mountain regions in western
Europe (e.g. Alps, Pyrenees) and the High Atlas in Morocco. Each product of
the Theia Snow collection contains four classes: snow, no snow, cloud and no data.
We present the algorithm to generate the snow products and provide an
evaluation of the accuracy of Sentinel-2 snow products using in situ snow depth
measurements, higher-resolution snow maps and visual control. The results
suggest that the snow is accurately detected in the Theia snow collection
and that the snow detection is more accurate than the Sen2Cor outputs (ESA
level 2 product). An issue that should be addressed in a future release is
the occurrence of false snow detection in some large clouds. The snow maps
are currently produced and freely distributed on average 5 d after the image
acquisition as raster and vector files via the Theia portal
(10.24400/329360/F7Q52MNK).
Introduction
The snow cover is an important driver of many ecological,
climatic and hydrological processes in mountain regions and in high latitude
areas. In these regions, in situ observations are generally insufficient to
characterize the high spatial variability of the snowpack properties. The
snow cover was included as one of the 50 essential climate variables
(ECVs) to be monitored by satellite remote sensing by the Global Observing
System for Climate (GCOS) in accordance with the Committee on Earth
Observation Satellites (CEOS) agencies. ECVs are intended to support the work
of the United Nations Framework Convention on Climate Change and the
Intergovernmental Panel on Climate Change.
The snow cover is not a variable sensu stricto, but an object which
can be characterized through many variables, including snow-covered area
(SCA), fractional area (fSCA), albedo, liquid water content, snow depth and
snow water equivalent . An international survey by the
Cryoland consortium (http://cryoland.enveo.at/consortium, last access:
12 April 2019) shed light on the user
requirements for snow services based on satellite remote sensing. SCA and fSCA
products were ranked as the most important by the respondents
. The survey also indicated the need for operational
products provided on a daily basis, with latency times shorter than 12 h.
High-resolution data down to 50 m resolution were sought by road and
avalanches authorities . The respondents requested
regional products, e.g. on the scale of entire mountain ranges like the Alps or
even the whole of Europe, and preferred the Universal Transverse Mercator (UTM) projection . At
the national level in France there is also a need for operational high-resolution snow-covered area maps as revealed by the recent roadmap for
satellite applications issued by the French Government Plan d'applications satellitaires 2018,. Based on a wide panel of
end users, this plan selected the monitoring of the snow-covered area in French
national parks as a one of the key actions which should be achieved using
Earth observation satellites in the near future (2018–2022).
Operational SCA maps have been generated from satellite observations since
the 1960s . Current SCA and fSCA products are
mostly derived from MODIS data
,
but their spatial resolution (1 km to 250 m) can be too coarse for various
applications, especially in mountain regions where the snow cover properties
vary on scales of 10 to 100 m . High-resolution
(30 m) snow cover maps can be generated from Landsat images but the low
temporal revisit of the Landsat mission (16 d) is an important limitation
for snow cover monitoring, especially considering that the cloud probability
can exceed 50 % in temperate mountains .
Since 2017, with the launch of Sentinel-2B, the Copernicus Sentinel-2 mission
offers the unique opportunity to map the snow cover extent at 20 m
resolution with a revisit time of 5 d (cloud permitting)
. The combination of Sentinel-2 and Landsat-8 data
provides the opportunity for even more frequent observations of the snow
cover with a global median average revisit interval of 2.9 d
.
The principles of snow detection from multispectral optical imagery is well
established since the pioneering work of with the
Landsat Thematic Mapper. Today, the challenge for scientists and end users is
rather to cope with the large amount of data that is generated by a mission
like Sentinel-2. The generation of a single snow map from a Sentinel-2
level-1C tiled product (monodate orthorectified image expressed in
top-of-atmosphere reflectance) involves downloading a
700+ Mb
zip file (once uncompressed, the product contains 12 folders and 108 files,
including 15 raster files and 13 XML metadata files). Since March 2018, ESA
began the operational processing of level 2A (L2A) products (monodate
orthorectified images expressed in surface reflectance, including a cloud
mask). Each ESA L2A product also includes a snow cover mask. However, the
size of a single L2A product before unzipping can exceed 1 Gb (15 folders,
137 files). In addition, the quality of the ESA L2A snow mask can be improved
for two reasons: (i) ESA's L2A algorithm is a general-purpose algorithm, which
was not optimized for snow detection, and (ii) because ESA's L2A processor treats
each image independently i.e. mono-date
approach,, the output cloud mask has a lower accuracy
than a cloud mask generated by a multi-temporal algorithm
.
In this article we introduce the Theia Snow collection, a new collection of
snow cover maps, which are derived from Sentinel-2 at 20 m resolution in an
operational context. The Theia Snow collection was recently upgraded to also
provide snow cover maps at 30 m resolution from Landsat-8 using the same
algorithm. The data are currently being produced and freely distributed via
the Theia land data centre.
Theia is a French inter-agency
organization designed to foster the use of Earth observation for land surface
monitoring for academics and public policy actors.
The snow retrieval
algorithm is based on the Normalized Difference Snow Index
but also uses a digital elevation model to
better constrain the snow detection. We first describe the algorithm
(Sect. ) and its implementation in the let-it-snow processor
(LIS, Sect. ). Then, we provide a detailed description of the
product characteristics (coverage, period, format, etc.;
Sect. ). In Sect. , we present an evaluation of
Theia snow products using in situ snow depth measurements, very high-resolution clear-sky snow maps and visual control. We finally discuss the
main limitations and potential applications of the Theia Snow collection.
AlgorithmScope
An algorithm was designed to determine the snow presence or absence from
Sentinel-2 and Landsat-8 observations outside areas of dense forest with
the following requirements:
It should be scalable, i.e. allow the processing of large areas (104 km2)
with a reasonable computation cost (typically less than 1 h for a single
product).
It should be robust to seasonal and spatial variability of the snow
cover and land surface properties.
It should maximize the number of pixels that are classified as snow or no snow.
It is preferable to falsely classify a pixel as cloud than falsely
classify a pixel as snow or no snow.
Input
The algorithm works with multispectral remotely sensed images, which include
at least a channel in the visible part of the spectrum and a channel near
1.5 µm (referred to as shortwave-infrared or SWIR). It takes
the following as input:
a L2A product, including
the cloud and cloud shadow mask (referred to as an “L2A cloud mask” in the following),
the green, red and SWIR bands from the flat-surface reflectance product.
These images are corrected for atmospheric and terrain slope effects
. The slope correction is important in mountain regions since
it enables us to use the same detection thresholds whatever the sun-slope
geometry.
A digital elevation model (DEM).
Pre-processing
The red and green bands are resampled with the cubic method to a pixel size
of 20 m by 20 m to match the resolution of the SWIR band. The DEM is
generated from the Shuttle Radar Topography Mission seamless DEM
by cubic spline resampling to the same 20 m
resolution grid.
Algorithm descriptionSnow detection
The snow detection is based on the Normalized Difference Snow Index
NDSI, and the reflectance in the red band. The
NDSI is defined as
NDSI=ρgreen-ρSWIRρgreen+ρSWIR,
where ρgreen (or ρSWIR) is the
slope-corrected surface reflectance in the green band (or SWIR at
1.6 µm). The NDSI expresses the fact that only snow surfaces are
very bright in the visible and very dark in the shortwave infrared. Turbid
water surfaces like some lakes or rivers may also have a high NDSI value;
hence a additional criterion on the red reflectance is used to avoid false
snow detection in these areas. A cloud-free pixel is classified as snow
if the following condition is true:
NDSI>niandρred>ri,
where ni and ri are two parameter pairs (i.e. i∈{1,2}) as
explained below. Otherwise the pixel is marked as “no snow”. The values of
the parameters are provided in Table .
Snow line elevation
The algorithm works in two passes. The snow detection
(Sect. ) is performed a first time using parameters n1
and r1, which are set in such way as to minimize the number of false snow
detections (Table ). This first pass enables us to estimate the
minimum elevation of the snow cover zs, above which a second pass
will be performed with less conservative parameters values n2 and r2
(Table ). An example of the evolution of the snow line
elevation over the tile 31TCH in the Pyrenees is shown in Fig.
and the corresponding snow maps are shown in Fig. . Based on
regional analyses of the snow line elevation variability in mountain ranges,
e.g., we assume that large
altitudinal variations of the snow line elevation are not likely on the scale
of a tile of 110 km by 110 km. Therefore, the minimum snow elevation
zs is considered uniform on the scale of the processed image.
Time series of snow and cloud cover area
distribution by elevation band over tile 31TCH (tile location in
Fig. ). The dashed red line indicates the position of
zs as determined by the LIS algorithm. The corresponding snow
maps are shown in Fig. .
Time series of snow maps over tile 31TCH (tile location in
Fig. ). The corresponding charts of snow and cloud cover area
distribution by elevation band are shown in Fig. .
Before proceeding to pass 2, the total snow fraction in the image after
pass 1 is computed. If this snow fraction is below ft
(Table ), then pass 2 is skipped, as the sample of snow
pixels is not considered statistically significant for determining the snow
line elevation. Otherwise, the pass 2 is activated. For that purpose, the DEM
is used to segment the image in elevation bands with a fixed height of dz
(Table ). Then, the fraction of the cloud-free area that is
covered by snow in each band (after pass 1) is computed. The algorithm finds
the lowest elevation band b at which the snow cover fraction is greater
than a given fraction fs (Table ). The value of
zs is the lower edge of the elevation band that is two elevation
bands below band b.
Cloud mask processing
The cloud mask in the input L2A product is conservative because (i) it is
computed at a coarser resolution and (ii) it was developed to remove surface
reflectance variations due to cloud contamination. However, the scattering of
some thin clouds is low in the SWIR, green and red bands, which are used for
snow detection (Sect. ). Hence, the human eye can see
the snow (or the absence thereof) through these semi-transparent clouds in a
colour composite. In addition, the L2A cloud mask tends to falsely classify
the edges of the snow cover as cloud. Therefore, the algorithm includes some
additional steps to recover those pixels from the L2A cloud mask and
reclassify them as snow or no snow. This step is important because it
substantially increases the number of observations as specified in
Sect. .
A pixel from the L2A cloud mask cannot be reclassified as snow or no snow if
any of these conditions are satisfied:
it is coded as “cloud shadow” in the L2A cloud mask;
it is coded as “high-altitude cloud” (or “cirrus”) in the L2A cloud mask;
it is not a “dark cloud” (see below).
The cloud shadows are excluded because the signal-to-noise ratio can be very
low in these areas. The high clouds are excluded because they can have a
similar spectral signature as the snow cover, i.e. a high reflectance in the
visible and a low reflectance in the SWIR. This type of cloud are detected in
Sentinel-2 and Landsat-8 L2A products based on the spectral band centred on
the 1.38 µm wavelength .
We select only the dark clouds as possible reclassification areas,
because the NDSI test is robust to the snow–cloud confusion in this case. The
dark clouds are defined using a threshold in the red band after
downsampling the red band by a factor rf using the bilinear method. This
resampling is applied to smooth locally anomalous pixels, following the MAJA
algorithm, which performs the cloud detection at 240 m for L2A
products . Therefore, if a (non-shadow, non-high-cloud) cloud
pixel has a red reflectance at this coarser resolution that is lower than
rD (Table ), then it is temporarily removed from
the cloud mask and proceeds to the pass 1 snow detection. The new cloud mask
at this stage is the pass 1 cloud mask in Fig. .
After passing the passes 1 and 2, some former cloud pixels (pixels that were
originally marked as cloud in the L2A cloud mask) will not be reclassified as
snow because they did not fulfil the conditions of Eq. ().
These pixels are flagged as cloud in the final snow product if they have a
reflectance in the red that is greater than rB (Table ).
Otherwise they are classified as no snow. Here, the red band at 20 m
resolution is used to allow an accurate detection of the snow-free areas
near the snow cover edges that the L2A product tend to falsely mark as cloud.
The resulting cloud mask is the pass 2 cloud mask in
Fig. .
Flow chart of the snow detection algorithm. Table
gives the description and value of the algorithm parameters (written in red
in this chart). MUSCATE is the scheduler which manages the L2A production for
Theia.
LIS algorithm parameters description and default values for Theia's
L2A Sentinel-2 (and Landsat-8) products.
ParameterDescriptionValuerfResize factor to produce the downsampled red band12 (8)rDMaximum value of the downsampled red-band reflectance used to define a dark cloud pixel0.300n1Minimum value of the NDSI for the pass 1 snow test0.400n2Minimum value of the NDSI for the pass 2 snow test0.150r1Minimum value of the red-band reflectance the pass 1 snow test0.200r2Minimum value of the red-band reflectance the pass 2 snow test0.040dzSize of elevation band in the DEM used to define zs100fsMinimum snow fraction in an elevation band to define zs0.100fctMinimum clear pixels fraction (snow and no snow) in an elevation band used to define zs0.100ftMinimum snow fraction in the image to activate the pass 2 snow test0.001rBMinimum value of the red band reflectance to return a non-snow pixel to the cloud mask0.100Implementation
The algorithm was implemented in an open-source processor called let-it-snow
(LIS). LIS is written in Python 2.7 and C++ and relies on the Orfeo
ToolBox and GDAL libraries. GDAL is used for input and output operations, image
resampling and metadata access . Orfeo Toolbox enables us to make
the computations with good performances under memory constraints (i.e. the
amount of available memory can be predefined), which is critical for
operational production . LIS takes as input a digital
elevation model and a Theia L2A product from Sentinel-2 MSI, Landsat-8 OLI,
SPOT-4 HRVIR or SPOT-5 HRG . The spatial resolution and
central wavelength of the spectral bands used by LIS for each sensor are
given in Table . It is also compatible with Landsat-8 USGS
level 2 products and Sentinel-2 ESA's Sen2Cor level 2A
products .
The parameter values were set based on previous studies with Landsat
and by visually
checking many snow maps and snow fraction histograms. From this set of a
priori parameter values, only r2 was adjusted based on the analysis of a
first batch of products to enhance the snow detection on shaded slopes.
Spatial resolution and central wavelength of the spectral bands used
by LIS for each compatible sensor.
SensorGreen bandRed bandSWIR bandSPOT-4 HRV20 m, 0.55 µm20 m, 0.65 µm20 m, 1.6 µmSPOT-5 HRG10 m, 0.55 µm10 m, 0.65 µm10 m, 1.6 µmSentinel-2 MSI10 m, 0.56 µm10 m, 0.66 µm20 m, 1.6 µmLandsat-8 OLI30 m, 0.56 µm30 m, 0.65 µm30 m, 1.6 µmData description
The Theia Snow collection data can be freely accessed using a web browser by
connecting to http://theia.cnes.fr (last access: 12 April 2019) or using this free
command-line tool: https://github.com/olivierhagolle/theia_download
(last access: 12 April 2019). The user must first create an account in Theia to have the
permission to download the data.
The Theia Snow collection is organized following the tiling system of
Sentinel-2 (Fig. ). Each tile covers a square area of 110 km by
110 km in the UTM coordinate system. There
are currently 127 tiles in the Theia Snow collection, mostly over the mountain
ranges of western continental Europe (France, Spain, Switzerland, Italy,
western Austria). The snow products are also provided for southern Quebec
in Canada, the Issyk-Kul lake catchment in Kyrgyzstan and the Kerguelen Islands.
These extra-European sites were selected to support specific ongoing projects
in relation with Theia. An opportunistic tile is produced in central Nevada,
USA, because this tile is already available as a level-2A version for calibration
purposes. The Theia Snow collection covers a number of mountain ranges with
seasonal snow cover (Table ).
Coverage of the Theia Snow collection tiles.
Non-exhaustive list of mountain ranges covered by the Theia Snow
collection and corresponding tiles. The tiles marked with an asterisk were
made available before the start of the operational production (see
Sect. ).
* These tiles
can be viewed on an interactive map on this page:
https://theia.cnes.fr/atdistrib/rocket/\#/sites (last access: 12 April 2019).
Map of the first two products in the Theia Snow collection and
comparison with MOD10A1.005 snow product of the same day. The map background
is the colourized version of the EU DEM.
The Theia Snow collection products are available for 127 tiles, starting on
December 2017. At the time of writing there were over 17 000 available
products. A set of 1300 products tagged as version 1.0 are available between
November 2015 and June 2017 for a subset of 15 tiles
(Table ). These products were produced in pre-operational
using a different configuration of LIS, which underestimated the snow-covered
area on shaded slopes. All other products were generated with the same
configuration as presented in this article. The products after December 2017
can still have a different version tag because of changing versions in the
upstream L2A product. However, the different L2A versions have a low impact
on the snow products. It is expected that the cloud mask has improved
gradually with time, due to algorithms enhancements in the successive L2A
versions and also because of the increasing revisit frequency of the
Sentinel-2 mission (the full 5 d revisit became nominal above all land
masses in March 2018). The different versions of the Theia Snow collection are
listed and updated on this page:
http://www.cesbio.ups-tlse.fr/multitemp/?page_id=13180 (last access: 12 April 2019).
The snow products are currently routinely generated at CNES using the MUSCATE
scheduler, which also manages the L2A production for Theia. MUSCATE performs
a series of operations in a high-performance computing environment: (i) download
the level 1C product as soon it is available from the French mirror of the
Sentinel products (PEPS, Plateforme d'Exploitation des Produits Sentinel),
(ii) process and distribute the L2A product and (iii) process and distribute the
snow product. This workflow is optimized to reduce the lag time between the
acquisition and the distribution of the products. As an example, in April
2018, nearly 80 % of the snow products were made available online 5 d or
less after the date of the Sentinel-2 acquisition (Fig. ). It
is expected that this time lag should reach a median value of 2 d thanks to
the increasing performance of MUSCATE.
Cumulative probability of the time lags between the Sentinel-2
acquisition and the distribution date of the corresponding snow product,
computed for all snow products with acquisition date in April
2018.
Each snow product is distributed as a zipped file which contains raster and
vector files. The file base name is the product ID, i.e.
productId.zip, where the productId is a character string
which is constructed as follows:
productId = Satellite_AcquisitionDate__L2B-SNOW_TileName_D_Version.
For example, the eastern snow product in Fig. was extracted
from a file named
SENTINEL2A_20151130-105641-486_L2B--SNOW_T31TDH_D_V1-0.zip,
which indicates that this product was generated from a Sentinel-2A image
acquired on 30 November 2015 at 10:56:41.486 UTC and covers tile 31TDH. The
version of the product is 1.0 (see above).
The size of each zipped product varies depending on the complexity of the
snow and cloud mask but generally ranges between 10 and 100 Mb. After
extracting the product, the following files are created:
productID_SNW_R2.tif is an 8 bit single-band GeoTiff image which
provides the following classification for each pixel:
0: no snow,
100: snow,
205: cloud including cloud shadow,
254: no data.
productID_SNW_R2.{shp,shx,dbf,prj} is a vector file in the ESRI
shapefile format, which contains a vectorized version of the
productID_SNW_R2.tif using the polygon geometry.
productID_CMP_R2.tif is an 8 bit RGB GeoTiff image which shows
the outlines of the snow mask (green lines) and cloud mask (magenta lines) on
a false-colour composite of the input L2A image (RGB image made with
SWIR, red and green bands). This composition was chosen because RGB composites using the
SWIR-band image are useful for discriminating the snow cover from the snow-free
areas and from the clouds . Hence, this image mainly
allows the user to visually check the consistency of the snow and cloud mask.
productID_MTD_ALL.xml is a metadata file.
productID_QKL_ALL.jpg is a quick picture of the productID_SNW_R2.tif
image made using this colour map (8 bit RGB code in parentheses): snow in cyan
(0,255,255), cloud in white (255,255,255), no snow in grey (119,119,119) and
no data in black (0,0,0) (see an example in Fig. ).
DATA/productID_HIS_R2.txt (since product version 1.4) is a
text file indicating the cloud, snow and no-snow fraction area by elevation
bins (Sect. ).
MASKS/productID_EXS_R2.tif (in product version 1.0, it is
located in the root of the zipped file) is an 8 bit GeoTiff image for expert
evaluation, where each bit indicates the value of an intermediate computation
mask (Sect. ):
bit 1: snow (pass 1),
bit 2: snow (pass 2),
bit 3: clouds (pass 1),
bit 4: clouds (pass 2),
bit 5: clouds (initial all cloud).
For most applications, only files with the SNW suffix should be
useful.
The shapefile and raster images are referenced in WGS-84 UTM system with the
zone number given by the first two digits of the tile name. All raster files
have a 20 m resolution. Note that no-snow pixels can be any surface,
including water surface. No data pixels are the pixels outside of the
acquisition segment.
Example of a snow map (SNW file) and its corresponding
colour composite (CMP file). In the snow map the snow is represented
in cyan, no snow in grey and cloud in white. In the RGB composite, the snow-covered areas are delineated in magenta, while the clouds are delineated in
green (Theia snow product of tile 32TNS on 27 May 2017).
Evaluation
In this section we present an evaluation of the Theia Snow collection. We first
present the methods developed to do this evaluation and then the results. The
evaluation was based only on Sentinel-2 products, because the production for
Landsat-8 had just started at the time of writing and thus the Landsat-8
products represented a very minor fraction of the Theia Snow collection.
MethodComparison with in situ snow depth measurements
We collected all available snow depth measurements within tiles 31TGM, 31TGL,
31TGK, 32TLS, 32TLR, 32TLQ, 31TDH, 31TCH, 30TYN and 30TNX from the
Météo-France database between 1 September 2017 and 31 August 2018.
These 10 tiles cover most of the French Alps and Pyrenees
(Fig. ). We obtained 120 snow depth time series with at least
one measurement. We gathered all available Sentinel-2 snow products for these
tiles over the same period, i.e. a total of 1134 products. Then, we extracted
the pixel values at the location of each snow measurement station for all
dates. When a station is located in two tiles we only kept the data from the
first tile. We selected the snow depth measurements which were collected on
the same day of a Theia snow product and discarded the measurements
corresponding to a cloud detection in the Theia snow product. The snow depth
measurements were converted to snow presence and absence using a threshold of
SD0=0 m. This eventually allowed us to compute a confusion matrix
between a set of snow presence and absence data from in situ measurement and
a set of simultaneous snow presence and absence data from the Theia Snow
collection across the French Alps and Pyrenees. However, previous studies
comparing MODIS binary snow products with ground measurements showed that a
value of 0 m may not be optimal ; therefore the
sensitivity of the results to SD0 was tested by recomputing the confusion
matrix for 1 cm increments of SD0 from 0 to 1 m.
Map showing the location of the stations in the réseau nivo-météorologique (Météo-France snow observation network) used for this study (source:
Météo-France) and the corresponding Sentinel-2
tiles.
Comparison with snow maps of higher spatial resolution
We used SPOT-6 and SPOT-7 images with a resolution of 1.5 m in panchromatic
and 6 m in multispectral (blue, green, red, near-infrared) to evaluate the
accuracy of the snow detection in Theia snow products. For this comparison,
the LIS processor was run with its current operational configuration (i.e.
the configuration used to routinely produce the Theia snow products since
December 2017 Table ). SPOT-6 and SPOT-7 sensors acquire
images with a large radiometric depth coded in 16 bits, which enables us to
identify surface features in alpine regions with dark shaded slopes and
bright snow surfaces. We searched the catalogue of the Kalideos database
for available SPOT images that could match a cloud-free (or nearly
cloud-free) Theia snow map over the French Alps region
(https://alpes.kalideos.fr, last access: 12 April 2019). We identified six pairs of
images in 2016 and 2017 with a maximum time lag of 6 d
(Table , Fig. ). The SPOT images were obtained as
orthorectified products from the Kalideos database. The SPOT images were used
to generate reference snow maps by a pixel-based supervised classification.
For each SPOT image, we manually drew about 15 polygons of homogeneous
of snow and no-snow surfaces using the panchromatic image. The few cloud
pixels in the SPOT images were also manually delineated with a large buffer
to restrict the classification to strictly cloud-free areas. Colour composites
made from the multispectral images were also used as a visual support to help
the snow and cloud classification. These polygons were then used to extract
training samples from the SPOT products. The samples were taken from both the
panchromatic and the multispectral images. We tested a number of
classification algorithms by splitting the polygons in calibration and
validation data sets. The detail of this analysis is not shown here, but it
allowed us to conclude that the random forest classifier was the best choice
for our purpose, given its good accuracy and its numerical efficiency
.
In addition, we generated the snow maps from the same Sentinel-2 level 1
products using ESA's Sen2Cor version 2.5 . The
Sen2Cor output snow masks are expected to be nearly identical to the ones
included in the distributed L2A products by ESA, since the same processor is
used. We have generated the L2A products ourselves with Sen2Cor for this
study, because the L2A products were not yet available when we made this
analysis (ESA started operational production of Sentinel-2 data at level 2A
in 2017). The Theia and Sen2Cor snow maps were compared to the reference
snow maps from SPOT images using standard metrics derived from the confusion
matrix (accuracy, F1 score, kappa, false-detection rate and false-negative
rate).
Pairs of Sentinel-2 and SPOT-6/7 products used for the evaluation of
the snow detection accuracy (see also Fig. ).
Sentinel-2 product Reference product TileDateSensorDate31TGK13 Aug 2016SPOT-68 Aug 201632TLS12 Oct 2016SPOT-712 Oct 201631TGK11 Dec 2016SPOT-717 Dec 201631TGL1 Dec 2016SPOT-73 Dec 201631TGK11 Mar 2017SPOT-611 Mar 201731TGL11 Mar 2017SPOT-611 Mar 2017
Location of the five SPOT-6/7 products used for the evaluation of
the snow detection accuracy and the corresponding Sentinel-2 tiles (see also
Table ).
Evaluation metrics
From the confusion matrices of Sect.
and , we derived the following metrics: accuracy (the
proportion of the total number of predictions that were correct), false-positive rate (FPR, i.e. the proportion of no-snow pixels that were
incorrectly classified as snow), false-negative rate (FNR, i.e. the
proportion of snow pixels that were incorrectly classified as no snow),
F1 score (harmonic average of the precision and recall) and kappa coefficient
.
Visual verification
The evaluation methods presented above undersample the actual resolution of
the Theia snow collection, both in space (only 120 points of comparisons in
Sect. ) and time (only six dates of comparisons in
Sect. ). Given that we do not have a more extensive
validation data set, we used a time series of 64 Theia snow products from
6 July 2015 to 2 July 2017 over tile 31TCH to control the consistency of the
snow and cloud masks based on the visual inspection of the false-colour
composites. This approach is efficient for evaluating the frequency of gross
errors on the tile scale, i.e. large patches of false snow or false no-snow
detection .
ResultsComparison with in situ snow depth measurements
The confusion matrix between the Theia snow products and in situ snow depth
measurements is given in Table . We find 1414 pairs of data
for the study period (i.e. there are 1414 individual snow depth measurements
for which a cloud-free retrieval can be found in the Theia Snow collection on
the same day). From this confusion matrix, the accuracy (proportion of
correct classifications) is 94 % and the kappa coefficient is 0.83, which
indicates excellent agreement according to .
The false-positive rate (2.8 %) is lower than the false-negative rate
(6.7 %), which means that the Theia snow product is more likely to
underestimate than to overestimate the snow detection at the station
locations. The false-negative rate decreases if SD0 is set to a higher
value; however the false-positive rate also increases in such a way that an
optimum is reached at SD0=2 cm (Fig. ).
The comparisons of each individual time series before matching the data by
date is shown in Appendix . These plots illustrate
that the high revisit time of Sentinel-2 enables us to capture the seasonal
cycle of the snow cover well. Even intermediate melt-out events at lower elevation
stations can be identified over the course of the snow season.
Confusion matrix between the Theia snow products and in situ snow
depth data (SD).
Theia snow products In situ snow depthno snowsnowSD =02768SD >0761054
Sensitivity of the agreement between the in situ and Theia snow product
to SD0 in metres (threshold to convert the measured snow depth to snow presence
or absence). The inset shows a close-up of the region of
0–0.2 m.
Results of the validation of Theia (LIS 1.2.1) and ESA (Sen2Cor
2.5) snow products using SPOT-6/7 reference snow maps.
Sentinel-2 product Accuracy F1 Kappa FPR FNR TileDateSen2CorLISSen2CorLISSen2CorLISSen2CorLISSen2CorLIS31TGK11 Dec 20160.830.920.730.910.610.840.410.040.050.1431TGL11 Mar 20170.890.900.920.930.740.760.090.020.060.1131TGL1 Dec 20160.840.950.780.950.660.900.350.050.010.0631TGK11 Mar 20170.850.890.890.920.650.720.150.010.070.1331TGK13 Aug 20160.980.990.350.770.340.770.780.240.220.2232TLS12 Oct 20160.980.970.630.640.620.620.490.080.170.51Comparison with snow maps of higher spatial resolution
Table shows that the MAJA–LIS workflow provides a better
detection of the snow cover than the Sen2Cor processor in all the studied
cases. Although the improvement in the accuracy coefficient is low in some
cases (31TGL/2017-03-11, 31TGK/2016-08-13) or even slightly negative in one
case (32TLS/2016-10-12), the F1 score and kappa coefficient of the Theia snow
products are always greater than or equal to those of Sen2Cor. The
differences are significant for four products among the six tested. The lower
accuracy in the case of the 32TLS/2016-10-12 product is due to a higher
false-negative rate. The false-positive rates in the Theia snow products are
generally lower than the false-negative rates, which is consistent with the
previous evaluation using in situ data. The spatial analysis of the errors
indicates that the false-positives (i.e. overdetection of snow pixels) in
the Theia products are mostly located near the snow cover edges
(Fig. ). This suggests that these errors might not be due to a
false detection of snow (e.g. confusion with a lake or a cloud surface) but
are rather due to the resolution discrepancy with the reference data. By
contrast, the spatial comparison of the Sen2Cor products with the reference
data shows large patches of false negatives (omission of snow pixels) in the
snow-covered area, which are typically due to the omission of snow pixels in
shaded areas (Fig. ).
Visual verification
We found 4 products among 64 with significant patches of false-positive
pixels (i.e. pixels falsely classified as snow). These false snow areas are
exclusively located in cloud areas (Fig. ). False-negative pixels (i.e. pixels falsely classified as snow-free) can be found
by looking at higher-resolution data along the snow cover edges, but we did not
find large areas of false-negative pixels.
Full time series of Theia snow products from 6 July 2015 to 2 July
2017 over tile 31TCH. Four products with visually evident false snow
detections are framed in red and shown at higher resolution. Red arrows
indicate the location of the false snow patches, while the green arrow
indicates a true snow patch. The product of 16 April 2017 is shown as a
reference to help the visualization of the errors.
Discussion
The evaluation of the Theia Snow collection using both in situ and remote-sensing
reference data sets indicates an excellent accuracy of the snow detection,
albeit with a tendency to snow underdetection. The omission of snow pixels
can be due to the presence of trees or shadows. In the case of the in situ
comparison, part of the errors may be explained by the uncertainty in the
geolocalization of the Sentinel-2 data, which is close to 11 m at 95 %
for both satellites according to the latest Sentinel-2 L1C data quality
report . In addition, in situ measurements may not
be representative of the snow conditions within the Sentinel-2 pixel.
However, the sensitivity analysis of the accuracy and kappa coefficient to
the SD0 value suggests that the discrepancy between the scale of the in
situ observations and the scale of the pixel observation is not significant,
since the optimal SD0 is close to 0 m. In contrast, a similar analysis
with lower-resolution snow products from MODIS found an optimal SD0 of
0.15 m . This illustrates how high-resolution snow
products from Sentinel-2 enable us to partly resolve the typical scale issue
between in situ and remote-sensing products .
The visual inspection of the products reveals that the main issue in some
Theia snow products is rather the occurrence of false positives due to the
confusion with cloud surfaces. This problem can be due to two main factors:
Cold clouds: cold clouds have a spectral signature that is close to the
snow cover since they contain ice crystals (high NDSI). If these clouds are
not accurately detected by the level 2A processor then the LIS algorithm will
also classify them as snow after pass 1 (Sect. ). As a result
the snow line elevation zs is not well estimated, which can
generate erroneous snow patches within these clouds, even at low elevation
(e.g. product of 20 July 2017 in Fig. ).
Shaded clouds: the three-dimensional structure of the clouds can form
shadows within the cloud cover area (e.g. product of 4 March 2017 in
Fig. ). These shaded cloud areas have a lower
reflectance in the visible wavelengths and therefore can be considered as
dark clouds in the LIS algorithm, provided that the shaded cloud cover
area is large enough to significantly reduce the reflectance in the red band,
even after the downsampling step (Sect. ).
Among 64 products, we find 4 cases with significant false snow cover
detection in this tile (Fig. ). Although it is not
shown here, we have made this exercise over other tiles in the Alps and Atlas
mountains, and we estimate that this proportion is similar in these areas.
Regarding this issue we mainly rely on our visual judgment using colour
composites to assess the ability of the algorithm to discriminate the cloud
and the snow cover. However, to further quantify the algorithm performance
and eventually optimize its parameters, we would need a database of
classified imagery with labelled regions of cloud and snow cover in many
different cases.
Both Sen2Cor and LIS use the NDSI to detect the snow cover; however the
algorithms differ in many aspects. The better performances of LIS in
Sect. can be related to the mono-date approach for
atmospheric correction and cloud detection in Sen2Cor, which can cause
snow–cloud confusion in the L2A product, while LIS uses L2A products from the
more accurate multi-temporal MAJA processor . In
addition, a unique feature of the LIS algorithm is the use of the snow line
elevation concept to improve the robustness of the snow detection in mountain
regions. In addition, LIS was designed to retrieve snow below thin clouds and
to reclassify snow pixels which are frequently found near the snow cover
edges in L2A products. Apart from the MAJA processing, however, the effect of
these algorithmic differences could be less evident in flat areas.
Code and data availability
The Theia snow collection can be accessed and cited
using this digital object identifier: 10.24400/329360/F7Q52MNK. The let-it-snow source code is available under the
GNU Affero General Public License v3.0 in this repository: https://gitlab.orfeo-toolbox.org/remote_modules/let-it-snow/ (last access: 12 April 2019).
Conclusions
The Theia Snow collection is a free collection
of snow products which indicate the presence or absence of snow on the land
based on Sentinel-2 (20 m) and Landsat-8 (30 m) data. Most of the snow
products are derived from Sentinel-2. However, in late September 2018,
Landsat-8 snow products started being routinely added to the Theia Snow
collection for the titles in France only. Previous Landsat-8 data were also reprocessed back to March 2017.
At the time of writing (15 March 2019), for example in tile T31TCH in the
Pyrenees, 20.5 % of all available products were derived from Landsat-8
data (63 among 244). In addition, the processing is done to facilitate the
combination of Landsat-8 and Sentinel-2 snow products, since the data format
and the tiling scheme are the same (only the pixel size differs). Although
the co-registration of Sentinel-2 and Landsat-8 data can be still problematic
, the combination of “analysis ready” Landsat-8 and
Sentinel-2 is important for increasing the number of cloud-free observations.
The evaluation presented in this article indicate that the Theia Snow product
allows an accurate detection of the snow cover (and of the absence of snow).
However, it remains a preliminary assessment that should be extended in the
future. For example, the evaluation was based only on Sentinel-2 products,
since the integration of Landsat-8 data started during the writing of this
paper and should be extended to Landsat-8 products. However, the
processing is done with the same algorithm, which is based on the calculation
of the NDSI from slope-corrected surface reflectance (level 2A data). In
addition, the evaluation was limited to the French Alps and French Pyrenees,
while the collection covers other mountain regions. Last, the spatial
evaluation using higher-resolution remote-sensing data was focused on the
accuracy of the snow detection using very high-resolution clear-sky images,
while, as discussed above, the main difficulty is not to detect the snow when
the sky is cloud-free but rather to avoid false snow detection within
clouds. The reduction of the false-positive rate is the main challenge for
the future developments in the LIS algorithm. In the meantime, we also
welcome feedback from other users.
The Theia Snow collection is based on optical observations; therefore it is
not adapted to the detection of the snow cover in dense forest areas where
the ground is obstructed by the canopy . This may
typically occur in evergreen conifer forests of alpine regions (e.g. Alps,
Pyrenees). We have noted that the snow detection can be successful even in
alpine forests, but we lack data that can be used for quantifying its accuracy.
Therefore it is recommended to use a land cover map to exclude these forest
areas from the analysis. For the tiles in France, this can be done using the
Theia land cover map , which is distributed at
the same spatial resolution. In the Pyrenees, the mean altitude of
zero-degree isotherm in winter is close to the treeline elevation at 1600 m;
therefore the impact of the forest cover is limited . The
LIS algorithm may also fail to detect the snow cover on steep, shaded slopes
if the solar elevation is very low (typically below 20∘). This can
occur in midlatitude areas in winter. In this case, the slope correction in
the L2A product is generally not applied. This is indicated in the L2A mask
but not in the snow product. A future release of the Theia Snow collection should
include this information. However, the visual inspection of many images in
regions of complex topography gives us the impression that the snow detection
algorithm already performs well on shaded slopes. We are planning to
investigate this issue further using alternative approaches
.
In spite of all these limitations, the Theia snow products have already been
successfully used for the evaluation of the MODIS snow products
and the assimilation of snow-covered area data
into a snowpack model in the High Atlas . Given
that the snow cover is a key driver of many natural processes in mountains
regions, we envision various potential applications of the Theia snow
products, including the modelling of the distribution of the permafrost in
mountain regions, the validation and calibration of hydrological models in
snow-dominated catchments and the spatial modelling of biodiversity and
productivity of ecosystems in mountain regions.
Spatial comparison between Theia and Sen2Cor snow products with SPOT-6/7 reference data
Time series of Theia snow products and snow depth records at 120 snow
observation stations from 1 September 2017 to 31 August
2018
Magenta dots: snow presence (a) or absence (b) in
the Theia snow collection. Black line: snow depth time series.
Magenta dots: snow presence (a) or absence (b) in
the Theia snow collection. Black line: snow depth time series.
Magenta dots: snow presence (a) or absence (b) in
the Theia snow collection. Black line: snow depth time series.
Magenta dots: snow presence (a) or absence (b) in
the Theia snow collection. Black line: snow depth time series.
Author contributions
SG and OH defined the LIS algorithm. MG and GS worked on
the implementation and development of the LIS processor. MG, MB and SG
worked on the validation of the snow products. SG wrote the manuscript with
inputs from all the co-authors.
Competing interests
The authors declare that they have no conflict of
interest.
Acknowledgements
The development and production of the Theia Snow collection is supported by
the CNES. We thank Tristan Klempka for his contribution to earlier
developments of the LIS processor. We thank Marc Leroy, Arnaud Sellé and
Philippe Pacholczyk for the coordination of the Theia project and Joelle
Donadieu and Céline l'Helguen for the transfer to the production at MUSCATE.
This manuscript was greatly improved thanks to the constructive
comments of Gaia Piazzi, Javier Herrero, Jeff Dozier and an anonymous
referee.
Review statement
This paper was edited by Alexander Kokhanovsky and reviewed
by Gaia Piazzi, Javier Herrero, Jeff Dozier, and one anonymous referee.
ReferencesBaba, M. W., Gascoin, S., and Hanich, L.: Assimilation of Sentinel-2 Data
into a Snowpack Model in the High Atlas of Morocco, Remote Sensing, 10, 1982,
10.3390/rs10121982, 2018.Baetens, L., Desjardins, C., and Hagolle, O.: Validation of Copernicus
Sentinel-2 Cloud Masks Obtained from MAJA, Sen2Cor, and FMask Processors
Using Reference Cloud Masks Generated with a Supervised Active Learning
Procedure, Remote Sensing, 11, 433, 10.3390/rs11040433, 2019.
Blöschl, G.: Scaling issues in snow hydrology, Hydrol. Process., 13,
2149–2175, 1999.Bouchet, M.: Validation et amélioration des produits Surfaces
Enneigées Sentinel-2, Zenodo, 10.5281/zenodo.1446460, 2018.
Cohen, J.: A coefficient of agreement for nominal scales, Educ. Psychol.
Meas., 20, 37–46, 1960.Commissariat général au développement durable: Plan d'applications
satellitaires 2018, available at: https://www.ecologique-solidaire.gouv.fr/sites/default/files/Plan%
20d%
E2%
80%
99applications%
20satellitaires%
202018.pdf
(last access: 12 April 2019), 2018.
Dozier, J.: Spectral signature of alpine snow cover from the Landsat Thematic
Mapper, Remote Sens. Environ., 28, 9–22, 1989.Drusch, M., Bello, U. D., Carlier, S., Colin, O., Fernandez, V., Gascon, F.,
Hoersch, B., Isola, C., Laberinti, P., Martimort, P., Meygret, A., Spoto, F.,
Sy, O., Marchese, F., and Bargellini, P.: Sentinel-2: ESA's Optical
High-Resolution Mission for GMES Operational Services, Remote Sens. Environ.,
120, 25–36, 10.1016/j.rse.2011.11.026, 2012.
Fleiss, J. L., Levin, B., and Paik, M. C.: Statistical methods for rates and
proportions, John Wiley & Sons, 2013.Frei, A., Tedesco, M., Lee, S., Foster, J., Hall, D., Kelly, R., and
Robinson, D. A.: A review of global satellite-derived snow products, Adv.
Space Res., 50, 1007–1029, 10.1016/j.asr.2011.12.021, 2012.Gao, B.-C., Goetz, A. F., and Wiscombe, W. J.: Cirrus cloud detection from
airborne imaging spectrometer data using the 1.38 µm water vapor
band, Geophys. Res. Lett., 20, 301–304, 1993.Gascoin, S., Hagolle, O., Huc, M., Jarlan, L., Dejoux, J.-F., Szczypta, C.,
Marti, R., and Sánchez, R.: A snow cover climatology for the Pyrenees
from MODIS snow products, Hydrol. Earth Syst. Sci., 19, 2337–2351,
10.5194/hess-19-2337-2015, 2015.Gascoin, S., Grizonnet, M., Klempka, T., and Salgues, G.: Algorithm
theoretical basis documentation for an operational snow cover product from
Sentinel-2 and Landsat-8 data (Let-it-snow), 10.5281/zenodo.1414452,
2018.Gascoin, S., Grizonnet, M., Hagolle, O., and Salgues, G.: Theia Snow
collection, CNES for THEIA Land data center, 10.24400/329360/f7q52mnk,
2018.GDAL/OGR contributors: GDAL/OGR Geospatial Data Abstraction software
Library, Open Source Geospatial Foundation, http://gdal.org (last
access: 12 April 2019), 2018.Grizonnet, M., Michel, J., Poughon, V., Inglada, J., Savinaud, M., and
Cresson, R.: Orfeo ToolBox: open source processing of remote sensing images,
Open Geospatial Data, Software and Standards, 2, 15,
10.1186/s40965-017-0031-6, 2017.Hagolle, O., Huc, M., Pascual, D. V., and Dedieu, G.: A multi-temporal method
for cloud detection, applied to FORMOSAT-2, VENμS, LANDSAT and SENTINEL-2
images, Remote Sens. Environ., 114, 1747–1755, 2010.Hagolle, O., Huc, M., Desjardins, C., Auer, S., and Richter, R.: MAJA
Algorithm Theoretical Basis Document, 10.5281/zenodo.1209633, 2017.
Hall, D., Riggs, G., Salomonson, V., DiGirolamo, N., and Bayr, K.: MODIS
snow-cover products, Remote Sens. Environ., 83, 181–194, 2002.Inglada, J., Vincent, A., Arias, M., Tardy, B., Morin, D., and Rodes, I.:
Operational High Resolution Land Cover Map Production at the Country Scale
Using Satellite Image Time Series, Remote Sensing, 9, 95,
10.3390/rs9010095, 2017.
Jarvis, A., Guevara, E., Reuter, H. I., and Nelson, A. D.: Hole-filled SRTM for the globe: version 4:
data grid, CGIAR Consortium for Spatial Information, Washington D.C., USA, 2008.Klein, A. and Barnett, A.: Validation of daily MODIS snow cover maps of the
Upper Rio Grande River Basin for the 2000-2001 snow year, Remote Sens.
Environ., 86, 162–176, 10.1016/S0034-4257(03)00097-X, 2003.
Krajčí, P., Holko, L., and Parajka, J.: Variability of snow line
elevation, snow cover area and depletion in the main Slovak basins in winters
2001–2014, J. Hydrol. Hydromech., 64, 12–22, 2016.Li, J. and Roy, D. P.: A Global Analysis of Sentinel-2A, Sentinel-2B and
Landsat-8 Data Revisit Intervals and Implications for Terrestrial Monitoring,
Remote Sensing, 9, 902, 10.3390/rs9090902, 2017.
Louis, J., Debaecker, V., Pflug, B., Main-Korn, M., Bieniarz, J.,
Mueller-Wilm, U., Cadau, E., and Gascon, F.: Sentinel-2 Sen2Cor: L2A
Processor for Users, in: Proc. 'Living Planet Symposium 2016', Prague, Czech
Republic, 9–13 May 2016, ESA SP-740, 9–13, 2016.Malnes, E., Buanes, A., Nagler, T., Bippus, G., Gustafsson, D., Schiller, C.,
Metsämäki, S., Pulliainen, J., Luojus, K., Larsen, H. E., Solberg,
R., Diamandi, A., and Wiesmann, A.: User requirements for the snow and land
ice services – CryoLand, The Cryosphere, 9, 1191–1202,
10.5194/tc-9-1191-2015, 2015.Masson, T., Dumont, M., Mura, M., Sirguey, P., Gascoin, S., Dedieu, J.-P.,
and Chanussot, J.: An Assessment of Existing Methodologies to Retrieve Snow
Cover Fraction from MODIS Data, Remote Sensing, 10, 619,
10.3390/rs10040619, 2018.
Metsämäki, S., Pulliainen, J., Salminen, M., Luojus, K., Wiesmann,
A., Solberg, R., Böttcher, K., Hiltunen, M., and Ripper, E.: Introduction
to GlobSnow Snow Extent products with considerations for accuracy assessment,
Remote Sens. Environ., 156, 96–108, 2015.
MPC Team: Sentinel-2 L1C Data Quality Report, European Space Agency, Tech.
Rep. 31, 80 pp., 2018.
Painter, T. H., Rittger, K., McKenzie, C., Slaughter, P., Davis, R. E., and
Dozier, J.: Retrieval of subpixel snow covered area, grain size, and albedo
from MODIS, Remote Sens. Environ., 113, 868–879, 2009.Parajka, J. and Blöschl, G.: Spatio-temporal combination of MODIS
images–potential for snow cover mapping, Water Resour. Res., 44, W03406,
10.1029/2007WR006204, 2008.Ramsay, B. H.: The interactive multisensor snow and ice mapping system,
Hydrol. Process., 12, 1537–1546, 1998.
Sirguey, P., Mathieu, R., Arnaud, Y., Khan, M., and Chanussot, J.: Improving
MODIS Spatial Resolution for Snow Mapping Using Wavelet Fusion and ARSIS
Concept, IEEE Geosci. Remote S., 5, 78–82, 10.1109/LGRS.2007.908884,
2008.
Sirguey, P., Mathieu, R., and Arnaud, Y.: Subpixel monitoring of the seasonal
snow cover with MODIS at 250 m spatial resolution in the Southern Alps of
New Zealand: Methodology and accuracy assessment, Remote Sens. Environ., 113,
160–181, 2009.Storey, J., Roy, D. P., Masek, J., Gascon, F., Dwyer, J., and Choate, M.: A
note on the temporary misregistration of Landsat-8 Operational Land Imager
(OLI) and Sentinel-2 Multi Spectral Instrument (MSI) imagery, Remote Sens.
Environ., 186, 121–122, 10.1016/j.rse.2016.08.025, 2016.U.S. Geological Survey Earth Resources Observation And Science Center:
Landsat OLI Level-2 Surface Reflectance (SR) Science Product,
10.5066/f78s4mzj, 2014.Vidot, J., Bellec, B., Dumont, M., and Brunel, P.: A daytime VIIRS RGB pseudo
composite for snow detection, Remote Sens. Environ., 196, 134–139,
10.1016/j.rse.2017.04.028, 2017.Xin, Q., Woodcock, C. E., Liu, J., Tan, B., Melloh, R. A., and Davis, R. E.:
View angle effects on MODIS snow mapping in forests, Remote Sensing of
Environment, 118, 50–59, 10.1016/j.rse.2011.10.029, 2012.
Zhu, Z., Wang, S., and Woodcock, C. E.: Improvement and expansion of the
Fmask algorithm: Cloud, cloud shadow, and snow detection for Landsats 4–7,
8, and Sentinel 2 images, Remote Sens. Environ., 159, 269–277, 2015.