Studies of glaciers generally require precise glacier outlines. Where these are not available, extensive manual digitization in a geographic information system (GIS) must be performed, as current algorithms struggle to delineate glacier areas with debris cover or other irregular spectral profiles. Although several approaches have improved upon spectral band ratio delineation of glacier areas, none have entered wide use due to complexity or computational intensity.
In this study, we present and apply a glacier mapping algorithm in Central
Asia which delineates both clean glacier ice and debris-covered glacier
tongues. The algorithm is built around the unique velocity and topographic
characteristics of glaciers and further leverages spectral and spatial
relationship data. We found that the algorithm misclassifies between 2 and
10 % of glacier areas, as compared to a
The algorithm does not completely solve the difficulties inherent in classifying glacier areas from remotely sensed imagery but does represent a significant improvement over purely spectral-based classification schemes, such as the band ratio of Landsat 7 bands three and five or the normalized difference snow index. The main caveats of the algorithm are (1) classification errors at an individual glacier level, (2) reliance on manual intervention to separate connected glacier areas, and (3) dependence on fidelity of the input Landsat data.
This study focuses on mapping glaciers over a large spatial scale using
publicly available remotely sensed data. Several high-resolution glacier
outline databases have been produced, most notably the Global Land Ice
Measurements from Space (GLIMS) project
Greater study area of the Tien Shan, showing SRTM v4.1
topography
Several methods have been developed to delineate clean glacier ice
In this study we analyze the results of our classification algorithm using a
suite of 40 Landsat Thematic Mapper (TM), ETM+
and Optical Land Imager (OLI) images (1998–2013) across a spatially and
topographically diverse set of study sites comprising eight Landsat
footprints (Path/Row combinations: 144/30, 145/30, 147/31, 148/31, 149/31,
151/33, 152/32, 153/33) along a
The study area contains a wide range of glacier types and elevations, with
both small and clean-ice-dominated glaciers, as well as large, low-slope, and
debris-covered glaciers. The diversity in glacier types in the region
provides an ideal test area – particularly in mapping glaciers with long and
irregular debris tongues, such as the Inylchek and Tomur glaciers in the
central Tien Shan
The wintertime climate of the study area is controlled by both the winter
westerly disturbances and the Siberian High, which dominate regional
circulation and create strong precipitation gradients throughout the range,
which extends from Uzbekistan in the west through China in the east
(Fig.
Our glacier mapping algorithm is based on several data sets. The Landsat 5
(TM), 7 (ETM+), and 8 (OLI) platforms were chosen as the primary spectral
data sources, as they provide spatially and temporally extensive coverage of
the study area (Table 1). ASTER can also be used as a source of spectral
information, but here we chose to focus on the larger footprint and longer
time series available through the Landsat archive. In addition to spectral
data, the 2000 Shuttle Radar Topography Mission V4.1 (SRTM) digital elevation
model (DEM) (
Data table listing Landsat acquisition dates used in this study. Organized by WRS2 Path/Row combinations. Bold dates indicate images used for velocity profiles.
Our glacier classification algorithm uses several sequential thresholding
steps to delineate glacier outlines. The scripts used in this study are
available in the data repository, with updates posted to
Data pre-processing.
Velocity fields are calculated with normalized image cross-correlation (manual, can be
automatized). The HydroSHEDS river network is rasterized (manual, can be
automatized).
Optional manual debris points are created (manual, optional). SRTM data is used to create a hillslope image (Python script). All input data sets are matched to a single extent and spatial resolution (30 m) (Python
script). Glacier classification steps.
Clean-ice glacier outlines are created using Landsat bands 1, 3, and 5 (Matlab
script). “Potential debris areas” are generated from low-slope areas (Matlab
script). Low-elevation areas are removed (Matlab script).
Low-velocity areas are removed (Matlab script). Distance-weighting metrics are used to remove areas distant from river networks or clean glacier ice (Matlab
script). Distance-weighting metrics are used to remove areas very distant from clean glacier ice and manual seed points (Matlab
script). The resulting glacier outlines are cleaned with statistical filtering (Matlab
script). Post-processing.
Glacier outlines are exported to ESRI shapefile format for use in a GIS (Python
script).
For accurate glacier delineation, we primarily used Landsat images which were
free of new snow, and had less than 10 % cloud cover. However, we have also
included scenes with limited snow and cloud cover in our analysis to
understand their impacts on our classification algorithm. We find that the
presence of fresh snow in images tends to overclassify glacier areas and
classify non-permanent snow as glaciers. Additionally, cloud-covered glaciers
cannot be correctly mapped by the algorithm
The algorithm uses Landsat imagery, a void-filled DEM, a velocity surface derived from image cross-correlation, and the HydroSHEDS 15 arcsec river network (buffered by 200 m and converted to a raster) as the primary inputs (steps 1a, 1b). The algorithm generates a slope image from the DEM and rectifies additional input data sets described below for processing by resampling and reprojecting each data set to the same spatial extent and resolution (30 m to match the Landsat data) (steps 1d, 1e). Although the current algorithm leverages a few proprietary Matlab commands, we will continue to update the code with the goal of using only open-source tools and libraries in the future.
Calculations are performed on rasterized versions of each input data set,
which have been standardized to the same matrix size. The first step in the
classification process leverages Landsat 7 bands 1, 3, and 5 (Step 2a). For
Landsat 8 OLI images, a slightly different set of bands is used to conform to
OLI's modified spectral range. For simplicity, bands referenced in this
publication refer to Landsat 7 ETM+ spectral ranges. The ratio of TM3 / TM5
(value
Building on the work of
As can be seen in Fig.
The Correlation Image Analysis Software (CIAS)
We only used one multi-year velocity measurement for each Path/Row
combination to derive general areas of movement/stability for glacier
classification, as using stepped velocity measurements over smaller time
increments did not show a noticeable improvement in glacier classification.
This also improved our classification of slow-moving glaciers, which may not
change significantly over only a single year. These velocities ranged
generally from 4.5 to 30 m yr
The velocity step is most important for removing hard-to-classify pixels along the edges of glaciers and wet sands in riverbeds. These regions are often spectrally indistinguishable from debris tongues but have very different velocity profiles. It is important to note, however, that this step also removes some glacier area, as not all parts of a glacier are moving at the same speed. This can result in small holes in the delineated glaciers, which the algorithm attempts to rectify using statistical filtering. Generating a velocity field is the most computationally expensive step of the algorithm.
After topographic and velocity filtering, a set of spatially weighted filters
was constructed. The first filtering step uses the HydroSHEDS river network
to remove “potential debris areas” which are distant from the center of a
given glacier valley (Fig.
The spatial-weighting step is essential for removing pixels spatially distant
from any clean-ice area. In many cases, large numbers of river pixels and, in
some cases, dry sand pixels have similar spectral and topographic profiles
to debris-covered glaciers. This step effectively removes the majority of
pixels outside the general glacierized area(s) of a Landsat scene, as can be
seen in Fig.
Once the spatial-weighting steps are completed, a set of three filters are
then applied in order to remove isolated pixels, bridge gaps between
isolated glacier areas, and fill holes in large contiguous areas (Step 2g).
First, a 3
Final algorithm outlines (black) with areas classified in addition to the clean-ice delineation in red. Landsat OLI image captured on 25 September 2013 as background.
This step is necessary for filling holes and reconnecting separated glacier
areas that result from the initial threshold-based filtering steps. For
example, slow-moving pixels in the middle of a debris-covered glacier tongue
that were removed based on velocity filtering are often restored by the
statistical filtering (Fig.
Manual control data sets encompassing
Glacier size class distribution (
Before any comparisons between glaciers can be performed, glacier complexes must be split into component parts. A set of manually edited watershed boundaries, derived from the SRTM DEM, were used to split both the manual and algorithm data sets into individual glacier areas for analysis. In this way, the diverse data sets and classified glacier areas can be split into the same subset areas for statistical comparison.
Over the eight Landsat footprints used in this study, we map
A subset of 215 glaciers from the manual control data sets of varying size and
topographic setting was chosen for more detailed analysis. The unedited,
algorithm-generated glacier outlines were compared against spectral
outlines, which only classify the glacier areas via commonly used spectral
subsetting (using TM1, TM3, and TM5, produced in Step 2b), the manual
control data sets, and the CGI v2. Figure
There is some apparent bias in our algorithm towards low-elevation areas, which represent the debris-covered portions of glaciers and are the most difficult areas to classify. This bias also stems from misclassified areas in shadows, particularly in north-facing glaciers. There is also a bias in our control data set towards underclassifying the high-elevation areas, which we attribute to user bias in removing isolated rock outcrops within glaciers, as opposed to simply defining accumulation areas as a single polygon. In general, the algorithm and the control data set are well-matched below 4000 m; above this, the spectral data set and the algorithm data set begin to align closely and generally follow the manually digitized data. This threshold represents the general transition from debris-covered glaciers to clean glacier ice in the study area. Our algorithm output is also well-matched with the CGI v2, except at very high elevations where it overclassifies some areas as compared to the CGI v2.
Vertex distance distributions for algorithm (blue) and spectral (red) vertices, as compared to a manual control data set, normalized to the maximum distance.
In order to examine inherent bias throughout the algorithm classification,
under- and overclassified areas were examined for a subset of the control
data set. To determine areas of overclassification (underclassification), the
manually (algorithm) generated data set was subtracted from the algorithm
(manual) data set, leaving only pixels that were overclassified
(underclassified). Figure
To investigate sampling bias in our analysis, we used 465 GLIMS glacier
identification numbers (centroids, point features) that overlapped with the
manual control data sets. A random subset of 100 of these points was chosen
for this analysis. As can be seen in Fig.
To capture changes in the shape of the glacier outlines between the initial
spectral classification and the final algorithm output, we computed the
distance between pairs of glacier vertices. We first reduced our manual
control data set to a set of X/Y pairs for each component vertex, which were
then matched to the closest vertex in the resulting spectral and final algorithm
polygons, respectively (Fig.
The distance distribution for the algorithm data set shows generally close
agreement between the algorithm and manual control data sets. The spectral
data set also contains a large percentage of vertices close to a
Several authors have presented alternative debris-covered glacier
classification methods and schemes using thermal and spectral data
Comparison of methods between previous debris-covered glacier mapping studies.
Our study improves on previous work in three main ways: (1) reduced
computational intensity, (2) greater diversity of study area, and (3) increased
temporal range of our data set. The methods proposed in this study,
excepting the generation of a velocity field, require very little processing
power. Once initial input data sets (velocity surface, rasterized river
network) have been created, a Landsat scene can be processed in 3–5 min
(Ubuntu 14.04, 8 cores (3.6 GhZ), 16 GB RAM). When this is compared with the
training data set creation, computationally expensive classification schemes,
and neighborhood analyses employed by other studies, there is a clear
improvement in efficiency. Secondly, we analyze a significantly larger
glacier area than any of the previous studies, which has helped us generalize
our algorithm and methods to a wide range of topographic and land cover
settings. Finally, we process a multi-year data set, encompassing 40 Landsat
scenes with varying land cover and meteorological settings. This has allowed
us to further generalize our algorithm to be effective beyond a single scene
or small set of scenes, and to remain effective across a wide spatial and
temporal range. The time-dynamic aspect of our algorithm can also complement
time-static wide-area data sets, such as the RGI v4.0, the CGI v2, and the
forthcoming GAMDAM data sets
Two additional topographic indices – spatial fast Fourier transforms (FFTs), also known as 2-D FFTs, and ASTER surface roughness measurements – were tested during the development of the algorithm, although neither provided significant improvement. We attempted to derive frequential information from several Landsat and ASTER bands, with limited success. Some glaciers exhibit a unique frequency signature when analyzed using spatial FFTs, although these were not consistent across multiple debris-covered glaciers with differing surface characteristics. Additionally, the FFT approach was tested against a principal component analysis (PCA) image derived from all Landsat bands, without significant improvement to the algorithm.
We also attempted to integrate surface roughness measurements using the ASTER
satellite, which contains both forward looking (3N – nadir) and backwards
looking (3B – backwards) images, primarily intended for the generation of
stereoscopic DEMs. The difference in imaging angle provides the opportunity
to examine surface roughness by examining changes in shadowed areas
The glacier outlines provided by the algorithm are a useful first-pass
analysis of glacier area. It is often more efficient to digitize only
misclassified areas, as opposed to digitizing entire glacier areas by hand
Algorithm outlines (yellow) compared to the control data set (black) and the CGI v2 (red) illustrate high fidelity in overall debris-tongue length between the three data sets, although the algorithm outlines exhibit noise along the edges of debris tongues.
The algorithm moves a step further than spectral-only classification and
attempts to classify glacier areas as accurately as possible, including
debris-covered areas. As can be seen in Fig.
Without post-processing, these raw glacier outlines can be used to analyze
regional glacier characteristics, such as slope, aspect, and hypsometry. Even
if glacier outlines are not perfectly rectified in space, at the scale of
watersheds, satellite image footprints, or mountain ranges, errors of under-
and overclassification even out, yielding valuable regional statistics
(Fig.
Algorithm outlines for July 2013 (black) and algorithm outlines for August 2002 (yellow), showing small retreats in glacier areas, particularly at the debris tongues. Vicinity of the Akshiirak glacierized massif, central Tien Shan.
Figure
The second use case for the algorithm is as a substitute for simple spectral
ratios. Manual digitization of glacier tongues is time consuming,
particularly in regions with numerous debris-covered glaciers. Our algorithm
provides a robust baseline set of glacier outlines that can be corrected
manually, with minimal extra processing time. As generating the input
velocity surfaces can take longer than processing glacier outlines from
dozens of Landsat scenes, efficiencies are gained when Landsat scenes are
processed in bulk. The algorithm as presented in this paper takes
Although the algorithm represents a step forward in semi-automated glacier classification, there are several important caveats to keep in mind. (1) Lack of data density and temporal range limits the efficacy of individual glacier analysis; the algorithm presented in this paper was not designed with individual glacier studies in mind, and in many cases, such as in mass balance studies, more accurate manual glacier outlines are necessary. Furthermore, (2) the algorithm relies on manual intervention to separate individual glaciers which are connected through overlapping classified areas or which are part of glacier complexes. Finally, (3) the algorithm relies heavily on the fidelity of the Landsat images provided, in that glacier outlines on images with cloud or snow cover are less likely to be well-defined. This creates a data limitation, as many glacierized areas are subject to frequent cloud and snow cover and thus have a limited number of potentially useful Landsat images for the purpose of this algorithm.
This study presents an enhanced glacier classification methodology based on the spectral, topographic, and spatial characteristics of glaciers. We present a new method of (semi-)automated glacier classification, which is built upon, but unique from, the work of previous authors. Although it does not completely solve the difficulties associated with debris-covered glaciers, it can effectively and rapidly characterize glaciers over a wide area. Following an initial delineation of clean glacier ice, a set of velocity, spatial, and statistical filters are applied to accurately delineate glacier outlines, including their debris-covered areas.
When compared visually and statistically against a manually digitized control data set and the high-fidelity CGI v2, our algorithm remains robust across the diverse glacier sizes and types found in central Asia. The algorithm developed here is applicable to a wide range of glacierized regions, particularly in those regions where debris-covered glaciers are dominant and extensive manual digitization of glacier areas has previously been required. The raw algorithm output is usable for rough statistical queries on glacier area, hypsometry, slope, and aspect; however, manual inspection of algorithm output is necessary before using the generated glacier outlines for more in-depth area change or mass balance studies.
Part of this work was supported through the Earth Research Institute (UCSB) through a Natural Hazards Research Fellowship, as well as the NSF grant AGS-1116105. We would like to thank Frank Paul, Wanqin Guo, and one anonymous reviewer for their detailed and helpful reviews, as well as Tobias Bolch for his contribution to the development of the paper.Edited by: T. Bolch