Arctic lead detection using a waveform mixture algorithm from CryoSat-2 data

Arctic sea ice leads play a major role in exchanging heat and momentum between the Arctic atmosphere and 10 ocean as well as in the retrieval of sea ice thickness. Although leads cover only a small portion of the Arctic Ocean, they affect the heat budget in the Arctic region considerably. In this study, we propose a waveform mixture analysis to detect leads from CryoSat-2 data, which is novel and different from the existing threshold-based lead detection methods. The waveform mixture analysis adopts the concept of spectral mixture analysis that is widely used in the field of hyperspectral image analysis. This lead detection method, based on the waveform mixture analysis, was evaluated with high resolution 15 (250m) MODIS images and showed comparable and promising performance in detecting leads when compared to the previous methods. The robustness of the proposed approach also lies in the fact that it does not require the rescaling of parameters (i.e., stack standard deviation, stack skewness, stack kurtosis, pulse peakiness, and backscatter sigma), as it directly uses L1B waveform data unlike the existing threshold-based methods. Monthly lead fraction maps were produced by waveform mixture analysis, which show a strong inter-annual variability of recent sea ice cover during 2011-2016, excluding 20 the summer season (i.e., June to September). We also compared the lead fraction maps to other lead fraction maps generated from previously published data sets, resulting in similar spatiotemporal patterns.


Introduction
Sea ice leads (hereafter referred to as "leads"), linearly elongated cracks in sea ice, are a common feature in the Arctic Ocean.Leads facilitate an amount of heat and moisture exchanges between the atmosphere and the ocean because of the temperature differences (Maykut, 1982;Perovich et al., 2011).Although leads occupy a small portion of the Arctic Ocean, there is much more heat transfer between the atmosphere and ocean through leads than sea ice (Maykut, 1978;Marcq and Weiss, 2012).Furthermore, Lüpkes et al. (2008) showed that a 1 % change in sea ice concentration owing to an increase in lead fraction could increase near-surface temperature in the Arctic by 3.5 K. Thus, detecting and monitoring leads in the Arctic Ocean is crucial because they are closely related to the Arctic heat budget and the physical interaction between the atmospheric boundary layers and sea ice in the Arctic.
Satellite sensors have been the most efficient way to monitor leads in the entire Arctic region since the 1990s (Key et al., 1993;Lindsay and Rothrock, 1995;Miles and Barry, 1998).Advanced Very High Resolution Radiometer (AVHRR) and Defense Meteorological Satellite Program (DMSP) satellite visible and thermal images were used to detect leads in the early 1990s.Recently, the Moderate Resolution Imaging Spectroradiometer (MODIS) ice surface temperature (IST) product with 1 km spatial resolution was used to detect leads to map pan-Arctic lead presence (Willmes andHeinemann, 2015, 2016).They mitigated cloud interference using a fuzzy cloud artefact filter and investigated lead dynamics based on a comparison between pan-Arctic lead maps and the characteristics of the Arctic Ocean such as shear zones, bathymetry, and currents.While optical sensors have a finer spatial resolution, they are not pragmatic in the dark regions during polar nights (from December to February).In addition, leads are easily contaminated by clouds.Microwave instruments such as passive microwave sensors and altimeters have been used to detect leads and produce lead fractions.Röhrs and Kaleschke (2012) utilized the polarization ratio of the Advanced Microwave Scanning Radiometer for EOS (AMSR-E) channels and retrieved daily thin ice concentration.With the help of the thin ice concentration, lead orientations and frequencies were derived using an image analysis technique (i.e., Hough transform) (Bröhan and Kaleschke, 2014).Airborne and space-borne radar altimeters can detect leads as well.Zygmuntowska et al. (2013) used Airborne Synthetic Aperture and Interferometric Radar Altimeter System (ASIRAS), similar to CryoSat-2, to identify leads based on waveform characteristics and a Bayesian classifier.Zakharova et al. (2015) and Wernecke and Kaleschke (2015) used the space-borne altimeters Satellite with Argos and Altika (SARAL) and CryoSat-2, respectively, to identify leads.While Zakharova et al. (2015) applied simple thresholds to identify leads along with Satellite with Argos and Altika (SARAL/Altika) tracks and estimated regional lead fractions, Wernecke and Kaleschke (2015) optimized thresholds to detect leads and produced pan-Arctic lead fraction maps using CryoSat-2 with an analysis of lead width and sea surface height.
Spectral mixture analysis based on the assumption that the spectra measured by sensors for a pixel are a linear combination of the spectra for all components within the pixel (Keshava and Mustard, 2002) was first applied to the altimetry research field in the polar regions by Chase and Holyer (1990).They estimated sea ice type and concentration using spectral mixture analysis based on Geosat waveforms.However, Geosat with a relatively small number of range bins and coarser spatial resolution is not sufficient to detect small leads in the winter (DJF) and spring seasons (MAM) in the Arctic.In this study, we adapted the linear mixture algorithm concept to waveforms from Synthetic Aperture Interferometric Radar Altimeter (SIRAL), CryoSat-2, to identify leads and produce monthly pan-Arctic lead fractions from January to May and October to December between 2011 and 2016.Waveform endmembers are crucial for implementing the spectral mixture algorithm (Fig. 1).The N-FINDR (N-finder) algorithm was used to select waveform endmembers from extracted waveforms by decision tree (DT) from Lee et al. (2016), which avoids the subjective selection of endmembers.The detected leads were visually evaluated with MODIS images (at 250 m resolution) and compared with other threshold-based lead detection methods.The proposed waveform mixture algorithm does not require changes to any of the parameters used in the algorithm to detect leads when the CryoSat-2 baseline is updated, which is a significant advantage compared to the existing threshold-based lead detection methods.The main objectives of this study are to (1) develop a novel lead detection method based on the waveform mixture algorithm, (2) compute recent pan-Arctic lead fractions, and (3) examine the spatiotemporal distribution of lead fractions.CryoSat-2, carrying SIRAL, was launched in April 2010 by the European Space Agency (ESA).CryoSat-2 is a satellite dedicated to polar research.SIRAL is a radar altimeter with a central frequency of 13.575 GHz (K u -band) and a bandwidth of 320 MHz.CryoSat-2 takes advantage of SIRAL when detecting smaller leads with efficient use of the instrument's energy compared to the previous radar altimeter missions such as GeoSat and Jason (Wingham et al., 2006).In this study, we used synthetic aperture radar (SAR) mode, mainly operating on sea ice regions, and SAR interferometric (SIN) mode, mainly operating on steep regions such as the margin of an ice shelf and ice sheet of level 1b baseline C data.The SAR and SIN modes have 256 and 1024 range bins, respectively (Scagliola and Fornari, 2015).The period of CryoSat-2 level 1b baseline C data in this study is for January-May and October-December 2011-2016.
CryoSat-2 transmits bursts of radar pulses (i.e., 64) with high pulse repetition frequency (PRF, 18.181 kHz), which forms Doppler beams because of the along-track movement of the satellite (Wingham et al., 2006).With the help of the high PRF, each Doppler beam is coherently correlated and pointed at the same location on the Earth's surface.This is called beam stacking.Multi-looking is conducting by averaging the stacking beams to reduce speckles and thermal noises (Salvatore, 2013).Exemplary results of waveforms in the L1b SAR data are shown in Fig. 1.These waveforms represent the temporal distribution of reflected power when the radar pulses reach the surface, describing a flat or rough surface.In this case, since the leading edge of each waveform starts from a different range bins, the beginning of the waveform was set to 1 % of the maximum echo power (Fig. 1).For a more detailed explanation of the processes used to develop L1b waveform data, refer to Salvatore (2013).

Sea ice edge data
The EUropean organization for the exploitation of METeorological SATellites (EUMETSAT) Ocean and Sea Ice Satellite Application Facility (OSI SAF) provides multiple sea ice products such as sea ice concentration, sea ice edge, sea ice type, sea ice emissivity, and sea ice drift.The sea ice edge product was developed using the polarization ratios of 19 and 91 GHz, the spectral gradient ratios of 37 and 19 GHz from Special Sensor Microwave Imager/Souder (SS-MIS), and anisFMB from the Advanced Scatterometer (AS-CAT) with a Bayesian approach (Aaboe et al., 2016).In this study, monthly averaged sea ice edge data were used to mask monthly lead fraction maps.The open ice cover in the sea ice edge product was regarded as an open ocean.

Monthly lead fraction maps
Lead fraction maps produced from previous studies (Röhrs and Kaleschke, 2012;Wernecke and Kaleschke, 2015;Willmes and Heinemann, 2016) were compared to the lead fraction maps generated using the proposed waveform mixture algorithm in this study.Röhrs and Kaleschke (2012) produced daily thin ice concentration maps using AMSR-E data with a 6.25 km grid, which can detect leads that are wider than 3 km.The daily thin ice concentration that was over 0.5 (i.e., 50 %) was considered to be a lead and binary daily lead maps were averaged to properly compare other monthly lead fraction maps.A threshold-optimization-based lead detection method with the CryoSat-2 was used in Wernecke and Kaleschke (2015) and monthly lead fraction maps were calculated with the grids of 99.5 km.The thin ice concentration maps (Röhrs and Kaleschke, 2012) and the lead fraction maps using CryoSat-2 (Wernecke and Kaleschke, 2015) are available on their website (http://icdc.cen.uni-hamburg.de/1/daten/cryosphere.html,last access: 16 April 2017).Willmes and Heinemann (2016) also produced daily lead maps over the entire Arctic region, classifying land, cloud, sea ice, leadartefact, and lead with the spatial resolution less than 2 km.The lead class was only considered to calculate daily binary lead fraction maps.The sum of the lead pixels was divided into the days of the months (i.e., 28, 30, or 31) to make monthly lead fraction maps.These data are available on their website (http:/dx.doi.org/10.1594/PANGAEA.854411, last access: 16 April 2017).In this study, we compared the monthly lead fraction maps from January to March 2011 as AMSR-E-based lead fraction maps were only available until 2011.

Waveform mixture algorithm
An endmember in remote sensing data represents a spectrally pure ground component in a single pixel.For exam-ple, it could be pure water, vegetation, bare ground, or a soil crust pixel in remote sensing data.Endmembers play the most important role in conducting spectral mixture analysis.Spectral mixture analysis assumes that the spectra measured by sensors for a pixel is a linear combination of the spectra of all components within the pixel (Keshava and Mustard, 2002).This technique is widely used to resolve spectral mixture problems in image analysis (Foody and Cox, 1994;Lu et al., 2003;Wu, 2004;Iordache et al., 2011).Spectral mixture analysis determines the fractions of the components (i.e., classes) found in mixed pixels by producing abundances of the components based on endmembers.The proposed waveform mixture algorithm adopts the concept of spectral mixture analysis.Since the waveform of altimetry within a footprint could be considered to be a mixture of leads and various types of sea ice, spectral mixture analysis can be applied in this framework.In this study, waveforms of CryoSat-2 L1b data were used as endmembers such as the waveform of pure lead and first-year ice (FYI) (Fig. 1).The lead and ice endmembers are used as reference data for separating leads and ice.In order to successfully implement the waveform mixture algorithm, the proper selection of lead and ice endmembers is essential.
The basic waveform mixture model is defined as follows in Eq. ( 1).
where Y p = {Y 1 , Y 2 , Y 3 , . . ., Y k } represents waveform vectors and k means a range bin in the waveform.a ik is an abundance fraction, which provides lead and ice proportions in terms of lead and ice endmembers.E k is the endmember vector.The r k represents the unmodeled residual.Equation ( 1) is constrained under k k=1 a ik = 1 and a ik ≥ 0. The abundance can be derived by using a least square method to minimize the unmodeled residual (r k ).Chase and Holyer (1990) were concerned by two problems with the application of spectral mixture analysis to the waveform of altimeter data.First, the waveform within a footprint may not be linearly mixed between leads and sea ice.CryoSat-2 is more sensitive to the specular reflection of leads than the diffuse reflection of sea ice when both leads and sea ice exist within the same footprint, which implies the waveform may tend to be similar to the endmember of the leads (Chase and Holyer, 1990).Since CryoSat-2 data have a large number of range bins than Geosat, indicating higher vertical resolution, they could be used to reduce the overestimation of leads.Secondly, the waveform of the altimeter (i.e., Geosat) is somewhat weighted on the center of a footprint rather than representing an entire footprint.This could be an error source when applying spectral mixture analysis to waveform data (Chase and Holyer, 1990).However, the CryoSat-2 L1b waveform is produced by averaging more than 200 weighted waveforms with various incidence angles, which can alleviate this a problem.

Endmember selection
The selection of endmembers is essential in the framework of the waveform mixture algorithm.Among CryoSat-2 orbit files from January to May and October to December between 2011 and 2016, a total of 48 orbit files were selected to extract endmember samples by month (15th day of the month for January to May and October to December), which fully transverse the broad Arctic Ocean (Fig. 2).The lead and ice waveforms are extracted by using the DT algorithm developed for lead detection by Lee et al. (2016).The DT has proven to be very effective in various remote sensing classification tasks (Kim et al., 2015;Torbick and Corbiere, 2015;Amani et al., 2017;Tadesse et al., 2017;Hisabayashi et al., 2018).The lead and sea ice endmembers (i.e., the most representative waveforms) are a key factor in the successful implementation of the waveform mixture algorithm.In order to avoid the subjective selection of endmembers, a number of endmember candidates were extracted by the DT algorithm (Lee et al., 2016) and the N-FINDR algorithm determined the optimum lead and ice endmembers.The N-FINDR algorithm basically uses the fact that the N spectral dimension and the N-volume (V ), defined by a simplex with pure pixels, are always greater than any other combination (Winter, 1999).It operates by inflating a simplex inside of the data (endmembers), starting with any pixel set.The endmember is replaced with another endmember, and the volume is recalculated.The endmember is replaced with the spectrum of the new pixel if the volume increases.This process repeats until the volume does not increase (i.e., until there is no replacement).
where e 1 represents a column vector of the endmember i. V The volume (V ) of the simplex-containing synthetic endmember sets is proportional to the determinant.This algorithm has been widely used for automatically selecting representative endmembers (Winter, 1999;Zortea and Plaza, 2009;Ertürk and Plaza, 2015;Ji et al., 2015;Chi et al., 2016).The DT model from Lee et al. (2016) was developed using data (i.e., stack standard deviation, stack skewness, stack kurtosis, pulse peakiness, and backscatter σ 0 ) collected in March-April from 2011 to 2014.Thus, the waveforms in other months and years should be compared with the waveforms in March-April from 2011 to 2014 through visual analysis to identify whether the waveforms derived by the DT model during the study period can appropriately implement the waveform mixture algorithm.Waveforms from March to April between 2011 and 2014 were compared to those from January to May, and October to December between 2011 and 2016 (not shown), resulting in little difference between them.This justified the use of the DT algorithm to extract waveform samples of leads and sea ice, proposed by Lee et al. (2016).The total numbers of sea ice and lead waveforms are 420 858 and 8501, respectively.However, visual analysis cannot guarantee that the waveforms are quantitatively different by month and year.
The lead classification based on the waveform mixture algorithm was evaluated with 250 m MODIS images collected from March to May and in October.We used Earth View 250 m Reflective Solar Bands Scaled Integers in MOD02QKM and adjusted the contrast to emphasize leads from sea ice in the images.It should be noted that since MODIS images with a spatial resolution of 250 m were not available in January, February, November, and December due to polar nights, the evaluation with MODIS images and lead classification results based on CryoSat-2 could not be used.To secure the reliability of the comparison, the temporal difference between the MODIS images and CryoSat-2 data was always under 30 min.
The waveform mixture model produces abundance data (i.e., lead and sea ice abundance) at along-track points with respect to each endmember of the leads and sea ice (Fig. 3).While the lead abundances are high on the leads, the ice abundances are low on the leads, and vice versa (Fig. 3).Thresholds have to be determined for a binary classification between leads and sea ice.Optimum thresholds to produce binary lead classification from lead and sea ice abundances were identified through an automated calibration.To implement the automated calibration, reference point data of leads and sea ice were determined by visual inspection of four MODIS images collected on 17 April 2014, 25 May 2015, 10 October 2015, and 27 March 2016.While the calibration was conducted using half of the randomly selected reference data, the validation was performed using the remaining data.The size of the leads detected by the proposed waveform mixture algorithm is 250 m or greater because the calibration and validation processes were conducted using MODIS images with 250 m spatial resolution.It should be noted that leads smaller than 250 m are hardly seen in MODIS images, which implies that there is some uncertainty in the comparison of the lead detection methods for small leads.Threshold combinations from 0.2 to 0.9 with a step size of 0.01 for both lead and sea ice abundances were tested, and the one resulting in the highest accuracy was determined to be the optimum threshold combination.
Lead detection results were evaluated using three accuracy metrics -producer accuracy, user accuracy, and overall accuracy (Table 1).Producer accuracy (i.e., a/(a + c) in the table), which is associated with omission errors, is calculated as the percentage of correctly classified pixels in terms of all reference samples for each class.User accuracy (i.e., a/(a + b) in the table), which is related to commission errors, is calculated as the fraction of correctly classified pixels with regards to the classified pixels.Overall accuracy (i.e., (a + d)/(a + b + c + d) in the table) is calculated as the total number of correctly classified samples divided by the total number of validation sample data.The lead and ice reference data using MODIS images and CryoSat-2 tracks were labeled through visual interpretation.
The monthly lead fraction was derived by dividing the number of lead observations by the number of total observations within a 10 km grid in a month.It is noted that, while there are more than 30 CryoSat-2 observations in the 10 km grid around the center of the Arctic, fewer than five observa-Table 1. Error matrix for calculation of user, producer and overall accuracy in terms of lead and ice classification.

Lead
Ice Sum tions are generally found in each 10 km grid in the marginal zones of Arctic Ocean.This will be dealt with in the results section in more detail.It also should be noted that it is hard for the altimeter-based lead detection methods to be used to identify the propagating, opening and closing of leads, such as in Wernecke and Kaleschke (2015) and this study, because sea ice and leads generally move when the altimeters revisit a certain grid.

Calculation of sensitivity in a 10 km × 10 km grid
Since each grid has a different number of CryoSat-2 observations, a sensitivity analysis was conducted in terms of the number of observations by grid.We tested various percentage values to identify which percentage appropriately represents grid sensitivity.As the percentage increased, the grid sensitivity (i.e., standard deviation) also increased but the spatial difference was not significant; hence 30 % was chosen.Thirty percent of the lead and ice observations in 10 km × 10 km grids was randomly permuted 50 times, and the standard deviation of the resultant lead fractions through the 50 iterations were calculated using the grids.The higher the standard deviation in a grid, the more sensitive the observed lead fraction is to the number of available observations.It should be noted that the standard deviation is zero when no lead observation is found, which means the lead fraction is also zero.Sensitivities were calculated from Janwww.the-cryosphere.net/12/1665/2018/The Cryosphere, 12, 1665-1679, 2018 uary to April 2011 because these months were used to compare the lead fractions from the proposed waveform mixture algorithm to those in the existing literature.

Performance of lead classification
Figure 1 shows representative waveforms of leads and sea ice extracted by the N-FINDR algorithm as endmembers.The waveform of leads is dominated by specular reflection, resulting in a narrow peak curve.The representative waveform of sea ice has a wider distribution due to its rough surface when compared to that of leads.Considering different types of sea ice such as young ice, FYI, and multiyear ice (MYI), the representative waveform of sea ice is not significantly different from that of FYI based on visual inspection (Zygmuntowska et al., 2013;Ricker et al., 2015;Lee et al., 2016).
The optimum thresholds for the lead and sea ice abundances were determined to be 0.84 and 0.57 through the automated calibration, respectively.According to the thresholds, leads were identified with the conditions of lead abundance > 0.84 and sea ice abundance < 0.57.Selected examples of lead detection results based on the waveform mixture algorithm are presented in Fig. 4 with threshold-based lead detection results from the existing literature (Rose et al., 2013;Laxon et al., 2013;Lee et al., 2016).Simple thresholding approaches based on two waveform parameters, pulse peakiness (PP) and stack standard deviation (SSD) were used in Rose et al. (2013), Laxon et al. (2013), and Lee et al. (2016), respectively.It should be noted that since the existing methods were developed using parameters such as beam behavior parameters and backscatter σ 0 extracted from baseline B data, rescaling was conducted on the parameters extracted from newly updated baseline C data for reasonable comparison.Since the contrast between the parameters of baselines B and C data is not linear, we rescaled the parameters by adding their differences between the two baseline data to baseline C data.
Multiple lead classification methods based on CryoSat-2 data were evaluated by visual inspection with high-resolution (250 m) MODIS images.Leads (i.e., red dots) and sea ice (i.e., light blue dots) are distinguished, depending on the surface conditions of lead and sea ice (Fig. 4).For better comparisons, a quantitative assessment is required (Fig. 4).DT from Lee et al. (2016) produced the highest overall accuracy (95.19 %), followed by the waveform mixture algorithm (95 %), Rose et al. (2013) 2016) produced the highest user accuracy for leads, while the proposed approach produced the highest producer accuracy for leads, which implies a slight overdetection of leads by the proposed waveform mixture algorithm.The user accuracy for leads of Laxon et al. (2013) is the lowest, resulting in much overde-tection of leads (i.e., many leads on sea ice; Fig. 4).Similarly, the user accuracy for ice in Rose et al. (2013) is lower than that of the proposed waveform mixture algorithm, indicating the detection of leads on sea ice, which is shown in Fig. 4b and c.While the performance of the waveform mixture algorithm was comparable to the DT algorithm from Lee et al. (2016), the waveform mixture algorithm slightly overestimated leads, resulting in a lower user accuracy for leads than that by DT (Figs. 4 and 5).These are inevitable results because waveforms used in the waveform mixture algorithm are basically extracted by DT from Lee et al. (2016).The lead classification results should be assessed during all months (i.e., January to May, and October to December) and years (i.e., 2011 to 2016), using MODIS images to thoroughly evaluate the proposed waveform-based algorithm for lead detection.However, the lead classification results in January, February, November, and December were not assessed using MODIS images due to polar nights.Thus, the lead classification results in these months possibly have uncertainties.It should also be noted that the validation was limited as the MODIS images did not fully cover the entire Arctic region (top of Fig. 4).

Spatiotemporal distribution of lead fraction maps
The monthly lead fraction maps with a 10 km grid from January to May, and October to December from 2011 to 2016 are shown in Figs. 6 and 7.The period from June to September is generally considered to be the melting season.In this season, the presence of leads as well as melt pond in sea ice are dominant.It is difficult to accurately distinguish leads from sea ice due to the fact that the waveform of the melt pond is quite similar to that of leads.Since the lead detection methods for the retrieval of sea ice thickness do not work well in the melting season, the sea ice thickness during the melting season is still unavailable (Tilling et al., 2017).We have compared lead fraction maps to the different spatial resolutions (i.e., 10, 50, and 100 km) to decide the proper spatial resolution.The spatial distribution of all lead fraction maps looked similar (not shown) because the ratios of lead observations to the entire CryoSat-2 observations did not significantly change among different spatial resolutions.Although the number of CryoSat-2 observations with a 10 km grid around the coastline is small (5-10), the greater number of observations in larger grids (50 and 100 km) resulted in a similar distribution of lead fraction around the coastline.It is believed that the lead fraction maps with 10 km spatial resolution better represent the detailed spatial distribution of leads.The areas in the marginal ice zones of the Arctic Ocean clearly show a high lead fraction due to the shear zone (i.e., an area of deformed sea ice along the coast and outflow of sea ice (Serreze and Barry, 2005).In particular, the high lead fraction was found around the Beaufort Sea during the spring season (MAM) because of the Beaufort Gyre, a wind-driven ocean current.It is widely known that the Chukchi Sea is the main strait through which warm Pacific water flows into the Arctic (Woodgate et al., 2006(Woodgate et al., , 2010)).However, the lead fraction around the Chuckchi Sea was lower than the lead fraction around the Beaufort Sea from January to April (i.e., winter season) 2011 and 2016, excluding 2015.While the lead fraction decreases from October to March (i.e., freezing season) with a minimum in March, the lead fraction starts to increase from April.Changes in the Arctic Ocean circulation have contributed to the change in state of sea ice.The lead fraction along the coast of northwestern Greenland in Figs. 6 and 7 is low because of the convergence of sea ice by two major circulations, as shown in Kwok (2015).Kwok et al. (2013) revealed that the currents speed of Beaufort Gyre and Transpolar Drift increased from 1982 to 2009, leading to a decrease in the fraction of MYI.However, we do not find an increase in lead fraction between 2011 and 2016, likely due to the high interannual variability in lead fraction (Fig. 8).In order to properly compare the Arctic current circulations and lead fraction, long-term lead fraction data are needed.

Grid sensitivity analysis
The high standard deviation values around the coastline of the Arctic Ocean imply that the reliability of lead fractions was low.This might further explain why we do not observe an increase in the lead fraction in marginal zones as reported in the literature.On the other hand, the relatively large number of CryoSat-2 observations around the North Pole produced low standard deviations, indicating less sensitivity (Fig. 9i-l).As mentioned in Sect.3.2, the number of CryoSat-2 observations decreases from the North Pole toward the coastline of Arctic Ocean.This results in an in-crease in statistical uncertainties when calculating monthly lead fraction around the coastline of Arctic Ocean based on the small number of CryoSat-2 observations.The number of lead and ice observations is shown in Fig. 9a-h.While there are a few lead observations in the central Arctic, a large number of ice observations was found in the central Arctic.The high standard deviation values around the coastline of the Arctic Ocean imply that the reliability of lead fractions was low, while the relatively large number of CryoSat-2 observations around the North Pole produced low standard deviation indicating less sensitivity (Fig. 9i-l).There was a spatial difference of sensitivity by month (i.e., January to April) because of the different number of lead observations.Especially since there was no lead observation in the East Siberian coast and eastern Laptev Sea, the sensitivity (i.e., standard deviation) was also zero (Fig. 9c and d).It should be noted that the corresponding lead fraction might not represent an actual lead fraction in a 10 km × 10 km grid.This is a drawback when calculating monthly lead fraction maps with satellite altimeters.

Comparison of lead classification methods
Since the overall accuracy metrics of the proposed waveform mixture algorithm approach was comparable to those of the existing methods, especially DT, the waveformbased method can be used for estimating sea surface height anomaly.Threshold-based lead detection methods have to be rescaled whenever baseline data are updated.For example, beam behavior parameters and backscatter σ 0 changed slightly between usage of baseline B and C data.Thus, thresholds must also be updated in order to appropriately identify leads using the threshold-based methods.However, the waveform mixture algorithm is less affected by the change in baseline data because waveforms can still be used to detect leads using updated baseline data.This is the strong point of the waveform mixture algorithm when compared to the existing methods.
The use of the waveform mixture algorithm might not work well for detecting refreezing leads.In Fig. 4 c, g, k,  and o, the dark area in the MODIS scenes around the latitude of 84.26 • N and longitude of 43 • W was determined to be a lead class with visual inspection of the images and waveforms.Rose et al. (2013) classified this region as ice.Laxon et al. (2013) and the waveform mixture algorithm detected one lead in that region.In Lee et al. (2016), DT detected more leads in that region than the other methods, but the validation could not entirely cover the dark area.In fact, since the leads are often refrozen, the shape of the waveforms in that region were likely more similar to the FYI waveform than the lead waveform (Zygmuntowska et al., 2013;Ricker et al., 2015;Lee et al., 2016).In the context of the waveform mix- ture algorithm, this region could be classified as ice.Therefore, in order to more accurately detect leads, a surface elevation anomaly is needed as well as beam behavior parameters, backscatter σ 0 , and the waveform mixture algorithm because the surface elevation anomaly on refreezing leads would be low, as in other leads.

Comparison to other lead fraction maps
Four monthly lead fraction maps (Röhrs and Kaleschke, 2012;Wernecke and Kaleschke, 2015;Willmes and Heinemann, 2015) were compared to evaluate the pros and cons of each method used to produce the maps (Fig. 10).All four methods represent the spatiotemporal pattern of leads well for the freezing season from January to March.Scene-based lead fraction maps (i.e., AMSR-E in Fig. 10a-c, and MODIS in Fig. 10d-f) and altimeter-based lead fraction maps (i.e., CryoSat-2 in Fig. 10g-l) have fundamentally different spatial characteristics, as AMSR-E and MODIS are sensitive to different surface features.Scene-based lead fraction maps better represent the linear feature of leads and coastal polynya than altimeter-based lead fraction maps.Since the AMSR-Ebased approach only detects relatively large (∼ 3 km) leads, lead fractions are generally lower than in the fraction maps using the other approaches.While altimeter-based lead fractions in January 2011 (Fig. 10g and j) in the Chuckchi Sea were high, scene-based lead fractions (Fig. 10a-f) were low in January 2011.There are deformed and fragmented sea ices in the Chukchi Sea, which are different from the general lead shape.Altimeter-based lead detection methods identified leads between deformed and fragmented sea ices, generating a higher lead fraction in the Chukchi Sea in January 2011 (Fig. 10g and j).However, scene-based lead fraction methods did not detect leads in the Chukchi Sea well, resulting in a lower lead fraction.The MODIS-based lead detection method that used IST did not detect leads in the Chukchi Sea (Fig. 10d-f).In the AMSR-E images, sea ice signals were dominant in the footprint around the Chukchi Sea and cracks between deformed and fragmented sea ices were identified as ice.
Altimeter-based monthly fraction maps might be insufficient to represent monthly lead fractions in the coastline of the Arctic Ocean due to the limited number of CryoSat-2 observations in a month.Nonetheless, altimeter-based lead fraction maps documented the overall spatial distribution of leads reasonably, in particular for high lead fractions in the shear zone.Wernecke and Kaleschke (2015) used a random cross-validation technique to derive optimum thresholds based on ground references (i.e., MODIS images).They identified leads conservatively to reduce false classifications.The classification results strongly depend on ground reference data.Since relatively high-resolution (250 m) MODIS images were used to construct reference data in this study, the waveform mixture algorithm was able to identify small leads through the calibration process of the abundance data (Fig. 4).Although the proposed waveform mixture algorithm produced lead fraction maps with higher spatial resolution than those in Wernecke and Kaleschke (2015), the lead fractions around the coastline of the Arctic Ocean from Wernecke and Kaleschke (2015) appeared to have less sensitivity.This is because of the larger number of lead observations in a much coarser grid than that from our results.The grid sensitivity analysis should be considered when interpreting the lead fraction maps around the coastline of the Arctic Ocean derived by the proposed waveform mixture algorithm.
The choice of monthly lead fraction maps depends on the user's interest.Scene-based lead fraction maps better represent coastal polynya and the intrinsic form of leads (Röhrs and Kaleschke, 2012;Willmes and Heinemann, 2016).CryoSat-2-based lead fraction maps might not represent the linear shape of typical leads well like cracks which include deformed and fragmented sea ices that are not in linear form.This is also a way to exchange heat and momentum transfer between the atmosphere and ocean, which can be detected as leads.

Novelty and limitations
In this study, we developed an alternative lead detection method (i.e., waveform mixture algorithm) using CryoSat-2 L1b data, which can overcome the drawbacks of the previous threshold-based lead detection methods.Regardless of an update in CryoSat-2 baseline data, the proposed waveform mixture algorithm can consistently identify leads without rescaling parameters such as beam behavior parameters, pulse peakiness, and backscatter σ 0 .Such parameters must be re-scaled to implement threshold-based lead detection methods when using updated CryoSat-2 baseline data.
In addition, the proposed waveform mixture algorithm outperformed the existing simple thresholding-based methods  (Rose et al., 2013;Laxon et al., 2013) and was comparable to the machine-learning-based thresholding method (Lee et al., 2016).These advantages make the proposed waveform mixture algorithm useful for integration in operational systems.
However, the waveform mixture algorithm depends on the quality of the endmembers.Although the use of the N-FINDR algorithm decreased the subjective selection of endmembers, waveform samples of leads and sea ice derived by DT algorithm from Lee et al. (2016) may introduce uncertainty because the algorithm was validated for March and April from 2011 to 2014.The leads that are not identifiable in the MODIS images were not considered in this study.Detecting leads smaller than the along track resolution of CryoSat-2 (∼ 300 m) with various lead detection methods should be further discussed in detail in future research using high-resolution Landsat or SAR imagery.This is quite im-portant in the retrieval of sea ice thickness using an altimeter because leads are used as the tie points for the sea surface height (SSH).For example, how the leads smaller than the along-track resolution of CryoSat-2 affect the waveform and SSH should be further investigated.The spatial resolution of monthly lead fraction maps improved up to 10 km, showing a detailed spatial distribution of leads in the Arctic.For example, 10 km lead fractions showed significant variations in some regions, while 50 or 100 km lead fractions did not because lead fractions are averaged, resulting in blurred spatial patterns.

Conclusions
The waveform mixture algorithm was proposed to detect leads with CryoSat-2 L1b data.The lead and sea ice waveforms were considered to be endmembers that are essential for implementing the waveform mixture algorithm.The endmembers (i.e., representative waveforms of leads and sea ice) were extracted by the N-FINDR algorithm among numerous waveforms (i.e., 420 858 waveforms of sea ice and 8501 waveforms of leads).The thresholds used for the binary classification were determined by calibrating lead and sea ice abundances with reference data extracted from a highresolution (250 m) MODIS images.The results show that the proposed approach robustly classified leads with comparable performance to DT from Lee et al. (2016) and slightly better than the existing simple thresholding approaches for lead detection (Rose et al., 2013;Laxon et al., 2013).Furthermore, the lead detection of the waveform mixture algorithm was comparable to the DT-based lead detection method (Lee et al., 2016), suggesting that a sea ice freeboard can be retrieved with the robust lead detection method using the waveform mixture algorithm.Monthly lead fraction maps were produced using the proposed waveform mixture approach, showing clear interannual variability.The results of the lead fraction maps are consistent with the findings of recent studies (Tilling et al., 2015;Ricker et al., 2017;Kim et al., 2017).
Threshold-based lead detection methods heavily depend on beam behavior parameters.However, the proposed waveform mixture algorithm directly uses waveforms, which does not require it to change any parameters when the CryoSat-2 baseline version is updated.This method can be easily adapted to future missions.In this context, this waveform mixture algorithm can be used to consistently produce monthly lead fraction maps during the extended CryoSat-2 mission for monitoring Arctic sea ice.In addition, this study showed the high interannual variability of pan-Arctic lead fractions in recent years (i.e., 2011-2016).

Figure 1 .
Figure 1.Representative waveforms of (a) leads and (b) sea ice over the Arctic Ocean selected by the N-FINDR algorithm from January to May and October to December between 2011 and 2016.Refer to the methods section for the N-FINDR algorithm.

Figure 2 .
Figure 2. The 48 CryoSat-2 orbit files from January 2011 to December 2016 used for extraction endmember waveforms.The CryoSat-2 orbit files almost cover the entire Arctic Ocean.

Figure 3 .
Figure 3. Lead and ice abundance derived by the waveform mixture algorithm on 10 October 2015.(a) Lead abundance, (b) ice abundance.The color bar expresses abundances from 0 to 1.

Figure 4 .Figure 5 .
Figure 4. Visual comparison of lead classifications (a)-(d) based on Rose et al. (2013), (e)-(h) based on Laxon et al. (2013), (i)-(l) based on decision trees from Lee et al. (2016), and (m)-(p) based on the proposed waveform mixture algorithm.The MODIS data were collected on 27 March 2016 (a, e, i, m), 17 April 2014 (b, f, j, n), 25 May 2015 (c, g, k, o), and 10 October 2015 (d, h, l, p).An overview map of the location of cropped MODIS images is at the top of the figure.

Figure 6 .Figure 7 .Figure 8 .
Figure 6.Monthly lead fraction maps based on the waveform mixture algorithm from January to May and October to December between 2011 and 2013.The range of the color bar was set from 0 to 0.5 to emphasize lower values.

Figure 9 .
Figure 9. (a-d) The number of lead observations, (e-h) the number of ice observations, and (i-l) the standard deviation of the results based on the sensitivity analysis of lead fraction from January to April 2011.

Figure 10 .
Figure 10.Comparison to other lead fraction maps from January to March 2011.(a-c) Monthly mean thin ice concentration maps using AMSR-E from Röhrs and Kaleschke (2012).(d-f) Monthly mean lead fraction maps using MODIS from Willmes and Heinemann (2015).(g-i)Monthly lead fraction maps using CryoSat-2 fromWernecke and Kaleschke (2015).(j-l) Monthly lead fraction maps based on the waveform mixture algorithm using Cryosat-2 in this study.