Assessment of error in satellite derived lead fraction in Arctic

Introduction Conclusions References


Introduction
In winter leads control heat transfer between the ocean and the atmosphere despite their relatively small areal coverage.For instance, sensible heat flux through leads can be of the order of 600 W m −2 , compared to an annual average of about 3 W m −2 over ice (Maykut, 1978).This applies to leads represented by both open water and thin ice, but in winter the refreezing happens very quickly and open water leads exist only for a very short time (Weeks, 2010).Open-water leads alone, even though covering only 1-2 % of the central Arctic, contribute more than 70 % to the upward heat fluxes (Marcq and Figures Back Close Full Weiss, 2012).Model simulations showed that even 1 % change in sea ice concentration due to the increase in areal lead fraction can lead to a 3.5 K difference in the surface temperature (Lüpkes et al., 2008).Studying signatures of leads and surrounding ice in the images from Moderate Resolution Imaging Spectroradiometer (MODIS) Beitsch et al. (2014) showed that difference in ice surface temperature between thicker ice and a lead covered by thin ice could be as large as 15-20 K, while open water and thin ice in leads differed in temperature by up to 10 K (Fig. 2 in Beitsch et al., 2014).This makes the surface energy budget in the Arctic very sensitive to the fraction of the surface covered by leads, which has changed in recent years with the shift towards younger (Maslanik et al., 2007) and mechanically weaker sea ice cover (Rampal et al., 2009).
Areal fraction of leads in Arctic sea ice can be viewed as a parameter reflecting loss in mechanical strength of the ice pack and indicating the degree of surrounding sea ice mobility.Rampal et al. (2009) reported steady increase in sea ice deformation rate and drift during 1979-2007 and argued for possible correlation between the two.These trends still remain a challenge to capture for the current sea ice models, especially because they fail at simulating sea ice fracturing and lead opening with the correct properties.Accurate observations of lead fraction are thus of high importance for model evaluation and for being assimilated into models as initial conditions, or during a simulation.For example, Bouillon and Rampal (2015) and Rampal et al. (2015) presented recently a new sea ice model which is able to use information on lead fraction to constrain the local mechanical response of sea ice to winds and currents, with a significant impact on performance with respect to e.g.simulated sea ice drift and deformation.In this context, using accurate estimates of lead fraction with their associated uncertainties is therefore crucial.
A method for areal lead fraction (LF) retrieval from Advanced Microwave Scanning Introduction

Conclusions References
Tables Figures

Back Close
Full ture Radar (SAR) images and CryoSat-2 tracks (Röhrs et al., 2012).A daily light-and cloud-independent pan-Arctic LF dataset (AMSR-E LF) for winter months November-April from 2002 to 2011 was obtained using this method and published at Integrated Climate Date Center -ICDC, University of Hamburg (http://icdc.zmaw.de/),and represents a unique and valuable dataset.It was then used to automatically obtain lead location and orientation with a success rate of 57 % (Bröhan and Kaleschke, 2014).Preferred lead orientations were found typical for different regions of Arctic.
The AMSR-E LF method is essentially a thin ice concentration retrieval method, which was adapted to identify leads by using median filtering.This filtering enhances the leads' features due to their narrow and elongated shape.Therefore, other thin ice retrieval methods based on passive microwave observations (e.g., Mäkynen and Similä, 2015;Naoki et al., 2008;Cavalieri, 1994) cannot be used directly for LF retrieval.Sea ice concentration algorithm ASI (Svendsen et al., 1987;Kaleschke et al., 2001;Spreen et al., 2008) was able to identify leads (Beitsch et al., 2014) when implemented at 89 GHz frequency of AMSR2 on-board the Global Change Observation Mission-Water satellite with resolution of 3.125 km.However, this approach is limited in time coverage because AMSR2 started to deliver the data only in 2012 (http://suzaku.eorc.jaxa.jp),and quantitative validation work is still needed.
A lead detection method based on MODIS ice surface temperature was developed by Willmes and Heinemann (2015).The method classifies a scene into leads and artefacts, where for the first class (leads) the success rate is as large as 95 %.However, in the class of artefacts, which are mostly caused by ambiguity in cloud identification, there is a 50 % chance of it being either a lead or an artefact.Combined retrieval error from the two classes for a daily map, obtained by averaging, is estimated to be 28 %.The method gives daily lead occurrence maps at 1 km 2 resolution.
A number of classifiers applied to CryoSat-2 were tested for lead detection potential, and the most promising one identified and used to derive LF and lead width distribution (Wernecke and Kaleschke, 2015).The selected classifier was able to detect ∼ 68 % of leads correctly, and only ∼ 3 % of ice measurements were falsely identified as leads.Introduction

Conclusions References
Tables Figures

Back Close
Full Despite such good capability and fine resolution of 250 m, LF retrievals from CryoSat-2 are limited spatially, because the measurements are conducted by tracks making daily pan-Arctic coverage impossible; and temporally, the satellite being launched in 2010.Suggested approaches using laser altimeter for lead detection (e.g., Farrell et al., 2009 with the Ice, Cloud and land Elevation Satellite (ICESat)) have similar limitation.
Lindsay and Rothrock (1995) suggested a method for retrieval of lead widths and LF from thermal and reflected solar channels on the advanced very high resolution radiometer (AVHRR).The nominal resolution of the instrument is 1.1 km, and it is also able to resolve subpixel-sized leads due to strong contrast caused by leads and their network-like pattern.However, an AVHRR-retrieved LF dataset would be limited to cloud-free areas and its quality would depend on the quality of cloud masking defining these areas.
Automatic classification of leads from SAR is difficult, because radar backscatter signature of leads in SAR images can be ambiguous.This is due to wind roughening of the open water in the leads and occasional presence of frost flowers when new ice has just formed in a lead (Röhrs et al., 2012).To the authors' knowledge, no method has so far been presented in literature addressing automatic LF retrievals from SAR.
As it is outlined above, there are a variety of available promising methods to detect leads and retrieve LF from satellites.They all have their advantages and disadvantages and, depending on these, can be used for achieving different purposes.The topic of this study is a dataset meeting the following criteria: retrieving LF (not only lead occurrence, location or orientation), daily coverage, pan-Arctic, cloud-and light-independent, covering longest possible time period.The AMSR-E LF appears to be the only suitable dataset in this context, and therefore we find it necessary to provide quantitative error estimations of this dataset, which has not been done before.Based on analysis of the errors we introduce a correction factor for the existing dataset and suggest an improvement of the AMSR-E based method itself.In order to achieve the goal of this study, a simple SAR-based method for LF retrieval is suggested.Currently the method Introduction

Conclusions References
Tables Figures

Back Close
Full is specifically adapted for the purposes of this study, but further development can give a universal approach for areal LF retrieval from SAR, which would be highly valuable.Following the Introduction, Sect. 2 of the paper describes the data used for the study, and Sect. 3 explains the SAR-based method.The results are presented in Sect. 4 followed by Sects.5 and 6.

The AMSR-E LF dataset
The AMSR-E LF dataset for the time period of November 2003-April 2011 was used (downloaded in February 2015, http://icdc.zmaw.de/1/daten/cryosphere/lead-area-fraction-amsre.html).It covers winter months of November through April and is provided on a polar-stereographic grid with 6.25 km resolution distributed by National Snow and Ice Data Center (NSIDC).LF is expressed as the percentage of a grid cell covered by leads, which are represented by either open water or thin ice.Since openings refreeze very quickly in winter, the majority of the data entries are thin ice concentrations.The dataset is limited to areas where sea ice concentration is above 90 %, as retrieved by the ASI algorithm.
The AMSR-E LF dataset is shown in Fig. 1 as a number of measurements in each bin expressed in % of the total number of measurements (relative frequency), where each bin has a width of 5 % except the first one, which excludes LF < 1 %.These very small values of LF in the dataset appeared rather random on the daily maps and therefore were excluded assuming the method's precision would not have allowed resolving them anyway.All the grid cells close to land were also removed (2 grid cells away from land) because these areas contained large amount of near 100 % LF values, which may be caused by either real presence of the coastal polynyas/leads or an artefact due to the vicinity of land.Figure 1  The histograms for these months reflect the tendency observed in the full dataset, thus allowing us to limit the analyses presented in this paper to only this one winter.The last bin (LF 95-100 %), characterised by significant amount of measurements in comparison to the other bins with high LF values, will be addressed in later sections.
For the validation by SAR images the AMSR-E LF dataset was re-projected on the domain defined in Sect.2.2 using Nansat -an open source Python toolbox for processing 2-D satellite earth observation data (Korosov et al., 2015, https://github.com/nansencenter/nansat).

The SAR images
ENVISAT ASAR WSM (advanced SAR wide swath mode) images at HH-polarisation acquired during the winter of November 2008-April 2009 were used in this study.The area of interest is defined by the geographical coordinates ( 83• N, 20 • W), (87

SAR-based threshold technique
A threshold technique similar to one developed for lead detection from MODIS-derived ice surface temperature (Willmes and Heinemann, 2015) is suggested for automatic lead identification in SAR scenes.Visual inspection of SAR images shows that leads, in most cases, have lower backscatter than surrounding thicker ice.The transition is defined by a threshold, which is not constant from one image to another, as we find from automatic lead detection tests conducted on a number of SAR images.Therefore, we use characteristics of backscatter distributions for each SAR scene instead.
Before the threshold can be applied to a SAR scene (a subset is shown in Fig. 3a and respective distribution in Fig. 3d, beige bars) the image is undergone median filtering with a window size of five pixels (found experimentally), which reduces the noise while preserving the edges of the features.One such filtered subset of a SAR image is shown in Fig. 3b (distribution in Fig. 3d, blue bars), where dark blue areas correspond to leads.Comparison of distributions before filtering (wider) and after shows the noise-reducing effect of the median filtering.After applying the threshold, so that all the backscatter values below its value are classified as leads and the rest -as ice, a binary map (Fig. 3c) is retrieved.The threshold (σ t 0 ) is defined as where σ P 0 is the backscatter value at the peak of the distribution (blue line in Fig. 3d), δ is the standard deviation of the distribution, and n δ is a number of standard deviations to move away from the peak, that enables automatic identification of leads.The threshold was first tried with n δ = 1 and n δ = 2 (dashed red lines), but it was found that an intermediate value n δ = 1.5 (solid red line) worked better and therefore was chosen.and divided by the total number of SAR-pixels in it, which gives a percentage after multiplying it by 100.This method is developed strictly for the purpose of the AMSR-E LF dataset validation and therefore does not represent an independent LF retrieval method from SAR.Its limitations and potential of further development for wider applications are addressed in Sect. 5.

Reference lead fraction datasets retrieved from SAR
Using the approach described in Sect.3, we produced two SAR-based validation datasets: one with manual quality control of each SAR subset of 1000 × 1000 pixels (MQC SAR LF) and one based on automatic threshold where quality control is done by discarding images with obviously unsuccessful LF retrievals (SAR LF).

MQC SAR LF
This high-quality dataset was produced in order to verify the larger SAR LF dataset.Significantly larger amount of measurements in the SAR LF allows robust statistical analysis, but visual quality control of each image, given that leads are numerous small features, is hardly achievable.For the MQC SAR LF two criteria need to be verified: (1) whether the classification is successful and (2) whether leads are located in exactly the same places in both SAR-and AMSR-E-retrieved LF.The latter was mostly the case, however sometimes a lead in AMSR-E LF was misplaced by a distance large enough so that the two datasets mismatch.We believe this misplacement is caused by cases of relatively fast sea ice drift in the area.If we consider an AMSR-E grid cell of 6.25 km×6.25 km size, a SAR image is taken at a certain time of the day in this grid cell, while ASMR-E LF is a gridded daily product and thus provides an average over all the swaths covering this grid cell collected during 24 h.During a few hours the lead could Introduction

Conclusions References
Tables Figures

Back Close
Full have moved fast enough to disappear from the given grid cell.From visual analysis of the images we could say that this situation did not happen very often, however a quantitative estimate of how much it affects the validation was needed.Therefore, we make an assumption that if the distribution of SAR LF is similar to that of MQC SAR LF, where we made sure every lead was located correctly, the misplacements were indeed seldom the case also in the SAR LF dataset.
To produce the MQC SAR LF, five SAR scenes acquired in March 2009 with sufficient amount of easily distinguishable leads were selected.It was found that the quality of LF retrieval increases when dividing SAR scenes into subsets, and the subset size of 1000 × 1000 pixels showed to be sufficient.Using such small subsets rather than a full SAR image provides more accurate thresholds because it limits possible variability in conditions within the subset.Such conditions can be wind speed or ice surface properties (wet or dry ice, for example).Defining a threshold locally not only eliminates significance of these effects, but it takes advantage also of less variety of surfaces in general.For example, presence of open water, land, consolidated ice, wet ice, dry ice, and marinal ice zone in one image will make it difficult to find a threshold that will only identify leads.While a smaller subset, where only consolidated ice with leads is present, will give clearer threshold.
The threshold was thus calculated individually for each 1000 × 1000 pixels subset using Eq. ( 1), and used to calculate LF in corresponding AMSR-E grid cells.The classification in each subset was then inspected visually, comparing the three collocated maps: backscatter, MQC SAR LF and AMSR-E LF, in order to make sure it was successful.This procedure gave 1645 high-quality MQC SAR LF retrievals, which were then used to verify the findings based on a larger SAR LF dataset.

SAR LF
To produce this dataset, SAR subsets of 3500 × 3500 pixels each (on average) were used: the full SAR images were cut to match the region of interest (Fig. 2).The quality control of this validation dataset was done by visual inspection of every classified Introduction

Conclusions References
Tables Figures

Back Close
Full subset together with the original SAR subset (backscatter) and collocated AMSR-E LF product.In this process images were discarded in cases of unsuccessful lead identification, which is when features that appear like leads were missed by the method.This was of particular importance in cases when AMSR-E LF identified a feature in the respective location, to secure proper error estimation for the AMSR-E LF product.The majority of subsets contained leads represented by signatures darker than surrounding background, while if those with brighter signature were present in large amount such images were discarded.This means that the majority of the leads in the selected subsets were either composed by thin ice or calm open water.Therefore, the wind speed is not taken into account in this study, but for a more general application this would have been necessary to account for wind roughening of the open water areas in leads.
As a result we obtained a dataset for the period of November 2008-April 2009, made of 21-47 subsets (3500 × 3500 pixels each) per month, with number of measurements varying from about 8000 to 19 500 (Table 1) depending on the month.

Comparison of the AMSR-E LF and MQC SAR LF
The AMSR-E LF and MQC SAR LF datasets are shown in Fig. 4 as a scatterplot (left) and histograms (right).The scatterplot shows that the majority of the points are located below the 1-to-1 line, which means that in most cases AMSR-E LF overestimates the LF as compared to SAR retrievals.Note that for the value of AMSR-E LF 100 % there is wide range of MQC SAR LF values covering almost the full scale from 0 to 100 %.The right panel of Fig. 4 shows histograms of the two datasets representing number of measurements per each 5 %-bin expressed in % of total number of measurements (1645 in this case).The distributions of the two datasets look principally different, characterized by steep decrease in number of cases with increasing LF for SAR and wide distribution of values in the AMSR LF.Thus, for LF > 20 % AMSR LF seems to largely overestimate number of cases and underestimate this number for lower LF values.Similarly to the full AMSR-E LF dataset (Fig. 1) the near 100 % bin contains relatively large amount of measurements.In fact, about 94 % of all the data in this bin in the full AMSR-E dataset Introduction

Conclusions References
Tables Figures

Back Close
Full are above 99.9 %.From Fig. 1 one could assume that the amount of measurements in this bin should be smaller than in the previous bin following the gradual decline of the distribution (accordingly to the power-law distributions suggested by Wernecke and Kaleschke, 2015 andMarcq andWeiss, 2012), so that there is a much larger amount of smaller leads as compared to large ones.In order to understand the origin of such large amount of LF near 100 % we compare spatial maps of LF obtained from AMSR-E and SAR.As an example of such analysis, Fig. 5 shows part of a SAR image overlaid by collocated AMSR-E LF product, where one can see general overestimation of LF by AMSR-E (larger grid cells shown as percentage by different colours).But in particular it is clear for the LF 100 % cases (red grid cells): these often correspond to a smaller amount of water/thin ice in the SAR image.Four neighbouring AMSR-E grid cells are shown in a close-up inset, where three of them have a LF value of 100 % (the fourth one has no value), while the SAR image in the background clearly contains one lead that covers only about 25 % of the right grid cell, 40 % of the upper grid cell and about 60 % of the left one, where also smaller cracks are present.

Error estimations of the AMSR-E LF based on SAR LF
Same procedure as in Sect.4.2 is now applied using the large SAR LF dataset.Histograms for collocated datasets AMSR-E LF and SAR LF are produced for each month of the considered period (Fig. 6).They show the same tendency as when using the shorter high-quality dataset.The distributions here are much smoother because of the significantly larger number of measurements.The similarity of the distributions coming from high-quality MQC SAR LF and SAR LF allow us to base our conclusions on the larger dataset (SAR LF) thus providing more accurate estimates of errors.
Having this significant amount of collocated SAR and AMSR-E retrievals of LF we can confirm that the peak in AMSR-E LF dataset near 100 % represents an artefact.
We believe that this is a result of the assumption lying behind the AMSR-E method for LF retrieval.The method is based on the ratio of the brightness temperatures (r) in 89 and 19 GHz channels (Röhrs et al., 2012).The assumption is that all the values of Introduction

Conclusions References
Tables Figures

Back Close
Full this ratio above a certain constant value (a tie point) will give LF 100 %.All the other values are linearly interpolated between a tie point for LF 0 % (r0) and a tie point for LF 100 % (r100).If the upper tie point r100 is too low, a significant amount of LF values assigned to a value of 100 % by this cut-off may actually correspond to a variety of LF much lower than 100 %.This is reflected in Figs. 4 (left) and 5, where values of LF 100 % in AMSR-E dataset correspond to variety of values from SAR dataset.Ideally, an improvement of ASMR-E LF method is needed, for example, by adjusting the upper tie point so that the full range of LF values are covered.We address this further in Sect. 5. Since improvement of the method and production of a new AMSR-E LF dataset is out of scope of this study, we suggest imitating the same problem with the SAR LF dataset instead.Introduction of a new upper tie point r 100 would be equivalent to dividing of all the AMSR-E LF values by a certain factor, defined as f = (r 100 − r0)/(r100 − r0), because the method is based on linear interpolation of all the values between the limits of the range.Since the LF values in the near 100 % bin for AMSR LF are unknown, we suggest multiplying the SAR LF dataset by such factor instead.In order to define the value of f (also referred to as AMSR-E factor) we vary its value from 1 to 5 and calculate respective root mean square error (RMSE) as a measure of difference between the histograms of AMSR-E LF and SAR LF datasets for each month (Fig. 6.): where RF stands for relative frequency in each bin, and n b is the number of bins.
Obtained RMSE h is plotted as a function of f in Fig. 7 (left), where each month is assigned different colour and March 2009 is highlighted by bold line to illustrate the principle.By minimizing RMSE h we find optimal f value for each month, which amounts to 3.3, 2.5, 2.8, 3.7, 2.8, and 2. The values in other bins also redistribute in a way that is similar to the AMSR-E LF dataset.Original histograms of AMSR-E LF and SAR LF (same as Fig. 6, but for the full winter) are also shown for reference.
The systematic overestimation of AMSR-E LF data also affects the mean value of the distribution.For winter 2009, the mean value of AMSRE LF (LF AMSRE ) is equal to 31 %, whereas it is equal to 13 % for the SAR LF (LF SAR ).The absolute relative difference 100 × |(LF AMSRE − LF SAR )/LF SAR | decreases from 140 % with no correction to 17 % when using the correction factors found here.
Finally the agreement between SAR LF and AMSR-E LF datasets can be estimated by the point-wise RMSE of LF for the whole winter 2009: where n is the total number of measurements (64 063).Here LF SAR i are the LF values obtained when multiplying by the correction factor, so that point-wise RMSE is relatively independent of the systematic bias in AMSR-E LF.The point-wise RMSE is equal to 43 % and is an estimate of the standard deviation of the difference between AMSRE-E LF and SAR LF.However, similar computation of RMSE using LF SAR i without correction gives a value of 33 %, suggesting the need for a more physically justified approach, e.g. by improving the AMSR-E based method (see Sect. 5).

Discussion
A method to retrieve LF from SAR backscattering coefficient is introduced.This simple threshold technique is only suitable for the purposes of this study, and is thus not universal.However, its potential is shown, and the limitations are identified, which allows further developments of the method.One of the limitations is ambiguity of SAR signatures corresponding to leads.Introduction

Conclusions References
Tables Figures

Back Close
Full When a lead is represented by calm open water or thin ice, it has lower backscatter values than surrounding thicker ice and therefore can be identified by a threshold.While in cases when wind is roughening the open water surface in the lead, its signature becomes brighter.Another case of such ambiguity is presence of frost flowers on the newly refrozen lead, which also causes brighter signatures (Röhrs et al., 2012).Such leads with brighter signature than the background are not identified by the presented SAR method, but are sometimes (but not always) identified by the AMSR-E method.These cases did not occur much in the considered examples and were discarded from the analysis thus not affecting the conclusions.For a more universal SARbased method such cases can be included by introducing two thresholds -one for the leads appearing darker than the background and one for the ones appearing brighter.In that case two different sides of the backscatter distribution will be used independently.
Another limitation of used approach is presence of areas with presumably wet snow/ice, which appear rather dark on a SAR image and therefore are classified as leads by the threshold method.These cases did not occur often in our selection, and they did not influence the comparison because AMSR-E LF usually does not identify leads in such areas, and we only included the grid cells where AMSR-E LF dataset had any value above 0.1.The threshold is also sensitive to the sea ice thickness.At given threshold only leads with ice thin enough will be identified as leads.Since we do not know how thick the ice is, it adds to the ambiguity of such method.In other words, by selecting a threshold we indirectly set the sea ice thickness limit.When the distribution is bimodal (one mode for leads and one for thicker ice), a value between the peaks can be used as threshold, as suggested by Lindsay and Rothrock (1995) for distributions of temperature or brightness.However, such cases were so rare in the selected SAR images that this approach was discarded.To achieve bimodal distribution, the LF calculation procedure can be applied to SAR scenes divided into sub-scenes (size of approximately 1000 × 1000 pixels), which will demand more processing time.Such definition of threshold could serve as a more robust approach when developing Introduction

Conclusions References
Tables Figures

Back Close
Full an independent method for automatic SAR LF retrieval.For the purposes of this study the quality of suggested simple threshold method was considered sufficient.A validation dataset is produced using this method in order to quantify errors in AMSR-E LF estimates.However, these error estimates should be considered as rather preliminary, because the AMSR-E LF product in its current form cannot be fairly compared to a validation dataset.We identify an issue related to near 100 % LF values in the AMSR-E LF dataset: they occur very often, which is neither observed in the SAR datasets or conforms to the power law model usually assumed as decribing lead width distribution well.Based on these findings and the basics of the ASMR-E method, we make an assumption that the upper tie point in the method should be increased in order to cover the full range of LF values.In order to test this assumption we implement the method according to Röhrs et al. (2012) and calculate LF from the AMSR-E brightness temperatures on the 8 March 2009 with the original tie points (a subset is shown in Fig. 8, upper left), i.e. with the upper tie point r100 = 0.05.Such calculations give similar distribution of LF values (Fig. 8, upper right) as was found in the full AMSR-E dataset (Fig. 1).Using the linear relationship between r100 and f , and the optimal value of f for March 2009 (f = 2.8), we define that r100 should be increased to 0.113.This new threshold value gives a distribution closer to SAR LF dataset (Fig. 8, bottom right) -the value of RMSE h (Eq.2) decreasing from 5.4 % (corresponding to f = 1 in Fig. 7, left) to 0.9 %, while point-wise RMSE (Eq. 3) for this one-day dataset of 750 collocated LF measurements decreases from 45 to 23 %.The close-up insets similar to the one in Fig. 5 show that the leads are identified in the same locations as before, but the LF values are lower (Fig. 8, bottom left).We thus believe that implementation of such an adjustment to the full AMSR-E LF dataset will lead to a much better agreement with the SAR LF dataset.The new value of r100 retrieved for the other months It should be noted that even an improved AMSR-E LF method would still have its limitations.For example, it would not be able to capture leads narrower than 3 km due to its resolution, while leads as narrow as a few meters transmit turbulent heat more than two times as efficient as the ones hundreds of meters wide (Marcq and Weiss, 2012).For studies like e.g.assessing the integrated heat fluxes through leads in wintertime, the AMSR-E LF dataset alone will thus not be sufficient and other methods should be used in addition.Another limitation of such a method would be retrieval of LF in summer, when passive microwave observations are challenging.
Additional benefit of the improved and validated AMSR-E LF dataset would be a possibility to refine sea ice concentration datasets.An improved sea ice concentration dataset for Arctic winter can be produced by implementing ASI algorithm for AMSR2 brightness temperatures, which in itself is more sensitive to the leads than other sea ice concentration algorithms (Beitsch et al., 2014) and then refining it by accommodating the LF dataset.To achieve even better accuracy, the LF dataset in this case should also be implemented for the 3.125 km resolution AMSR2 brightness temperatures.

Conclusions
This work was partly motivated by the need of an accurate pan-Arctic lead fraction (LF) dataset for initialisation and evaluation of regional sea ice models.One such dataset was identified as having good potential for the purpose -daily pan-Arctic LF retrieved from Advanced Microwave Scanning Radiometer -Earth Observing System (AMSR-E), a passive microwave instrument independent on cloud cover and light conditions.In this study we set a goal to evaluate the AMSR-E LF dataset and provide quantitative estimate of eventual errors.These can serve as a measure of uncertainty of the product and background for a correction.
After analysis of the AMSR-E LF dataset and comparison to LF retrievals from Synthetic Aperture Radar (SAR) we identified an issue with the near 100 % LF values in this dataset.More specifically, we concluded that the tie points used in the AMSR-E Introduction

Conclusions References
Tables Figures

Back Close
Full  Full Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | shows the full dataset covering all the winters from November 2003 through April 2011 (∼ 26 millions measurements) by blue bars, and Discussion Paper | Discussion Paper | Discussion Paper | each month from November 2008 to April 2009 (varying from ∼ 430 to ∼ 600 thousands measurements) by different colours.
and is shown in Fig. 2 by the red rectangle.This area located north of Fram Strait was chosen due to relatively large amount of leads occurring in this particular region (see e.g.Bröhan and Kaleschke, 2014) so that sufficient amount of AMSR-E LF retrievals would be available for validation, and because this region is well covered by SAR data.The SAR images originally provided with spatial resolution of 75 m × 75 m, were re-projected using the Nansat toolbox onto a polar stereographic projection with nominal resolution of 100 m × 100 m with latitude of origin and central meridian defined by the central coordinates of the selected area.Calibrated surface backscattering coefficient (ASAR Product Handbook, 2007) normalized over ice was used for this study (we will refer to this value as backscatter).The procedure of normalization is described in Zakhvatkina et al. (2013Discussion Paper | Discussion Paper | Discussion Paper | For reference the mean of the distribution is shown by dashed grey line.Next, SAR-based LF is calculated for each AMSR-E grid cell where LF value is above 0.1 %.All the pixels classified as lead by SAR within such grid cell are added together Introduction Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Fig. 7, right).The values in other bins also redistribute in a way that is similar to the AMSR-E LF dataset.Original histograms of AMSR-E LF and SAR LF (same as Fig.6, but for the full winter) are also shown for reference.The systematic overestimation of AMSR-E LF data also affects the mean value of the distribution.For winter 2009, the mean value of AMSRE LF (LF AMSRE ) is equal Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | amounts to 0.131, 0.103, 0.113, 0.145 for November 2008-February 2009 respectively, and 0.110 for April 2009.The average value of the new r100 weighted by the number of observations for each month is 0.117 and is therefore our best estimate for winter 2008Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | the full AMSR-E LF dataset will lead to significantly lower errors when evaluated using SAR, making this dataset more valuable for e.g.assimilation into models or model evaluationDiscussion Paper | Discussion Paper | Discussion Paper |

Figure 1 .Figure 3 .
Figure 1.Histograms for AMSR-E lead fraction (LF) dataset shown as the number of measurements per each LF bin of 5 % width expressed in % of the total amount of measurements (relative frequency).The blue bars show the full dataset, while each month of the winter 2008-2009 is shown by other colours (see the legend).

Table 1 .
Number of measurements in the SAR LF dataset.