Brief communication : The challenge and benefit of using sea ice concentration satellite data products with uncertainty estimates in summer sea ice data assimilation

Data assimilation experiments that aim at improving summer ice concentration and thickness forecasts in the Arctic are carried out. The data assimilation system used is based on the MIT general circulation model (MITgcm) and a local singular evolutive interpolated Kalman (LSEIK) filter. The effect of using sea ice concentration satellite data products with appropriate uncertainty estimates is assessed by three different experiments using sea ice concentration data of the European Space Agency Sea Ice Climate Change Initiative (ESA SICCI) which are provided with a per-grid-cell physically based sea ice concentration uncertainty estimate. The first experiment uses the constant uncertainty, the second one imposes the provided SICCI uncertainty estimate, while the third experiment employs an elevated minimum uncertainty to account for a representation error. Using the observation uncertainties that are provided with the data improves the ensemble mean forecast of ice concentration compared to using constant data errors, but the thickness forecast, based on the sparsely available data, appears to be degraded. Further investigating this lack of positive impact on the sea ice thicknesses leads us to a fundamental mismatch between the satellite-based radiometric concentration and the modeled physical ice concentration in summer: the passive microwave sensors used for deriving the vast majority of the sea ice concentration satellite-based observations cannot distinguish ocean water (in leads) from melt water (in ponds). New data assimilation methodologies that fully account or mitigate this mismatch must be designed for successful assimilation of sea ice concentration satellite data in summer melt conditions. In our study, thickness forecasts can be slightly improved by adopting the pragmatic solution of raising the minimum observation uncertainty to inflate the data error and ensemble spread.


Introduction
For the past 30 years, the Arctic sea ice extent and volume consistently decreased in all seasons with a maximum decline in summer (Vaughan et al., 2013).This retreat has large effects on the climate system.For example, the strong contrast between the albedo of sea ice and open water has a profound effect on the Arctic surface heat budget.This retreat also influences the lower-latitude weather and climate and has been linked to extreme events at midlatitudes, for example, unusually cold and snowy winters in Europe, the USA, and eastern Asia (Liu et al., 2012;Cohen et al., 2012), heat waves and droughts in the USA and in Europe (Tang et al., 2014), and anomalous anticyclone circulation over Eastern Europe and Russia (e.g., Semmler et al., 2012;Yang and Christensen, 2012).Apart from its relevance to regional and global climate, Arctic sea ice decline opens new economic opportunities.Accurate summer sea ice forecasts are there-Published by Copernicus Publications on behalf of the European Geosciences Union.Q. Yang et al.: The challenge and benefit of assimilating sea ice concentration with uncertainty estimates fore urgently required to thoroughly manage the opportunities (e.g., shipping, tourism) and risks (e.g., oil spill, marine emergencies) associated with Arctic opening (Eicken, 2013).
Sea ice data assimilation plays a pivotal role in sea ice forecasting, as it can provide realistic initial model states and continuously constrain the model state closer to reality.Data assimilation requires both reliable observed quantities and realistic uncertainty estimates.These requirements, especially regarding data uncertainties, are now also increasingly recognized by the sea ice remote-sensing community.Previous studies have shown that the assimilation of sea ice concentration (SIC) data can improve SIC estimates (e.g., Lisaeter et al., 2003;Lindsay and Zhang, 2006;Stark et al., 2008;Tietsche et al., 2013;Buehner et al., 2014) and also constrain the ice thickness and volume (Schweiger et al., 2011;Yang et al., 2015a).Given that error estimates in the studies mentioned above were assumed to be constant, there is scope for further improvement through the use of more realistic uncertainty estimates.
In 2010, the European Meteorological Satellite Agency (EUMETSAT) Ocean and Sea Ice Satellite Application Facility (OSISAF, http://www.osi-saf.org)released a climate data record of SIC based on SMMR and SSM/I data (Eastwood et al., 2011;product OSI-409).This data set features an explicit correction of the satellite signal due to weather contamination, dynamic adaptation of algorithm tie points, and spatiotemporally varying maps of uncertainties.In fact, this OSI-409 data set and its uncertainties were already successfully used for data assimilation purposes (e.g., Massonnet et al., 2013).
In May 2014, the European Space Agency (ESA) Sea Ice Climate Change Initiative (SICCI) released a SIC data set with associated uncertainty estimates (Version 1.11) to the public.In many respects, the SICCI SIC data set features an update of the algorithms and processing methodologies used for the OSISAF OSI-409 data set and, importantly, revised uncertainty estimates (Lavergne and Rinne, 2014).At the time of writing these two data sets, SICCI and OSISAF OSI-409, are the only algorithms or products that come with a physically based sea ice retrieval uncertainty information -as opposed to an estimate of the spatiotemporal variation of the ice concentration within a certain grid area and time window (e.g., NOAA SIC CDR; Peng et al., 2013).This new data set SICCI (v1.11) provides an opportunity to study the effect of the revised local (i.e., spatially varying) uncertainties on the assimilation of SIC data and hence sea ice prediction skill.
In this study, we follow the approach of Yang et al. (2015a) and Yang et al. (2015b) by focusing on the summer of 2010 and using the same ensemble-based singular evolutive interpolated Kalman (SEIK) filter (Pham et al., 1998;Pham, 2001) in its local form (LSEIK, Nerger et al., 2006).The SEIK filter algorithm for assimilating the SIC is selected because it is computationally efficient when applied to nonlinear models (Nerger et al., 2005), and a localized implemen-tation of such a filter allows for detailed sampling of forecast uncertainties (Nerger et al., 2006).The LSEIK filter has already been used successfully for SIC data assimilation (Yang et al., 2015a).The purpose of the study is to quantify the impact of different observational uncertainty approximations on sea ice data assimilation through a comparison with independent ice concentration and ice thickness observations.

Forecasting experiment design
We use the MIT general circulation model (MITgcm) sea ice-ocean model (Marshall et al., 1997;Losch et al., 2010).Following Yang et al. (2015a) and Yang et al. (2015b), this study employs an Arctic regional configuration with a horizontal resolution of about 18 km and open boundaries in the North Atlantic and North Pacific (Nguyen et al., 2011).To explicitly include flow-dependent uncertainty in atmospheric forcing, the approach by Yang et al. (2015a) was used in which UK Met Office (UKMO) ensemble forecasts from the TIGGE archive (THORPEX Interactive Grand Global Ensemble) drive the ensemble of sea ice-ocean models.Each of the selected UKMO ensemble forecasts consists of one unperturbed "control" forecast and an ensemble of 23 forecasts with perturbed initial conditions.For further details the reader is referred to Bowler et al. (2008) and Yang et al. (2015a).
Following Yang et al. (2015a) and Yang et al. (2015b), the system's forecasting skills are evaluated with a series of 24 h forecasts over the period of 1 June to 30 August 2010 during which the LSEIK filter is applied every day.This particular period is chosen as it was the first time that the open water was found in the interior pack ice near the North Pole as early as 12 July 2010 (NSIDC, http://nsidc.org/arcticseaicenews/2010/07/).During this summer melting period the Arctic sea ice extent (area with at least 15 % SIC) shrank from 11.8 million km 2 on 1 June to 5.2 million km 2 on 30 August 2010 (Sea ice Index (Version 1.0), Fetterer et al., 2002, NSIDC), which shows a clear picture of sea ice melting in Arctic summer: on 1 June most of the Arctic Ocean was covered with closed pack ice, while on 30 August the sea ice area had shrunk to the central Arctic and the concentration was drastically reduced (Fig. 1).
The simulated and satellite-observed SIC are combined using a sequential SEIK filter with second-order exact sampling (Pham et al., 1998;Pham, 2001) coded within the Parallel Data Assimilation Framework (PDAF, Nerger and Hiller, 2013; http://pdaf.awi.de).The filter algorithm includes the following phases: initialization, forecast, analysis, and ensemble transformation.The sequence of forecast, analysis, and ensemble transformation is repeated.
The required initial ensemble approximates the uncertainty in the initial state of the physical phenomena.Following Losa et al. (2012) and Yang et al. (2015a), we used a model integration driven by the 24 h UKMO control fore- casts over the period of 1 June to 31 August 2010 to estimate the initial state error covariance matrix of SIC and thickness.The leading empirical orthogonal functions (EOFs) of this covariance matrix representing the model variability are transformed by the second-order exact sampling to generate the initial ensemble of ice concentration and thickness.An ensemble size of 23 states is chosen to match with the ensemble size of UKMO perturbed forcing.In the forecast phase, all ensemble states are dynamically evolved in time with the fully nonlinear sea ice model driven by the UKMO ensemble atmospheric forcing.The analysis step k combines the predicted model state x f k with the observational information y k and computes a corrected state x a k every 24 h as follows. (1) Here K is the so-called Kalman gain that weights the observational information based on the model and data error covariance, P f k and R k , respectively.H k is the observational operator that project the model variable to the observational space.In the analysis step the error covariance matrix and ensemble of model state approximating the P a k are updated.With the SEIK filter as a reduced-rank square-root approach, the updated ensemble of model states samples the analyzed model uncertainties according to the leading EOFs.As seen from the formulas the quality of the analysis and, therefore, the system's prediction skills depend on the assumed prior error statistics P k and R k .In this respect it is worth stressing the importance of accounting for representativeness/representation errors.Such errors relate to uncertainties in the projection of model variables to the observational space.For example, the model may represent the observed data on different temporal and spatial scales (grid box averages vs. point measurements) or the model variable may not be directly related to the observation.There are also deficiencies in approximating and sampling the model uncertainties.In practice, it is rather difficult to estimate the representation error a priori, also due to the conditional nature of error statistics specified in data assimilation algorithms.Hence, it may become necessary to enlarge observational uncertainties to account for representation errors.
In Nerger et al. (2006) it was shown that implementing the SEIK analysis in a local context (LSEIK) allows for a more accurate approximation of the forecast error covariance even with a relatively small ensemble size.In our study the LSEIK analysis is performed for each model surface grid point by assimilating the observational information only within a radius of 126 km (∼ 7 model grid points).Within the radius, we weighted the observations assuming quasi-Gaussian (Gaswww.the-cryosphere.net/10/761/2016/The Cryosphere, 10, 761-774, 2016 pari and Cohn, 1999) dependence of the weights on the distance from the analyzed grid point (see Janjić et al., 2011;Losa et al., 2012).As the atmospheric errors are already explicitly accounted for by the ensemble forcing, an ensemble inflation simulating model errors is not needed in this LSEIK configuration (Yang et al., 2015a).Three daily SIC data sets are used in this study.The SICCI fields from AMSR-E (Lavergne and Rinne, 2014) are used in the data assimilation.This product consists of daily fields provided on a 25 km polar-centered EASE2 grid (Brodzick et al., 2012).In the SICCI data set, the North Pole data gap is filled by interpolation, and daily maps of total standard error (the sum of algorithm uncertainties and smear uncertainties that refers to the representation error on a different grid resolution) are provided.If the uncertainties contain the smearing error the data assimilative system will account for this.The ice concentration data used for comparison are from the National Snow and Ice Data Center (NSIDC; Cavalieri et al., 1984).This product also consists of daily fields with 25 km grid spacing on a polar stereographic projection.For summer 2010, the NSIDC ice concentration fields are derived from a different passive microwave instrument (SSMI/S onboard DMSP F-17) and with a different algorithm (NASA-Team).AMSR-E has a finer native spatial resolution than SSMI/S so that, although both products are provided on a 25 km grid, the SICCI (AMSR-E-based) fields have more details and appear less smoothed than the NSIDC (SSMI/Sbased) fields, especially in the sea ice edge area (Fig. 1).Strictly speaking, the differences between the SICCI and NSIDC products -different Earth grids (polar stereographic vs. EASE2) and finer native spatial resolution of AMSR-E -do not make them independent data, because both are derived from passive microwave instruments, but we may assume that they are sufficiently different for to be treated as independent.As a third data set for comparison and discussion, we use the MODIS-based SIC and melt pond fraction (MPF) data from University of Hamburg.These data are obtained from surface reflectance in several MODIS frequency bands and a method that is based on the fact that different surface types (melt ponds, sea ice, snow, and open water) have different reflectance spectra (Rösel et al., 2012, andRösel andKaleschke, 2012).Thus, the MODIS-derived melt pond and open water fractions (OWFs), which are related to SIC by 1 − OWF, are completely independent observations and as such we can use them for the forecasting system's assessment.Because of the strong influence of cloud cover on MODIS, these data are provided as composites over 8 days on a 12.5 km resolution grid.The absolute MPF that has not been weighed over the SIC is used in this study.In order to account for a possible bias in MODIS-derived MPF and SIC data product (Mäkynen et al., 2014) and other uncertainties (Rösel et al., 2012), we followed Kern et al. (2016) and decreased the MPF estimates by 0.08 and replaced negative values of the MPF by 0. MODIS SIC was increased by 0.03 and limited to a maximum of 1.0.
In spite of available satellite-based observations of ice thickness such as ICESat (Kwok and Rothrock, 2009), CryoSat-2 (Laxon et al., 2013;Zygmuntowska et al., 2014), andSMOS (Tian-Kunze et al., 2014), it is currently generally impossible to retrieve reliable sea ice thickness from either laser/radar altimetry or brightness temperature during summer melt conditions due to wet snow conditions or clouds.There are also no airborne summer sea ice thickness data available from Operation Ice Bridge (OIB) campaign flights because these are usually carried out in spring (Kurtz et al., 2013).Instead of satellite-based and air-borne remotesensing data we compare our simulation results to measurements of sea ice draft from the Beaufort Gyre Exploration Project (BGEP) upward-looking sonar (ULS) moorings located in the Beaufort Sea (BGEP_2009A, BGEP_2009D; see Fig. 1a for the locations) and sea ice thickness data obtained from autonomous ice mass balance buoys (IMBs; Perovich et al., 2013).The error in ULS measurements of ice draft is estimated as 0.1 m (Krishfield and Proshutinsky, 2006).To facilitate a direct comparison with the model ice thickness, following Vinje et al. (1998) and Hansen et al. (2013), the drafts are converted to thickness by multiplying by a factor of 1.136.This constant ratio between thickness and draft was derived by Vinje and Finnekåsa (1986) through hand drillings.Different ice types and ice densities have different effects on the draft-thickness conversion by introducing uncertainties and nonlinear relationships between thickness and the original drafts (Forsström et al., 2011), but the seasonal evolution of the ice thickness is more important than the absolute thickness values in this study, so these effects are ignored in this study.The IMBs use two acoustic rangefinders to monitor the position of the ice bottom and the snow/ice surface and estimate the sea ice thickness.The accuracy of both sounders is 5 mm (Richter-Menge et al., 2006).In this study, the IMB_2010B was used; its trajectory during summer 2010 is shown in Fig. 1.
Three experiments, which mainly differ in the way observational uncertainties are represented, form the backbone of this study: 1. LSEIK-1: following Yang et al. (2015a), SICCI SIC data are assimilated with a constant uncertainty value of 0.25, i.e., the observation errors are assumed to be Gaussian distributed with standard deviations (SDs) of 0.25, including representation errors.
2. LSEIK-2: same as LSEIK-1 but the uncertainty fields provided with the SICCI product are used (see Fig. 2).A minimum uncertainty of 0.01 is imposed to avoid complications due to divisions by very small numbers.
3. LSEIK-3: same as LSEIK-2 but with a minimum uncertainty of 0.10 to account for a possible representation error.This representation error is difficult to estimate a priori.In order to find an appropriate values, we also tested other values (0.05, 0.15, 0.20) as case studies.The To reflect the increased uncertainty in the extrapolation of the SICCI data into the data-void North Pole region, a constant uncertainty of 0.30 is assigned in this region for all experiments.
The original observational data uncertainties of ice concentrations that are provided with the SICCI data set and used in LSEIK-2 and LSEIK-3 are displayed in Fig. 2. In Fig. 2, we show the provided observation uncertainties on 12 July, 20 July, and 13 and 21 August 2010.The uncertainties are about 0.05 over packed ice and open water, but larger uncertainties up to and beyond 0.3 are present at the ice edge and regions of intermediate ice concentration values.The SICCI total uncertainties are indeed the sum of two components, one characterizing the algorithm uncertainties and the other measuring the uncertainties due to representativity of 25 km daily averages, geo-location, and instrument footprint mismatch (Lavergne and Rinne, 2014).The second component to the total uncertainties is only pronounced in areas of gradients in the SIC observations -typically at the ice edgeand amount for the inability of such coarse resolution satellite observations to accurately locate sea ice edge.Should the SICCI SICs be assimilated in models with significantly better spatial resolution, the enlarged uncertainties allow the model to freely locate its ice edge within the 25 × 25 km grid cells showing intermediate ice concentration values in the data.

Results
Figure 3 shows the effect of assimilating SICCI concentration data on the simulated SIC averaged over August 2010 (Fig. 3b, c and d).Compared to the SICCI data, the unassimilated model (Fig. 3a) has considerably lower SICs in the pack ice of the Arctic Ocean and considerably higher SICs in the marginal ice zones and the adjacent open water areas.As expected, the three LSEIK experiments correct the model bias towards observed (and assimilated) values.Of these assimilation experiments, LSEIK-2, which uses the originally SICCI-provided uncertainties, gives the best agreement with the SICCI observations (Fig. 3c).
We also compare the predicted SIC against the MODISbased SIC data (Fig. 4).The reader is reminded that these data are 8-day composites and just 10 such composites are available over the period of interest.Only the grid cells with a cloud cover fraction smaller than 0.10 were considered in order to minimize the influence of clouds.As before, the free run overestimates the SICs over the marginal ice zones (Fig. 4a), the three LSEIK experiments improve the forecasts (Fig. 4b, c, and d).The differences between the three assimiwww.the-cryosphere.net/10/761/2016/The Cryosphere, 10, 761-774, 2016 lated solutions are ambiguous.In some regions, for example, Fram Strait, the LSEIK-1 (Fig. 4b) and 3 (Fig. 4d) solutions have a strong bias that is corrected in LSEIK-2 (Fig. 4c), but in the western Beaufort Sea LSEIK-2 (Fig. 4c) appears to have larger differences to MODIS SIC than the other solutions.Averaged over the 10 composites and all the available data points, the root mean square error (RMSE) of the three LSEIK forecasts with respect to the MODIS SIC have a same value of 0.10.Figure 5 compares the RMSE for ensemble mean ice concentration forecasts with and without data assimilation with respect to the assimilated SICCI (Fig. 5a) and the nonassimilated NSIDC (Fig. 5b) ice concentration for the period 1 June to 30 August 2010.Note that Fig. 5 shows only the RMSE for grid locations where the satellite products report ice concentrations below 0.35, that is, mostly locations along the ice edge.This threshold of 0.35 is somewhat arbitrary but other values, for example, 0.25 or 0.50 lead to similar results.Figure 5 thus mostly assesses how the data assimilation experiments constrain the envelope of Arctic sea ice (cyan color around concentrations of 0.35 in Fig. 1), not the interior.The reason for choosing this range is that all SIC products from passive microwave instruments are inaccurate for high summer concentrations because of the presence of melt ponds (Ivanova et al., 2015).In such a case, documenting that the assimilated state is closer to the NSIDC product is not very conclusive, since NSIDC and SICCI products are probably similarly affected at high concentration values.
Therefore, focusing on regions with lower SICs and a potentially lower influence by melt ponds is likely enhancing the robustness of our results.In addition, the two data sets treat the open water area adjacent to the ice cover differently.For example, the explicit weather correction method used in the SICCI product does not correct for cloud liquid water and cannot eliminate all weather influences on the ice concentration.In contrast the weather filter used for the NSIDC data cuts off SIC at various values (Ivanova et al., 2015).It should be also noted that for this comparison, the observations are linearly interpolated to the model grid.Such interpolation could lead to small local changes in SIC, and the related biases are not discussed in this study.
All the data assimilation experiments reduce deviations of the forecasted ice concentration from the satellite-based data sets.The RMSE temporal evolutions are associated with the number of available data points that can be used for comparison or with surface forcing.The curves of MITgcm free-runs differ between Fig. 5a and b because the RMSE is calculated with different SIC data sets.Compared to the free run without data assimilation, mean RMSEs of LSEIK-1, LSEIK-2, and LSEIK-3 ensemble mean forecasts with respect to the SICCI data are reduced from an average of 0.56 to 0.18, 0.07, and 0.16, respectively.Similarly, the RMSEs with respect to the NSIDC data are reduced from 0.55 to 0.20, 0.13, and 0.19.At all times, LSEIK-2 and LSEIK-3, using the SICCIprovided uncertainty estimates and adjusted minimum uncertainties, agree better with both the assimilated SICCI and  non-assimilated NSIDC observations than LSEIK-1, which employs a constant uncertainty of 0.25.LSEIK-2, with the original SICCI uncertainties, agrees best with both SICCI and NSIDC observations.This shows that for this summer, the forecasting system produces an ensemble mean state for SIC that agrees better with the two ice concentration data sets when the full range of uncertainties provided by the SICCI satellite observation is used.
The time series of daily 24 h forecast of sea ice thickness are compared to in situ ULS observations BGEP_2009A (Fig. 7a) and BGEP_2009D (Fig. 7b).Note that the numerical model carries mean thickness (volume over area) as a variable.The observed thickness is multiplied by SICCI or NSIDC local ice concentration to arrive at the observed ULS-SICCI or ULS-NSIDC grid-cell mean thicknesses shown in Fig. 7.In spite of some small differences, ULS-SICCI and ULS-NSIDC both reveal a very similar variation: at BGEP_2009A, the grid-cell mean thickness on 1 June was about 2.5 m.The thickness rapidly reduced under melting conditions in July and reached about 0.2 m on 30 August (Fig. 7a).Similarly, the grid-cell mean thickness at BGEP_2009D was about 3.5 m on 1 June and decreased to less than 0.1 m on 30 August (Fig. 7b).All forecasts with data assimilation show improvements over the freerunning MITgcm after late July when the misfit between the observed and modeled SICs becomes significant (Figure not shown).This is because the ice thickness is influenced by the data assimilation only through the covariances between the ice concentration and thickness (Yang et al., 2015a).The ice thickness RMSE with respect to ULS-SICCI at BGEP_2009A is reduced from 0.86 m in the free model run to 0.43 m in LSEIK-1, 0.61 m in LSEIK-2, and 0.43 m in LSEIK-3 (Table 1).Similarly, the RMSE with respect to ULS-SICCI at BGEP_2009D is reduced from 0.93 m in the free model run to 0.55 m in LSEIK-1, 0.51 m in LSEIK-2, and 0.59 m in LSEIK-3 (Table 1).The LSEIK-2 solution (with the original SICCI uncertainty) agrees with the in situ observations at BGEP_2009D (Fig. 7b) but overestimates the mean sea ice thickness at BGEP_2009A (Fig. 7a), especially from mid-July to mid-August.The LSEIK-3 thickness (with the modified SICCI uncertainties) agrees better with the BGEP_2009A data and is basically equivalent to LSEIK-1.
The ice thickness at IMB 2010B (Fig. 7d) has only 10 data points in the period 6 June to 8 August, because its snow sounder failed on 7 May, so that ice thickness can only be computed from ice profile data that were available once a week.Similarly, the observed thickness is multiplied by SICCI or NSIDC local ice concentration to arrive at the observed IMB-SICCI or IMB-NSIDC grid-cell mean thicknesses shown in Fig. 7.All 24 h forecasts have a positive bias of about 1.0 m on 6 June, but all LSEIK forecasts capture the downward trend after 11 July better than the freerunning model.The LSEIK-3 solution gives the best agreement with the observations.The RMSEs from the IMB-SICCI at IMB 2010B are reduced from 0.91 to 0.54 m with LSEIK-1, 0.73 m with LSEIK-2 and to 0.51 m with LSEIK-3.The reason is discussed in the following section.

Discussion
Based on the recently released SICCI SIC data that provide uncertainty estimates, a series of sensitivity experiments with different data error statistics has been carried out to test the impact of SIC uncertainties in data assimilation.Compared to a data assimilation configuration with constant uncertainty of 0.25, the data assimilation of SICCI data with provided uncertainties can give a better short-range ensemble mean forecasts for SIC in summer.However, the ice thickness forecasts are probably not improved with the observational uncertainties.As there is still no available satellite-based sea ice thickness data in summer, the ice thickness evaluation in this study can only be based on two local ULS observations and one IMB-based observation.Also, estimating the gridcell mean sea ice thickness using the local SICCI or NSIDC SIC data introduces further uncertainties into the thickness calculations.For more robust results for sea ice thickness forecasts, more thickness observations for ground truth evaluation are absolutely necessary, for example, from ice floats and other in situ data sources.
The main message from Figs. 3, 4, and 5 is in fact that the high sensitivity of the data assimilation to the observation uncertainties can be explained by the employed (atmospheric) model and observational error statistics in the LSEIK assimilation system.The spread of the ensemble representing forecast uncertainties in SIC for LSEIK-2 turns out to be relatively small.For example, on 30 August 2010 most of the ensemble-represented SDs in the Arctic central area and the sea ice edge area are less than 0.01 and 0.03, respectively (Fig. 8b).This means that all members are very close to the ensemble mean and the data assimilation will have only little effect.Compared to LSEIK-2, LSEIK-3 has a similar spatial distribution of the ensemble spread with higher SDs in the sea ice edge area and lower SDs in the concentrated central ice area but overall higher SDs.Together with the fact that LSEIK-2 does not fit the thickness observations as well as LSEIK-3, this suggests that the ensemble forecast spread for SIC is too low and cannot reflect the true uncertainty.As only observations of SIC are assimilated, sea ice thickness is influenced indirectly during the data assimilation through the point-wise covariance between the ice concentration and thickness, thus through a linear update.Here, the very small SIC ensemble variance leads to a very small sea ice thickness spread (Fig. 9b).This probably explains why the LSEIK-2 system is not very effective at improving the sea ice thickness estimates while LSEIK-3 does somewhat better.The increased ensemble spread in the SIC allows the system to better represent the uncertainties and leads to a larger ice thickness spread (Fig. 9c).The sea ice thickness forecasts are improved accordingly.
The relative enhanced skill of sea ice thickness forecasts by LSEIK-3 with respect to LSEIK-2 does thus point to a possible issue with assimilating the summer SICCI ice concentration with the provided uncertainties.At first sight, the data uncertainties in summer sea ice pack seem to be too low (Fig. 2).For example, on 12 July 2010 when surface ice melting prevails and the microwave-radiometry-based ice concentration estimates are known to underestimate the physical sea ice cover (Ivanova et al., 2015), the provided uncertainties at the sea ice pack area are still lower than 0.06 with few regions exhibiting values around 0.10 (Fig. 2d).
In fact, Ivanova et al. (2015, Sect. 5.3 "Melt ponds") report that AMSR-E and SSM/I, like all other passive microwave sensors, cannot distinguish ocean water (in leads) from melt water (in ponds) because of the very shallow penetration depths of the microwave signal in water.Therefore, these radiometric SICs are closer to one minus MPF, than to the physical SIC in our models.This mismatch between the observed and modeled ice concentration (radiometric vs. physical) does not exist in winter when there is no surface melting (Ivanova et al., 2015).However, in summer melt conditions, the observed ice concentration includes an unknown area of pond water.For example, the MODIS-based melt pond distribution data show the distribution of melt ponds over the Arctic sea ice in the summer of 2010 (middle panels in Fig. 10).It was illustrated that the passive microwave-based SIC are underestimated in the pond-covered area and overestimated between the melt ponds (Kern et al., 2016).The provided un-   certainties are not larger since the radiometric concentration is not more uncertain.This mismatch results in a systematic difference between the two quantities (the physical concentration is larger than the radiometric concentration) that cannot be fully mitigated by enlarged standard deviations of a Gaussian uncertainty model in Ivanova et al. (2015).The in-fluence of melt ponds on the accuracy of the SICCI data set is documented in Lavergne and Rinne (2014, Sect.2.2.1.1 "summer melt ponding") and Kern et al. (2016).
The right panels of Fig. 10 show the bias in the SIC model prediction relative to the observation on 12 July, 20 July, and 13 and 21 August 2010.The spatial distribution of the MPF Q. Yang et al.: The challenge and benefit of assimilating sea ice concentration with uncertainty estimates (middle panels in Fig. 10) further supports the conclusion that the data assimilative system performs better when the prior observational error statistics account for some representativeness errors as in experiment  This mismatch between the measured and modeled quantities calls for adopting more advanced data assimilation methodologies, for example, embedding a matching relation in form of an observation operator for successful assimilation of SIC satellite observations (from passive microwave instruments).Given the scope of this study and the comparisons with the in situ BGEP and IMB ice thickness, the solution implemented in LSEIK-3, that is to enlarge the observation uncertainties using a minimum value of 0.10, is a pragmatic and effective approach.This simple approach reflects the larger uncertainties in the sea ice edge area and leads to a more reasonable spread in the model ensemble, which in turn leads to a better agreement with the observations and the information about the MPFs.

Conclusions
In this study, we assimilate the summer SICCI SIC data taking into account the data uncertainties provided by the distributors.Even with a constant data uncertainty for the SICCI data, comparing the assimilated SICCI, non-assimilated NSIDC, and MODIS ice concentration and BGEP/IMB in situ thickness data, its assimilation results in better estimates of the SIC and thickness.The SIC estimates are further improved when the SICCI-provided uncertainty estimates are taken into account, but the sea ice thickness cannot be improved.
Moreover, it was found that our data assimilation system cannot give a reasonable ensemble spread of SIC and thickness when we use the provided uncertainty directly.This is because (1) there is a mismatch between the summer SIC as observed by the passive microwave sensors (radiometric concentration) and that simulated by our model (physical concentration), and (2) the provided observation uncertainties do not account for this mismatch.A simple and pragmatic approach appears to bypass this by imposing a minimum threshold value on the provided uncertainties in summer.Fully resolving the mismatch calls for more research, for example by considering melt pond cover and evolution in the models or observation operators in the data assimilation schemes.That would allow one to reduce the representation error.Nevertheless, the part of error related to possible uncertainties in the approximation of the forecast error statistics and discrepancies in model and data up-or downscaling may still exist and has to be considered in any data assimilation algorithm.

Figure 1 .
Figure 1.The NSIDC (a, b) and SICCI (c, d) sea ice concentration on 1 June (a, c) and 30 August 2010 (b, d).The locations of BGEP_2009A, BGEP_2009D, and IMB_2010B are shown as a white triangle, a white square, and a white line in image (a).Data-void areas along the coasts are white, and these areas are larger in NSIDC than in SICCI.

Figure 2 .
Figure 2. The uncertainty provided with SICCI sea ice concentration data on 12 July (a), 20 July (b), 13 August (c), and 21 August (d) 2010.Data-void areas along the coasts are white.

Figure 4 .
Figure 4. Same as Fig. 3, but "24 h forecasts minus MODIS composites" averaged over the period from 3 June to 21 August 2010.The 24 h forecasts used in the comparisons start on day 5 of the 8-day-composite time period.

Figure 5 .
Figure 5. Temporal evolution of RMSE differences between sea ice concentration forecasts and the SICCI (a) and NSIDC (b) ice concentration data.The RMSE only includes grid points for which the satellite data have ice concentrations below 0.35 (i.e., mostly in the marginal ice zone).The RMSE of the MITgcm free-run, LSEIK-1, LSEIK-2, and LSEIK-3 24 h forecasts are shown as gray, green, blue, and red solid lines.

Table 1 .
RMSE of the four forecasting experiments from grid-cell mean ice thickness calculated by the ULS moorings BGEP_2009A, BGEP_2009D, IMB-2010B, and the satellite ice concentration observations.The two values refer to the calculation using two different data sets SICCI-NSIDC.

Figure 7 .
Figure 7. Evolution of grid-cell mean sea ice thickness (m) at BGEP_2009A (a), BGEP_2009D (b), and IMB_2010B (c) from 1 June to 30 August 2010.The black solid and dashed lines show the grid-cell mean ice thickness using SICCI and NSIDC sea ice concentrations, respectively.The MITgcm free-run, LSEIK-1, LSEIK-2, and LSEIK-3 24 h ice thickness forecasts are shown as gray, green, blue, and red solid lines.

Figure 10 .
Figure 10.The SICCI sea ice concentration (left panels), the melt pond fraction (middle panels), and the LSEIK-3 forecast skill improvement of sea ice concentration (LSEIK-3 minus SICCI; right panels), the figures from top to bottom are 12 July, 20 July, and 13 and 21 August 2010.Note that the melt pond fraction maps are composites of 8 days before the given date.