Estimation of Degree of Sea Ice Ridging Based on Dual-Polarized C-band SAR Data

For navigation in Baltic Sea ice during winter season, parameters such as ice edge, ice concentration, ice thickness, ice drift and degree of ridging are usually reported daily in the manually prepared Ice Charts, which provide icebreakers essentail information for route optimization and fuel calculations. However, manual ice charting requires long analysis times and detailed analysis is not possible for large scale maps (e.g. Arctic Ocean). Here, we propose a method for automatic estimation of degree 5 of ridging density in the Baltic Sea region, based on RADARSAT-2 C-band dual-polarized (HH/HV channels) SAR texture features and the sea ice concentration information extracted from the Finnish Ice Charts. The SAR images were first segmented and then several texture features were extracted for each segment. Using the Random Forest classification, we classified them into four classes of ridging intensity and compared them to the reference data extracted from the digitized Ice Charts. The overall agreement between the ice chart based degree of ice ridging (DIR) and the automated results varied monthly, being 10 83 %, 63 % and 81 % in January, February and March 2013, respectively. The correspondence between the degree of ice riding of the manual Ice Charts and the actual ridge density was good when this issue was studied based on an extensive field campaign data in March 2011.


Introduction
Navigation in sea ice is hampered by rapid changes in the sea ice conditions.Thus, it is essential for wintertime shipping and offshore operations to get reliable and up-to-date information on the prevailing ice conditions.The most important sea ice parameters are the location of the ice edge, ice types, ice thickness, concentration and the amount of ridged ice.Without detailed sea ice information, navigating through heavily ridged sea ice is difficult or even impossible.
The Baltic Sea is a semi-enclosed brackish sea water basin in northern Europe.The ice cover in the Baltic Sea usually begins to form in November and has its largest extent between January and March (Seinä and Peltola, 1991).The normal melting season starts in April, and the ice melts completely by the beginning of June.The maximum annual ice extent ranges from 12 to 100 % of the whole Baltic Sea area with an average of 40 % (Seinä and Palosuo, 1996).Over the last decades a clear decreasing trend can be seen in the maximum ice extent, although the trend has not yet been subject to a detailed investigation.The sea ice edge is used as a parameter of sea ice defining the boundary between the open water and the sea ice area.Here the ice concentration areas higher than 25 % belong to sea ice, and below they belong to open-water.The sea ice in the Baltic Sea can be divided into fast ice and drift ice.Fast ice appears in the coastal and archipelago areas.Drift ice has a dynamic nature due to forcing by winds and currents, which results in an uneven broken ice field with distinct floes, leads and cracks, brash ice barriers, rafted ice and ice ridges.The upper limit for thermodynamically grown ice in the drift ice zone is 70 cm or less during most winters (Palosuo et al., 1982), while the keel depth of ice ridges is typically 5 to 15 m (Leppäranta and Hakala, 1992).The salinity of the Baltic Sea ice is typically only from 0.2 to 2 ‰ depending on the location, time and weather history (Hallikainen, 1992).The low salinity level affects the radar signal response from satellite imagery, resulting in more volume and less surface scattering of the incident signal.Synthetic aperture radar (SAR) satellites such as RADARSAT-2 (RS-2) and Sentinel-1 (S-1) play a major role in operationally monitoring the ice conditions in the Baltic Sea.SAR imaging is practically independent of the atmosphere conditions (e.g.cloud cover) and solar illumination and therefore suitable for operative sea ice monitoring.However, as backscatter information in SAR imagery cannot easily be linked to the different ice types, the expertise of an trained ice analyst is usually required.
In the Baltic Sea, daily ice charts prepared by the Finnish Ice Service (FIS) analysts provide a daily source of information on the ice conditions.The charts divide the ice cover into polygons to which ice types and properties are assigned.The analysis is based on a visual interpretation of the SAR imagery as the principal source of ice information.Currently, RS-2 and S-1 C-band SAR imagery with a wide coverage (e.g.RS-2 ScanSAR Wide Mode with 500 by 500 km image size) are used.The SAR imagery is complemented by visible and thermal infrared imagery, e.g. from Moderate Resolution Imaging Spectroradiometer (MODIS), in situ observations, sea ice information messages from icebreakers and data from sea ice models.The ice chart polygons defined by the ice analysts represent ice areas with similar ice characteristics.Parameters assigned to each polygon are ice concentration, average level-ice thickness, maximum and minimum levelice thickness and the degree of ice ridging (DIR) which is a numeral classifying the ice into five categories, as explained in Sect.2.3.The FIS ice analysts estimate the DIR values mainly using SAR imagery and with additional information on the ice drift based on successive SAR images and results of sea ice models.The main criteria for the visual DIR estimation from the SAR imagery are the SAR backscattering and its visible patterns (SAR texture) (see Fig. 1).
Typically, the ice situation changes little from one day to the next.Hence, when drawing a new ice chart, the ice analysts use the latest chart as the basis for the new one and only adjust the polygon contours and their assigned DIR values to match the new ice situation.This procedure speeds up the process of ice charting but may also introduce a bias if old polygons are used.The quality of the displayed SAR features of sea ice (e.g.magnitude of contrast/intensity, amount of radar noise), the analyst's experience and their style of drawing (more detailed or less detailed) can contribute further to inconsistencies in the finalized ice chart.
In this paper we propose a method to automatize the DIR estimation process based on RS-2 dual-polarized (HH/HV) SAR data acquired under cold conditions and using the FIS ice charts as reference data.The results are then evaluated together with the ice analysts.We do not expect a perfect match between the automatic chart and the manual one.The polygons in the manual charts suppress variation for the smallscale features and merge them into one DIR category.Here we aim to produce a more detailed DIR chart, which closely follows the SAR texture features of sea ice ridges, edges, cracks and leads.This would allow the icebreakers and nonicebreaker vessels to benefit from it in advance route planning and optimization, by taking advantage of the sea ice passages within ridged-ice areas.Manual ice charting should also benefit from a more detailed and automated DIR map which can serve as a basic layer for the final ice chart.
In Sect.2.3 we will describe the different DIR categories used by FIS.As a tool in the DIR classification we use the random forest (RF) algorithm, which will be explained in detail in Sect. 3.3.Using the automated classification procedure we target an efficient exploitation of SAR data and, by means of increased spatial and temporal resolutions, an improved quality (pixel level accuracy and consistency between different analysts) of the ice charts.

Data sets and processing
Our study area in the Baltic Sea is north of the latitude of 61 • N, covering the entire Bay of Bothnia and most of the Sea of Bothnia.The time period is the ice season 2012-2013.The most severe ice conditions in 2012-2013 in the Baltic Sea occurred in our study area.The ice season 2012-2013 was average but the turning point of the winter was late.The weather began to cool during the first week of January and the ice extent increased.In the last week of January the strong winds moved the ice fields and the mild weather melted the ice.At the beginning of February the weather remained similar -at night new ice was formed and then broken by winds during the day.Towards the end of February the weather cooled down and new ice also formed in the Gulf of Finland.At the beginning of March cold arctic air started to flow to Scandinavia and the extent of ice began to grow.The whole of March was extremely cold.The 15 March ice extent reached 177 000 km 2 , which was the maximum of the season.From then on, the cold nights formed new ice but sunny days melted it, and the ice extent did not increase further.

σ o of the Baltic Sea ice
In the following we discuss how in general the C-band backscattering coefficient (σ o ) of the Baltic Sea ice is related to the sea ice properties and especially sea ice ridging.Under cold weather conditions when the snow cover on sea ice is dry the ice surface scattering has been observed to be the dominant component in the total co-polarized σ o at incidence angles below 45 • (Carlström and Ulander, 1993;Dierking et al., 1999).If the ice surface is very smooth and salinity < 0.5 psu, which typically is the case for level fast ice in the Baltic Sea, then the backscattering from the ice-water interface and ice volume is significant.The surface backscattering from level ice is controlled by the statistics of the smallscale roughness as well as the salinity of the ice surface.If sea ice is ridged, the large-scale surface roughness alters the geometry of the surface and, hence, also modifies σ o .Empirical measurements of the Baltic Sea ice C-band σ o have shown that the variation in the large-scale surface roughness mostly modulates σ o and image texture, although changes in the small-scale roughness are also significant (Carlström and Ulander, 1993;Dierking et al., 1999;Mäkynen and Hallikainen, 2004).
The σ o contrast between level ice and ridged ice is on average larger at C-band cross-polarization than at copolarization (Mäkynen and Hallikainen, 2004).The standard deviation of σ o was observed to be larger for ridged-ice types (mixtures of level ice, ice ridges, rubble) than for level-ice types and brash ice in Mäkynen and Hallikainen (2004).The C-band σ o is not directly related to the sea ice thickness, but at least in the Baltic Sea it is possible to estimate the thickness of ridged ice under dry snow conditions through a statistical relationship between ice freeboard, level-ice thickness and σ o (Similä et al., 2010).The variance of the mean freeboard, i.e. large-scale surface roughness, increases with increasing average freeboard, and as the surface roughness increases σ o also typically increases.In general, these previous studies on sea ice σ o signatures show that there is a relation between C-band σ o and DIR, but further studies are needed to better quantify this relation.
Next, different approaches for SAR-based sea ice classification are briefly reviewed.Many SAR-imagery-based sea ice classification systems just perform classification to open water and different ice types, such as new ice, first-year-ice, multi-year-ice, but DIR is not explicitly estimated in more detail.Classification schemes utilizing σ o and SAR texture have been presented, e.g. in Soh et al. (2004); Sandven et al. (2012); Barber and LeDrew (1991) and Clausi (2001).Classification of ice types based on single-polarization Cband SAR backscattering has been studied, e.g. in Karvonen (2004) and Shokr (2009).Sea ice SAR classification using the world meteorological organization (WMO) ice categories (stage of development) (WMO, 2010) has been studied, e.g. in Clausi (2001); Deng and Clausi (2005); Maillard et al. (2005); Yu and Clausi (2007); Clausi et al. (2010); Ochilov and Clausi (2012).These approaches are based on the SAR segmentation and different SAR features, including texture ones.Some of the methods also combine the ice analyst analysis and an automated analysis.A system capable of a semiautomated segmentation and enhanced classification with a digitized ice chart as input is presented in Clausi et al. (2010).It is noted that the ice categories in these studies do not either explicitly or uniquely include DIR classification.

RADARSAT-2 SAR imagery
The SAR imagery used in this study is RADARSAT-2 ScanSAR Wide (SCWA) dual-polarized imagery with the HH/HV polarization combination.The nominal size of an RS-2 SCWA image is around 500 by 500 km, and the pixel size is 50 m.The spatial resolution is around 73-163 m by 100 m (range by azimuth).The incidence angle (θ 0 ) varies from 20 to 49 • .The equivalent number of looks (ENL) is larger than six.The nominal noise floor equivalent σ o at both HH-and HV-polarization varies along the across-track direction as −28.5 ± 2.5 dB and the absolute accuracy of σ o is better than 1 dB (MDA, 2014).
The acquired RS-2 SAR imagery covered the whole of the Baltic Sea.The number of SAR images used in the daily SAR mosaic over the test area varied from one to three SAR frames per day from January to March in 2013.On some days the SAR mosaic was updated twice.We selected from these SAR mosaics the training and the test data using the rule that the time gap between a training and a test mosaic must be at The test data consisted of four mosaics from January, six mosaics from February and three mosaics from March.Some mosaics were not used due to the time constraint.The monthly training and test data refers to the mosaics gathered during the same month.We selected these 3 months for the development and the test work because then the SAR images were mostly acquired under dry snow conditions.Hence, the dominant backscattering source was the sea ice surface and we could expect a statistical relationship between σ o and DIR as reported in Carlström and Ulander (1995); Dierking et al. (1999); Similä et al. (2001Similä et al. ( , 2010)); Mäkynen and Hallikainen (2004).
The preprocessing of the RS-2 SCWA images consisted of calibration (calculation of σ o HH and σ o HV ), georectification, calculation of the incidence angle θ 0 and land masking.First the data were rectified into the Mercator projection with 100 m pixel size.This georectification is compatible with the FIS ice charts and the navigation system of the Finnish and Swedish icebreakers.In this Mercator projection the reference latitude is 61 • 40 N.
As the SAR σ o is dependent on θ 0 , an incidence angle correction is necessary before the classification of the SAR images with wide θ 0 range, such as RS-2 SCWA images.For the HH-polarization images, an incidence angle correction method described in Mäkynen et al. (2002) was applied.This incidence angle correction maps the σ o values using a linear dependence for the σ o in dB-scale to a predefined θ 0 value θ R 0 .In this case, the fixed θ R 0 of 30 • was used.At the HV-polarization the SAR σ o values are close to the instrument noise floor (around −28.5 dB for RS-2 SCWA mode).The noise floor modulates the (low) HV channel signal leading to clearly visible stripes (artefacts) in the HV imagery.These stripes complicate both the visual and automated interpretation of the SAR imagery.The HV channel incidence angle dependence and varying noise floor in the range direction are corrected based on a statistical incidence angle dependence computed for a large number (65) of RS-2 ScanSAR Wide images.Then the incidence angle range is quantized into N bins of 0.01 • (θ 1 , . .., θ N ) covering the whole incidence angle range.We average the σ o values from all the 65 images for each θ i .We denote this average by σ o (θ ).Together these average values create a σ o curve Z(θ ) as a function of the incidence angle over the whole incidence angle range.We consider that each σ o (θ ) consists of the sum of the average σ o , the associated noise floor and the σ o decay as a function of the incidence angle.As we average a large number of values representing different targets for each theta bin, we assume the σ o average to be similar for each bin (i.e.constant over the whole incidence angle range), and then Z(θ ) also presents a sum of the constant value, the noise floor as a function of theta and σ o decay as a func-tion of theta.To minimize the effect of the constant value on the correction we then take the mean of all Z(θ) values over the whole range of theta.We denote this mean by Z. Then the whole Z(θ) curve is made a zero mean signal by subtracting Z from Z(θ ); this zero mean function is denoted by Z(θ ).The incidence angle corrected backscattering coefficient σ o HV corrected (r, c) is now obtained from each calibrated pixel at σ o (r, c) value: σ o HV corrected (r, c) = σ o (r, c)−Z(θ (r, c)), where (r, c) are the row and column coordinates of the image grid and θ (r, c) is the corresponding incidence angle.This correction was proposed in Karvonen (2015).
The equivalent number of looks (ENL), noise equivalent σ o and autocorrelation between neighbouring pixels in the rectified images were studied using homogeneous areas of 3.1 × 3.1 km visually selected from the images over open water areas with a weak texture.The ENL was around 9.5 for the whole θ o range.Thus, the radiometric resolution was around 1.2 dB and the standard deviation (SD) of fading was 1.4 dB.The autocorrelation coefficient between the adjacent 100 m pixels was on average only 0.18.The land masking was based on the GSHHG (Global Self-consistent Hierarchical, High-resolution Geography database from the National Oceanic and Atmospheric Administration, NOAA) coastline data (Wessel and Smith, 1996).
Next, the SAR images were segmented, and the segmentwise features were calculated at a resolution of 100 m; for details see Sect.3.1 and 3.2.Due to the large size of the SAR images and also the feature images they were downsampled to 500 m resolution.Finally, the daily SAR image and feature mosaics were constructed by overlaying all of the SAR data available for each day; i.e. the latest data are shown in the mosaic.The study area was typically fully covered by RS-2 SAR imagery every 1 to 2 days.

Ice charts and degree of ridging
Our reference data set consists of the daily FIS manual ice charts over the Baltic Sea.In the ice charts the degree of ice ridging (DIR) is used to classify sea ice in a way that is relevant for the difficulty of navigation.DIR is manually assigned as a qualitative numeral, ranging from 1 to 6, to each ice chart polygon.The six DIR categories used in the operational ice charting in the Baltic Sea relate to the ridge density variation in an area.The categories are visually identified through changes both in the σ o response as well as in textural characteristics in SAR imagery.Their interpretation is validated through field measurements provided by several operating icebreakers.The task of assigning a DIR value to each ice chart polygon is a complex process requiring a good understanding of the history of the current winter season, i.e. monitoring of changes in the pack ice zone and utilizing the continuous reports on ice conditions provided by the icebreakers.In our study, we only define four DIR categories by combining the brash ice barriers (WMO, 2010) and the heavily ridged-ice category.The brash ice barriers covered a very small fraction of the sea ice area, so that they could not have been treated as a distinct category for classification.The very heavily ridged-ice field category was not present in our data set.The four DIR categories used in this study are defined as follows.
The level-ice area is indicated as DIR 1, which usually looks homogenous and smooth in the SAR imagery with low σ o response.This category also includes slightly rafted ice.The slightly ridged-ice category including heavily rafted ice areas is marked as DIR 2. The ridged and heavily ridged-ice areas corresponding to the DIR 3 and 4 are recognized by the changes in surface roughness at a larger scale, resulting in higher σ o values.As their formation depends on ice pressure, knowledge of the earlier ice and weather conditions is required.The DIR 4 category in our data set included the few occurrences of brash ice barriers.
By visual inspection of the RS-2 and S-1 SAR wide swath imagery with a spatial resolution of approximately 100 m it is not possible to describe the ridging intensity quantitatively.However, it is feasible to assign categories of ridged ice to extended areas for which the actual ridging intensities differ.For justification of the meaningfulness of the areal DIR values see our comparison with the 2011 field campaign data set in Sect.4.1.
The ice charts are saved as numerical grids from the icecharting software with a resolution of approximately 1 nautical mile (NM).In the grid format the ice thickness, ice concentration and DIR value assigned to each ice chart polygon are included.The size of a polygon is usually several hundreds of square kilometres.Additionally, the sea surface temperature is included in the ice chart grids.This practice slightly differs from the ice classes defined by the icecharting guidelines in the Canadian Ice Service, MANICE (2005), in which the ice is classified based on the stage of development and indicated by the so-called WMO egg codes (WMO, 2010).

Surface and thickness profile data on ridged ice
No long-term studies between ice chart DIR and the actual ridging statistics have been published, although field campaigns to measure ridging in the Bay of Bothnia started in the late 1970s.The first campaigns used ship-borne laser profilers and the first extensive airborne laser profiler campaign was conducted in 1988 (Lewis et al., 1993).To compare DIR categories in the FIS charts with actual ridging we utilize data collected during helicopter-borne profiling campaigns in the Bay of Bothnia.The main data set is from the March 2011 campaign with approximately 600 km of measurement lines by a helicopter-borne electromagnetic (HEM) sensor which combines laser surface profiling and inductive distance measurement to the ice-water interface.The measurement system was similar to that described by Haas et al. (2009).The HEM measurements given as a comprehensive understanding on ridging are obtainable from linear profiles (see Fig. 2).
The two profiles provide the total thickness, and the surface laser profile resolves ridge sails.The measurement spacing of the HEM instrument is 3-4 m, while the measurement response is obtained from a footprint which is typically 50 m.Standard inversion of the EM response assumes that the ice has uniform thickness and zero conductivity under the footprint.Neither holds for the roughly triangular ridge keels as their porous lower parts are permeable to electric currents.This results in an underestimation of keel depths with 50 % or even more (Pfaffhuber et al., 2012) and also in an underestimation of the total volume of ice.
The ice season 2010-2011 was severe, with a maximum ice extent of 309 000 km 2 .In the mid-basin of the Bay of Bothnia, the level-ice thickness reached 60 cm with somewhat decreased ridging compared to the average winter with similar wind conditions.
To provide interannual variation for the HEM-campaignbased results of Sect.4.1, we use data from the 1988 campaign and from three other campaigns in 1993, 1994and 1997, summarized in Lensu (2003)).The 1993 campaign was carried out in February and the others in March.They measured in total 1600 km of surface profiles.The 1988 campaign covers the whole of the Bay of Bothnia and the 1994 campaign covers it in the S-N direction, while the 1993 and 1997 campaigns cover the NE quadrant of the basin.From the profile data, ridge sails are selected with the Rayleigh criterion: to include the shallower one of two adjacent sails its height must be at least twice the minimum elevation between the sails.In addition, a cut-off height is imposed.From the sail data the variation in ridge density or the number of ridge sails per kilometre, and sail height can be determined.The ridge densities were 6.4, 7.3, 5.3 and 26.7 per kilometre for the 88, 93, 94 and 97 campaigns.The sail height shows less variation, from 0.58 to 0.66 m.The densities are affected mostly by the number of days with strong winds during the earlier stages of the season when ice is less resistant to the deformation.The threshold wind speed for the onset of deformation is usually 14-16 m s −1 .The average values obscure the large regional variance in ridge density.It typically increases when moving northwards and eastwards.Coastal ridge fields are often created by the closing of refrozen coastal leads and can be continuous rubble fields with densities of up to 100 per kilometre.Sail height depends on the average ice thickness of the basin.Also, the presence of snow reduces the heights in the profile data with values equal to the snow thickness.However, as a first approximation, the average height of sails exceeding 0.4 m can be assumed to be 0.6 m in the Bay of Bothnia.In the interpretation of the profile data it must be taken into account that a considerable fraction of ridges fall below the cut-off.Moreover, the sail heights are sampled from random crossings and also include shallower sections of the sails.In situ field measurements usually select the highest point of the sail and the observed heights are typically 1-3 m and drilled keel depths of 5-15 m (Kankaanpää, 1997).

Correspondence between ice charts and SAR mosaics
For a correct classification of the SAR texture features of sea ice, they need to be consistent with the degree of ridging values assigned for each class in the ice chart for the whole training data set.This consistency, however, cannot be assured by the current ice-charting process for two main reasons.Firstly, the SAR data scenes are not always acquired over the area of interest in time for the ice charting (Rinne and Similä, 2016).This results in the ice analysts' requirement to extract ice information from other available sources, typically consisting of optical sensors (e.g.MODIS), in situ measurements or observations (e.g.icebreakers).Secondly, the degree of ice ridging is divided into four severity classes.These ridging categories are not always clearly separated; thus in many cases they can be mixed.
To minimize the subjective bias and maximize the consistency between the texture features present in the daily SAR mosaics and the allocated ridging class in the corresponding ice charts, we used only those SAR-chart pairs which agree with each other on a daily basis to a decent degree and rejected all others.This was performed by examining every DIR chart during the whole test period from 1 January to 31 March 2013 and comparing it visually to the corresponding SAR mosaic.Visually comparing the sea ice texture features in SAR imagery and evaluating their correspondence to the correct ice class in the FIS charts is not a straightforward task when performed by a non-trained analyst.However, after inspecting several pairs visually and discussing them with the ice analysts, the comparison was much easier and there were several cases in which at least one major disagreement was found.Some of these disagreements arose and were confirmed to have appeared from the lack of fresh SAR imagery at the time of ice charting.In some other cases, the disagreements were clearly visible.If the DIR values assigned in the ice chart for a specific SAR texture region were consistent on multiple occasions, the exception pair was eliminated because it was considered inadequate for the classification.An example of the data selection procedure is presented in Fig. 3, in which we show two SAR-chart pairs from 9 to 12 March 2013 over the Bay of Bothnia and the northern part of the Bothnian Sea.Here, the two daily SAR mosaics show visually similar texture features with very small differences.In spite of that, the corresponding DIR charts show a change in class from slightly ridged ice of class 2 on 9 March to level ice of class 1 on 12 March, along the west coast near the fast-ice region.For ship navigation, the low ice ridging classes (DIR 1 and 2) do not likely pose any real concern.Therefore, if they are assigned differently on different days, the shipping is not affected much.On the other hand, for the automatic classifier this is a confusing case that leads to a decrease in discrimination power between the two ridging categories.A similar effect can be seen in the more central to southern part of the Bay of Bothnia, where the heavily ridged ice of class 4 on 9 March has changed to class 2 on 12 March.In this case, we have accepted only the data pair from 12 March as a good pair, because it was found more consistent with the classification from multiple days (i.e. 13, 14 March).In this case, the navigation in those areas might suffer.
In the end, our selection of daily SAR mosaics and FIS DIR charts pairs resulted in a total of 11 pairs in January, 15 in February and only 8 in March.All other data were rejected from the classification due to inconsistencies in at least one DIR class assigned to the corresponding SAR mosaic region or the time gap restriction (see Sect. 2.2).

Methodology for estimation of the degree of ice ridging
Our classification procedure consists of two stages.First, we segment the SAR imagery.The primary goal in the segmentation is that the resulting segments would mainly be composed of one DIR category.Then for each segment we com- The Cryosphere, 12, 343-364, 2018 pute a set of SAR texture features which are related to the ice ridging information.Their definitions are given below in Sect.3.1.The vector with the computed features as its components is called a feature vector.The second stage is to classify every segment based on the feature vector and assign one DIR category label to each segment.Hence, a successful classification requires that the segmentation succeeds, the features are meaningful and the feature-based classification is efficient.

SAR image segmentation
We denote the set of all sites in I by S, a single site (i.e.pixel location determined by its row and column indices) by s = s(i, j ), with i = {1, . .., M} and j = {1, . .., N } and the set of labels by L = {L 1 , . .., L K }.The configuration of the labels is denoted by .After the initial segmentation we have K classes (labels) with the class-wise means µ = {µ 1 , . .., µ K } and variances σ 2 = {σ 2 1 , . .., σ 2 K }.These means and variances remain fixed after the initial segmentation.We adopt a firstorder neighbourhood system ∂s in which each site has four neighbouring sites: two in the horizontal direction and two in the vertical direction.We use a pairwise clique system.If s and r belong to the same clique of the site s, i.e. c(s) = {s, r}, then s must belong to the neighbourhood ∂r and r must belong to the neighbourhood ∂s.Hence r and s are neighbouring pixels.In the first-order neighbourhood each site has four cliques.The potential function associated with the clique c(s) is denoted by V c(s) or generically V c .The pixel value in one site s is denoted by f s and f = {f s |s ∈ S}.Without a subscript refers to a label configuration in S. We denote by S the set of all the configurations.Assume that we at some iteration have the label configuration * in S. In the next iteration, selecting the best new label Ls for the site s, given f and * S/s , is equivalent to maximizing the probability distribution of labels in s, conditioned by f s and the current label configuration in the neighbourhood * ∂s .This is possible by utilizing the Markov property in MRF (Besag, 1974).The selection of the best new label for the site s can be written as The right-hand side of Eq. ( 1) can be written as the product: where the first term g is the likelihood function and the second term is the potential function.
For pairwise cliques the potential function V c (s, r) is reduced to only two states: www.the-cryosphere.net/12/343/2018/The Cryosphere, 12, 343-364, 2018 where The homogeneity of the region is controlled by the β(> 0) parameter.
We assume that g has a Gaussian distribution with the class-wise mean µ L s and variance σ 2 L s ; i.e. in the Gibbs form it is where Z s is a normalizing constant and U (f s |L s ) is the likelihood energy.
At site s we compute the local energy U s (L), i.e. the logarithm of the product in Eq. ( 2), as Maximizing the product in Eq. ( 2) over L yields the new label Ls .This maximization is equivalent to minimizing U s (L).
In a similar manner we obtain the best new labelling ˆ for the whole image by solving the local minimization of U s (L) for every s ∈ S.So the global minimum U ( ) is achieved by local computations.This procedure results in the MAP estimate for ˆ : These kind of functions can be optimized by various methods, one being the simulated annealing method (Kirkpatrick et al., 1983;Černý, 1985), where a slow decrease in the probability of accepting worse solutions occurs as the algorithm searches the solution space.The method used here is an adaptation of the Metropolis-Hastings algorithm introduced in Metropolis et al. (1953).In the algorithm the labelling is also dependent on the control variable called temperature, T , the value of which decreases as the iteration proceeds.We denote the proposed new label by L * .If the value of the energy function U (L * ) decreases, L * is accepted always.If the value of U (L * ) increases, the label is accepted with probability exp(− U/T ), where U is the energy difference between the new and old configurations.

SAR image features
We studied the classification of DIR categories using the computed SAR features and the DIR values from FMI ice charts.The following SAR features were computed from the SAR images with 100 m pixel size, and their efficiency in the DIR classification was studied.Each feature value is a median value of the feature computed over a single segment.
1. HH polarization SAR backscattering coefficient (σ o HH ), with incidence angle correction applied; 2. HV polarization SAR backscattering coefficient (σ o HV ), with incidence angle correction and noise level equalization applied; 3. HH entropy (E HH ), computed in windows with a radius of 5 pixels; 4. HV entropy (E HV ), computed in windows with a radius of 5 pixels; 5. HH autocorrelation (AC HH ), computed in windows with a radius of 5 pixels; 6. HV autocorrelation (AC HV ), computed in windows with a radius of 5 pixels; 7. HH coefficient of variation (CV HH ), computed in windows with a radius of 5 pixels); 8. HV coefficient of variation (CV HV ), computed in windows with a radius of 5 pixels; 12. HH kurtosis (K HH ), computed in windows with a radius of 5 pixels; 13. HV kurtosis (K HV ), computed in windows with a radius of 5 pixels.
Additionally we extracted the segment mean of sea ice concentration (SIC) from the FMI ice charts.
The coefficient of variation was computed separately as where σ is the standard deviation and µ is the mean over the window.Kurtosis is computed as the fourth moment within the data window.Entropy E (Shannon, 1948) was computed as where p k 's are the proportions of each grey tone k within a window.Autocorrelation, AC (Similä, 1994;Karvonen et al., 2005), was computed as where I (k, l) is the pixel value at location (k, l).Mean over the horizontal, vertical and diagonal directions, i.e. (k, l) = (0, 1), (k, l) = (1, 0), (k, l) = (1, 1) and (k, l) = (1, −1) was used to accomplish directional isotropy.The computation window is denoted by B.
Edge density D was computed for each segment (separately at the HH and HV polarizations) after edge detection by the Canny algorithm (Canny, 1986) as where N e is the number of edge pixels within a segment and N is the segment size in pixels.
Most of the features have a straightforward interpretation.Entropy describes how uniformly the HH/HV values are distributed.Edge density is a measure for edge fragments present in the segment, which we assume to be related to ridging.The coefficient of variation (CV) describes how fast the standard deviation increases with the mean.We expect that in the ridged areas CV is larger than in the homogeneous areas.Kurtosis describes the peakiness of the σ o distribution.With the aid of the spatial autocorrelation we can quantify how structured the ice field in question is in the SAR imagery.We expect that more structural elements appear in the ridged ice than in the level ice for which the spatial σ o variation is more random.
In Fig. 5 the computed features for an area in the central Bay of Bothnia on 15 March 2013 are shown.Moderately and heavily ridged areas were present here.

Random forest classification method
After trying several classification methods (local regression, logistic regression, general additive regression model) we found that the random forest (RF) (Breiman, 2001) approach produced good enough results to be of practical use.Random forest is an ensemble learning method which can be applied www.the-cryosphere.net/12/343/2018/The Cryosphere, 12, 343-364, 2018 to classification and regression.In RF we artificially generate several training sets from a single training set at our disposal using bootstrapping, grow a classification tree for each individual training set, perform classification for each tree and then aggregate the results.The bootstrap aggregation is called bagging.This technique is efficient for reducing variance in high-variance predictions in the same manner as taking an average of samples in Breiman (2001).
For the classification of the daily sea ice data we divided our data into the training and the test data sets (see Sect. 2.2).
In our computations we have used the routines included in the commercial software Matlab.

Description of the algorithm
Here we outline the RF classification method and the notations used in this algorithm.The classes are denoted by C = {1, . .., C}.We have a training set X = {x 1 , . .., x N } where each sample x i consists of a feature vector f i and the corresponding class.When we take a bootstrap sample from X, we denote it Z * .Our bootstrap sample Z * is the same size as the original sample, so on average the fraction 63 % of the original samples of X belongs to it and the rest are duplicates (Efron and Tibshirani, 1993).The samples of X left out from Z * (about 37 % of the samples) are called out-of-bag (OOB) samples.
The classification tree is denoted by T b ( b ) with b ∈ {1, . .., B} and it uses Z * as its training data.Each end node n of T b ( b ) has a class label which is the most frequent class in that node.The parameter b characterizes the bth random forest tree in terms of split variables, split points at each node and terminal node class label.The class label given by T b ( b ) depends on the feature vector f i which is used as input for the tree.It is denoted by Ĉb (f i , b ).We generate B bootstrapped training sets and relying on every training set we grow a classification tree T b ( b ).A classification tree often achieves a rather low bias if it is grown deep with many nodes without pruning (Hastie et al., 2011) .
The impurity measure is the Gini index G: where p(c|n) is the proportion of the samples that belong to class c at a particular node n.G indicates how dominant the class c is in the subtree after the split.A small Gini index value indicates that the subtree contains predominantly observations from a single class.In the split the feature component of the vector f i with the smallest Gini index is utilised (Ripley, 1996).
In classification we record the classes predicted by the ensemble of B trees for a specific feature vector and take a majority vote.The most common class is the class predicted by the ensemble.Then the selected class has a smaller uncertainty than a single classification tree (Hastie et al., 2011), because an average has a smaller variance than a single variable.This is also true for the correlated variables.If B is large enough then the random forest algorithm avoids the tendency of over fitting the model which often occurs in the context of the decision trees.
The problem with bagging is that the grown trees are correlated.To reduce this correlation the RF has a randomisation step.When building trees, each time a split in a tree is considered, a random sample of m predictors is chosen as split candidates from the full set of p predictors (m = 4 and p = 8 here).A new sample of m predictors is taken at each split.This step prevents the same features from dominating every tree.
The flow of the random forest algorithm is described below.b.Grow a random-forest tree T b ( b ) with the bootstrapped data, by recursively repeating the following steps for each terminal node of the tree, until the minimum node size is reached.
i. Select m variables at random from the p variables.ii.Determine the best variable and split-point among the m variables using the Gini index.iii.Split the node into two daughter nodes.

Output the ensemble of trees {T b ( b ) B
1 }.
To classify a new feature vector f n : Classification: Let Ĉb (f n , b ) be the class prediction of the bth random-forest tree.

Selection of the features
Because an ensemble of trees was used in RF and a large number of features were utilized, the results were hard to interpret.To analyse the impact of different features on the class estimation the importance measure was used.The value of the importance measure is called an importance value.This measure is implemented as follows.
For each tree, the classification error on the OOB portion of the data is computed.This gives the baseline error rate for the tree.Then in the OOB set we randomly permute one feature of the feature vector f i and simultaneously keep fixed the other features in f i .We note that the marginal sampling distribution of the picked feature remains the same during the permutation.Next, we recalculate the classification error in the OOB set.This classification error is compared to the baseline error.Usually it is larger than the baseline error.The procedure is repeated for every feature separately.The decrease in classification rate as a result of this permuting is averaged over all trees and is used as a measure of the importance of the chosen feature.
To select the features we run the RF algorithm for several feature combinations and for several different training data sets.The importance of the features as well as the classification accuracy was monitored.This empirical approach led to the choice of 8 features from the computed 13 features introduced in Sect.3.2, including the additional SIC feature.
In summary we found that the RF classification presents the following advantages: (i) RF has the ability to describe complex, non-linear statistical relationships among variables, (ii) RF reduces the uncertainty of the obtained estimate and (iii) RF reduces the possibility of overfitting.The greatest weakness in RF is its relatively weak extrapolation property (Hastie et al., 2011).This property can be seen from the behaviour of the error rates.The RF classifier has a very low training error rate but the error rates increase significantly for the test set.

Ice chart ridging categories vs. surface and thickness profile data
We use the March 2011 HEM campaign data to study how well the DIR categories in the FIS charts describe the actual ridging.As the variation is much larger for ridge density than for sail height, ridging is parameterized here by density only.The other parameter considered is the total thickness of ridged ice.This is a navigationally relevant parameter that can be used to calculate the ice-going speed of ships.To establish compatibility with the ice chart, which employs a 1 × 1 nautical mile (NM) grid, the sail density and ridged-ice thickness from the March 2011 data were calculated as averages for the cells of the grid.There are two different aspects of comparison involved.The first is the relationship between ridge density and DIR in the ice chart data.The other is the relationship between ridge density and ridged-ice thickness, which is in general relevant to the question of how well surface data can represent the total thickness of ice.Thus, the comparison is made between ridge density and DIR and, on the other hand, between ridged-ice thickness and DIR.
Although somewhat qualitative the DIR indices are estimates made by sea ice specialists and refer to the Lagrangian ice chart regions corresponding to various formation and deformation phases of the ice cover.The reliability or their boundaries is usually high.The DIR value 1, corresponding level-and rafted-ice categories, had very small coverage in the data, while DIR 2, the category of slightly ridged ice, was not found at all.The comparison is therefore made for the DIR values 3 and 4, or moderately and heavily ridged ice.The sail height retrieved from the profile data was equal for these categories, while a clear difference was found for the ridge densities and ridged-ice thicknesses; see Table 1.This indicates that a rough but reliable quantification of ridging can be based on DIR values only.A more detailed picture can be obtained from comparisons of Fig. 6 between DIR and, on the other hand, ridge density and total ice thickness from the March 2011 data.For the ridge density the colour bar range is chosen to be from 12.7 to 21.5, which are the average densities corresponding to DIR 3 and 4 in Fig. 6.Ice thickness colour bar was scaled similarly.Thus, all values below the averages corresponding to DIR 3 are blue and all values above averages corresponding to DIR 4 are pink.Above and below the colour bar range the ridge density still has a wide range of variation, as is seen from the histogram in Fig. 6.However, the basic regional characteristics are similarly visible in all three data sets.In spite of the utmost simplicity of the DIR it was in a reasonably good agreement with both ridge density and total thickness.The agreement with DIR was somewhat better for the total ice thickness than for the ridge density.This may be related to the fact that a large fraction of ridging does not show in the density due to the cut-off but affects the SAR-based and visual estimates behind DIR values.The generally good agreement between DIR, ridge density and total thickness means that DIR values estimated from SAR can be translated to navigationally relevant ridging or thickness parameters.The largest differences between the degree of ice ridging and HEM quantities were found in the coastal ridge field extending from 64 • N, 23 • E towards SW (see Fig. 2).Both ridge density and average sail height were lower for this part in comparison with the extension of the same ridge field towards NE from this location.These values were also similar to those found in the mid-basin, so the missing separation of this coastal ridge field into two categories that were apparent from the HEM data is clearly a shortcoming of the ice chart DIR data.

Monthly backscattering statistics
We concentrated our analysis on the areas with SIC over 80 %.In areas with ice concentration varying from 80 to 90 %, the amount of open water can impact the backscattering statistics significantly, particularly during high winds.This SIC limitation excluded the marginal ice zone (MIZ), which is defined as consisting of ice areas with SIC from 15 % to80 %; see e.g.Strong (2012), from our analysis.Almost all areas with SIC from 80 to 90 % belonged to level-ice polygons in our data set.In total the level-ice category DIR 1 covered well over 50 % of all the ice areas during our test period.
According to earlier studies the effect of incidence angle on σ o for level ice and ridged ice is rather similar.In Mäkynen et al. (2002) it was found that the incidence angle dependence of σ o HH in logarithmic scale (dB) can be described by a linear model, with slopes −0.21 dB degree −1 for ridged ice and −0.25 dB degree −1 for level ice.It seems that using a slope of −0.23 for all the data is adequate for automated classification, and the ridged areas and level ice can be distinguished both at near and far range.Also, a more sophisticated approach that iteratively applies different slopes to level ice and ridged ice has been studied in Karvonen et al. (2002),   For the HV channel, the combination of the incidence angle and noise floor correction is essential.Without this correction the HV backscattering and texture features derived from it can not be used in classification as the effect of the varying noise floor is so high (up to about 3 dB) and will cause a significant number of misclassifications.However, after correction the HV channel data can be used in classification and we have not visually observed any significant differences in near-range and far-range σ o HV for either open water, level-ice or ridged-ice classes.
We first looked at how the σ o HH distribution changes monthly from January to March 2013 in two main ice categories: level ice (DIR 1) and ridged ice (DIR > 1) (see Fig. 7).At the beginning of January level ice appeared mainly near the coast of the Sea of Bothnia, and the dominance of mostly thin ice in the Sea of Bothnia continued up to the middle of February, whereas in the Bay of Bothnia ridged-ice areas appeared throughout the whole test period.A significant fraction of the level-ice pixels had a σ o HH value below −18 dB, indicating thin smooth ice.For thin ice relatively high σ o HH (over −18 dB) have also been observed (Mäkynen and Hallikainen, 2004).Also, the presence of the level-ice areas with a relatively low SIC ( 80 HH value from the ridged areas was over 2 dB higher than that of the level-ice areas unlike in previous months.There was a significant increase in the magnitude of σ o HH from ridged areas, whereas this was not the case for the level-ice σ o HH .Based on Fig. 7 it was obvious that the magnitude of σ o HH alone could not be an efficient predictor in the estimation of the DIR value in January and February. We assume that the small separation in the σ o HH values originating from level ice and ridged ice during the first 2 In addition, the level-ice area (DIR 1) had significant uncertainties in the FMI charts.The ice analysts responsible for the charts told us that in several cases it was difficult or nearly impossible to discriminate reliably between level ice and slightly ridged ice (DIR 2).In these cases they usually chose the level-ice category if the icebreaker reports did not indicate any difficulties for merchant ships.If these had been reported, a slightly ridged-ice category (DIR 2) was chosen.
Considering the σ o HH contrast between level-and ridgedice areas, the situation changed gradually in February and March towards a more distinct separation between these ice types.We shall analyse the DIR charts separately for the period of strong thermodynamic ice growth (January-February) and more stable winter conditions (March) in Sect.4.3.
The examination of the monthly σ o HV distributions (see Fig. 8) confirms our findings for the σ o HH distribution.In January and February the values of σ o HV from level-ice areas were close to the noise floor (−28 dB) and hence uninformative for a meaningful analysis.In the ridged areas σ o HV were 2-3 dB higher but still in general rather low, indicating a low sea ice surface roughness.In March the σ o HV distributions both in level-ice areas and ridged areas were on average about 1-2 dB higher than in the previous months.Also, the contrast in the σ o HV between ridged and level ice was more significant.As a consequence, the σ o HV values affected the classification result in March but were not useful in the earlier months.

Classification results for several ridging categories
There was a fundamental imbalance between the sample sizes representing level-ice and ridged-ice classes.The samples from all the ridged-ice classes formed about 40 % of all samples.If we had required that all the classes were of equal size in the training data, the number of observations per ice category would have been low, e.g. less than 20 % of the level-ice samples would have been utilized.When assess-  ing the results we will keep in mind the very different sizes of the ice classes.We run all our random forest classifications with the same set of tuning parameters for routine TreeBagger (Matlab, 2016).From the set of eight (p = 8) features we randomly chose m = 4 features to be used in a split.Often the value m = √ p, i.e. m = 3 here, would have been recommended (Hastie et al., 2011).However, we noted that slightly better results were obtained with m = 4 for our data set.Another fixed option was that the minimum number of data points in the end nodes was set to 10.We grew 200 trees during the classification.Results with more decision trees did not yield any significant improvement of the error rates.
In the first phase we investigated the capacity of the RF classifier to separate level and ridged ice.Data from all 3 months were included in the analysed data set.The results are presented in Table 2.The overall classification rate was 82 % for the whole of winter.
Next we examined the classification of all the four ridging categories throughout the 3-month period.The training and test data sets had been selected from each month in our data set.The overall classification rate for the test period was 71 %.Looking at the confusion matrix in Table 3 we can observe that the level-ice category (93 %) had a very high classification rate.The classification of the three categories for ridged ice was more challenging.The ridged-ice category (DIR 3) was classified correctly in 45 % of the cases but over 30 % of the observations were confused with level ice.The slightly ridged ice (DIR 2) was poorly distinguished.Only in 15 % of the cases was it detected correctly.Most DIR 2 samples (42 %) were assigned to the level-ice category which in light of the previous discussion could be expected, i.e. the preference among the FIS ice analysts to use level-ice category over slightly ridged ice in the manual ice charts.The ridged-ice category with the most accurate classification rate (59 %) was the heavily ridged-ice category (DIR 4).
To obtain more information on how the adopted approach works in rapidly changing ice conditions and in more stable winter conditions, we classified all 3 test months separately so that the training and testing data were collected during the same month.The overall accuracy of the monthly results varied largely, being lowest in February (63 %) and higher in January (83 %) and March (81 %).The corresponding Cohen's kappa figures were 0.60 (substantial agreement), 0.52 (moderate agreement) and 0.68 (substantial agreement).The separation between all ice categories was best in January (overall accuracy 83 %) when basically only three DIR categories appeared.Evidence that the definitions of the different DIR categories were inconsistent with each other in January and February was that in these months the detection www.the-cryosphere.net/12/343/2018/The Cryosphere, 12, 343-364, 2018  In each month level ice was the dominant ice category, being over 50 % across the ice-covered area.The DIR 2 category covered from 6 to 19 % of the ice area depending on the month.In no month was DIR 2 successfully detected due to its ambivalent definition with respect to the DIR 1 category.The DIR 3 category was successfully detected in January when its areal coverage was large (21 %), and in March when the boundaries between different ice categories were best defined during the test period.The heavily ridged-ice fields (DIR 4) were classified well, except in January when such ridged areas were rare (about 1 %).A possible explanation for the lowest accuracy rate in February was that the boundaries between different DIR regions were often visually rather difficult to discern in the SAR imagery according to our experience.Figure 9 shows the variation of the detection rate for each DIR category in all the classification results.The most distinct feature in the results is the consistently poor detection rate for DIR 2.
In Fig. 10 we can see the Baltic Sea ice DIR classification result (panel a) for a dual-polarized SAR image mosaic on 9 February 2013 (Fig. 4a, b).Also, the reference DIR chart is shown for comparison (panel b).The automated DIR chart produced agreed well with the FIS ice charts for DIR values 1 and 3.However, the automated chart estimated a large fraction of DIR 2 category ice to DIR 4 category.The automated DIR chart contained detailed markings of the cracks and openings in the central Bay of Bothnia which were not present in the FIS chart.We remark that the SAR mosaic on 9 February looked very similar to the one on 7 February (2 days earlier), when the same cracks or openings can be found, but the corresponding FIS ice chart DIR showed DIR 4 in the areas to which DIR 3 was now assigned.This can be taken as an example of the subjectivity which is inherent to the manual ice charts.
There is a good overall agreement between the FIS chart and our DIR classification in Fig. 12.Most of the differences occur in the Bothnian Sea.There the FIS chart indicates mostly level ice and to some extent slightly ridged ice.However, the classification assigned the ridged-ice and heavily ridged-ice categories to some FIS level-ice areas.Based on the SAR HH-and HV-polarization mosaics (see Fig. 11) those areas represent broken ice fields, although the ridging intensity is hard to assess visually.more variability between different ridging categories in midwinter.The combination of these two factors determine the accuracy of the final classification.
We studied the success of the segmentation by examining how the large fraction of the segments contained practically just one ridging category; i.e. the area of some ridging category covered over 90 % of the segment area.The results were that in January 93 % of the SAR imagery belonged to the segments, in February 80 % and in March 86 %.The high fraction of well-defined segments in January is easy to understand because most of the ice was level ice (72 % of the area), and just three ridging categories appeared (the heavily ridged area covered less than 1 %).In February the fraction of level ice has decreased to 55 % of the total area, all four ice categories were present, and the total area of well-defined segments decreased to 80 %.In March the level-ice area covered 59 % of the total area and the area of the well-defined segments was 86 %.Hence there was better segmentation accuracy in March than in February.In that month the total area of correctly classified ridging categories was 81 %, 5 % less than the total area of the well-defined segments.In February the total area of correctly classified ridging categories was just 63 % which means 17 percent points less than the total area of the well-defined segments.This analysis suggests that the main separating factor contributing to the classification accuracy was due to the more versatile feature vectors in March.

Importance of features
The selection of the eight features in Sect.3.3.2was based on their importance value.The features consisted of six HHpolarization-based segment-wise features (see Sect. 3.2) and the segment-wise σ o HV as well as the SIC value extracted from the FIS ice chart.Their importance order when the training data covered the whole test period is presented in Table 4.If the training data of just 1 month were used the importance order of features varied slightly.The importance of one specific feature is relative in the sense that it changes when the combination of the used features changes, i.e. the importance of one feature depends on which other features are included.However, the feature SIC remained the most influential feature in every case.This is comprehensible because when SIC was between 80 and 90 %, the ice area in question almost always represented the level-ice category (DIR 1) and the corresponding feature vector was easy to classify correctly.The rather low importance value of σ o HV is probably due to the relative narrow range of the σ o HV values.
To gain more insight into how the eight selected features affected the classification accuracy, we studied the possibility of feature reduction using the March data as a benchmark.The March data were selected because the diversity of ridging categories was largest then (see Table 3).We systematically eliminated the selected features one by one and reclassified the March test data using the remaining features.In none of the cases did the classification accuracy improve with fewer features.For several removed features (E HH , AC HH , K HH , σ o HV ) the classification accuracy decreased by just a few percent points (1-3 %).The removal of the ED HH feature did not practically affect the accuracy at all.A significant misclassification rate increase was observed with the reduction of the σ o HH (−6 %), CV HH (−8 %) and SIC (−12 %).In every case the relative importance of the retained features changed.Hence the importance of the features present in Table 4 is true only in the context of this specific feature combination.
To see more clearly that the features included in the feature vector complement each other and make the classification more robust, we classified the March data using only three basic features (f 3 = (SIC, σ o HH , σ o HV )).The overall accuracy was just 64 %.Then we added the feature CV HH to f 3 because CV HH caused a significant drop in the accuracy.The accuracy remained low, at only 68 %.Our conclusion is that the information provided by the whole feature set is needed for a good description of ridged-ice field in the SAR imagery.If a reduction of one feature already decreases the classification accuracy, the reduction of two or more features would degrade the classification further.The only feature which may be unnecessary is ED HH .It was also the most heuristic one (see Sect. 3.2).Because it does not decrease the classification accuracy, we kept it in our feature combination.We also experimented with replacing the HH-polarization-based features with their HV-polarization counterparts.This led to degradation of the classification accuracy in all of the studied cases.

Discussion and conclusions
The degree of ice ridging is one of the most useful parameters for ice-navigating ships.It basically indicates, together with the ship characteristics, whether a vessel can safely pass through an ice field or not.The DIR also complements the more general risk index outcome (RIO), defined by IMO (2016), as this does not address ridging but relies on WMO categories for this stage of development.We have shown that an automated estimation of the DIR from SAR texture features, together with an ice concentration estimate, performs well when compared to the values extracted from the manual FIS ice charts.The applied features describe statistics of σ o variation in the SAR imagery.DIR estimation is a suitable task for a SAR-based approach because the C-band σ o is sensitive to the large-scale surface roughness due to ridging.
Both independently operating ships as well as ships relying on icebreaker support operate in ice-infested waters.In the Baltic Sea most of the merchant ships need icebreaker assistance.However, ships of the highest Finnish-Swedish ice class in the Baltic Sea, 1A Super, which is equivalent to the Polar Class PC 6, are designed to operate in difficult ice conditions independently.The FIS ice charts are prepared to serve operations in which ships follow an icebreaker in a convoy.Based on discussions with the FIS ice analysts the following remark is made.If the ice conditions in an area do not pose a realistic risk for icebreakers to get stuck, a smaller DIR value is often assigned to this area, even if the area is difficult for independent navigation of merchant ships.This is especially true for DIR 2. Hence, the availability of the icebreaker assistance has an effect on the DIR classifications in the FIS ice charts.
The primary objective of our DIR classification algorithm is to separate the severe ice conditions from the easier navigable ones.To reach this goal our DIR classification mainly relies on SAR image statistics.In some cases this may lead to differences between the FIS ice charts and our classification results because the FIS charts take the icebreaker factor into account, which is not present in the SAR imagery.Hence, these two data sets can be interpreted from slightly different perspectives.An example of this difference is our earlier discussion related to Figs. 10 and 12.One of the essential advantages of the automated DIR charts is that they include leads and small level-ice areas between ridged areas not present in the coarser FIS charts.
We used a two-stage classification system.First, we segmented the dual-pol SAR mosaics.The result was slightly different for different months.The area of level ice always exceeded 50 % of ice cover.In January it was highest, over 70 %.In that month 93 % of the segmented area belonged to the segments dominated by one ice category.In February and March the respective figures were 80 and 86 %.It should be noted that for January only three DIR categories were present, unlike for the last 2 months in which all four DIR categories appeared.We can conclude that the SAR signatures matched the DIR boundaries best in March when the amount of ridging in our test period was at its maximum.
In the second phase of the classification we classified the segments using segment-wise feature vectors, classifying each segment to one ridging category.This was most successful in March (82 %).Then the ridging intensity varied largely in different regions in our test area and the resulting texture of the SAR imagery was more versatile than in the other studied months.It is worth noting that in March the accuracy of the feature-based classification was just five percentage points lower than the total area of the well-defined segments; i.e. the feature-based classification succeeded with the RF classifier.This result can be regarded as a confirmation that the computed features were well suited to describing the ridging in the SAR imagery.In January the classification accuracy was at the same level as in March (83 %) but the area covered by the well-defined segments was much larger, indicating that the feature-based classification did not perform as well as in March.In January and February the ice was thinner and the degree of ridging lower than in March.In these 2 months the σ o HH and σ o HV distributions from level ice and ridged ice overlapped substantively.This weak discrimination between level ice and ridged ice can be partly attributed to the subjective interpretation of the level-ice category at FIS as discussed in Sect.2.5.
Our approach works best in the Baltic Sea when the evolution of winter has passed the freezing phase and a significant amount of ridging has occurred.Then ridging strongly contributes to the texture of the SAR images.
Before setting up an operational DIR estimation system over the Baltic Sea, we need to test our algorithm with more winters data and to optimise it for the best possible result.
In an operational mode we can use the most recent SAR/FIS and SIC data for the training.Instead of using the SIC information present in the FIS charts we can also use an automated radiometer or combined radiometer/SAR-based SIC data.Currently, the finest resolution in operational SIC products is offered by the Advanced Microwave Scanning Radiometer 2 (AMSR2)-based ASI sea ice algorithm (Beitsch et al., 2014).The grid size in the product is 3.125 km.To improve our product during ice forming or melting periods, we can include ice thickness as an additional parameter in future DIR classifications.
We have plans to extend our algorithm to the Arctic Ocean, where there is a high demand for reliable ice information for independently navigating merchant vessels.Harsh ice conditions, such as those in March 2013 in this study, prevail much longer in the Arctic seasonal ice regime than in the Baltic.The consistent availability of DIR charts in Arctic would enable the monitoring of areal evolution of different ridging intensity categories.An automated DIR chart utilizing fine-resolution SAR data and classifying the suitability of different areas for navigation would benefit all Arctic shipping.
The ice ridging and its backscattering mechanisms are similar in the Baltic and Arctic.In general, Arctic sea ice is thicker and ridges can be larger than in the Baltic.However, the backscattering increases as a function of the surface roughness in both areas.We have visually inspected SAR imagery over seasonal sea ice in the Barents and Kara seas and they look similar to corresponding imagery over the Baltic.This builds confidence that our algorithm, with possible minor adjustments, could be applied to the Arctic first-year ice.
A. Gegiuc et al.: Estimation of degree of sea ice ridging Nevertheless, the possible applicability of the method to the multi-year ice areas must be studied separately.

Figure 1 .
Figure 1.Example of RS-2 dual polarized SAR image mosaic (a: HH, b: HV) over the Baltic Sea on 15 March 2013 and the corresponding DIR chart (c) showing the manually drawn polygons of different degrees of ice ridging, including the marginal ice zone detection mask based on the ice concentration values between 25 and 80 % and open-water mask based on the ice concentration values smaller than 25 %.

Figure 2 .
Figure 2. A 20 km section of combined surface laser profile and EM thickness profile, and the corresponding ice thickness histogram for the 2011 field campaign data.The laser profile resolves all ridge sails, while the EM profile averages thickness over an altitude-dependent footprint, typically 50 m.

Figure 3 .
Figure 3. Example of RS-2 SAR data mosaic in HH and HV polarization and the corresponding DIR chart with values extracted from the digitized Finnish ice charts for 9 March 2013 in the upper panel and 12 March in the lower panel.On both days the SAR shows a similar ice situation, although the two DIR charts show changes in the ridging classes: in the NW of the Bay of Bothnia, from slightly ridged ice (DIR 2) to level ice (DIR 1), and in the central to southern part of the Bay of Bothnia, from heavily ridged ice (DIR 4) to slightly ridged ice (DIR 2).In this case, the data from 9 March 2013 were removed from the classification.

9.
Edge density for HH image(ED HH ) with scaling: 1000× N e /A (N e is the number of edge pixels and A is the segment area); 10.Edge density for HV image(ED HV ) with scaling: 1000× N e /A (N e is the number of edge pixels and A is the segment area); 11.Segment size (SSZ);

Figure 4 .
Figure 4. Example of RS2 SAR data from 9 February 2013 in HH (a) and HV (b) polarizations together with the segmentation result (c) and the SIC chart (d).

Algorithm 1
Random forest algorithm for classification 1.For b = 1 to B: a. Draw a bootstrap sample Z * of size N from the training data.

Figure 6 .
Figure 6.Ridge density variation in the test area (a), HEM thickness measurements (b), DIR indices (c) and histogram of ridge densities determined for one 1 × 1 NM cell (d).

Figure 7 .
Figure 7.The monthly HH-polarization backscattering coefficient distribution for level (dashed line) and ridged (solid line) ice areas.The results are for January (a), February (b) and March (c) 2013.
-90 %) meant that openwater patches affected the level ice σ o HH , yielding both high and low σ o HH values.The level-ice σ o HH values above −18 dB were evenly distributed in January.Ridged-ice areas had a large σ o HH peak at −16.5 dB and most of the remaining pixel values ranged from −16 to −12 dB.In February level ice σ o HH still had a sharp peak around −20 dB indicating the presence of very thin ice, but most of the σ o HH values had spread between −16 and −11 dB.In the ridged areas a majority of σ o HH values were in the range between −17 and −12 dB.The mean and median values of σ o HH for level-ice areas and ridged areas were almost identical in January and February.In March the σ o HH statistics for level and ridged areas showed a clear discrepancy here.The level-ice σ o HH values were distributed from −20 to −10 dB, whereas the σ o HH values from ridged areas were concentrated in the range from −15 to −11 dB.The average σ o

Figure 8 .
Figure 8.The monthly HV-polarization backscattering coefficient distribution for level (dashed line) and ridged (solid line) ice areas.The results are for January (a), February (b) and March (c) in 2013.

Figure 9 .
Figure 9.The detection rates for the different DIR categories in all the classification.

Figure 10 .
Figure 10.Degree of ice ridging extracted from the digitized Finnish ice charts on 9 February 2013 (a) and the result of estimated DIR based on our RF approach (b).The DIR charts include the marginal ice zones (25 % < IC < 80 %) extracted from the ice concentration charts (see Fig. 4).

Figure 11 .
Figure 11.Example of RS2 SAR data on 15 March 2013 in HH (a) and HV (b) polarizations.(c) MRF MMD segmentation result for the HH-HV first PCA component.(d) Ice concentration chart extracted from the Finnish ice chart.

Figure 12 .
Figure 12.Degree of ice ridging extracted from the digitized Finnish ice charts on 15 March 2013 (a).Result of estimated DIR based on our RF approach (b).The DIR charts includes the marginal ice zones (25 % < IC < 80 %) extracted from the ice concentration charts (see Fig. 11).
to avoid situations in which the same SAR scene would appear both in the test and the training data set.Hence the training data consisted of five mosaics from January, four mosaics from February and four mosaics from March.

Table 1 .
Comparison of ice ridge statistical parameters derived from the HEM data with ice chart degrees of ice ridging.
but the effect on sea ice classification was minor.When inspecting the SAR mosaics visually, most of the SAR frame boundaries were not visible or were hardly visible, indicating successful σ o HH incidence angle correction.For open water the correction may not work properly as open water σ o signatures depend heavily on wind speed and swell (i.e.surface roughness).

Table 2 .
Confusion matrix for the level ice vs. ridged ice (categories DIR 2 to 4) in the RF classification for the whole test period.The largest estimated DIR category for each reference category is marked in bold.

Table 3 .
Confusion matrix for all ridged-ice categories in the RF classification.The largest estimated DIR category for each reference category is marked in bold.

Table 4 .
The importance of different features when the training data covered the whole test period.