Microstructure representation of snow in coupled snowpack and microwave emission models

. This is the ﬁrst study to encompass a wide range of coupled snow evolution and microwave emission models in a common modelling framework in order to gener-alise the link between snowpack microstructure predicted by the snow evolution models and microstructure required to reproduce observations of brightness temperature as simulated by snow emission models. Brightness temperatures at 18.7 and 36.5 GHz were simulated by 1323 ensemble members, formed from 63 Jules Investigation Model snowpack simulations, three microstructure evolution functions, and seven microwave emission model conﬁgurations. Two years of meteorological data from the Sodankylä Arctic Research Centre, Finland, were used to drive the model over the 2011–2012 and 2012–2013 winter periods. Comparisons between simulated snow grain diameters and ﬁeld measurements with an IceCube instrument showed that the evolution functions from SNTHERM simulated snow grain diameters that were too large (mean error 0.12 to 0.16 mm), whereas MOSES and SNICAR microstructure evolution functions simulated grain diameters that were too small (mean error − 0.16 to − 0.24 mm for MOSES and − 0.14 to − 0.18 SNICAR). No model (HUT, MEMLS, or DMRT-ML) pro-vided a consistently good ﬁt across all frequencies and polarisations. The smallest absolute values of mean bias in brightness temperature over a season for a particular frequency and polarisation ranged from 0.7 to 6.9 K. the snow microstructure model


Introduction
Global observations of the snow cover extent from optical and microwave satellite observations combined with in situ data have shown a reduction in the spring snow cover (Brown et al., 2010;Brown and Robinson, 2011). Observed decline in snow cover extent during 2008-2011 exceeded that predicted by climate models (Derksen and Brown, 2012). Ob-servations also indicate that duration of snow cover is also reducing, but they cannot determine whether mass or volume of snow has changed. Microwave, altimetry, or coarser-scale gravity satellite sensors offer the only feasible way to measure snow mass or depth on a global scale, with microwave observations spanning the longest timescale of these. However, microwave algorithms such as those developed by Chang et al. (1987) and Kelly (2009) can result in large errors because of the high sensitivity of applied forward models to parameterization of the snow microstructure (Davenport et al., 2012). In particular, the assumption of a fixed snow scatterer radius in the Chang et al. (1987) algorithm does not reflect the naturally changing snowpack structure. Errors in snow mass products derived from these algorithms mean that the products are difficult to use for evaluation of snow mass in climate models (Clifford, 2010) and unsuitable for assimilation into land surface models for streamflow forecasts (Andreadis and Lettenmaier, 2006). Development of the assimilation-based technique in GlobSnow allows changes in the snow microstructure to be taken into account through inversion of groundbased observations of snow depth and coinciding microwave brightness temperatures (Takala et al., 2011). Although more accurate than other global products, some errors remain, and the GlobSnow accuracy relies on the proximity and representativity of the ground stations (Hancock et al., 2013). In addition, the intermediate retrieval of the snow "grain size" in GlobSnow is a parameter that also incorporates other land surface features, so is not a true representation of the snow effective diameter .
Snowpack evolution models offer a way to estimate temporal changes in snow microstructural parameters and stratigraphy (e.g. Lehning et al., 2002;Brun et al., 1992). Intercomparison studies have shown large differences between snow evolution models driven by the same forcing data (Rutter et al., 2009). Given that the mass inputs were the same for the 33 snow models considered in the SNOWMIP2 study of Rutter et al. (2009), it is differences in the internal snow physics and model structure (layering assumptions) that result in the wide range of simulated depth and snow mass. Temperature, temperature gradient, and density drive changes in the snow microstructure (e.g. Flanner and Zender, 2006), so it is likely that different snow physics assumptions in a coupled snowpack and emission model result in different thermal structures, microstructure parameterisations, and ultimately different microwave extinction behaviour.
Theoretical differences between specific electromagnetic models have been examined in Löwe and Picard (2015), Pan et al. (2015), and other intercomparisons carried out by . These studies are useful for interpreting differences in electromagnetic model outputs for a snapshot profile of the snowpack properties. Given the dependence of microwave scattering on snow microstructure, a satellite retrieval system needs some quantification of microstructure. Snowpack evolution modelling offers a means to quantify the metamorphic changes in snow microstructure. Indeed, snowpack evolution models have been coupled with microwave emission models to demonstrate the potential of this approach for snow remote sensing applications Andreadis and Lettenmaier, 2012;Brucker et al., 2011;Picard et al., 2009). These studies all examined the accuracy of a single snowpack model coupled with a single microwave emission model.
The purpose of this study is to inform future design of retrieval and assimilation systems where snowpack evolution models may be used to provide microstructural parameters for microwave emission models, by examining how particular snowpack and emission model choices lead to a variation in simulated brightness temperatures throughout the winter period, and evaluate how the simulated values compare to observations. The Jules Investigation Model (JIM; Essery et al., 2013) has been coupled with three widely used microwave emission models: the Dense Media Radiative Transfer Multi-Layer model (DMRT-ML; Picard et al., 2013), the Microwave Emission Model of Multi-Layer Snow (MEMLS; , and the Helsinki University of Technology (HUT) multi-layer model (Lemmetyinen et al., 2010;Pulliainen et al., 1999). Snowpack microstructure metamorphism is represented here by three different options with differing complexity for grain diameter evolution (or equivalently the specific surface area, SSA). These models are the grain growth models of SNTHERM (SNT; Jordan, 1991), SNICAR (SNI;Flanner and Zender, 2006), and MOSES (MOS; Essery et al., 2001). This allowed quantification of the seasonal variation in uncertainty in brightness temperature simulations from 1323 coupled snowpackemission model systems, as evaluated against ground-based observations of brightness temperature.
The study approach, model descriptions, and field measurements are given in Sect. 2. Comparisons between simulations and between simulations and observations are presented in Sect. 3, and the implications for future approaches to the remote sensing of snow mass are discussed in Sect. 4.

Models and methods
This study builds on the work of Essery et al. (2013), who incorporated many published snow model parameterisations within a single model framework, the JIM, which is described in Sect. 2.1. As this earlier study did not incorporate snow microstructure changes, JIM was coupled with three microstructure evolution functions for this study, described in Sect. 2.2, and three distinct snow emission models, detailed in Sect. 2.3. Steps necessary to form the model ensemble, including assumptions about the representation of the soil, are given in Sect. 2.4. A description of the field site, driving, and evaluation data for the simulations in this paper are presented in Sect. 2.5.
Thermal conductivity: 0 Empirical JIM variables are snow density ρ s , overlying snow mass M s , snow temperature T s , air temperature T a , wind speed U a , snow effective thermal conductivity λ s , partial density of liquid water γ w , and partial density of ice γ i . Other symbols represent constants, given in Essery et al. (2013).
2.1 Snow model parameterisation Essery et al. (2013) developed the JIM, a system of 1701 snowpack evolution models to provide a systematic method and common framework to examine how the range of snowpack parameterisations used in land surface models impacts the simulation of snow parameters. Based on this work, a more computationally efficient version, a factorial snowpack model has been developed (Essery, 2015) that allows for 32 model configurations. JIM is based on an Eulerian grid scheme (fixed layer structure), which requires mass redistribution between layers with precipitation events. An alternative approach is a Lagrangian grid scheme: a deforming layer structure that retains much of the same snow material throughout the season (e.g. Jordan, 1991;Brun et al., 1992;Lehning et al., 2002). For this paper, a subset of the original JIM members was selected as these were expected to influence the parameters important for microwave modelling. The subset includes variation in the representation of compaction, the density of newly deposited snow, thermal conductivity, and liquid-water flow (snow hydrology). Table 1 summarises the different approaches taken. Note that a variable fresh snow density (options 0 and 1) cannot be used when the snowpack has fixed density (compaction option 2), so there are only 63 model configurations in the model subset rather than 81. For all other snowpack parameterisations, option "1" from Essery et al. (2013) were used for albedo, surface exchange, and snow fraction representations to form the JIM subset.

Microstructure evolution
JIM subset outputs were used to drive three microstructure models of differing complexity. SNT (Jordan, 1991) growth of snow grain diameter d is based on the rate of vapour transport through the snow (and therefore temperature gradient), which leads to the microstructure evolution function of dry snow in SNT as where g 1 and D eos are empirical constants, P a is the atmospheric pressure, C kT s is the variation of saturation vapour pressure with snow temperature T s , T m = 273.15 K, and ∂T s ∂z is the temperature gradient. Grain growth under wet conditions is more rapid, with empirical constant g 2 and is dependent on the liquid fractional volume, f l by SNI microstructure evolution is a computationally efficient approximation to a model based on physics that uses a lookup table for empirical parameters τ and κ, as described in Flanner and Zender (2006). These parameters are dependent on the snow density, temperature, and temperature gradient. The equation of microstructure evolution in SNI is based on snow SSA: SSA per unit mass of ice (m 2 kg −1 ) can then be converted to grain diameter with D = 6/ (ρ i SSA) (Mätzler, 2002;Montpetit et al., 2012). A third microstructure model, MOS, parameterises snow evolution as a function of grain radius r and snow age: where G r is an empirical temperature-dependent grain area growth rate, S f is the snowfall rate in time interval t, and d 0 is a constant representing the mass of fresh snow required to reset the snow albedo to its maximum value.
Other microstructure parameterisations are available, namely the Crocus (Vionnet et al., 2012) and SNOWPACK (Lehning et al., 2002) microstructure evolution functions. It is not currently possible to couple these with the JIM model due to the Eulerian grid structure of JIM. Mass transfer between layers allows numerical averaging of concepts such as grain diameter and SSA, but not shape-dependent concepts such as dendricity and sphericity. Therefore the Crocus and SNOWPACK functions have not been included in this study.

Microwave emission models
The microwave models chosen for this application span a range of physical complexity in their representation of the snow. The HUT model (Lemmetyinen et al., 2010) is a semiempirical model based on strong forward scattering assumptions, the MEMLS model  is of intermediate complexity and contains the improved Born approximation (Mätzler, 1998), and the DMRT-ML model  is the most physically complex and is based on quasi-crystalline approximation with coherent potential (QCA-CP). Many other microwave emission models have been developed, such as Mie scattering approach of Boyarskii and Tikhonov (2000), Chang et al. (1976), and Eom et al. (1983), strong fluctuation theory (Stogryn, 1986;Song and Zhang, 2007), distorted Born approximation (Tsang et al., 2000), the quasi-crystalline approximation (Grody, 2008), other QCA-CP models (Rosenfeld and Grody, 2000;Jin, 1997), or the numerical method of Maxwell's equations in 3-D (Xu et al., 2012). These references are not exhaustive but do give an illustration of the range of models available. Here, we restrict the comparison to widely available multilayer models that span a range of complexity and whose computational efficiency is such that entire seasons can be simulated rapidly.
Of the models chosen, all are multiple layer and broadly require the same information; i.e. they use layered information on snow temperature, density, and layer thickness as input but differ in their representation of the microstructure. They are all based on radiative transfer theory, which is gov-erned by the following general equation: where θ and φ are the zenith and azimuth angles, µ = cos θ , T B is the brightness temperature vector, which we will assume here to consist of horizontally and vertically polarised brightness temperature only, κ a is the absorption coefficient, and κ e is the extinction coefficient, which is a sum of the absorption coefficient and the scattering coefficient κ s . The models differ in which two coefficients determine the third. In HUT, the derived coefficient is κ s , whereas κ e is derived in MEMLS and κ a in DMRT-ML. Other differences between models include the representation of the phase function (single-stream model with separate up-and downwelling components in HUT, six-stream in MEMLS and multiple streams in DMRT-ML), specification of the absorption coefficient and the numerical techniques applied to solve the radiative transfer equation (Lemmetyinen et al., 2010;Picard et al., 2013;Mätzler and Wiesmann, 1999;Pan et al., 2015). Differences between models are not restated here, but options chosen within each model leading to different model versions are stated in the following subsections.

DMRT-ML
DMRT-ML is based on a sticky hard spheres representation of the microstructure so that the scattering coefficient given by the QCA-CP is given as (7) where k 0 = 2π/λ is the wave number, a is the radius of the spheres, f is the fractional volume of scatterers, s is the permittivity of the scatterers, b is the permittivity of the background, and E eff is the effective permittivity of the medium. t is related to the stickiness factor τ governing the potential of particles to coalesce. For non-sticky particles t = 0 but for sticky particles, it is given by the largest of the two solutions to the quadratic equation: Whilst Löwe and Picard (2015) have shown that it may be possible to determine stickiness from micro-CT measurements of the snow, an appropriate value of stickiness is not known for the field observations used in this paper. Roy et al. (2013) and Löwe and Picard (2015) showed that non-sticky representation in DMRT-ML is inappropriate. For this model ensemble, two DMRT-ML configurations have been chosen to capture the range of brightness temperatures simulated: "DMRT less sticky" (τ = 0.2) and "DMRT very sticky" (τ = 0.1). These two values represent reasonable values used by others (e.g. Tsang et al., 2007;Shih et al., 1997).

MEMLS
Within MEMLS there are a suite of options for the calculation of the scattering coefficient. Two of the options within MEMLS were selected for this study to cover both empirical and theoretical approaches: "MEMLS empirical" and "MEMLS IBA". The empirical version of MEMLS used gives the scattering coefficient as where the correlation length p ec is in mm, density ρ is in g cm −3 , and frequency ν is in GHz. This is suitable for correlation lengths 0.05 < p ec < 0.3 mm and density 0.1 < ρ < 0.4 g cm −3 . MEMLS IBA uses the improved Born approximation theory given in Mätzler (1998) and Mätzler and Wiesmann (1999), where the scattering coefficient is given by the integral of the phase function for polarisation angle χ One further assumption applied to distinguish this MEMLS IBA configuration is that oblate grains are used rather than small spherical scatterers or thin spherical shells. This assumption governs the representation of the mean square field ratio, K 2 , as detailed in Mätzler and Wiesmann (1999). The microstructure length information is contained in I : It should be noted that the choice of oblate grains also affects the effective permittivity in I , represented by an empirical, density-dependent effective permittivity (Wiesmann and Mätzler, 1999, Eqs. 45-47) for this case.

HUT
HUT has three options for the extinction coefficient. These are nominally suited to different grain diameter (d 0 ) ranges, with some overlap between them. All three versions (termed HUT H87, HUT R04, HUT K10) have been included in this version of the model ensemble. HUT H87 is based on the work of Hallikainen et al. (1987): This is nominally appropriate for frequency range ν = 18-60 GHz and d 0 < 1.6 mm.
The extinction coefficient in HUT H04, with a validity range of 1.3 < d 0 < 4 mm was derived by Roy et al. (2004): Kontu and Pulliainen (2010) gave the extinction coefficient for maritime snow, used here in the HUT K10 simulations as Scaling of the grain diameter by the relationship recommended in Kontu and Pulliainen (2010) has not been applied here as it was developed for snow microstructure observations rather than simulated snowpack microstructure.

Model framework
Interfacing of the various model inputs and outputs was enabled through the development of the ensemble framework, via a combination of shell script and Octave/MATLAB code. The DMRT-ML model was run from the shell script, which subsequently calls an Octave/MATLAB script to run HUT and MEMLS. HUT and MEMLS run alternately in this framework as the soil parameters (common between DMRT-ML and HUT) are used to calculate soil reflectivity in HUT, which is then used as the lower boundary condition in MEMLS. Internal parallelisation of the MATLAB code of HUT-MEMLS means that a season-long simulation of one HUT-MEMLS combination with one grain scaling factor takes 9 min over eight cores. For the DMRT-ML FOR-TRAN code, external bash shell parallelisation reduces execution time from 16 to ca. 2 h for one grain scale factor and two parameterisations of stickiness. Over 29 million individual brightness temperatures were simulated for this study.
For the purposes of this study, the effective sphere size in JIM, DMRT-ML, and HUT is assumed to be identical i.e. d HUT = 2 × r DMRT = d JIM . This may not be a good assumption as the empirical extinction coefficient model used in HUT was based on observations of the maximum grain extent rather than effective diameter, which was almost impossible to measure at the time of the original work. The exponential correlation length in MEMLS (in mm) is calculated from the theoretical relationship to the effective grain diameter from JIM (in µm) as Montpetit et al. (2013) and Mätzler (2002): Figure 1 illustrates the flow of information in the model ensemble. Meteorological data are used to drive the 189 configurations of JIM (3 microstructural models for each of the 63 snowpack parameterisations). The outputs from JIM are then reformatted for each of the three electromagnetic models. Table 2 gives a summary of the main differences in inputs

[ mm]
Layer number 1 = base 1 = base 1 = base 1 = top Soil permittivityobs r 0,HUT obs Note that for the purposes of the ensemble, DMRT-ML was adapted to allow the input of diameter rather than radius. HUT was adapted to ensure Fresnel reflectivity for a smooth soil surface and to output the soil reflectivity r 0,HUT at both polarisations for use in MEMLS simulations.
between models. The electromagnetic model inputs are then used to drive the two DMRT-ML model versions (τ = 0.1, τ = 0.2), the two MEMLS model versions (empirical, IBA with oblate grains), and the three HUT versions (three different extinction coefficient models). Meteorological and field data used to drive and evaluate the ensemble are described in the following section.

Data
Model runs for this study were performed for the Intensive Observation Area (IOA) of the Finnish Meteorological Institute Arctic Research Centre (FMI-ARC). The site provides a wealth of forcing and evaluation data, including automated soil, snow and meteorological observations, ground-based microwave radiometry, and a programme of manual snow profile observations. Air temperature, solar ra- pack model evaluation. Snow samples from 31 of these pits were extracted and used to measure profiles of the SSA with the IceCube instrument (Zuanon, 2013). A bulk grain diameter was calculated for the analysis from the snow water equivalent (SWE)-weighted mean SSA, excluding layers without observations. For these two seasons, the real component of the soil permittivity measurements were available at 100 MHz, measured at three locations in the observation area with Delta-T devices ML2x sensors, installed horizontally at a depth of approximately 2 cm beneath the organic surface layer. Mean measurements from the stable winter period (1 December-31 March) were chosen as representative for the entire season, which resulted in values of soil permittivity of 4.4 in 2011-2012 and 4.6 in 2012-2014. For the JIM simulations in this paper, a scaling factor of 1.11 was applied to the 2011-2012 precipitation data, and a scaling factor of 1.06 was applied to the 2012-2013 data to match the measured snow accumulation on the ground better. These factors differ slightly from the values used in the 7-year consolidated data set of Essery et al. (2016).

Simulation methodology
Choices in the snowpack evolution parameterisations made here lead to 189 unique JIM snowpack models. Modelled snowpack profiles of layer thickness, temperature, density, and grain diameter were output daily at noon for this study. These were then applied to the seven microwave emission model combinations, resulting in 1323 sets of brightness temperature simulations per day.
In order to illustrate and analyse the effects of assumptions regarding snowpack evolution and microwave scattering on simulated brightness temperatures over the course of the winter season, the remainder of the paper will do the following: 1. Present the range of brightness temperatures expected for any generic combination of snowpack and emission model.
2. Apply a range of scaling factors (0.1 ≤ ≤ 5.0) to simulated JIM snowpack diameters (d optimal = d JIM ) and calculate the degree of misfit between simulated and observed brightness temperatures using the following cost function (CF): The cost function term is summed over the two polarisations (H-and V-pol) for the two frequencies (18.7 and 36.5 GHz) over the number of days (ndays) when observations and simulations are both available. Due to the observation schedule at the Sodankylä site, the noon "observations" for comparison with the simulations were determined as the mean of the 10 am and 2 pm observations. If observations were missing from either or both of these times, the brightness temperature for that day was excluded from the CF calculation. Optimal were found from the minimisation of the CF.
3. Isolate the effect of snowpack parameterisations on simulated brightness temperature by presenting simulation results grouped by parameterisations of densification, liquid-water flow, initial snow density, and thermal conductivity. This will determine which factors govern the spread in brightness temperature and are therefore important for the design of snow retrieval assimilation systems. Fig. 3. There is a small difference between the automatic measurements and the manual field observations attributable to the spatial variability of the snow and difference in measurement location. Ultrasonic snow depth measurements were on average 12 mm deeper than the snow pit observations in 2011-2012 but were 29 mm shallower than snow pit measurements in 2012-2013. SWE measured automatically by the gamma ray sensor had a mean value of 3.6 mm SWE greater than the pit observations in 2011-2012 but 5.9 mm less in 2012-2013. Although the precipitation inputs were scaled due to known sensor undercatch problems, in 2011-2012 the SWE was underestimated until the end of January, then overestimated until the melt period. Compared with snow pit observations, the SWE bias prior to 1 February was −14.2 mm. Between 1 February and 31 March, the SWE bias was 19.1 mm. From 1 April until the end of the season, the bias was −24.5 mm water equivalent. In 2012-2013, simulated SWE was overestimated for most of the season, with a mean bias of 13.7 mm compared with the snow pit observation. Simulated SWE is relatively insensitive to the snow parameterisation in the accumulation period, but three distinct model groups emerge in the melt period, which are due to the three different representations of the liquid-water flow. The snowpack model parameterisations have a greater impact on the snow depth, which is to be expected as this is directly affected by the representation of densification and initial snow density. Figure 4 demonstrates the impact of the snow model parameterisations on snow grain diameter growth, as simulated with the MOS, SNI, and SNT microstructure evolution functions. Each microstructure model results in a spread of bulk grain diameter due to the 63 snowpack parameterisations, but in general the difference between microstructure models is greater than the difference due to snowpack parameterisations. The simulation range is greatest at the start and at the end of the season, when the snowpacks can be subject to the largest temperature gradients or liquid-water dependent growth. In both years of simulation, the mid-season bulk grain diameter is smallest with MOS and largest with SNT. MOS and SNI are similar in magnitude, but SNT bulk grain diameters were approximately twice as large on average, with a mean ratio over the season of 1.9-2.2, as shown in Table 3. SNT bulk grain diameter was up to 3.2 times larger than MOS bulk grain diameter. Visual estimation of the snow grain diameter gave values that were always larger than all of the simulations. Measured SSA-derived bulk grain diameters generally lay in between the SNT simulations and the simulations with SNI and MOS. The mean absolute error and mean relative difference for these simulations are presented in Table 4. SNT had the lowest bias (0.12 mm) in 2011-2012, whereas SNI had the lowest bias (−0.14 mm) in 2012-2013. Bulk grain diameter simulated by the microstructure models led to a mean difference of between −53 and +45 % relative to the observations. Simulation of mean and range of brightness temperature from the three emission models driven by all snowpack and microstructure model combinations is shown in Fig. 5. Note that excessively low brightness temperatures on 1 November 2011 were excluded from this figure as the snowpack for some JIM members was extremely thin with an unphysically high snow density. In general, HUT with three representations of extinction coefficient showed the smallest range of The Cryosphere, 11, 229-246, 2017

Snow depth and SWE simulated by the JIM is shown in
www.the-cryosphere.net/11/229/2017/  brightness temperature, whereas DMRT-ML (covering both very sticky and less sticky assumptions) had a much greater range, which was nearly as large as MEMLS (empirical representation and improved Born approximation with oblate grains). This is demonstrated by the ratio between the seasonal mean ranges of brightness temperature presented in Ta-Table 4. Mean absolute error (mm) between bulk grain diameter simulated with the microstructure models compared with observations derived from SSA measurements with IceCube. Smallest bias for each year is shown in bold. Percentages are given in parentheses.  ble 5, where the ranges compared with HUT had a ratio of greater than 1. MEMLS had a larger range than DMRT-ML, although at 37 GHz the difference was small. As illustrated in Fig. 5, at 19 GHz, the mean of DMRT-ML simulations were highest and the mean of MEMLS simulations were generally lowest (with the exception of 19H in 2011-2012). At 37 GHz, horizontal and vertical polarisation, HUT gives the highest mean brightness temperature in both years, although the mean DMRT-ML brightness temperature is within 3 K of HUT at horizonal polarisation (both years). MEMLS mean brightness temperatures are the lowest at 37 GHz at both horizontal and vertical polarisation in both years. All ranges exhibit a distinctive "wedge" shape, where the ranges generally increase throughout the season until the collapse of the range in the melt period. Compared with the brightness temperature observations, no model gives a consistently better performance across both frequencies and both polarisations. This is illustrated in Table 6, where mean bias and root mean square error (RMSE) for each season has been presented for each frequency and polarisation combination. The lowest bias was less than 7 K in magnitude, whereas the lowest RMSE for each frequency/polarisation was less than 13 K.     Table 7 indicates scaling factors that would need to be applied to the grain diameter in order to allow a particular microstructure evolution function to minimise the CF given in Eq. (16), i.e. the best agreement with observed brightness temperature for all four frequency and polarisation combinations. A scale factor of 1 suggests a perfect fit between snowpack microstructure and microwave microstructure. A scale factor of less than 1 indicates a snowpack grain diameter overestimate, whereas a scale factor of greater than 1 is an underestimate. For SNT microstructure, a scale factor of less than 1 was required in 2011-2012 for all emission models with the exception of the less sticky (τ = 0.2) application of DMRT-ML. This indicates that the SNT microstructure resulted in grain diameters larger than that required by the emission models for that year. In 2012-2013 SNT microstructure required slight scaling to increase the grain diameter for HUT and for less sticky DMRT-ML, but downscaling for very sticky hard spheres in DMRT-ML and for MEMLS. With the exception of the application to empirical MEMLS in 2011-2012, the SNI and MOS grain diameters were too small and required scaling upwards. A CF minimum was achieved for empirical MEMLS driven by MOS microstructure with no scaling whatsoever in 2011-2012. The pattern is consistent between years, with the greatest interannual difference in scale factor for HUT. Once the microstructure differences have been isolated through application of the optimal scale factor, as shown in Table 8 Differences in brightness temperature also exist in the simulations due to the snowpack parameterisation (i.e. 63 JIM combinations). Empirical MEMLS with MOS microstructure in the 2011-2012 season was chosen as a test case to illustrate the effects of snowpack parameterisation on the brightness temperature because of the equivalence of snowpack and emission model microstructure (no scaling required). This subset of 63 simulations for 37H brightness temperature in 2011-2012 is shown in Fig. 6. There is a seasonal dependence in the range, with model divergence from mid-January onwards. 1 February and 1 May were chosen for cluster analysis to determine which parameterisations caused the split in simulations, as shown in Fig. 7.

2011-
Clear groupings of simulations in Fig. 7, upper left, indicate that the snowpack densification parameterisation has a distinguishable effect on the simulation of brightness temperature. A physical representation of densification (parameterization = 0) gave the lowest brightness temperatures on the 1 February, but the highest by 1 May. In contrast, where no compaction is simulated, i.e. snow density is constant throughout the season (parameterization = 2), the opposite is true. An empirical representation of densification (parameterization = 1) results in brightness temperatures generally between those of the physical, and of no densification. Ther-mal conductivity has no effect on the simulation of brightness temperature, whereas subtle differences are attributable to the fresh snow density value and to the representation of snow hydrology. There is no discernible difference between fresh snow density parameterisation schemes 0 and 1, whereas 2 gives a different set of brightness temperatures. Snow hydrology has very little effect in the early season but can lead to differences in the melt period. Overall, the snowpack parameterisations with MOSES microstructure and empirical MEMLS lead to a mean difference in the 36.5 GHz brightness temperature of 11 K at H-pol and 18 K at V-pol. The maximum difference in 36.5 GHz brightness temperature was 33 K at H-pol and 54 K at V-pol for the 2011-2012 season. The maximum difference between H and V polarisation for all unscaled microstructure-electromagnetic model combinations is demonstrated in Table 9. Large differences in the maximum brightness temperature difference as a result of the 63 snowpack configurations occurred for the SNT microstructure. Except for DMRT-ML less sticky and HUT with MOS or SNI microstructure, the V-pol difference is greater than the H-pol difference.

Discussion
The biggest difference to obtaining accurate simulations would be made by improving the microstructure evolution models within snowpack models because the optimal scale factors are generally larger between microstructure models than between emission models. SNTHERM grains tend to be too large for the emission models and generally require scaling down to smaller values. SNICAR grains are in the mid-range and require a small amount of scaling, generally upwards to larger grains. MOSES grains are the smallest and generally require higher scale factors than SNICAR. These patterns are consistent, regardless of the electromagnetic radiative transfer model used. Differences between microstructure evolution models are so large because they were developed in models with different purposes. MOSES is a large-scale land surface model, requiring snow grain size for albedo calculations (Essery et al., 2001). SNICAR is a snow albedo model (Flanner and Zender, 2006). SNTHERM, in contrast, was developed to predict surface temperature and uses grain diameter in the simulation of liquid-water flow as well as albedo (Jordan, 1991). SNICAR and MOSES grain sizes are closer to the SSA-derived grain diameter as a result. SNTHERM simulates a grain size that is closer in concept to the visual estimates of grain diameter than the other two models. The large spread when coupling snowpack evolution and microwave models, due to the differences in the modelling of snow microstructure, is consistent with the wide range of studies that have investigated how to link snowpack observations of microstructure to the microstructure parameter required in electromagnetic models (e.g Kendra et al., 1998;Du et al., 2005;Liang et al., 2008;The Cryosphere, 11, 229-246, 2017 www.the-cryosphere.net/11/229/2017/   Durand et al., 2008;Brucker et al., 2011;Xu et al., 2012;Montpetit et al., 2013;Roy et al., 2013;Rutter et al., 2014;Picard et al., 2014). Nevertheless there are differences between microwave emission models for a particular microstructure evolution model and even differences within the same family of emission models. "Improvement" in the microstructure for a particular model combination may lead to less accurate simulations at some frequencies and polarisations, which highlights that there is more to understand. In part, this may be due to the methodology of this study as the CF is calculated per microstructure-electromagnetic model configuration, yet the bias and RMSE are presented for each electromagnetic model family. An individual contribution can influence the group in a non-intuitive way.
Here, the lowest bias and RMSE for unscaled microstructure simulations were −6.9 to +6.9 K and 4.2 to 12.2 K, respectively, but depended on microwave model, frequency, and polarisation. In an attempt to put these results into context, there are a number of studies that have quantified brightness temperature simulation errors for these models. These fall into different categories, depending on sensor characteristics, the source of the evaluation data (ground-based, airborne, satellite) and presence of ice lenses , the treatment of the snow microstructure (Picard et al., 2014), snow type, observation angle, and the specific electromagnetic model , and the underlying substrate (Lemmetyinen et al., 2009;Derksen et al., 2014). Examples of unscaled field observations of microstructure compared with ground-based observations include the HUT simulations of , who found an RMSE of 10-34 K, and Rutter et al. (2014), who found a bias of 34-68 K that was reduced to < 0.6 K upon application of grain scale factors of 2.6-5.3. Scaling, or best-fit, relationships were used by Durand et al. (2008) (mean absolute error 3.1 K at V-pol and 9.3 K at H-pol), Montpetit et al. For DMRT-ML, consideration of the stickiness is imperative. Two constant values were considered here: extremely cohesive or less sticky particles. Löwe and Picard (2015) have made progress in understanding stickiness from micro-CT data. There are theoretical limits, based on snow density (Löwe and Picard, 2015, Eqs. 35-36), but in general stickiness is independent of diameter and of density and a constant value should not be used, as was done here. Further research is needed in this regard.
For the HUT radiative transfer model, the optimum combinations of snowpack and microwave model are dependent on both models and, therefore, the end application. The SNT microstructure is most closely matched to the microstructure of the Hallikainen et al. (1987) extinction model. Both were developed with a similar concept of microstructure. With MOS or SNI microstructure, Roy et al. (2004) would be most appropriate. Kontu and Pulliainen (2010) is more broadly applicable as the scale factor always lies between R04 and H87, regardless of the microstructure model. Therefore K10 may be better choice if a range of microstructure models is considered in a data assimilation retrieval scheme but with only one observation operator.
In the case of MEMLS, there are some differences between the empirical model and IBA, but the microstructure model really matters. IBA is a more appropriate model for the larger SNT grains and endorses the recommendation of Mätzler and Wiesmann (1999) for IBA in the simulation of larger grains. The microstructural concept of MOS matches the microstructure of empirical MEMLS very well, with no scaling required in 2011-2012, although SNI is equally appropriate in 2012-2013.
There is little variation between years for the DMRT-ML (sticky) and MEMLS models, and a consistent pattern for HUT. Other studies have investigated the microstructural link between snowpack and microwave models. Wiesmann et al. (2000) found that the scale factor between exponential correlation length in MEMLS and grain diameter in SNTHERM for the Weissfluhjoch site in Davos, Switzerland, was 0.16. Applying Eq. (15) for snow of density 250 kg m −3 , the scale factor to relate the grain diameter of SNTHERM to the exponential length in MEMLS for the Sodankylä site would be 0.24 for IBA and 0.15 for empirical MEMLS for the 2011-2012 data set. This is entirely consistent with the Wiesmann et al. (2000) study, in spite of the different locations and snowpack conditions. Wiesmann et al. (2000) also reported a relationship for Crocus simulations, as did Brucker et al. (2011). At this stage it is not possible to make comparisons of this work with those studies because the Crocus evolution model has not been included in this study due to the difficulty of ap-plying these models to the Eulerian frame snowpack model scheme used here. These two studies are, however, consistent with each other. Wiesmann et al. (2000) found a snow-typedependent scale factor of 0.3-0.4 between MEMLS correlation length and Crocus grain diameter, whereas the range in Brucker et al. (2011) was 0.4-0.25 for snow density between 100 and 400 kg m −3 . The scaling factor between the SNOWPACK-derived correlation length and the correlation length of MEMLS was found to be 0.1  but, again, a comparison with this work is not possible as the SNOWPACK grain evolution model has similar requirements to the Crocus microstructure model as they have a common origin.
When isolating the spread in brightness temperature due to snowpack parameterisations, this spread is largely due to the snowpack model representation of the densification process, with a variable impact throughout the season. After the microstructure model, snow compaction must be considered carefully in the design of a coupled snowpack-microwave model. Liquid-water flow representation in the snowpack model may become important in the melt period, particularly for a snowpack with mid-winter melt periods or if the snowpack model is used to provide information on SWE during melt when microwave observations cannot. If fresh snow is assumed to have a constant density in a retrieval or assimilation system then that value will have an impact but is less important than compaction. Thermal conductivity has no discernable impact on the brightness temperature simulations so the choice of its representation is largely irrelevant for snow mass retrieval and assimilation systems.
Although empirical MEMLS driven by MOSES was chosen as an example to demonstrate the impact of parameterisations, this was purely because of the apparent consistency between the MOSES grain diameter converted to exponential correlation length and MEMLS simulations for 2011-2012 at this site. This is not a general endorsement of empirical models, as those based on physics are expected to be more universally applicable, but the specific application of these models will dictate the balance of accuracy versus simplicity. Extending the analysis beyond this example, snow parameterisations affect other unscaled model combinations to varying degrees. Microstructure scaling factors in Table 7 can be used as a proxy for the degree of scattering in the unscaled simulations. A higher scale factor acts to increase the simulated scattering, so for a scale factor < 1 too much scattering occurs in the unscaled simulations. Snowpack parameterisations have a greater impact for a higher degree of scattering, larger at V-pol than H-pol. This is because scattering is already greater at H-pol so the spread in H-pol simulations as a result of snowpack parameterisations is suppressed by the existing level of scattering. The converse applies for high scaling factors (e.g. MOS with less sticky DMRT-ML).
Although the differences in scale factors between microstructure models are larger than the differences in scale factors between microwave models, this does not negate the need for developments in the microwave models. This is highlighted by the treatment of field observations of SSA to derive optical diameter as even these require some form of scaling (e.g. Montpetit et al., 2013;Picard et al., 2014;Rutter et al., 2014). Use of scale factors can improve brightness temperature accuracy at some frequency and polarisations but may decrease the accuracy at others. The necessity of scale factors indicate the need for a deeper understanding in the role of microstructure in the microwave models. Much of this is discussed from a theoretical perspective by Löwe and Picard (2015). With a sticky hard sphere model of the microstructure, even if the stickiness is known, Löwe and Picard (2015) showed that a scale factor to relate the measured optical diameter to microwave diameter depends on the type of metamorphism the snow has been subjected to. Indeed, here, constant scale factors have been applied with no attempt to assess how these may change over the season. They do not account for the anisotropic nature of the snow, which adds to the complexity both in the modelling of the snowpack (Löwe et al., 2013) and in microwave scattering (Leinss et al., 2016). Some of the fundamental questions on how to relate snowpack and microwave microstructure may be addressed with a better microstructure descriptor of the snowpack rather than a single length scale and would benefit from easy interchangeability between different microwave models and different snowpack evolution models. Ultimately a consistent microstructural treatment will be needed in both snowpack evolution and microwave models.

Conclusions
Future snow mass and depth retrievals systems may rely on snowpack models to provide snow microstructural parameters. To improve accuracy in seasonal simulations of brightness temperature, the largest gains will be achieved by improving the microstructural representation within snowpack models, followed by improvements in the emission models to use accurate microstructural information and reduce bias and RMSE at all frequencies and polarisations simultaneously. For the design of retrieval systems with current capabilities, particular model combinations may be more suitable than others, and careful consideration must be given to snow compaction processes. Snow process representation becomes increasingly important as the snowpack scatters more. The future lies in a better and consistent treatment of snow microstructure in both snowpack and emission model developments.

Data availability
Data are available at the repository doi:10.6084/m9.figshare.4552822 (Sandells et al., 2017). These include JIM outputs, the non-scaled grain diameter and optimal grain diameter brightness temperature simulations, and observations of brightness temperature and grain size. The full brightness temperature data set (all scaling factors) is too large to place in a repository but can be made available upon request.