Evaluation of different methods to model near-surface turbulent fluxes for a mountain glacier in the Cariboo Mountains , BC , Canada

As part of surface energy balance models used to simulate glacier melting, choosing parameterizations to adequately estimate turbulent heat fluxes is extremely challenging. This study aims to evaluate a set of four aerodynamic bulk methods (labeled as C methods), commonly used to estimate turbulent heat fluxes for a sloped glacier surface, and two less commonly used bulk methods developed from katabatic flow models. The C methods differ in their parameterizations of the bulk exchange coefficient that relates the fluxes to the near-surface measurements of mean wind speed, air temperature, and humidity. The methods’ performance in simulating 30 min sensibleand latent-heat fluxes is evaluated against the measured fluxes from an open-path eddycovariance (OPEC) method. The evaluation is performed at a point scale of a mountain glacier, using one-level meteorological and OPEC observations from multi-day periods in the 2010 and 2012 summer seasons. The analysis of the two independent seasons yielded the same key findings, which include the following: first, the bulk method, with or without the commonly used Monin–Obukhov (M–O) stability functions, overestimates the turbulent heat fluxes over the observational period, mainly due to a substantial overestimation of the friction velocity. This overestimation is most pronounced during the katabatic flow conditions, corroborating the previous findings that the M–O theory works poorly in the presence of a low wind speed maximum. Second, the method based on a katabatic flow model (labeled as the KInt method) outperforms any C method in simulating the friction velocity; however, the C methods outperform the KInt method in simulating the sensible-heat fluxes. Third, the best overall performance is given by a hybrid method, which combines theKInt approach with the C method; i.e., it parameterizes eddy viscosity differently than eddy diffusivity. An error analysis reveals that the uncertainties in the measured meteorological variables and the roughness lengths produce errors in the modeled fluxes that are smaller than the differences between the modeled and observed fluxes. This implies that further advances will require improvement to model theory rather than better measurements of input variables. Further data from different glaciers are needed to investigate any universality of these findings.

Abstract.As part of surface energy balance models used to simulate glacier melting, choosing parameterizations to adequately estimate turbulent heat fluxes is extremely challenging.This study aims to evaluate a set of four aerodynamic bulk methods (labeled as C methods), commonly used to estimate turbulent heat fluxes for a sloped glacier surface, and two less commonly used bulk methods developed from katabatic flow models.The C methods differ in their parameterizations of the bulk exchange coefficient that relates the fluxes to the near-surface measurements of mean wind speed, air temperature, and humidity.The methods' performance in simulating 30 min sensible-and latent-heat fluxes is evaluated against the measured fluxes from an open-path eddycovariance (OPEC) method.The evaluation is performed at a point scale of a mountain glacier, using one-level meteorological and OPEC observations from multi-day periods in the 2010 and 2012 summer seasons.The analysis of the two independent seasons yielded the same key findings, which include the following: first, the bulk method, with or without the commonly used Monin-Obukhov (M-O) stability functions, overestimates the turbulent heat fluxes over the observational period, mainly due to a substantial overestimation of the friction velocity.This overestimation is most pronounced during the katabatic flow conditions, corroborating the previous findings that the M-O theory works poorly in the presence of a low wind speed maximum.Second, the method based on a katabatic flow model (labeled as the K Int method) outperforms any C method in simulating the friction velocity; however, the C methods outperform the K Int method in simulating the sensible-heat fluxes.Third, the best overall performance is given by a hybrid method, which combines the K Int approach with the C method; i.e., it parameterizes eddy viscosity differently than eddy diffusivity.An error analysis reveals that the uncertainties in the measured meteorological variables and the roughness lengths produce errors in the modeled fluxes that are smaller than the differences between the modeled and observed fluxes.This implies that further advances will require improvement to model theory rather than better measurements of input variables.Further data from different glaciers are needed to investigate any universality of these findings.

Introduction
Mountain glaciers of British Columbia are experiencing significant mass loss in response to ongoing climate change and are projected to lose most of their current volume by the end of the century (Radić et al., 2014;Clarke et al., 2015).These projections, however, are highly sensitive to the representation of glacier mass balance in the models, in particular to the model parameterizations used to simulate surface melting.All regional-to global-scale projections of glacier mass changes rely on a relatively simple representation of glacier Published by Copernicus Publications on behalf of the European Geosciences Union.
V. Radić et al.: Evaluation of different methods to model near-surface turbulent fluxes melting via so-called "degree-day" or "temperature-index" models (Radić and Hock, 2014).Despite relying on empiricism, these models have relatively successfully dealt with the absence of fine-resolution meteorological input data that are, however, a prerequisite for a good performance of more physics-based surface energy balance (SEB) models.With rapid development of computational resources and availability of global climate models, SEB models will likely be used in future studies of glacier evolution on regional and global scales.To ensure that future models are firmly rooted in theory, new methods are required to test and parameterize turbulent heat fluxes at the glacier surface.
The turbulent sensible-and latent-heat fluxes are recognized as important components of the SEB over midlatitude glaciers worldwide (Hock, 2005).When averaged over periods of weeks to months, these fluxes are generally less than net radiation.Over daily to hourly timescales, however, sensible-and latent-energy transfers can exceed net radiation (Hock, 2005).Highest melt rates at these short temporal scales often coincide with times of greatest turbulent heat flux (Hay and Fitzharris, 1988).Direct measurements of turbulent fluxes, by an eddy-covariance method, are relatively rare because they require sophisticated instrumentation requiring continuous maintenance, making them unsuitable for long-term operational purposes.Few studies exist that have directly measured turbulent fluxes over mountain glaciers, and all of these studies are restricted to shorter time intervals, e.g., weeks to a couple of months (e.g., Munro, 1989;Forrer and Rotach, 1997;van der Avoird and Duynkerke, 1999;Cullen et al., 2007;Guo et al., 2011;Conway and Cullen, 2013).A more common approach in the studies of SEB on glaciers is to derive the turbulent fluxes through parameterization schemes that utilize the observations of mean meteorological variables such as near-surface air temperature, wind, and relative humidity (e.g., Hock and Holmgren, 2005;Sicart et al., 2005;Mölg et al., 2008;Reijmer and Hock, 2008;Mölg et al., 2009;MacDougall and Flowers, 2011;Sicart et al., 2011;Huintjes et al., 2015;Prinz et al., 2016).The simplest and most widely applied method is the bulk aerodynamic method (bulk method) (e.g., Braithwaite et al., 1998;Oerlemans, 2000).The bulk method is an integrated form of a gradient-flux relation, based on the theoretical work of Prandtl (1934) and Lettau (1934).The bulk method assumes constant vertical fluxes of momentum, heat, and humidity in the surface boundary layer, and horizontal homogeneous conditions of the surface.
Despite the common usage of bulk methods in glacier SEB studies, its application is often problematic.Main limitations of the bulk method applied to glacier SEB studies include the following: 1. Surface conditions: the bulk method for sensible-heat fluxes relies on the assumption that the difference between near-surface air temperature, often measured at 2 m above the surface, and surface temperature ade-quately represents the near-surface gradient of air temperature.Furthermore, the surface temperature is rarely measured but rather usually assumed or modeled, which can introduce a significant error in the estimates of the turbulent heat fluxes (Conway and Cullen, 2013).Similarly, for the latent-heat fluxes, the surface vapor pressure is assumed rather than measured.
2. Roughness lengths of momentum (z 0v ), temperature (z 0t ), and humidity (z 0q ): to obtain accurate roughness lengths, detailed measurements at a glacier site are needed, either by an eddy-covariance method or by vertical profile measurements.Without them, roughness lengths from other studies are used assuming that the values are universally applicable on similar glacier surfaces (e.g., Greuell and Oerlemans, 1989;Konzelmann and Braithwaite, 1995).Values of z 0v derived at glacier ice surfaces widely vary in space and time (Bintanja and van den Broeke, 1995;van den Broeke, 1996).There are few directly measured values for the scalar roughness lengths (z 0t and z 0q ) for mountain glaciers, which has led to approaches that assume their values approximate z 0v .While this "effective" roughness length (Braithwaite, 1995) works well as a tuning parameter when the modeled turbulent fluxes are optimized to match the observed ones, it differs from its "actual" z 0v , which only depends on the geometry and distribution of the roughness elements, assuming turbulence is generated by frictional drag.An alternative approach to deriving scalar roughness lengths is to relate them to measured z 0v via the surface renewal theory of Andreas (1987).That approach found that z 0t and z 0q over a glacier surface are approximately 2 orders of magnitude smaller than z 0v in an aerodynamically rough flow (Smeets et al., 1998;Conway and Cullen, 2013), but more studies are needed to investigate whether this finding is invariant in space and time.When the eddy-covariance method is used, accurate roughness lengths can only be confidently derived in near-neutral stability conditions that are rarely present at sloped glacier surfaces, especially during the melt season, when stable atmospheric stratification often accompanied by a downslope drainage (katabatic) flow dominates.Better-constrained values of roughness lengths are needed, especially considering that order-ofmagnitude changes in z 0v and z 0t alter the modeled turbulent heat fluxes by a factor of 2 (Munro, 1989;Hock and Holmgren, 1996).
3. Stability corrections: many SEB models applied to snow and ice assume that logarithmic vertical profiles of wind speed, temperature, and humidity are valid under prevailing stable conditions and no modification due to the stability effects are necessary (e.g., Munro, 1991;Oerlemans, 2000;Machguth et al., 2006).Experimental and theoretical evidence supports the assertion that Monin-Obukhov (M-O) theory, which requires modification of the fluxes due to the stability effects, is not applicable over sloping glacier surfaces due to violation of the assumptions such as homogeneous, infinite, flat terrain and constant fluxes with height (Denby and Greuell, 2000).When the atmospheric stability is considered in the bulk method, two approaches are commonly used: (a) one that accounts for stability through the bulk Richardson number (Ri b ) (e.g., Wagnon et al., 2003;Mölg et al., 2008;Anderson et al., 2010;Gillett and Cullen, 2011) and (b) another that uses a M-O stability parameter z L (e.g., Braithwaite, 1995;Klok et al., 2005;van den Broeke et al., 2005;Hulth et al., 2010).In both approaches, the stability corrections are empirical functions often derived from field studies on a flat and vegetated terrain.Applying the corrections through the two metrics (Ri b and z L ) can significantly alter the modeled heat fluxes, especially in the presence of large positive near-surface temperature gradients during conditions of low wind speed (< 3 m s −1 ) (Conway and Cullen, 2013).

Stable boundary layers accompanied by katabatic flow:
over a sloping glacier surface, a low-level katabatic jet is a common feature, which means that buoyancy enters the horizontal momentum equation and the turbulence arises independently of the surface roughness (e.g., van der Avoird and Duynkerke, 1999).The Prandtl model for katabatic flow that treats eddy viscosity as a constant value (Prandtl, 1942) is unable to correctly describe the sharp near-surface gradients in wind speed and air temperature that are often observed (Munro, 1989;Oerlemans, 1998).Grisogono and Oerlemans (2001) showed that the Prandtl model can be improved if a varying assigned eddy viscosity profile is used instead of a constant value.Using this "enhanced" Prandtl model, Parmhed et al. (2004) found that modeled turbulent fluxes at a glacier surface compare well with the measured fluxes during katabatic flows.An alternative approach to incorporating the model for katabatic flow into the bulk method for turbulent heat fluxes was developed following the classical Prandtl model for slope flows (Oerlemans and Grisogono, 2002).The simple model only agrees in the overall pattern, which shows an increase in sensible-heat flux in response to an increase in air-surface temperature difference.Despite its simplicity, application of the model requires calibration with detailed observations of temperature and wind profiles in the surface boundary layer, in addition to the near-surface meteorological observations.Few such measurements exist on glaciers, and therefore these models have not been rigorously evaluated so far.
In light of the above, further research is needed to evaluate and develop methods that are suitable for determining nearsurface turbulent fluxes over sloping glacier surfaces subject to katabatic flows.In particular, our objective is to evaluate the existing framework of bulk approaches commonly used for parameterizing the turbulent momentum and heat fluxes at a glacier surface.We achieve this by comparing the modeled fluxes, i.e., output of the bulk approaches, to measured fluxes, i.e., the fluxes derived from an open-path eddy-covariance (OPEC) method.OPEC measurements are obtained from a mountain glacier in British Columbia (BC), over a short-term window (weeks) for two summer seasons.We first provide a brief overview of the field site, data collection, eddy-covariance data treatment, and overall methodology.This is followed by a detailed description of the bulk approaches and their evaluation results, and finalized with the discussion and conclusions.
2 Data and methods

Study site and station setup
Castle Creek Glacier lies in the Cariboo Mountains, BC (Fig. 1).The 9.5 km 2 glacier flows north for 5.9 km; has an elevation range of 2827-1810 m a.s.l.; and contributes meltwater to Castle Creek, a tributary of the Fraser River.The glacier's annual mass balance has been monitored since 2009 (Beedle et al., 2014).We utilized the data from an automatic weather station on the glacier (AWS glac ): in summer 2010, the AWS glac was installed at 53 • 3 2.99 N, 120 • 26 39.57W and an altitude of 1967 m a.s.l.; in summer 2012 we installed the AWS glac to within ±10 m of its 2010 position.Two year-round off-glacier automatic weather stations have been in operation since 2007/2008 (Déry et al., 2010).The stations are situated near the glacier's transient snowline and in the glacier forefield, and are herein referred to as AWS up (53 • 2 36 N, 120 • 26 18 W, 2105 m a.s.l.) and AWS low (53 • 3 45 N, 120 • 26 4 W, 1803 m a.s.l.), respectively.At AWS up , the mean annual air temperature over 2007-2010 was −2.6 • C, while summer (June-August) mean air temperature was 6.6 • C.Over the same period, precipitation during summer (June-August) averages 94 mm at AWS up , although rainfall is likely underestimated due to gauge undercatch at the exposed ridge site, where mean monthly wind speeds often exceed 5 m s −1 (Déry et al., 2010).The lower part of the glacier (< 2100 m a.s.l.), where the AWS glac is located, is gently sloping with an approximate mean gradient of 7 • .
The AWS glac recorded data over a 12-day period (1-12 August) in 2010 and a 29-day period (21 August-18 September) in 2012.A Campbell Scientific CR1000 data logger scanned meteorological data with a sampling frequency of 10 s and averaged these measurements every 10 min.A Rotronic HC-S3 T/RH sensor measured temperature and relative humidity at a height z t = 1.7 m above the ice surface, and a RM Young 01503 wind anemometer monitored wind speed and direction at z v = 2.0 m (2010 sea- son) and z v = 1.9 m (2012 season).Kipp & Zonen sensors (CMP6, CM3, CGR3, and NR-Lite) sensed radiation fluxes (incoming and reflected shortwave, incoming longwave, and net radiation).We also respectively recorded surface air pressure and liquid precipitation at the AWS up and AWS low , and we linearly interpolated the measured surface air pressure to obtain surface air pressure at AWS glac .The meteorological sensors at AWS glac , mounted on a floating tripod (Fig. 1), maintained a constant measurement height above the surface over the observational period.We also installed a Campbell Scientific open-path eddy-covariance (OPEC) system and sonic ranger (SR50) on a separate tower drilled into the ice about 3 m upglacier of the AWS glac (Fig. 1) (Jarosch et al., 2011).The OPEC system included a sonic anemometer (CSAT-3) and a krypton hygrometer (KH2O) mounted with an initial height of 1.86 m (2010 season) and 2.0 m (2012 season) above the surface.The data logger recorded threedimensional (3-D) wind speed, sonic temperature (corrected for water vapor according to Scotanus et al., 1983) and vapor density fluctuations at 20 Hz.Vapor density fluctuations were later converted to specific humidity fluctuations.

Eddy-covariance data processing
We applied a series of data processing steps (Burba, 2013) that are part of the EddyPro software used to process the OPEC data and calculate the turbulent fluxes.The EddyPro processing output yields turbulence statistics and calculated fluxes for each 30 min segment of raw data.The height of OPEC sensors relative to the ice surface changed due to the surface melting throughout the observational period: the net change in 2010 was 0.7 m, whereas this change equaled 0.75 m in 2012.We applied the following corrections to the raw OPEC data to correct for differences in the height of the sensors above the ice surface, but none of them yielded substantial (> 1 %) differences in 30 min averaged turbulent fluxes.At the beginning of each observational season we leveled the instruments to horizontal and observed negligible tilt at the end of each observational period.For both seasons, we set the CSAT-3 so that the sensor's arm formed a 90 • angle to the prevailing down-glacier wind direction to minimize flow distortion due to air flow through the station structure.Before calculating the turbulence statistics, we applied a coordinate rotation to the velocity data to align the streamlines into the mean flow using a planar fit method (Wilczak et al., 2001).This method sets the vertical axis of the CSAT-3 perpendicular to a hypothetical plane produced from averaged wind measurements during the observation period.Because the glacier surface changes with time, we applied sensitivity tests to assess whether the alignment of the plane needed adjustment and whether a different coordinate rotation method (e.g., double-or triple-rotation scheme) produced different output.We noted no substantial changes, i.e., no difference > 1 %, in the 30 min turbulent fluxes during these tests, however.The latent-heat flux data were corrected for oxygen absorption by the KH2O (Tanner et al., 1993) and for fluctuations in the water vapor density induced by high-frequency changes in air temperature (Webb-Pearman-Leuning correction;Webb et al., 1980).Finally, we applied a series of standardized processing steps consisting of corrections for potentially high-and low-frequency loss, and sensor separation (Ibrom et al., 2007;Moncrieff et al., 2004;Horst and Lenschow, 2009).

The algorithm and quality control for roughness lengths
The covariances of the 3-D wind velocities, temperature, and specific humidity, derived by the OPEC system for each 30 min data segment, relate to the friction velocity (u * ), temperature scale (θ * ), and specific humidity scale (q * ) using where u and v are fluctuations of the horizontal wind component around their 30 min mean values, w is fluctuation of the vertical wind component, and T and q represent temperature and specific humidity fluctuations.We also used the OPEC system to indirectly measure the Obukhov length (L): where T v is the 30 min averaged virtual temperature (K), g is the acceleration due to gravity (9.81 m s −2 ), and κ (0.40) is the von Kármán constant.Following Cullen et al. (2007), the turbulent scales and L were then used to derive roughness lengths for each 30 min data segment: where U z v , T z t , and q z t are 30 min averages for the wind speed (m s −1 ), air temperature ( • C), and specific humidity (kg kg −1 ), respectively, determined from AWS glac measurements at the senors' heights (z v,t ).The surface temperature (T 0 ) was assumed to be at melting point (0 • C) and its vapor pressure at saturation (6.13 hPa) since we did not measure these variables.To reduce errors in our study, we only use data for which T z t > 1 • C. Our T 0 = 0 • C assumption has been found to be more accurate than estimating the surface temperature from the longwave-radiation measurements (Fairall et al., 1998) or from a SEB closure (Hock, 2005).Nevertheless, when the surface is not consistently melting, SEB closure can give much better results than the assumption of the melting surface (e.g., Conway and Cullen, 2013).Estimating T 0 from our radiation data proved to be unreliable because of the poor accuracy of the NR-Lite net radiometer.As part of our uncertainty analysis, we quantify errors in our results due to the T 0 = 0 • C assumption.We converted vapor pressure (e) to specific humidity using q = 0.622 e p , where p is observed air pressure (hPa).v z v L , t z t L , and q z t L (where t z t L = q z t L ) are integrated forms of universal functions based on M-O theory.Many such functions have been developed, and the most widely used is the Dyer-Businger flux-profile relationship (Businger et al., 1971;Dyer, 1974).Following Conway and Cullen (2013), for stable stratification L > 0 we use the expressions of Holtslag and de Bruin (1988) in the following forms: where a = 1, b = 2 3 , c = 5, and d = 0.35.For unstable stratification z v,t L < 0 , which was rarely present during our observational periods, we use the expressions of Dyer (1974).
Turbulence flux data from OPEC systems often contain spurious measurements.The roughness lengths derived from Eqs. ( 5) to ( 7) widely vary because they rely on several mean and turbulence variables.One way to reduce this scatter is to limit analysis on high-quality data (Andreas et al., 2010).In this study we applied a series of filters to the 30 min data segments (points) obtained during the study period following the approach of Conway and Cullen (2013) and Li et al. (2016).Our filtering algorithm proceeds as follows: a. "Basic" filter for z 0t and z 0v : remove the points for which < 0 and q z t −q 0 q * < 0, which return unrealistically large roughness lengths since the exponents in Eqs. ( 5)-( 7) become much larger than unity.One of the main reasons negative values arise is that the bulk approach (finite differences) is a first-order approximation of the gradient-flux approach, and the surface values for T 0 and q 0 are assumed and kept constant in time rather than measured directly for each 30 min segment.
sector around the glacier centerline to minimize sensor arm interference and ensure the longest on-glacier fetch.
e. "Wind speed" and "u * " filters: use runs for U z v > 3 m s −1 and u * > 0.1 m s −1 as errors in deriving the roughness lengths become comparatively large for low wind speeds and small friction velocities.In addition, accounting for a relatively strong air flow (U z v > 3 m s −1 ) helps reduce potential errors in T z t due to radiative heating of temperature sensor during periods of strong solar radiation (Huwald et al., 2009).
f. "Temperature gradient" filter for z 0t : to guarantee that a sufficiently large temperature gradient is detected between the surface and measurement height, only use data for which T z t > 1 • C. As mentioned earlier, this filter also assures that the assumption of melting surface holds.
g. "Moisture gradient" filter for z 0q : to ensure for sufficiently large moisture gradients between the surface and measurement height, select data that satisfy |e z t − e 0 | > 0.66 hPa, corresponding to a difference between T z t = 0 and h. "Small values" filter applied to z 0t and z 0q : following Andreas et al. (2010) we assumed that the surface exchange of heat and moisture cannot occur at scales smaller than 10 −7 m.This scale is an approximate mean free path of air molecules at sea level.Considering that this is the minimal scale at which molecular diffusion takes place at the surface, z 0t and z 0q smaller than this are treated as unrealistic (i.e., eddy diffusivity can take place only at larger scales).
i. "Large values" filter: no study of turbulent fluxes over glaciers determined roughness lengths larger than 1 m.We thus consider any values of roughness length that exceed 1 m as spurious.
We note that it is common to exclude the data during precipitation events as these greatly increase measurement errors associated with the OPEC system.Nevertheless, this filter was redundant in our study because the above filters already removed the points overlapping with precipitation events recorded at AWS low , and because the OPEC system (in particular KH2O) failed to take measurements during most of these events.To assure that the bulk method evaluation is performed on the high-quality measurements, all of the filters above are applied to the OPEC-measured u * and fluxes of both sensible heat (Q H ) and latent heat (Q E ) except the neutrality and wind speed filters.The latter two filters are modified so that all runs with | z v,t L | < 2 are included in the calculation of fluxes, as well as all runs with U z > 1 m s −1 .The upper threshold of z v,t L = 2 is chosen because the universal stability functions for stable stratification are commonly defined up to z L = 2 (Foken, 2008), which represents strongly stratified stable regime.The "measured" 30 min Q H and Q E are derived from OPEC data as where ρ a is the air density (here OPEC-derived), c p is the specific heat of air at constant pressure (1005 J kg −1 K −1 ), and L v is the latent heat of vaporization (2.514 MJ kg −1 ).

Bulk methods
All bulk methods used in this study are rooted in the gradient transport theory or K theory (Stull, 1988), in which the turbulent fluxes of momentum (τ ), sensible heat (Q H ), and latent heat (Q E ) in the surface boundary layer are proportional to the time-averaged gradients of wind speed (U ), potential temperature (θ), and specific humidity (q), expressed by where z is the height above the surface; K M is the eddy viscosity; and P r is the eddy Prandtl number relating the eddy diffusivities for heat (K H ) and vapor exchange (K E ) to eddy viscosity as K M = P r K H and K M = P r K E , respectively.Equation ( 12) also defines the friction velocity u * , which we use as a surrogate for surface shear stress.We note that by convention in SEB on glaciers the heat fluxes are positive (negative) if they transport heat towards (away) from the surface.
We provide an overview of the methodology used to compare and analyze the performance of bulk methods in simulating the turbulent fluxes of momentum, heat, and humidity.By definition, the bulk method parameterizes the turbulent fluxes with the use of mean meteorological variables (e.g., wind speed, near-surface temperature and relative humidity) and the bulk exchange coefficient.In total we evaluate two groups of bulk methods (i.e., the integrated form of Eqs.12-14): (1) methods that use the bulk exchange coefficient (C) as a function of roughness lengths and atmospheric stability, henceforth labeled as C methods, and (2) methods that incorporate simple katabatic wind models in their parameterizations.As part of the C methods we analyze four bulk methods that are most commonly used on glacier surfaces: (1) the C log method, which relates C only to the roughness lengths assuming logarithmic vertical profiles of wind, temperature, and specific humidity; (2) the C Ri b method, which adds stability corrections in C using the bulk Richardson number (Ri b ); (3) the C M−O method, which adds stability corrections in C using the universal stability functions of z v,t L ; and (4) the C SR method, which is the same as C M−O in its expression for C but uses the surface renewal model of Andreas (1987) to estimate the scalar roughness lengths (z 0t and z 0q ).
If not otherwise stated, the roughness lengths are assumed to be constant in time and equal to the mean log value of the OPEC-derived 30 min roughness lengths observed only during the near-neutral conditions and when all the filters are applied to the OPEC data.The group of bulk methods derived from simple katabatic flow models consists of (5) C kat method -following Oerlemans and Grisogono (2002), and (6) K Int method -using an integrated varying eddy viscosity profile (Grisogono and Oerlemans, 2001;Parmhed et al., 2004).Details of each method (1)-( 6) are provided in the sections that follow.The modeled 30 min turbulent fluxes (u * , Q H , Q E ) are compared to the equivalent OPEC-derived fluxes with the use of standard evaluation metrics: root mean square error (RMSE), mean bias error (MBE), and Pearson correlation coefficient (r).The performance of all six methods will be subjected to a set of sensitivity tests to identify key sources of errors.To do so, we will substitute modeled parameter values, such as friction velocity (u * ) and stability parameter z v,t L , with their equivalent OPEC-derived values.We will present results together with the description of sensitivity tests to facilitate readability.

C log method
According to the mixing-length theory (Prandtl, 1934) for a neutral surface layer, eddy viscosity can be parametrized as (Stull, 1988) Inserting these parameterizations in the gradient-flux relations (Eqs.12-14); treating u * , Q H , and Q E as constants; and integrating the equations with respect to height yield the bulk aerodynamic expressions yields where C 2 v ≡ C D is a dimensionless exchange coefficient ("drag coefficient") for momentum flux, and C v C t ≡ C H and C v C q ≡ C E are dimensionless exchange coefficients for the sensible-and latent-heat flux, respectively.Alternatively, one can rewrite Eqs. ( 17) and ( 18) using u * as Wind speed (U z v ), air temperature (T z t ), and water vapor pressure (e z t ) represent the time-averaged values at their measurement heights z v,t with zero subscript in these variables denoting their surface values (note that z t = z q ).Air density at AWS glac (ρ a ) is derived as the air density at standard sea-level pressure (ρ 0 = 1.29 kg m −3 at 0 • C) multiplied by the ratio between the air pressure estimated at AWS glac (p) and the standard sea-level pressure (p 0 ; 1013 hPa).The dimensionless exchange coefficients for the neutral atmosphere assume a logarithmic profile of wind speed, temperature, and humidity with height, and they take the following form: C q,log = κ

C Ri b method
This method assumes the same parameterization for K M,H,E (Eq.15) as the C log method, but it allows a flux reduction in a stable stratification via the bulk Richardson number (Ri b ), defined as where temperature is given in Kelvin.For stable conditions (0 < Ri b < 0.2), which prevail over a melting glacier surface, the coefficient C for the property y ( v, t, or q) is (Webb, 1970) Starting from the parameterization for neutral conditions (Eq.15) but allowing K M,H,E to vary in response to atmospheric stability (generally, K for statically unstable surface layer > K for neutral > K for statically stable), K M can be expressed as www.the-cryosphere.net/11/2897/2017/The Cryosphere, 11, 2897-2918, 2017 where ψ v is a dimensionless shear and L is Obukhov length.Using this approach for K and integrating the flux-gradient equations, the expression for C (for the property y) becomes We use the integrated form of M-O stability functions from Holtslag and de Bruin (1988) and Dyer (1974) as expressed earlier in Sect.2.3.The calculation of requires an estimate of L, which requires an estimate of Q H and u * .This method thus becomes a root-finding problem for the three coupled equations, which is solved by either Newton's method or a fixed-point iteration method.We chose the latter since the method is shown to successfully work for this set of equations (e.g., Berkowicz and Prahm, 1982;Lee, 1986) and has been widely used in glacier studies (e.g., Munro, 1989;Conway and Cullen, 2013).In this iterative method each 30 min Q H and u * are initially derived assuming the neutral case z v,t L = 0 .This approach allows an estimate of L and , which in the next iteration yields a new estimate for Q H and u * .These steps are repeated until no significant change in Q H occurs, i.e., until the difference becomes < ±1 W m −2 , a condition which is satisfied within the first five iterations (Munro, 1989).

C SR method
In all the parameterizations for C we employ a constant value for roughness lengths derived as the mean log value from the 30 min values (Eqs.5-7) calculated for the neutral conditions only.An alternative approach in deriving the scalar roughness lengths (z 0s ≡ z 0t,0q ) is through the surface renewal model by Andreas (1987).The model uses simple similarity arguments considering the structure of the viscous sublayer to derive z 0s from known z 0v as a function of Reynolds roughness number (Re * = u * z 0v ν , where ν is the kinematic viscosity of air, equal to 1.46 × 10 −5 m 2 s −1 ): where b 0,1,2 are the polynomial empirical coefficients.This model has been successfully tested for relatively smooth snow and ice surfaces, while for a rough, hummocky glacier surface Smeets and van den Broeke (2008) found that somewhat different values for the polynomial coefficients give better performance.

C kat method
Rather than integrating the flux-gradient relations with a chosen parametrization for K, Oerlemans and Grisogono (2002) derived a bulk approach for surface heat fluxes through a simplified scaling of the governing equations for heat and momentum balance in a 1-D katabatic flow model (Prandtl, 1942).Their basic assumption was that the katabatic flow is characterized by a well-defined wind maximum, setting the exchange coefficient for heat proportional to the maximum wind speed and to the height of the wind maximum.With this scaling approach they determined a "katabatic bulk exchange parameter" for heat flux (C kat , in m s −1 ) as where k and k 2 are dimensionless empirical constants, s is the temperature deficit at the glacier surface (negative value; θ (0) = s , θ (z → ∞) = 0), and γ is the background potential temperature lapse rate.Defining the exchange coefficient in this way makes the sensible-heat flux increase quadratically with the surface temperature deficit: s can be replaced by 2 m air-surface temperature difference ( T ), but it only yields a crude estimate of the actual temperature deficit.The latent-heat flux is similarly expressed as Neither Q H nor Q E explicitly depend on u * because the momentum flux and wind speed are obtained from the katabatic flow model in which the eddy viscosity is parametrized through the background variables ( s and γ ).C kat becomes smaller as the atmospheric boundary layer becomes more stable (larger γ ).As noted in Oerlemans and Grisogono (2002), this is a dynamic effect, which should not be confused with the effect of stratification on the exchange coefficients as predicted by the M-O theory.
Our data are not suitable to adequately test this bulk method since we do not have the vertical profile observations of potential temperature needed to estimate s and γ .Nevertheless, the data are sufficient to investigate the validity of the quadratic relation between Q H and T .Assuming that the quadratic fit holds, we optimize C kat for each season independently so that the modeled Q H gives the best fit to the observed Q H .Although this optimization can be performed in multiple different ways, we chose to set all the parameters constant except the one used for model tuning.The values of constant parameters are taken from the field data from a 10 km long mountain glacier in the Austrian Alps (PASTEX-94; Greuell and Struijk, 1994): k 2 = 1, γ = 0.005 K m −1 , P r = 2, and T 0 = 273 K.The chosen optimization parameter k is then evaluated by minimizing the RMSE between modeled and observed 30 min Q H for each season separately, using the air-surface temperature difference (T z t −T 0 ) as a substitute for s .In addition to this optimization, we investigate whether C kat can be expressed as an empirical function of a proxy variable for the background lapse variables ( s and γ ).We make use of near-surface air temperature measurements from the two stations in the glacier vicinity (AWS up and AWS low ) and assume that the proxy variable is the difference between on-glacier (AWS glac ) and off-glacier temperature.We then regress 30 min C kat values, obtained from OPEC-derived Q H and measured T (Eq.30), against the proxy variable.

K Int method
Similarly to the C kat method, this approach assumes a dominant katabatic flow throughout the observational period.Following Grisogono and Oerlemans (2001) and Parmhed et al. (2004) we use the assumed linear-Gaussian profile for eddy viscosity K(z) as where K max is the maximum value of K(z) (in m 2 s −1 ) reached at the height H K (in m) above the surface.Near the surface, this parameterization for K(z) is similar to O'Brien's cubic polynomial approximation applicable to the neutral and stable boundary layer (O'Brien, 1970;Stull, 1988).In the original model of Grisogono and Oerlemans (2001), K max and H K are related to the background variables (γ , s ), slope of the surface, and P r.Inserting the parameterizations for K(z) into the bulk-gradient relations (Eqs.12-13) and integrating them with respect to z (treating K max and H K independent of z), we obtain the following bulk expressions for the turbulent fluxes: where which is valid for z v < H K .
To adequately determine K max and H K , detailed observations of the wind speed and temperature profiles with height are required.In the absence of these observations, and similarly to the C kat method, we derive the parameter values through the optimization method that incorporates dependency of the two parameters on the proxy variable for s and γ (i.e., the difference between on-glacier and off-glacier temperature).Since these two parameters (K max and H K ) are interrelated (Parmhed et al., 2004), it is sufficient to optimize one parameter while keeping the other constant.We focus therefore only on optimizing K max while keeping H K constant.We set H K = 20 m as a mean observed value from a field study on an Icelandic glacier (Parmhed et al., 2004) as this was the only study we found that directly measured the two parameters.The optimized K max value for each 30 min segment is estimated from Eq. (33) using 30 min OPECderived u * and the mean log value of OPEC-derived z 0v (from neutral conditions only).The final step is to regress the optimized parameter against the proxy variable.The regression is performed on the 2012 data with identified katabatic flow conditions only, while the 2010 data are used to validate the regression model.

Uncertainty analysis
For each 30 min roughness length estimate (Eqs.5-7) we calculate its error through the error propagation method for a multi-variable function (e.g., Bevington, 1969).The measurement error for wind speed (δU z v ) and assumed errors for air-surface temperature difference (δ T ) and air-surface difference in water vapor pressure (δ e) are propagated (retaining the linear terms in Taylor expansion) into the final error for the roughness length.The following error values are used: δU z v = 0.11 m s −1 , δ T = 0.5 • C, and δ T = 0.23 hPa.
We quantify uncertainties in the modeled 30 min fluxes due to (1) the uncertainty in the roughness lengths; (2) the assumption that the surface temperature is at melting point; and (3) the systematic error in air temperature due to radiative heating of the temperature sensor.To estimate errors in (1), for each modeled 30 min flux, we use a Monte Carlo approach so that the bulk method is run 1000 times with randomly perturbed roughness length values.The roughness length values are randomly picked from a derived normal distribution of their log values, using the OPEC-derived mean log value and standard deviation (applying all the filters from Sect.2.3) as the distribution parameters.The error in each 30 min modeled flux is represented by a standard deviation of the 1000 ensemble runs.The same Monte Carlo approach is used to quantify the uncertainty in (2).For each Monte Carlo run, we randomly prescribe T 0 , assuming that T 0 has a normal distribution with a mean of 0 • C and a standard deviation of 0.5 • C. Total errors in the modeled fluxes due to (1) and ( 2) are calculated as the root mean sum of all individual 30 min errors and are labeled as δQ z 0 and δQ T 0 , respectively.Finally, to assess errors in (3), we make use of the air temperature from the sonic anemometer (T sa ), where susceptibility to radiative overheating is minimal.The values of T z t are subtracted from T sa , giving T bias , which during nighttime periods (no solar radiation) is calculated to be 2 • C. The nighttime bias is removed from T bias , which is then regressed against the incoming shortwave radiation (K down ) within a www.the-cryosphere.net/11/2897/2017/The Cryosphere, 11, 2897-2918, 2017 set of given ranges of measured wind speed (see below).The rationale here is that, for a given K down , the higher the wind speed, the more efficient the natural ventilation of the sensor, leading to smaller T bias (e.g., Harrison, 2010).The linear regression, applied on the 2012 data, yields the following relationship between T bias and K down : T bias = 0.0026 K down + 0.035, for 1 < U z v < 2, T bias = 0.0018 K down + 0.049, for 2 < U z v < 3, (37) T bias = 0.0010 K down + 0.063, for 3 < U z v , where K down is in watts per square meter (W m −2 ) and U z v is in meters per second (m s −1 ).Corrected air temperatures are then calculated by subtracting T bias from T z t .Differences between the modeled fluxes, derived by the bulk methods with and without corrected temperature, represent the 30 min errors, while their root mean sum gives the total error over the observational period (δQ T bias ).

Estimates of the roughness lengths
We apply the filters discussed in Sect.2.3 to the OPEC data to obtain the roughness lengths using the Eqs.( 5)-( 7).For the 12-day observational period in 2010 there are 515 unfiltered measurements for z 0v,0t and 307 for z 0q .The fewer initial data points for z 0q arises from high sensitivity of the KH2O sensor to the buildup of droplets on the sensor window during precipitation events.During these events, the KH2O sensor produced spurious values or failed to record the data, while CSAT-3 recorded observations, albeit with questionable quality.After filtering, usable data are reduced by 98 %, giving nine high-quality estimates for z 0v,0t but none for z 0q (Fig. 2).Similarly, the initial number of data points in the 29day observational period in 2012 is 1161 for z 0v,0t and 621 for z 0q , while the post-filtering number of data points is 30 for z 0v , 25 for z 0t , and 6 for z 0q .
Among all the filters, the neutrality filter | z v,t L | < 0.1 eliminates most of the data points, followed by the stationarity filter.Despite the significant loss of viable data, these filters are essential to achieve accurate determination of roughness lengths with the bulk method.OPEC-derived fluxes are assumed valid only during the steady-state or stationary conditions (i.e., in the absence of intermittent turbulence and/or gravity waves).For the near-neutral conditions, the choice of the stability correction functions has negligible effect on the calculated fluxes (the correction function approaches zero as | z v,t L | approaches zero).Furthermore, the near-neutral stability conditions are more commonly observed during overcast and windy conditions rather than during clear-sky conditions with prevailing katabatic flows.This fact minimizes the chance of selecting 30 min intervals with katabatic flows for which the bulk method, used to determine the roughness lengths, might be deficient.
We determine the time windows of prevailing katabatic flows (shaded areas in Fig. 3) by identifying longer periods (several hours to days) with almost stationary downglacier wind direction (200 • for our site) and relatively high mean wind speeds (> 3 m s −1 ), preferably reaching maximum 30 min mean values > 5 m s −1 .With this rough identification criterion we may miss some shorter time intervals with katabatic flows, but a more restrictive approach reduces the uncertainty that originates from the selection of katabatic flow periods in our further analysis.The identified prevailing katabatic periods (Fig. 3) overlap with clear-sky conditions and with relatively high near-surface air temperatures (T z t ), as is commonly expected for katabatic flows at glacier surfaces (e.g., Oerlemans et al., 1999).
Each roughness length estimate has its respective error (error bars in Fig. 3 derived from the error propagation; Sect.2.5).To provide the mean value for roughness lengths during neutral conditions, we used the average of their logged values to account for the measurements' log-normal distribution.To incorporate the individual error of each estimate into the mean value, we derive a weighted mean roughness length, where the weights are inversely proportional to the individual errors.The results (Table 1) show that the weighted mean values do not substantially differ from the mean values, and we thus use the non-weighted mean values in the remaining analysis.For both seasons, our results show that mean log z 0v is of the order of 10 −3 m and 2 orders of magnitude larger than mean log z 0t .The overall error of these mean values, expressed as 1σ (± a standard deviation), shows uncertainty of 1 order of magnitude around the mean value (Table 1).To determine z 0q for the 2010 season, which had no high-quality measurements for z 0q , we thus assume z 0q ≈ z 0t since the equality holds for the mean log values in the 2012 season.Table 1.Mean and ± standard deviation (σ ) of log roughness lengths of momentum (z 0v ), temperature (z 0t ), and humidity (z 0q ) derived for near-neutral stability conditions | z L | < 0.1 during 2010 and 2012 observational periods (units are in m).Also shown is weighted mean ± standard deviation (σ w ; see text) and number of points (N ) used to assess these values, i.e., number of points remaining after all the filters have been applied to the original data.

Intercomparison of bulk methods
We compare 30 min turbulent fluxes of momentum, sensible heat, and latent heat, derived from the OPEC data (observed fluxes) with the modeled fluxes from each bulk scheme.To ensure that the evaluation is performed on high-quality data, only the 30 min fluxes that pass the filters in Sect.2.3 are used, albeit without the neutrality filter and allowing for U z > 1 m s −1 (Fig. 2).This filtering yielded 239, 222, and 77 values for u * , Q H , and Q E , respectively, in 2010, while for the 2012 season we obtained 393, 353, and 83 values, respectively.In all methods, if not otherwise mentioned, we assume P r = 1, i.e., Modeled fluxes for the 2012 season (y axis in Fig. 4) are the mean values from the ensemble of Monte Carlo runs with the randomized roughness lengths, whereas the error bars are represented with ± 1 standard deviation (σ ) of the ensemble.All C methods poorly simulate u * (r = 0.25), with substantial overestimation during katabatic flow, and underestimate u * during non-katabatic conditions and relatively low wind speeds (U z < 3 m s −1 ).This type of dependency of the bias www.the-cryosphere.net/11/2897/2017/The Cryosphere, 11, 2897-2918, 2017  (modeled minus observed u * ) on the wind speed is present in each C method (Fig. 5).The K Int method provides the best simulation of u * (r = 0.60), while the bias dependence on the wind speed is less pronounced than in the C methods (Fig. 5).
Good performance of K Int method is expected since the parameter K max is optimized to maximize the match between modeled and observed u * .All C methods simulate Q H and Q E equally well (r ≈ 0.77) despite their equally poor per- The Cryosphere, 11, 2897Cryosphere, 11, -2918Cryosphere, 11, , 2017 www.the-cryosphere.net/11/2897/2017/formance in simulating u * , while the K Int method performs poorly (r = 0.15) for Q H .The C kat method performs best in simulating Q H since its parameter has been optimized with the observed Q H .The scatterplots for the 2010 season (not shown) are similar to those for the 2012 season, while the correlation values are smaller in 2010 due to smaller sampling period (Table 2).The performance of the C SR and C M−O methods accord because the OPEC-derived values for scalar roughness lengths (near-neutral conditions only) largely agree with the estimated values from the surface renewal model (Fig. 6).Large roughness Reynolds numbers (Re * ) estimated for our site indicate a rough flow regime for which the surface renewal model (Andreas, 1987) predicts a scalar roughness length 2 orders of magnitude smaller than z 0v .This prediction is similar to our observed values, especially when the mean log values of roughness lengths rather than their individual 30 min estimates are used.Despite the large scatter of 30 min OPEC-derived z 0t z 0v values around the predicted values from the surface renewal model, more than 50 % of the observed values fall between the predictions of Andreas (1987) and those of Smeets and van den Broeke (2008) expected for hummocky ice with z 0v > 1 × 10 −3 m (Fig. 6).
Intercomparison of the C methods (Fig. 4) reveals that the performance of the C M−O method is similar to that of the C log method.This similarity arises in part because the M-O stability parameter z v,t L , calculated with the fixed-point iterative scheme of Munro (1989), underestimates the stability during katabatic conditions and overestimates it during conditions with low wind speeds (Fig. 7).The stability corrections that depend on the fluxes during the katabatic conditions, while the fluxes are suppressed during the non-katabatic conditions.We also found no correlation between the 30 min OPEC-derived and any of the mean meteorological variables (e.g., temperature, wind speed; Fig. 7), which likely explains the failure of the fixed-point iterative scheme that relies on these dependencies.The poor performance of the stability corrections in the C Ri b method also follows from the lack of correlation between Ri b and the OPEC-derived L (Fig. 7).The lack of a relation between Ri b and z v,t L disagrees with the proposed stratification-dependent conversion of z v,t L into Ri b as presented in Arya (2001).Defining Ri b through U z and T (Eq.24) assumes that the mean variables are a good proxy for the local stability: during the near-neutral conditions, ideally characterized by overcast weather with strong winds and relatively low T , Ri b should be close to zero.On a sloped glacier surface, however, strong winds commonly occur during clear-sky conditions with a katabatic flow (Fig. 7).Ri b close to zero is thus more indicative of strong katabatic flow than neutral stability.

C methods
In the gradient-flux relation, the eddy viscosity is parameterized as a function of z, u * , and M-O stability parameter z L .Because u * and z L , in the C methods, are estimated rather than directly measured, any error in these modeled values can propagate into the flux estimates.Our goal in this section is to investigate the influence of the two variables, u * and z v,t L , on the bulk method performance.To do so, we estimate the turbulent fluxes from each of the four bulk schemes using the OPEC-derived u * and Obukhov length (L).Henceforth the notation u * and/or z L at the end of each method's name indicates the use of OPEC-derived variables (e.g., In each scheme we apply mean log roughness lengths (Table 1).
The following results are found for both the 2010 and 2012 seasons: C log u * and C SR u * performed best (lowest RMSE) in simulating the heat fluxes among the C methods with measured u * and/or z v,t L (Table 2).The variance error (VE, scatter of residuals around 1 : 1 line) in the C log u * method is also smaller than in the C M−O z L method (Fig. 8).Both results suggest that, in order to better simulate Q H , correct estimates of u * are more important than accurately estimating z v L .The C log u * method outperforms any C M−O method, even with OPEC-derived u * and z L .The results for Q E agree with the results for Q H .Despite Q E being less affected by the M-O stability corrections than Q H , the modeled latentheat fluxes are closer to those observed if the OPEC-derived z v L are used.While the original C methods overestimate Q H during the katabatic conditions, the C methods with stability corrections and measured u * underestimate the fluxes (Table 2 and Fig. 8).
Overestimation of Q H during katabatic conditions has also been shown by Denby and Greuell (2000) and explained by a failure of M-O theory in the presence of a shallow katabatic wind speed maximum.At the wind speed maximum, measured u * approaches zero, while the C method assumes constant momentum flux in the surface layer and therefore overestimates u * .The overestimation is less pronounced for Q H than for u * because the reduced turbulence at the wind speed maximum leads to an increase in the air-surface temperature difference and consequently to an increase in the measured Q H .When measured u * is used in the C M−O or C Ri b method, however, the C method underestimates Q H since the air-surface temperature difference alone cannot compensate for the effect of reduced momentum flux.To correct for this bias in Q H , P r would need to decrease (i.e., the C method would need to account for more effective eddy diffusivity than eddy viscosity at the given height).In the absence of wind profile measurements, we can only assume that these effects take place at our site, but we have no observational   evidence for the presence of the wind speed maximum.By plotting P r, derived from Eq. ( 19) using OPEC-derived Q H and u * , against 1/U z , we show that P r < 1 during katabatic conditions with high wind speeds (Fig. 9), yielding an average P r = 0.79 during the prevailing katabatic flow.Due to a self-correlation between P r and M-O stability parameter, we do not assess P r dependence on OPEC-derived z v L ; however, several studies have shown that, as the local stability increases, Prandtl number deceases, leading to P r < 1 for z L > 1 (Grachev et al., 2007).Setting P r = 0.79 in the C M−O method with measured u * (labeled as C M−O u * P r) during the katabatic conditions only improves the performance of the C method in simulating Q H (Fig. 8).

C kat method
The optimization of the k parameter yields k = 4.00 × 10 −4 and k = 5.04 × 10 −4 for the 2010 and 2012 seasons, respectively.When this optimized C kat is used to derive sensibleand latent-heat fluxes (Eqs.30 and 31), the modeled fluxes simulate the observed ones better (smaller RMSE and MBE) than any of the C methods (1)-(4) (Table 2).There is a quadratic fit between OPEC-derived Q H and T (Fig. 10), which explains 61 % of the variance in the 2012 observations and 19 % of variance in 2010 observations.Our results do not guarantee the validity of Eq. ( 30) but do demonstrate that the simple katabatic model of Oerlemans and Grisogono (2002) reproduces the basic characteristics of katabatic flow at our site.This model, when calibrated, can simulate the turbulent heat exchange more successfully than a common bulk method (C method).
In the original model (Oerlemans and Grisogono, 2002), C kat depends on the environmental lapse rate, such that C kat decreases as the static stability increases.Our proxy for stability is the difference between on-glacier (AWS glac ) and off-glacier temperature (AWS up or AWS low ).We assess the standard and log-log correlation between this proxy variable and the "observed" parameter k, whose values are derived from OPEC-derived Q H and measured T (Eq.30) using the same parameter values in Eq. ( 29) as in the prior optimization method.The correlation analysis, however, has not produced any statistically significant correlations at the 95 % confidence level.

K Int method
The relation between the optimized 30 min K max and the proxy variable is strongly nonlinear (Fig. 11).The best-fitting model (r = 0.61, p value < 0.01) that relates K max to the difference in air temperature at AWS up and AWS glac yields the following power-law relation: where the units for K max are square meters per second (m 2 s −1 ).Note that this empirical fit is assumed valid within the range of commonly observed values for K max taken from Parmhed et al. ( 2004): 0.03 < K max < 3 m 2 s −1 .The powerlaw model is calibrated on the 2012 data only (Fig. 11), while the 2010 data are used for the model validation.The validation yields statistically significant correlation coefficients between the 30 min modeled and observed K max values (r = 0.42, p value < 0.01).This result indicates that, for our study site, the same empirical model is applicable to both field seasons (Fig. 11).
The modeled u * derived from the K Int z L method (Eq.33) successfully captures two features observed in the data (Fig. 11): (i) a nonlinear relation between friction velocity and wind speed and (ii) a functional dependence of this nonlinear relation on the proxy variable for the environmental static stability.For comparison, we also show modeled u * vs. U z v , where the C log method is used to determine the friction velocity (Eq.16).For the near-neutral conditions | z v L | < 0.1 , the C log method performs well because of the near-linear dependence between the OPEC-derived u * and wind speed.As stability increases, however, the linearity is replaced by a relation that more closely resembles u * ∝ U z v , where the strength of the proportionality depends on the environmental static stability, approximated here with the off-glacier and on-glacier temperature difference (T up − T z ; Fig. 11).
As already shown (Fig. 4), the K Int method gives the best estimate of the friction velocity and the poorest estimate of Q H among the bulk methods.This performance pattern indicates that K M and K H do not necessarily share the same parametrization.Setting P r = 0.79 for katabatic conditions, the K Int method performance in simulating Q H improves (Table 2).As our final test, we combine the bestperforming C method with the K Int method to investigate whether the top-performance methods in simulating Q H and u * , respectively, when combined, give the best performance overall.This variant of the bulk method calculates the friction velocity from the K Int method (Eqs.33 and 36), which is then inserted in the C log method's equations for Q H and Q E (Eqs.19 and 20) with P r = 1.The results show that this C log K Int "hybrid" method indeed outperforms all the methods in the study (Table 2 and Fig. 8).

Error analysis
We summarize uncertainties between modeled and observed fluxes for the six bulk methods used in the study (four types of C methods and two types of methods based on the katabatic models) including the hybrid C log K Int method (Fig. 12).Error estimates are shown for the 2012 season only and agree well with the estimates for the 2010 season.The differences in model vs. measured fluxes are presented as a mean square error (MSE, which is equal to RMSE 2 from Table 2) decomposed into a squared MBE and a VE, satisfying MSE = MBE 2 + VE.We also show the total errors due to (1) the uncertainty in the roughness lengths (δQ z 0 ); (2) the assumption that the surface temperature is at melting point (δQ T 0 ); and (3) the systematic error in air temperature due to radiative heating of the temperature sensor δQ T bias .The C log K Int hybrid method has the lowest MSE among all bulk methods (Fig. 12).Across the methods, the total error (MSE) is more dominated by the contribution from the VE The Cryosphere, 11, 2897-2918, 2017 Figure 12. (a-c) Mean square error (MSE), squared mean bias error (MBE 2 ), and variance error (VE) for the six main bulk methods and the "hybrid" C log K Int bulk method, for the 2012 dataset.Note that MSE = MBE 2 + VE.(d) Error in Q H for each bulk method is due to (1) the uncertainty in the roughness lengths δQ z 0 ; (2) the assumption that the surface temperature is at melting point δQ T 0 ; and (3) the systematic error in air temperature due to radiative heating of the temperature sensor δQ T bias .
than by the mean bias error.Modeled Q H with the C log and C M−O method has the largest sensitivity to perturbations in roughness lengths (largest δQ z 0 ), likely because the uncertainties in both z 0v and z 0t propagate into the modeled z v,t L and then, via the fixed-point iterative method, into estimated Q H . Errors due to perturbations in T 0 (δQ T 0 ) and bias in T z (δQ T bias ) are small among all of the methods except the K Int method.Most importantly, our analysis reveals that the three sources of errors in modeled Q H (C methods) are all smaller that the RMSE (Table 2) between modeled and observed values.In other words, the "internal" model errors are smaller that the bias in the model performance when compared to the measured fluxes.In short, further advances will require improvement to model theory rather than better measurements of input variables.

Discussion
We discuss our findings in the order that we introduced uncertainties in the bulk methods, particularly in simulating Q H on sub-daily scales.Uncertainties originate from (1) assumed rather than measured surface temperature, (2) estimation of roughness lengths, (3) representation of stability corrections in the bulk methods, and (4) an absence of a successful model applicable during very stable conditions and prevailing katabatic flow on sloped glacier surfaces.
1.In the absence of reliable measurements of surface temperature, T 0 , we assumed the glacier surface to be at the melting point throughout the observational period.
We quantified the uncertainty of this assumption using Monte Carlo sampling.Although significant for the 30 min Q H estimates, model error (δQ T 0 ) is negligible over the whole observational period (Fig. 12).This error estimate does not account for any systematic biases in the surface temperature that might occur throughout the observational period (e.g., overnight cooling or refreezing of the surface; changes in surface temperature due to a debris cover or a formation of water channel at the study site).As long as surface temperature fluctuations are small, random, and centered at 0 • C, assuming the glacier surface is at the melting point does not cause any substantial error in the simulated fluxes over the entire observational period.
2. To estimate roughness lengths (z 0v , z 0t , z 0q ), we used detailed eddy-covariance measurements subjected to a series of corrections and filters.Since the C M−O method is used to derive roughness lengths, it is crucial to assess the roughness lengths during the conditions when the best performance of this method is expected (i.e., during the near-neutral stability conditions when | z v,t L | < 0.1).These conditions rarely occurred in our study, resulting in significantly fewer data.We also assumed the mean log values of z 0v,0t,0q are representative of the entire observational period.For both seasons, we obtained z 0v that scales to 10 −3 m, which is characteristic for hummocky ice, while our z 0t and z 0q estimates turned out to be 2 orders of magnitude smaller than z 0v .These findings are in agreement with the previous studies on glaciers (e.g., Smeets et al., 1998;Conway and Cullen, 2013), while the latter finding is corroborated with the surface renewal theory of Andreas (1987).Similar estimates for z 0v,0t from the two different summers at the same site give some confidence in our method used to determine roughness lengths.
Taking into account small sample size, the absence of any model for predicting a time-varying roughness lengths, and the lack of any significant changes in the surface properties during the observational period, www.the-cryosphere.net/11/2897/2017/The Cryosphere, 11, 2897-2918, 2017 we used constant values for z 0v,0t,0q throughout each season.The uncertainty in the modeled turbulent fluxes due to potentially time-varying roughness lengths was quantified via Monte Carlo sampling (δQ z 0 ) and was found to be smaller than the RMSE between modeled and observed fluxes.The modeled fluxes are thus poorly simulated due to the poorly defined parameterizations, rather than the poorly constrained roughness lengths.In our Monte Carlo assessment of δQ z 0 , we assumed the errors in 30 min z 0v,0t,0q to be random and small, rather than having a systematic error due to any potential changes in the actual surface roughness (e.g., snowfall; development of a surface drainage system; inhomogeneous surface ablation).The assumed random errors are attributed to instrumental (measurement) error and to the choice of the threshold stability value for the near-neutral conditions (i.e., | z v,t L | < 0.1).We posit that the large variance error (the scatter in the graphs showing 30 min modeled vs. observed turbulent fluxes) arises from the variability in the meteorological mean variables (temperature, wind speed, local stability) rather than the variability in the roughness lengths in time.If, instead of the constant mean log values for z 0v,0t,0q , we used their observed values specific to each 30 min segment with the near-neutral conditions, the resulting overfit would not improve the performance over all the points because (i) the overfitted points represent < 5 % of points in the observation period and (ii) the model performs poorly outside near-neutral conditions.
3. When testing the stability corrections in the bulk methods, we assumed that the main predictor of the timevarying bulk exchange coefficient is the time-varying local stability, not the changes in the surface roughness.
Using the two common parameterizations to account for the stability corrections (C Ri b and C M−O method), we found that neither scheme significantly improves estimates of turbulent fluxes relative to the scheme without the stability corrections (C log method).Our analysis suggests two reasons for this: (i) the empirical stability functions ( v,t,q ) taken from the literature perform poorly on our data and (ii) if not directly measured via the eddy-covariance system, the L metric is impossible to accurately obtain because no functional relationship is found between the metric and the mean meteorological variables (e.g., temperature and wind speed).The latter reason may also explain why measured Ri b could not be expressed as a function of OPEC-derived z v,t L .Our findings corroborate previous experimental and theoretical evidence that the M-O theory is poorly applicable over sloping glacier surfaces (Denby and Greuell, 2000).During katabatic flow, the bulk methods that rely on the M-O stability theory strongly overestimate the momentum flux and less strongly overestimate the heat flux.These biases in modeled fluxes are expected in the presence of low-level wind speed maxima that are characteristic during katabatic flow (Denby and Greuell, 2000).We cannot test this expectation since we lack observations of vertical wind profiles for our study, however.The K int method, based on a simple katabatic model, best simulates the friction velocity among all the methods in the study.This method underperforms any other method in simulating Q H , however, revealing a shortcoming in the assumption that the eddy diffusivity (K H ) and eddy viscosity (K M ) share the same parametrization.One way to improve on this shortcoming, as indicated in our results, would be to introduce a vertically varying eddy Prandtl number, P r(z).
4. Under stable stratification and in the presence of a lowlevel jet (katabatic flow), intermittent turbulence and gravity waves are present, and steady-state conditions do not exist (e.g., Foken, 2008;Axelsen and van Dop, 2009).We removed the data in our study that failed to satisfy the criteria of stationarity (stationarity filter).Assuming that our filtered data sample reflects the steadystate conditions, we found that the K Int method can simulate the 30 min u * better than the C methods.Improved performance is due to the parametrization of K(z) which, being independent of u * , yields the bulk method with u * ∝ √ U z .This function u * (U z ) yields a better fit to the observations than the function u * ∝ U z characteristic for the C methods.Despite the absence of vertical profile measurements to adequately test the model for K(z), our derivation of K max as a function of a proxy variable for environmental static stability (i.e., the difference between off-glacier and on-glacier near-surface air temperature) agrees well with the expected dependencies of this parameter on the temperature deficit at the glacier surface (Parmhed et al., 2004).While our results are promising in terms of the applicability of the K Int approach to determine the friction velocity, it remains unclear how our static stability variable relates to the background potential temperature lapse rate, γ .A possible extension of our work would be to investigate whether a more exact parameterization for K (as a function of z, ∂U ∂z , and ∂T ∂z ) in the surface layer can be determined as a solution to the governing equations (heat and momentum balance) for katabatic flow.

Conclusions
The main objective of the study was to evaluate commonly used bulk methods (C methods) for simulating turbulent heat fluxes at a sloped glacier surface and compare these with approaches based on a simple katabatic flow model.We investigated the stability components of the bulk methods and attempted to improve upon these, assuming site-specific roughness length values are available.In addition to the four C methods (C log , C Ri b , C M−O , and C SR method) with different parameterizations for the bulk exchange coefficient (C), we evaluated two types of bulk methods developed from katabatic flow models (C kat and K Int method; Grisogono and Oerlemans, 2001;Oerlemans and Grisogono, 2002).These six bulk schemes -for simulating 30 min turbulent fluxes of momentum, sensible heat, and latent heatwere evaluated against the equivalent fluxes obtained by an OPEC method.The evaluation was performed at a point scale of a mountain glacier in BC, using one-level meteorological and OPEC observations over multiple days for two melt seasons.The OPEC data were subjected to a set of quality control corrections and filters prior to analysis.Few 30 min data segments met the criteria for the near-neutral static stability and the steady-state turbulence conditions, revealing a challenge in performing this type of analysis at sites dominated by the stable conditions and drainage (katabatic) flows.Despite the small OPEC data sample available to determine surface roughness lengths (z 0v,0t,0q ) and the high-quality 30 min turbulent fluxes, the evaluation results derived from the two independent field seasons accord.A summary of our main findings includes the following points: -Bulk exchange coefficients can be derived from the OPEC-derived roughness lengths for neutral stratification, where the mean log values for z 0v,0t,0q over the entire period are assumed representative of the actual surface roughness and constant in time.The widely used bulk methods facilitated by the standard meteorological observations on glaciers provide a good approximation of the OPEC-derived 30 min turbulent fluxes during the neutral stability conditions.
-The Monin-Obukhov stability functions widely used in glacier studies perform relatively poorly at our site.The C methods, with or without the stability corrections, overestimate the sensible-heat fluxes (Q H ) over the observational period.This overestimation arises from a positive mean bias in the modeled friction velocity (u * ), and the bias increases with the katabatic wind speed.Without an adequate model for u * , the use of the stability corrections for Q H will likely fail.Similar findings derived from a different glacier in this region (Fitzpatrick et al., 2017) suggest the transferability of this finding to other mountain glaciers subject to similar climate.
-If OPEC data are not available, L cannot be successfully predicted from the mean meteorological observations (e.g., wind speed, air-surface temperature difference).The proposed fixed-point iterative method for predicting z v,t L suggested by others (Berkowicz and Prahm, 1982;Lee, 1986;Munro, 1989) fails to provide a good approximation of OPEC-derived z v,t L at our site.
-Contrary to the K-theory predictions (Stull, 1988), we found that the relation between the OPEC-derived 30 min u * and observed wind speed is nonlinear.Our data reveal u * ∝ √ U z , where the constant of proportionality is a nonlinear function of the environmental static stability.
-The K Int method outperforms any other bulk scheme in simulating u * partly because u * ∝ √ U z .The K Int method performs poorly in simulating Q H , however, revealing a shortcoming in the commonly used assumption that eddy viscosity and diffusivity can use the same parametrization.Applying the K Int method to assess u * , which is then used to assess Q H in the C log method, yields the best performance across all the bulk methods.The hybrid method, combining a katabatic model to derive the friction velocity and a simple C method to derive Q H , works the best in simulating the sensible-heat fluxes.
-The bulk exchange coefficient in the K Int method is related to a difference between off-glacier and on-glacier near-surface air temperature, which is a proxy variable for the background potential temperature lapse rate (γ ) and the temperature deficit at the glacier surface ( s ) in the original model of Grisogono and Oerlemans (2001).
As expected by the original model and confirmed by our results, the bulk exchange coefficient decreases as the temperature difference increases.Multi-level observations of wind speed and temperature within the boundary layer above a glacier surface are needed to adequately evaluate this model, however.
Data availability.All data used in this study are available upon request from the corresponding author.The raw data are not currently in a standard format, and the metadata are not fully digitized.
Author contributions.VR participated in the 2012 field campaign, developed and performed the analysis, and wrote the initial version of the manuscript.BM provided financial and field support and helped prepare and revise the manuscript.JS led the 2010 field campaign and contributed to the manuscript refinement.NF processed the OPEC data and contributed to the manuscript refinement.MAT contributed to the method development.SJD provided logistical support and meteorological data, and contributed to the manuscript refinement.
Competing interests.The authors declare that they have no conflict of interest.

Figure 1 .
Figure 1.(a) Map of Castle Creek Glacier with locations of automatic weather station on the glacier (AWS glac ) and two AWSs in the glacier vicinity.AWS glac with eddy-covariance tower and floating meteorological tripod during installation, Castle Creek Glacier, (b) 2010 (photo by Theo Mlynowski), (c) 2012 (photo by Valentina Radić).

V
. Radić et al.: Evaluation of different methods to model near-surface turbulent fluxes b. "Stationarity" filter: only use steady-state runs following the method of Foken (2008), which examines the fluxes for different averaging times.Steady-state conditions can be assumed if 5 and 30 min averaged fluxes do not differ by more than 30 %. c. "Neutrality" filter: select near-neutral conditions

Figure 2 .
Figure 2. Initial number of observations for estimating the roughness lengths (from 30 min averages of OPEC data) and remaining number of observations after each filter is applied (ordered from top down) for 2010 and 2012 observational periods.Solid line represents the filters used to determine the roughness lengths, and dotted line represents the filters used to determine the fluxes of momentum, temperature, and humidity.

Figure 3 .
Figure 3. Thirty-minute averages of meteorological variables and SEB fluxes measured at the AWS glac in 2010 (a) and 2012 (b): near-surface air temperature (T z ), accumulated precipitation (P ; measured at AWS low ), wind speed (U z ), wind direction (U dir ), incoming (K down ) and reflected (K up ) shortwave radiation, and OPEC-derived sensible (Q H ) and latent (Q E ) heat flux.Bottom panel shows the OPEC-derived roughness lengths for momentum (z 0v ), temperature (z 0t ), and humidity (z 0q ), after filtering the OPEC data, and their estimated errors.Shaded in blue are the identified time intervals of prevailing katabatic flow (see text).

Figure 4 .
Figure 4. Comparison of 30 min observed (OPEC-derived) vs. modeled friction velocity (u * ), sensible-heat flux (Q H ), and latent-heat flux (Q E ), according to the six bulk methods: C log , C Ri b , C M−O , C SR , C kat , and K Int .Dots and error bars respectively show the mean and standard deviation of Monte Carlo ensembles (see text).No error bars are given for the C kat method since it is independent of roughness lengths.Red points denote values during the near-neutral conditions | z v,t L | < 0.1 only, and blue points indicate values during katabatic flow.Root mean square error (RMSE; W m −2 ), mean bias error (MBE; W m −2 ), and Pearson correlation coefficient (r) are provided for each scatterplot.

Figure 5 .
Figure 5. (a-b) Bias (modeled minus observed values) in u * vs. the wind speed, and (c-d) bias in Q H vs. wind speed.Red points denote values during the near-neutral conditions | z v,t L | < 0.1 only, and blue points indicate values during katabatic flow.

Figure 6 .
Figure 6.(a) Ratio of roughness lengths for temperature (z 0t ) and momentum (z 0v ) vs. roughness Reynolds number (Re * ), and (b) ratio of roughness lengths for humidity (z 0q ) and z 0v vs. Re * .Model 1 (solid line) is the theoretical prediction of Andreas (1987), and model 2 (dashed line) is the prediction of Smeets and van den Broeke (2008).Mean log value of z 0 v specific to each season (Table 1) is used to calculate Re * .
Figure 7. (a) Modeled z/L with the fixed-point iterative scheme in the C M−O method against the OPEC-derived stability parameter (z/L obs ), and (b) Bulk Richardson number (Ri b ) against z/L obs .Dashed black line shows 1 : 1 line.(c) Dependency of z/L obs on the measured wind speed (U z ) and (d) on the near-surface air temperature (T z ).Red (blue) points show values during near-neutral (katabatic) flow conditions.

Figure 8 .
Figure 8.Comparison of 30 min values of observed (OPEC-derived) and modeled sensible-heat flux (Q H ) using six different bulk methods: C log u * , C M−O u * , C M−O u * P r, C M−O and C log K Int .Red (blue) points show values during near-neutral (katabatic) flow conditions.

Figure 9 .
Figure 9. Measured eddy Prandtl number vs. the inverse of the wind speed.

Figure 10 .
Figure 10.Observed 30 min sensible-heat flux (Q H ) vs. observed air-surface temperature difference ( T ), and modeled 30 min Q H fluxes (black line) according to the C kat method, for the 2012 and 2010 seasons.

Figure 11 .
Figure 11.(a-b) K max vs. air temperature difference between AWS up and AWS glac , for the 2012 and 2010 seasons.Black (blue) circles represent all (katabatic) observations, and black line is the empirical fit model for K max .(c) OPEC-derived 30 min friction velocity u * vs. observed wind speed U z v , for the 2012 season, where each observation point is colored according to its 30 min value of OPEC-derived z v L (color bar).Solid colored lines are modeled u * , using the K Int method with H k = 20 m and optimized K max , for five different stability conditions (T up − T z = −1, 1, 3, 5, and 7 K).Dashed line is modeled u * using the C log method.

Table 2 .
Comparison between modeled and OPEC-derived sensible (Q H ) and latent (Q E ) heat fluxes, expressed as root mean square error (RMSE), mean bias error (modeled minus observed; MBE), and Pearson correlation coefficient (r) for 2012 and 2010 (values in parentheses) observational periods, given for a set of six bulk methods and their variants (see text).