Exploring uncertainty in glacier mass balance modelling with Monte Carlo simulation

Abstract. By means of Monte Carlo simulations we calculated uncertainty in modelled cumulative mass balance over 400 days at one particular point on the tongue of Morteratsch Glacier, Switzerland, using a glacier energy balance model of intermediate complexity. Before uncertainty assessment, the model was tuned to observed mass balance for the investigated time period and its robustness was tested by comparing observed and modelled mass balance over 11 years, yielding very small deviations. Both systematic and random uncertainties are assigned to twelve input parameters and their respective values estimated from the literature or from available meteorological data sets. The calculated overall uncertainty in the model output is dominated by systematic errors and amounts to 0.7 m w.e. or approximately 10% of total melt over the investigated time span. In order to provide a first order estimate on variability in uncertainty depending on the quality of input data, we conducted a further experiment, calculating overall uncertainty for different levels of uncertainty in measured global radiation and air temperature. Our results show that the output of a well calibrated model is subject to considerable uncertainties, in particular when applied for extrapolation in time and space where systematic errors are likely to be an important issue.


Introduction
A wide range of approaches to the modelling of mass balance exist, ranging from simple temperature index correlations (e.g., Braithwaite, 1981;Reeh, 1989) through to complex physical models of energy balance and associated melt Correspondence to: H. Machguth (horst.machguth@geo.uzh.ch) (e.g., Brock et al., 2000;Arnold et al., 2006). Typically, models are developed for the general case of modelling mass balance, but are calibrated and validated at a few point locations and therefore for a particular set of topographic and climatic conditions. Thus, for example, temperature index approaches require very little input data (positive degree days and a degree days factor) and could be applied in regions with sparse measurements (e.g., Reeh, 1989). However, they require calibration for each area in order to consider local charateristics (Braithwaite, 1995). By contrast, more complex physical models are assumed to require less tuning, and thus to be more suitable for extrapolating mass-balance in both space and time but at the expense of a higher demand for input data. Several studies exist where models of various complexities have been extrapolated over, for example individual glaciers or mountain ranges to produce seasonal values for mass balance (e.g. Arnold et al., 1996), and a key question in the development of such methods is the uncertainty associated with them.
Any approach to quantifying uncertainty must firstly consider potential sources and techniques for the quantification of uncertainty. Mass balance models typically require both meteorological inputs and snow or ice parameterisations representing the point(s) at which the model is being run. Although many models have been developed using data measured at the same point as mass balance measurements, such an approach is not viable for extrapolating mass balance in space and time, where typical inputs to such a model have to be interpolated from point measurements or the projections of, for instance, general circulation models (GCM) or regional climate models (RCM). Equally, if we wish to explore mass balance in the past, point data are normally not available from the glacier of interest, and meteorological measurements from some long term data series, assumed to be correlated with the glacier location are generally used. In both cases, the meteorological inputs to the mass balance model must be extrapolated or interpolated in space, and perhaps interpolated in time, to derive values appropriate to the modelled mass balance.
Uncertainties in input parameters to mass balance models can thus be characterised as stemming from, in the case of measured data, errors and uncertainties in measured values, and in the case of modelled values, differences between modelled and measured values (which in turn stem from differences between the spatial extent over which these values can be considered to be representative). The methods used to extrapolate/interpolate these data from the measurement location to the model location are other potential sources of uncertainty. Further uncertainties arise as a result of the abstraction of processes themselves within the mass balance model, and resulting generalisation of the real world system -for example, few mass balance models represent changes in snowpack form during melt (e.g. formation of ablation hollows or sun-cups during the ablation season) and the resulting increase in roughness length and change in turbulent energy fluxes. In considering uncertainties only in measured values of meteorological data there are two important sources of uncertainty to consider: random and systematic errors. Random errors are usually assumed to be related to the device making the measurement and its notional precision and are either temporally uncorrelated or only temporally correlated over short periods, whilst systematic errors are some constant offset or trend in measured values over long periods of time. Systematic errors are a well known problem in long term measurement series and can arise from, for example, incorrect calibration of an instrument or changes in a measurement site (e.g. Böhm et al., 2001).
Despite our understanding of the likely sources of uncertainty in mass balance modelling, most papers in the literature limit their exploration of uncertainty to sensitivity studies which explore modelled responses to variation in individual parameters. While such approaches provide useful information and may be adequate for models calibrated and run at the same point in space, they are insufficient to understand the uncertainties in modelled mass balance extrapolated in time and space. For example, Van der Veen (2002) argued that sensitivity studies were inadequate in modelling mass balance for polar ice sheets because they do not provide a probability for a certain result, but rather only the range of possible results for variation in a given input parameter. Furthermore, sensitivity studies do not allow a full exploration of the parameter space and resulting non-linear effects as a result of combined uncertainties in multiple parameters. Parametric uncertainty analysis, in contrast to sensitivity tests, aims to evaluate the multi-dimensional response of a model to combined uncertainty in input parameters with a probability density function as an output (Tatang et al., 1997;Van der Veen, 2002).
In this paper we set out to estimate the uncertainty in mass balance calculations made with a glacier surface energy balance model of intermediate complexity  applied to the Morteratsch valley glacier in the Swiss Alps. The paper has three key aims: 1. to estimate uncertainty for individual input parameters used in Klok and Oerlemans's 2002 model; 2. to calculate a probability density function (PDF) for mass balance as a function of the uncertainties in input parameters at a point on the Morteratsch glacier; and 3. to assess the modelled uncertainties for studies extrapolating glacier mass balance forward in time and space.
In the first part of the paper we introduce the data and test area for the model, before describing the mass balance model. The basic model is compared to measurements over 11 years to validate that it can reproduce measured values. We then explore the size and form of random and systematic errors in the model's input parameters, before running Monte Carlo simulations to derive the overall uncertainty in modelled mass balance. In order to explore the likely implications of our uncertainty study for climate change projections, we then calculate PDFs for two key input parameters which are also outputs of most typical climate models: air temperature and global radiation.

Test site and time frame
Morteratsch is a mid-sized valley glacier in the south-eastern Swiss Alps, extending from approximately 4000 m a.s.l. down to 2020 m a.s.l. and covering an area of about 16 km 2 ( Fig. 1). Mass balance measurements on Morteratsch are embedded in a relatively long term study of the glacier surface energy balance which was initiated by the IMAU, Utrecht, in 1995(e.g. Oerlemans, 2000Oerlemans and Klok, 2002). Since then an automatic weather station (AWS), a sonic ranger for continuous surface height measurements and three stakes have been operated on the tongue of the glacier at approximately 2100 m a.s.l. Mass balance measurements were initiated in 1999 at two other sites and in the following year at a fourth site. In this paper we make use of sonic ranger data and stake measurements from all four sites.
The present study focuses on the mass balance at the AWS over 400 days, starting from 18 October 1998 and ending on 20 November 1999. In the following, this time period is referred to as the "calculation period".
Data from four meteorological stations, operated by Me-teoSwiss, are used in this study as input data for the model or for the assessment of uncertainties: Corvatsch (3315 m a.s.l., located on a summit, 8 km west of the point AWS on Morteratsch Glacier), Hospizio Bernina (2307 m a.s.l., located at a pass, 7 km east), Samedan (1705 m a.s.l., located on a  Paul et al. (2004). The digital elevation model (DHM25 level2) is reproduced by permission of swisstopo (BA081413). wide and flat valley floor, 12 km north) and Weissfluhjoch (2690 m a.s.l., located on a summit, 45 km north) (Fig. 1). Hospizio Bernina is a manual weather station and only daily means are available -we acquired daily mean 2 m air temperatures (T a ). The other stations are automatic and we obtained from all three stations hourly means of T a , relative humidity (e a ) and air pressure (p). In addition, global radiation (S in meas ) and precipitation (P ) were acquired from Corvatsch.

Description of the model
In this study we investigate the numerical mass balance model developed by Klok and Oerlemans (2002). We selected this model because of the detailed and clear description of parameterizations, model output and validation procedure. Furthermore, the original model or parts of it have already been used in other studies e.g. (e.g., Klok and Oerlemans, 2004;Arnold et al., 2006). According to Klok and Oerlemans (2002), the model is based upon the following equation describing the specific mass balance, M(kg m −2 ): Q m is the melt energy involved in melting and Q l the energy involved in sublimation or riming. L m (3.34×10 5 J kg −1 ) is the latent heat of melting, and L s (2.83×10 6 J kg −1 ) is the latent heat of sublimation. P represents the accumulation due to snowfall. The surface energy heat flux (F ) supplies energy for melting (Q m ) or for the glacier heat flux (G), which implies the warming or the cooling of the snow or ice pack.
S in and S out are incoming and reflected solar radiation; L in and L out are incoming and emitted longwave radiation. Sensible and latent heat fluxes are represented by Q h and Q l . Melting can occur only when the surface temperature is at 0 • C and F is positive. If the latter is the case but the surface temperature is below zero, then F =G and the snow pack or ice is heated. The model is driven by data from synoptic weather stations located outside of the glacier boundary layer. Required input from meteorological measurements are: T a , S in meas , P , e a and p. Energy fluxes at the glacier surface are parameterized according to Klok and Oerlemans (2002). In their model parameterizations from Oerlemans and Grisogono (2002) are applied to calculate turbulent fluxes. Katabatic flow is parametrized therein and thus, measured wind speed is not required for input in the model. While writing our program code we closely followed the explanations given by Klok and Oerlemans (2002). Some of the original parameterisations have been modified and are described here. We calculated potential clear sky global radiation according to Corripio (2003) and Iqbal (1983). The snow albedo (α s ) is calculated according to Klok and Oerlemans (2002) and modified formulas from ECMWF, using snowageing functions for melting (exponential) and non-melting Comparison of modelled mass balance and measured mass balance at four points on Morteratsch-Glacier. "SR" stands for sonic ranger, "S" for stake measurements and "mod" for modelled values. The elevation of the four points are: AWS: 2100 m a.s.l., 2: 2500 m a.s.l., 3: 2700 m a.s.l. and 4: 2950 m a.s.l. All data is given in meters ice except for the snow accumulation measured by the sonic ranger which is depicted in meters snow height. conditions (linear). Furthermore, in contrast to Klok and Oerlemans (2002), the new value of α s after a snow fall event is not only a function of total snow depth and the underlying ice albedo (α i ), but also a function of α s , the albedo of the old snow surface before the snow fall event. Klok and Oerlemans (2002) calculate precipitation (P ) from measurements at Samedan and two manual weather stations (Pontresina and Bernina-Curtinatsch) in combination with a multiplication factor. Here we simply use measured P from Corvatsch station in combination with a tuning factor (P corr ). While in the original model a single threshold air temperature (T snow ) of 1.5 • C is used to distinguish snowfall and rain, we apply a gradual transition between 1 • C and 2 • C.

Testing of the mass balance model
The original model is reported as having delivered results in good agreement with measurements on Morteratsch for both a two and a five year model run Oerlemans, 2002, 2004). In order to calibrate our modified model, we first conducted a model run for the 400 days calculation period (Sect. 2). We adjusted P corr over this period to achieve good agreement between modelled and observed date of the disappearence of the winter snow cover at the AWS in spring 1999. Tuning was performed only at the AWS but the value we found for P corr was applied uniformly at all points where mass balance was calculated. Except for the new calculation of P corr , we did not apply any other new tuning to the model. The modelled cumulative mass balance before tuning (P corr =1) was −6.63 m w.e. and after tuning (P corr =2.1) a value of −5.97 m w.e. resulted. According to Klok and Oerlemans (2002), measured melt at the AWS during the melt season of 1999 was −5.9 m w.e., which approximately equals to total observed mass balance for the full 400 days -according to the sonic ranger measurements, the 1998 melt period ended on 18 October and little accumulation (approx. 0.05 m w.e.) occurred between the end of the melt period in 1999 and 20 November 1999. Hence, the cumulative mass balance calculated for the tuning period agrees well with measurements. According to Klok and Oerlemans (2002) the snow cover at the AWS disapeared on 18 May 1999. After tuning of P corr modelled melt out occurs on the same day.
Finally, to test the robustness of the calibrated model, an eleven year model run for four points on Morteratsch Glacier was conducted and the results compared to measurements (Figs. 2 and 3). For the point AWS, modelled cumulative mass balance is −72.5 m ice (m ice) which agrees very well with a measured value of −74 m ice (Fig. 2). The two curves are very similar, deviations during winter time are only apparent since the model results are shown in m ice and the sonic ranger measures surface elevation which corresponds to snow height in winter. The three other points with measurements available were also included in the comparison. The agreement for S-2 and S-3 is also good, however, the model systematically overestimates mass balance at S-4. The comparison of measured and modelled mass balance in a scatter plot (Fig. 3) shows that for individual stake readings deviations occur. There is one clear outlier where measured melt at S-2 is strongly underestimated. The disagreement is likely due to an uncertain stake reading.

The uncertainty model
Different approaches exist to determine the PDF of the output of a model. Analytical solutions are often desirable, however, they become complex or impossible if the set of uncertain parameters is large and nonlinear effects are present. Consequently, methods such as probabilistic collocation (e.g., Tatang et al., 1997) or Monte Carlo simulations are commonly applied to approximate uncertainty. Although computationally expensive, Monte Carlo simulations are popular because they are relatively simple to apply even when working with complex models and large numbers of uncertain parameters. For instance, Van der Veen (2002) used Monte Carlo simulations to assess uncertainty in the mass balance of the Greenland ice sheet.
In a Monte Carlo simulation a certain calculation (in our case a model run) is repeated many times and uncertain input parameters are varied within their uncertainty ranges. Model outputs are stored and a histogram is constructed to obtain the PDF for the desired output variable.
In the context of this paper the calculation being repeated is the modelling of the cumulative mass balance at a point AWS on Morteratsch Glacier over the time span between 18 October 1998 and 20 November 1999, the period over which we tuned the parameter P corr .
For every uncertain parameter we estimated random and systematic uncertainty based on the literature and measured values of these parameters. An explanation of the uncertainties of individual parameters is given in Sect. 4.2 and the values chosen are listed in Table 1.
These uncertainties were then multiplied by normally distributed random numbers with a mean at 0 and a standard deviation of 1, resulting in systematic (ε s ) and random uncertainties (ε r ). The random numbers for the two classes of uncertainties as well as for the different parameters are independent of each other, implying the simplifying assumption that input parameter uncertainty is not correlated. Calculated ε s and ε r are then added to the measured values of the corresponding input parameters. The ε s are calculated at the beginning of every run and remain constant throughout the run. The ε r are treated as either fully random or temporally autocorrelated. In the first case, every second numerical time step (every hour) a new ε r is calculated. In the case of temporal autocorrelation ε r at time step t is correlated to ε r at time steps t+1, t+2, . . . , t+(t decor −1) and only at time step t+t decor is the uncertainty decorrelated from the original un- certainty at t. Where temporal autocorrelation was applied, a typical mean time span t decor and a related standard deviation were defined. Based on the latter two values and another array of normal distributed random numbers, we calculated a series of consecutive time spans, of variable length with a mean length of t decor . At the start of every time span a value for ε r is calculated and finally linear interpolation was applied between two successive ε r . For some parameters the addition of ε s and ε r can result in impossible values (e.g. relative humidity of more than 100%). Physically defined limits were set where necessary (Table 1) and whenever such a limit was violated, the related parameter was set to its limiting value.

Parameters and related uncertainties
Twelve input parameters were assigned uncertainties: All five directly measured values (T a , S in meas , P , e a and p) as well as the parameters used in spatial interpolation, the lapse rate ( T a ) and precipitation gradient ( P ). Furthermore, from the parameters selected by Klok and Oerlemans (2002) for sensitivity testing, we included ice albedo (α i ), T snow , the thickness of the surface layer (z 1 ), cloudiness (n) and the turbulent exchange coefficient (C b ). No uncertainties were assigned to the temperature of the lowermost layer in the three layer subsurface model since sensitivity testing of this parameter has previously shown it to have a neglible influence on mass balance .
We aimed to give independent estimates of random and systematic uncertainties for each parameter. However, where both types of uncertainties are of similar magnitude or available data and literature did not allow for individual estimates of uncertainty, we made the assumption that ε s =ε r .
The present uncertainty analysis aims to illustrate an approach to estimating uncertainty in mass balance modeling. However, not all potential sources of uncertainty could be considered. In particular, no uncertainty was assigned to the parameterization of the snow albedo and potential clear sky global radiation. Furthermore, uncertainties in precipitation are treated in a simplified manner. Strasser et al. (2004) use a measurement error of 0.3 • C for unventilated thermometers of an AWS in their study. The ventilated thermometers from MeteoSwiss are believed to measure T a very accurately. Nevertheless, we used 0.3 • C here for both stations because the microclimate at the respective stations (e.g. the Corvatsch Station is located on the roof of a house) may not be fully representative for nearby locations at the same altitude. Additionally, a systematic error of the same magnitude was introduced. The uncertainty assigned to T a refers only to the uncertainty of the measurement at the synoptic weather stations. Total uncertainty in T a at the point AWS is larger because uncertainty in the lapse rate (cf. Sect. 4.2.6) is multiplied by the difference in altitude and combined with the uncertainty of the measurement to obtain total uncertainty in air temperature.

Measured global radiation
A detailed study on measurement errors in S in meas published by Meteo Swiss concluded that after data corrections the remaining uncertainties are of the order of 5 to 10% (Moesch and Zelenka, 2004). We therefore assigned both systematic and random uncertainties of 7.5%.

Measured precipitation
Measuring precipitation, in particular snowfall, is related to large uncertainties reaching 50% or more (Sevruk, 1985(Sevruk, , 1989. In the present model, P is tuned by means of P corr . Hence, we can not directly apply values on uncertainties in precipitation measurements from the literature. Tuning measured P to observed accumulation on the glacier introduces systematic errors since the observations on the glacier are also related to considerable uncertainties: only accumulation can be observed which is not identical to precipitation because the snow cover is subject to snow drifting, melt, riming and sublimation. Furthermore, spatial variability of snow accumulation is large (e.g., Machguth et al., 2006) and there are difficulties in determining the spatial and temporal variability of snow density. We therefore assigned systematic and random uncertainties of 25%. The latter is treated as fully random because precipitation intensity is spatially and temporally highly variable.

Measured relative humidity
It was difficult to find information on uncertainty in e a that goes beyond technical specifications from typical measurement devices. A systematic and a random uncertainty of both 5% was assigned to e a . The chosen values are a rough estimate to take into account uncertainties in the measurement itself, in the assumption that the values are representative and that they can be interpolated linearly inbetween the two stations. Consequently, the uncertainties at the two stations are assumed to be not correlated.

Measured air pressure
Air pressure has a very small influence on the model outcome and thus uncertainty in this parameter is not discussed in detail here. A systematic and a random uncertainty of 100 pa The Cryosphere, 2, 191-204, 2008 www.the-cryosphere.net/2/191/2008/ was assigned to p, uncertainties at the two stations are considered to be uncorrelated.

Lapse rate
To assess the variability of T a we calculated mean lapse rates from measured T a at the four selected meteorological stations in the vicinity of Morteratsch. The mean hourly T a over the time span from 1995-2006 (approx. 100 000 records for each station) is 0.0049 • C m −1 for Corvatsch-Samedan (13 km apart, 1607 m difference in altitude) and 0.0056 • C m −1 for Corvatsch-Weissfluhjoch (46 km apart, 625 m difference in altitude). The corresponding standard deviations are 0.0037 • C m −1 and 0.0022 • C m −1 , respectively. For Corvatsch-Hospizio Bernina (15 km apart, 1007 m difference in altitude), the mean daily lapse rate for the same time period amounts to 0.0062 • C m −1 with a standard deviation of 0.0022 • C m −1 . Taking into consideration the implication that a greater lapse rate will result in correspondingly higher values of T a at the point AWS, it is at first glance surprising that running the model for the 400 days with data from Corvatsch and Weissfluhjoch results in a less negative mass balance (−5.19 m w.e.) than for Corvatsch and Samedan (−5.97 m w.e., see Sect. 2). Replacing Samedan with Weissfluhjoch also results in modifications of e a and p. However, on closer inspection it becomes apparent that the main reason for reduced melt is the difference in summer lapse rates: from 1 May to 30 September, mean hourly T a (1995)(1996)(1997)(1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006) amounts to 0.0061 • C m −1 for Corvatsch-Samedan, 0.0058 • C m −1 for Corvatsch-Weissfluhjoch and mean daily T a is 0.0074 • C m −1 for Corvatsch-Hospizio Bernina.
Although closer to the glacier than Weissfluhjoch, it is questionable as to whether Samedan better represents meteorological conditions at Morteratsch. Samedan is located on a wide valley floor with large diurnal and annual temperature fluctuations. Furthermore, linear regression of hourly T a yields R 2 =0.61 for Corvatsch-Samedan and R 2 =0.96 for Corvatsch-Weissfluhjoch. On the other hand, T a calculated from the latter pair of stations seems rather low, most probably because Weissfluhjoch is situated further to the north, in an area more open to colder air currents from the north and north-west, whereas the area around Morteratsch is more influenced from the south and south-west. These comparisons show that the calculation of T a is very sensitive to the selection of the synoptic weather stations and that there is considerable uncertainty in its value. However, the small number of available station pairs makes it difficult to determine the magnitude and type of the uncertainty. As a rough estimate we assumed a normally distributed systematic uncertainty of 0.0002 • C m −1 .
Furthermore a temporally autocorrelated random uncertainty of 0.001 • C m −1 is assigned to T a . The assumption of temporal autocorrelation is essential here because otherwise strong hourly fluctuations and jumps in T a would result. To determine the typical time span of temporal autocorrelation, we analyzed both twelve year time series of hourly T a for their respective lag autocorrelation (Since Hospizio Bernina are not available on a hourly basis, this analysis could not be carried out for all three station pairs). Semivariograms of the temporal correlation of both time series are depicted for lags of between 1 and 24 000 h in Fig. 4. In the case of Corvatsch-Samedan clear daily and seasonal variations are present whereas the time series Corvatsch-Weissfluhjoch decorrelates rapidly with little daily or seasonal variation. From these figures we conclude, that T a decorrelate within roughly 24 h. Once again, Fig. 4 shows the dilemma of the two pairs of stations: The valley station shows strong diurnal and seasonal fluctuations which are not present when comparing two summit stations. Although these strong fluctuations are large compared to the conditions at the point AWS (e.g., Oerlemans, 2001), in particular during the melt season, the mean values of T a calculated with data from the valley station may still be more representative for the glacier.

Vertical precipitation gradient
According to Schwarb (2000), P is to a certain extent a virtual parameter because vertical and horizontal components of the precipitation distribution can never be fully distinguished. Klok and Oerlemans (2002) obtained P =0.0004 m m −1 a −1 from the same author who applied a comprehensive set of rain-gauge data to a complex interpolation scheme in order to derive spatially distributed P and P at approximately 2 km resolution. Based on the assumption that the methodology of Schwarb (2000) provides a reliable mean value for P in the Morteratsch area while temporal variability around that mean is large, we assigned a moderate systematic uncertainty (0.0001 m m −1 a −1 ) and a large random uncertainty (0.0004 m m −1 a −1 ). Random uncertainties are not temporally correlated and the occurrence of negative P is allowed.

Ice albedo
In the present model α i was fixed to 0.34 in order to have a good representation for the snow free part of Morteratsch Glacier . It is generally stated that the ice albedo is subject to significant small scale variability over short distances (e.g. Knap et al., 1999), thus a single mean value will result in either under-or overestimations for different parts of the glaciers ice surface (according to Klok and Oerlemans (2002), measured α i at the AWS in summer 1999 was significantly lower than 0.34). In order to approximate the errors that result from assigning a fixed mean albedo to a glacier surface with an albedo distribution varying in both space and time (Klok and Oerlemans, 2004;Paul et al., 2005), we assigned a normal distributed systematic uncertainty of 0.05. No random uncertainty was assigned here because α i is not subject to significant random www.the-cryosphere. changes at an hourly time scale and the determination of a typical time span for a temporally autocorrelated random error seems rather difficult and uncertain.

Threshold temperature snowfall
Long term observations of air temperature and snow -rain transitions, compiled by Rohrer (1989) for a meteorological station in Davos (1590 m a.s.l., 45 km north of Morteratsch) show that the average of the transition range from rain to mixed precipitation to snow is somewhere between 0.75 • C and 1.5 • C with a standard deviation of roughly 0.3-0.5 • C. Furthermore, Rohrer (1989) shows that for the example of Davos, a change in both instrumentation and the measurement site resulted in significant change of the mean and the spread of the transition range. We applied a systematic error of 0.3 • C and a random error of 0.5 • C .

Thickness of the surface layer
The present mass balance model contains a three layer subsurface model to compute heat fluxes to and from the glacier. Since melt can only occur when the surface layer has reached the melting point, the chosen thickness of the surface layer (z 1 ) influences mass balance by controlling the time available for melt. Klok and Oerlemans (2002) varied z 1 by 0.11 m, here we assigned a normally distributed systematic uncertainty of 0.055, but since this parameter is initialised as a constant over an individual model run, it is not assigned a random uncertainty.

Cloudiness
In the present model cloudiness (n) plays an important role. It is derived from the ratio: where S in cs is potential global radiation under clear sky conditions. T cl is used as a reduction factor to compute S in and to derive n according to the following relationship, given by Greuell et al. (1997): Consequently, errors in both S in meas and S in cs affect cloudiness. If, for example, S in meas is above the real value, this will result in an overestimation of T cl . However, n is derived from T cl and will be lowered. Finally, since L in depends on cloudiness, it will also be lowered. Cloudiness during night time is interpolated from n before sunset and after sunrise. Consequently additional S in is present only during daytime while L in is affected 24 h a day. Both effects, enhanced S in and lowered L in are of the same order of magnitude. An error in S in meas will therefore shift the ratio of short-to longwave radiation balance but not have a large influence on their summed value.
In order to reduce this back-coupling effect and to account for uncertainties in the parameterisation of n, we first calculated T cl , applied it as a reduction factor, computed n and only afterwards modified n by adding an uncertainty. According to Greuell et al. (1997) Eq. (3) performed very well in explaining the relationship between observed cloudiness (n obs : eight classes, from 0 to 1 in steps of 1/8) and the mean observed T cl per class of n obs . On the other hand, the mean T cl per class were computed from a larger set of individual values of T cl which showed a large variance. Thus we assigned normally distributed uncertainties, consisting of small systematic (0.03) and a larger random uncertainties (0.2). Since the observed variance of T cl for the individual classes of n might be due to differing effects of the various types of clouds, we introduced a temporal autocorrelation to n with a mean time span of 12 h because cloud types typically persist for more than one hour.

Background turbulent exchange coefficient
The value for C b was found by matching measured and modelled melt at AWS for the year 1999 . Since the measurements are not error free (e.g., Braithwaite et al., 1998), the tuning will result in a systematic over-or underestimation of C b . Consequently, we have assigned a systematic uncertainty to this parameter and we selected the same value (0.0005) as used for the parameter sensitivity testing by Klok and Oerlemans (2002). As with surface layer thickness, this parameter is constant over an individual model run, and is thus not assigned a random uncertainty.

Sensitivity tests
Individual sensitivity studies were performed for every parameter listed in Table 1 to assess the combined influence of random and systematic uncertainties (Fig. 5). For reasons of simplicity, sensitivity studies were also conducted by means of Monte Carlo simulations. Modelled mass balance is most sensitive to the prescribed uncertainties in P and T a , followed by T a and α i . Uncertainties in S in meas have a much smaller influence while the impact of the remaining parameters' uncertainties are less than half the most sensitive parameter.

Parametric uncertainty analysis
In order to assess the required number of runs for the Monte Carlo simulations, we plotted the evolution of the standard deviation, and standard deviation of standard deviation, of modelled mass balances over 5000 runs in a Monte Carlo simulation where all systematic and random uncertainties according to Table 1 were applied. Both parameters are depicted in Fig. 6, indicating that fluctuations in standard deviations become small after roughly 1000 runs, suggesting that the chosen number of runs (5000) is likely to deliver stable results. Figure 7 shows the PDF of the model outcome, resulting from a Monte Carlo simulation applying the full set of systematic and random uncertainties. The mean modelled mass balance is −6.02 m w.e., maximum and minimum values are −3.72 m w.e. and −8.69 m w.e., respectively. The standard deviation is 0.71 m w.e. or roughly 10% of total cumulative melt. The PDF shows one distinctive peak at −5.7 m. However, this peak could not be reproduced in further experiments with different initiation values for the random numbers. From the output of the same Monte Carlo simulation, we depicted the temporal evolution of mean mass balance over the calculation period, growth of the related standard deviation and most and least negative of all 5000 runs in   Fig. 8. In winter uncertainty grows with snow fall events, while during dry periods there is virtually no growth in uncertainty. With the onset of melt, uncertainty starts to grow continuously. Note that both the most and least negative of all runs can not be considered as obviously unrealistic: They show an accumulation and an ablation period. Winter accumulation is a few centimetres water equivalent in the most negative and almost one meter of water equivalent in the least negative. These values roughly mark the bounds of natural variability as observed on Morteratsch (cf. Fig. 2). Furthermore, the dates of the disapearence of the snow cover were stored throughout the simulation and their PDF is shown in Fig. 9. The mean date of melt out is day number 138 (according to Klok and Oerlemans (2002) the melt out happened one day earlier). The probability for snow cover disappearance is clearly not normally distributed but shows distinct peaks and troughs. Finally, the Monte Carlo simulations were repeated to estimate the contribution of random and systematic errors separately. Figure 10 shows PDFs accounting for all systematic or all random uncertainties. Corresponding standard deviations are 0.69 m w.e. and 0.14 m w.e., respectively. Obviously, overall uncertainty is dominated by the systematic uncertainty.
However, uncertainty will clearly vary according to the data set used for model input. In order to make this study applicable to a broader audience working with different data (e.g. data from climate models), a further experiment was conducted, evaluating model response to different levels of uncertainty in S in meas and T a . Here, we varied uncertainty in T a from 0 • C to 2 • C in steps of 0.5 • C and uncertainty in S in meas from 0% to 20% in steps of 5%. The uncertainties in all other parameters were varied in the Monte Carlo simulation according to the values given in Table 1, except for uncertainties in T a which were set to zero in order to have full control over the uncertainty in T a . The number of runs was reduced to 1000 for every combination of uncertainties in the two parameters and resulting model uncertainties are listed in Table 2. Of course, it would be an interesting experiment to vary P similar to S in meas and T a since uncertainties in precipitation are particularly large. However, to our opinion a detailed assessment of the various sources of uncertainty in P is a prerequisite and should be addressed in future studies.

Estimating parameter uncertainties
The first step in any sensitivity test, or a parametric uncertainty analysis, is the selection of parameters for sensitivity testing and estimation of their associated ranges. Most previous research on energy and mass-balance modelling has focussed only on sensitivity testing, with parameter ranges based on a variety of sources. In this work, we have recognised the importance of not only random uncertainty, which can be considered akin to instrument precision and accuracy, but also systematic uncertainties which have generally been ignored. An important limitation is the difficulty in estimating values for both random and systematic uncertainties, and where appropriate, temporal autocorrelation of random uncertainties. However, we believe that an approach based on the selection of parameters through literature is an appropriate one, and that all of these sources of uncertainty must be modelled.

Sensitivity tests
Sensitivity testing was conducted to estimate the contribution of individual parameters to overall model uncertainty.
The model appears to be most sensitive to the prescribed uncertainties in P and T a . However, since air temperature at the AWS is calculated from T a at Corvatsch, T a and the difference in altitude from Corvatsch and AWS, the latter has an amplifying effect on the impact of uncertainties in T a . Nevertheless, the discussion in Sect. 4.2.6 indicates that the lapse rate can vary significantly depending on which station pair is chosen. Air temperature is generally considered a well known parameter, however the sensitivity to the prescribed uncertainties in T a , which might be considered conservative, is among the largest. Consequently, the common assumption of neglible uncertainties in air temperature must be regarded as questionable because air temperature has a large influence on the glacier mass balance (Ohmura, 2001). Sensitivity to uncertainties in S in meas are quite small although realistic levels of uncertainty have been applied. However, the impact of errors in S in meas is partly compensated due to the coupling of shortwave and longwave radiation balance through the parameterisation of cloudiness. Furthermore, it should be noted that the impact of the chosen uncertainties in the two investigated internal model parameters (C b and z 1 ) is smaller than the influence of most of the uncertainties related to the meteorological data or ice albedo.

Parametric uncertainty analysis
Several striking results are apparent from the parametric uncertainty analysis. Firstly, the standard deviation in cumulative melt of 0.71 m w.e. seems at first glance to be contradicted by the very good agreement of modelled and observed melt shown in Fig. 2. However, this good result was obtained by a model tuned to a particular location, over 400 days in the same location as to which it is applied. The uncertainty shown in Fig. 7 indicates the impact that typical uncertainties in measurement could have on such a point measurementand which are in this case accounted for, at least to some degree, by the tuning of the model to the point. Furthermore, Fig. 2 also shows the corresponding drift of modelled melt from measured for sites at which the model was not tuned -these differences are much more similar to the uncertainty associated with our PDF, and also indicated by the increase in uncertainty over time as shown in Fig. 8.
Secondly, whilst the PDF of cumulative mass balance is normally distributed, the PDF of the day of melt out has a distinct bimodal appearance (Fig. 9). This is because uncertainty has a much smaller influence on the average melt out day, since meteorological events which occur over relatively short periods have a strong influence on the timing of melt out -for example, an influx of cold air around day 140 reduces the probability of melt out to almost zero, irrespective of uncertainty. To illustrate this in more detail, daily mean T a at Corvatsch is shown in Fig. 9. Until day 110 air temperatures are low, and only in 6 runs has the snow cover already disappeared by then. Temperatures then rise sharply and reach a first peak at day 119 which also marks the onset of a rise in probability for melt out. However, since the snow cover is still quite thick in most runs and its temperature most likely below 0 • C, the rise in probability for snow cover disappearance lags temperature. T a stays high and the longer the period of warm temperatures last, the higher becomes the probability for melt out. A further rise in T a results in a peak at day 133 where the snow cover disapears in more than 8% of all runs. The number of runs with melt out and T a then show a distinct correlation with two sharp drops in probability of melt out, both caused by an influx of cold air. Around day 145 the probability for melt out starts to diminish regardless of the still increasing T a . By this date temperatures have been favourable for snow melt for almost one month (since approx. day 119), the probability that snow cover persists is already low and thus T a and the day of melt out decorrelate. This result points to the importance of choosing appropriate parameters for model validation -the day of melt out is much less sensitive to uncertainty than cumulative mass balance.
Thirdly, we observe that systematic uncertainties contribute much more to overall uncertainty than random uncertainties which tend to cancel one another out. Although this result is perhaps rather obvious, it indicates the importance of considering techniques for estimating and characterising systematic uncertainties, which are generally ignored despite their well known importance in, for example, the homogenisation of long term temperature series (Böhm et al., 2001). There is no reason to assume that measurements or projections made today are not subject to systematic uncertainties -for example, consistent under or over estimation of albedo through a melt season (Paul et al., 2005) -and these should be accounted for in uncertainty analysis.
We calculated uncertainty to be approximately 10% of total melt at one particular point on one glacier. However, for two main reasons, the results of this study should be transferred to other glaciers with caution. Firstly, as already discussed the uncertainties in the input parameters may differ strongly depending on the data sets used. Second, the uncertainty model contains simplifications which may not apply to other sites. The study focuses on a test site where the total melt is about one order of magnitude larger than accumulation. Consequently only a rough estimate of uncertainty in the latter was applied and thus, the uncertainty model can not yet be applied to an entire glacier surface. Although it would be of particular interest to assess the uncertainty of the mass balance calculation for an entire glacier instead of a single point, such an uncertainty assessment would require a more profound analysis and description of the uncertainties in accumulation modelling, and also consideration of appropriate methods for spatially autocorrelated uncertainty analysis.

Impact of variation in individual parameter uncertainty
Many projections in the future are based on relatively simple models, where one or two parameters are varied to explore the response of a system (for example, increases in air temperature). However, these projections are in themselves subject to the uncertainty found in all input parameters. Thus, we have carried out Monte Carlo simulations on two sets of scenarios based on particularly important parameters -air temperature and measured global radiation. The results demonstrate that the growth in overall uncertainty is not a linear function of uncertainty in T a or S in meas , but rather an exponential one. Uncertainty of more than 1 m w.e. is reached for 1 • C of uncertainty in T a which is still a rather conservative estimate, in particular where a mass balance model is applied to unmeasured glaciers or driven by data from climate models. Uncertainty in measured global radiation is of clearly lower impact for reasons discussed in Sect. 4.2.11. This approach allows the assessment of not only scenarios of future change, but the sensitivity of these scenarios to uncertainty to be estimated.

Implications
According to Anderson and Woessner (1992) an uncertainty analysis should be built into modelling strategies from the onset. Models are often validated by comparing their output to measurements. However, since both observations and model results may be uncertain, meaningful model validation requires not only the mean outputs but also their PDFs (Tatang et al., 1997). A parametric uncertainty analysis can contribute to process understanding by helping to identify modelled values, for example in Fig. 3, that show larger deviations from measurement than their level of uncertainty (obviously, uncertainties of the measurements have to be considered as well). For these values it is likely that the model failed due to conceptualisation problems and not due to errors in input parameters. Identifying such problems by explaining why the model has failed at particular points will lead to an improved mass balance model and potentially add to the understanding of process.
Monte Carlo simulations are also a valuable tool in assessing whether output of a mass balance model is more sensitive to the choice of tuning parameters or to uncertainties stemming from model input and parameterizations.
Finally, particularly when we wish to extrapolate in space or time, a parametric uncertainty analysis becomes essential. Since, uncertainties in data values are likely to increase as we extrapolate further from our observations in space and time, it is important to realise that uncertainties are also unlikely to be constant in space or time and modellers should take care not to over interpret results which are simply the mean of PDFs with potentially large standard deviations.

Conclusions
In this work we have calculated the uncertainty in a glacier mass balance model, estimated at a point on the Morteratsch Glacier in Switzerland over a period of 400 days, using uncertainty values for individual meteorological and model parameters. Despite good agreement of the tuned model with observed mass balance over a period of 11 years, we estimate uncertainty in cumulative mass balance of approximately 10%. The uncertainty is dominated by systematic uncertainties in parameters, which in most studies are not considered. Thus, we believe it is important in future work to consider methods of estimating and quantifying systematic uncertainties in typical parameters.
Our estimate of overall uncertainty is most likely a conservative one: Not all potential sources of uncertainty could be considered and precipitation was tuned without assigning uncertainty to the respective tuning parameter. However, to consider all the different aspects of uncertainty in precipitation or accumulation is not trivial. More detailed studies on this subject are required and would add to a more comprehensive understanding of uncertainties in mass balance modelling.
However, the main implication of this work is related to the extrapolation of model results in time and space. This paper shows, that for a well-tuned model with relatively low values for individual uncertainties, one can expect considerable uncertainty in modelled outputs. This in turn implies that, for a glacier where appropriate data are unavailable for tuning, and where the measured input data are less certain, one could expect increased uncertainties in cumulative mass balance. In future work we will explore how we can integrate parametric uncertainty analysis into models which extrapolate glacier mass balance in both space and time.