Inferred basal friction and surface mass balance of the Northeast Greenland Ice Stream using data assimilation of ICESat (Ice Cloud and land Elevation Satellite) surface altimetry and ISSM (Ice Sheet System Model)

. We present a new data assimilation method within the Ice Sheet System Model (ISSM) framework that is capable of assimilating surface altimetry data from missions such as ICESat (Ice Cloud and land Elevation Satellite) into re-constructions of transient ice ﬂow. The new method relies on algorithmic differentiation to compute gradients of objective functions with respect to model forcings. It is applied to the Northeast Greenland Ice Stream, where surface mass balance and basal friction forcings are temporally inverted, resulting in adjusted modeled surface heights that best ﬁt existing altimetry. This new approach allows for a better quantiﬁcation of basal and surface processes and a better understanding of the physical processes currently missing in transient ice-ﬂow models to better capture the important intra- and interannual variability in surface altimetry. It also demonstrates that large spatial and temporal variability is required in model forcings such as surface mass balance and basal friction, variability that can only be explained by including more complex processes such as snowpack compaction at the surface and basal hydrology at the bottom of the ice sheet. This approach is indeed a ﬁrst step towards assimilating the wealth of high spatial resolution altimetry data available from EnviSat, ICESat, Operation IceBridge and CryoSat-2, and that which will be available in the near future with the launch of ICESat-2.


Introduction
Global mean sea level (GMSL) rise observations show an overall budget in which freshwater contribution from the polar ice sheets represents a significant portion White, 2006, 2011;Stocker et al., 2013), which is actually increasing (Velicogna, 2009;Rignot, 2008) relative to thermosteric expansion and contribution from terrestrial glaciers (Gardner et al., 2013). In order to quantify the contribution of polar ice sheets to GMSL in the near future, accurate mass balance projections must therefore be carried out, which can either be based on extrapolation of current trends (Velicogna and Wahr, 2006;Velicogna, 2009;Shepherd and Wingham, 2007;Rignot et al., 2011) or supported by transient ice-flow models that are physically validated against data. Such models, however, as demonstrated in the SeaRISE (Sea-level Response to Ice Sheet Evolution) and ice2sea intercomparison projects, are not fully capable of capturing the present-day trends (Bindschadler et al., 2013;Nowicki et al., 2013a, b), which hinders our ability to project them into the future with a high degree of confidence. Indeed, one of the critical difficulties faced by ice-sheet modelers is the spin-up of iceflow dynamics in a way that matches present-day observations. At the root of the problem is a complicated interaction between (1) paleo-reconstructions that can match the evolution of the ice volume for the Greenland Ice Sheet (GIS) or Antarctic Ice Sheet (AIS) (Pollard and DeConto, 2009;Huybrechts et al., 2011;Ritz et al., 1997;Greve, 1997a) but which fail at capturing present-day ice-flow dynamics and (2) what is now referred to as "instantaneous spin-ups", which rely on inversion of basal friction at the ice-bed interface from satellite-derived surface velocities (MacAyeal, 1993;Morlighem et al., 2010;Seroussi et al., 2013;Price et al., 2011;Arthern and Gudmundsson, 2010), which are more efficient at capturing present-day ice-flow dynamics but lack the long-term trends in the stress and thermal regime of both ice sheets.
In order to bridge the gap between both approaches, so that paleo-reconstructions of ice sheets and present-day dynamic conditions can be captured in one continuous run of an ice-sheet model, there needs to be a shift in the way we approach integrating data into transient ice-flow models. Indeed, there is a wealth of data available to ice-sheet modelers that is not yet fully leveraged to constrain transient ice-flow models of the GIS and AIS. These data include among others (1) ice coring that provides temperature profiles for calibration of thermal models (Greve, 1997b(Greve, , 2005; (2) radar stratigraphy that provides layering and age structure ; (3) sediment coring in proglacial lakes around the GIS (Briner et al., 2010(Briner et al., , 2011 that provides vital information about margin positions throughout the Holocene; (4) high spatial resolution surface altimetry from small-footprint satellite laser altimeters onboard missions such as EnviSat (2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012), ICESat (Ice Cloud andland Elevation Satellite, 2003-2009) and CryoSat-2 (2010present) as well as ESA and NASA airborne laser altimetry campaigns flown as part of PARCA (Program for Arctic Regional Climate Assessment, 1993Assessment, -2008 (Thomas and Investigators, 2001;Thomas et al., 2004) and Operation Ice-Bridge (2009-present) that provide an almost continuous surface height record for the past 20 years (Csatho et al., 2013); and (5) SAR (Synthetic Aperture Radar) data from missions such as ERS-1,2 (European Remote Sensing Satellites), RADARSAT-1,2, ALOS (Advanced Land-Observing Satellite), PALSAR (Phased Array type L-band Synthetic Aperture Radar) and TerraSAR-X, among others, that provide a surface velocity record (albeit relatively discontinuous compared to the altimetry record) for the last 20 years (Joughin et al., 2004a(Joughin et al., , 2010Mouginot et al., 2014). All these data sets are yet to be systematically assimilated into ice-sheet models, to provide better constraints for transient models.
One of the main difficulties in reconstructing past and present ice-sheet flow that is compatible with observations and ice-flow dynamics is the lack of temporally variable assimilation methods. Most attempts at assimilating surface altimetry so far have relied on indirect approaches. The first one is ensemble methods, where sub-sets of model runs that are compatible with present-day conditions of the ice sheet are down-selected (Applegate et al., 2012). This type of method does improve spin-ups and inform about the level of uncertainty inherent in the model runs, but does not yield information on the underlying boundary conditions, and poten-tial corrections that have to be applied (within specific measurement error margins) for the model to converge to presentday conditions. For example, one such boundary condition that is inherently difficult to reproduce from the previous interglacial onwards is surface mass balance (SMB) (Ritz et al., 1997), which is a critical component of the GIS and AIS mass transport. The second method is the so-called flux-correction method Price et al., 2011), where boundary conditions at the ice front are corrected for in order to match time series of observed fluxes. This is a direct tuning approach which, however, can be incompatible with realistic boundary conditions at the ice front. The third and final method is a quasi-static approach, in which surface velocities are used at different snapshots in time to invert for parameters such as basal yield stress. This approach is used in particular in Habermann et al. (2013) to understand the dynamics of Jakobshavn Isbrae in the past 2 decades. Such methods are indeed precursors to more sophisticated methods which can tightly couple mass transport and stress balance in time-dependent assimilations.
For example, new approaches are emerging, based on time-dependent adjoint modeling Goldberg and Heimbach, 2013), which show great promise in potentially enabling sophisticated data assimilation of disparate data sets into ice-sheet models. Here, we propose one such approach, based on algorithmic differentiation of the Ice Sheet System Model (ISSM), to improve assimilation of surface altimetry data (over the entire ICESat time period) into a transient reconstruction of the ice-flow dynamics of the Northeast Greenland Ice Stream (NEGIS). The goal here is to invert for temporal forcings, namely SMB and basal friction at the ice-bed interface, such that best fit to surface altimetry data is achieved, while respecting the constraints inherent in the physics of an ice-sheet-flow model. On NEGIS, altimetry data provided by ICESat display high levels of spatial and temporal variability, which presents a good case scenario for investigating how model forcings need to be corrected for in order to replicate such variability in surface height, and what are potentially missing components in the way we model surface and basal processes that should be improved accordingly. In addition, recent studies suggest that NEGIS is undergoing dynamic thinning linked to regional warming (Khan et al., 2014), which will exhibit characteristic surface altimetry signatures that should be investigated, especially for a basin that previously was considered stable.
This study is structured as follows: in the next section we describe our forward transient model, the equations, the objective function we are interested in and the algorithmic differentiation approach to computing gradients of such an objective function along with the temporal inversion algorithm necessary to infer temporally variable model forcings that best fit observations. In Sect. 3, we describe the altimetry time series used on NEGIS, our spin-up methodology and all model inputs to our runs. Sections 4 and 5 present our results and discussion, respectively, and the final section concludes on the implications of our new approach towards assimilating altimetry data into projections of ice-sheet mass balance.

Model
Underlying our assimilation of surface altimetry is a transient ice-flow model, which can simulate the temporal evolution of an ice stream. We refer to this model as our forward model. This model is implemented within the Ice Sheet System Model (Larour et al., 2012c;Morlighem et al., 2010). Here, we are interested in the best fit between modeled surface heights and available altimetry observations. The gradient of this diagnostic value with respect to transient forcings (basal friction at the ice-bedrock interface and SMB) can then be used within an inverse method to invert for such forcings, while simultaneously improving our best fit to observations. In this section, we describe our forward model, the computation of the objective cost function and the methodology behind the computation of the gradient as well as the inverse method itself.

Forward model
The flow of NEGIS is characterized by low basal shear stress across the entire basin Schlegel et al., 2013), resulting in high velocities  deep inland towards the ice divide (cf. Fig. 1). Such flow exhibits low vertical shear stress and can therefore be realistically described using the shelfy-stream approximation (SSA) (MacAyeal, 1989). This formulation is a simplification of the full-Stokes equations describing the stress balance of an ice sheet. The simplifications involved include (1) neglecting the ice-flow acceleration (Reist, 2005), (2) neglecting horizontal gradients of vertical velocities compared to vertical gradients of horizontal velocities (Blatter, 1995;Pattyn, 2003), (3) neglecting bridging effects (van der Veen and Whillans, 1989) and (4) neglecting vertical shear altogether (which encompasses assumption 2 also) (MacAyeal, 1989).
Using (1)-(4), the stress balance equilibrium can be reduced to the following two equations expressed in terms of the depth-averaged horizontal velocity (u, v):  Rignot and Mouginot (2012) are displayed overlayed image map (Haran et al., 2013). The data is projecte North projection (EPSG:3411) with central meridian marked in yellow, I and II are used in Fig. (7). the upper surface elevation and (τ bx ,τ by ) the x, y components of the basal shear stress at the ice-bedrock interface. In this formulation, horizontal velocity is considered independent of z, and vertical velocity can be recovered through the incompressibility condition. In the case of NEGIS, the simplifying assumptions (1)-(4) are valid for the entirety of the basin, excluding some very specific areas where some departure to the SSA formulation occurs. For more details on these areas, we refer to Schlegel et al. (2013). The basal shear τ b expressed in terms of horizontal components of the basal shear (τ bx ,τ by ) in Eqs. (1) and (2) is a model forcing which we describe using a viscous linear relationship (MacAyeal, 1989(MacAyeal, , 1993: where v is the velocity parallel to the ice-bedrock interface, approximated here as (u, v); α the friction coefficient; and N eff the effective water pressure (taken here equal to the overburden pressure computed at the ice base). Viscosity in Eqs.
(1) and (2) is described using the following Norton-Hoff law (Glen, 1955): where B is the ice hardness, n Glen's law exponent andε e the effective strain rate. B is temperature dependent and follows an Arrhenius type law calibrated from Paterson (1994). The SSA formulation solves for the stress balance. However, mass conservation also needs to be ensured through the following mass transport equation: where h is the ice thickness; v = (u, v) the depth-averaged velocity; M s the surface mass balance (m a −1 of ice equivalent), positive for accumulation and negative for ablation; and M b the basal melting rate (m a −1 in ice equivalent), positive for melting and negative for freezing. The thermal regime of the ice is not captured in our transient ice-flow model. It is initialized using a thermal steadystate model described in Larour et al. (2012b), which is then assumed constant through time. We believe this approximation to be realistic, assuming there are no short-term thermal transients that develop between 2003 and 2009, the length of our data record .
In terms of boundary conditions, Eqs.
(1) and (2) are solved using observed Dirichlet surface velocity constraints at the boundaries inland, stress-free surface; friction at the ice-bedrock interface (described by Eq. 3); and water pressure at the ice front, which, when depth-averaged, results in the following condition: where σ is the stress tensor, n the unit outward-pointing normal vector at the ice front, ρ w the water density and b the elevation of the ice lower surface. At the ice front, the boundary condition for mass transport (Eq. 5) is specified as a free-flux boundary condition, where the calving rate is taken equal to the normal velocity at the ice front. Equations (1), (2), (4) and (5), along with corresponding boundary conditions, can be discretized and solved implicitly in time using the finite-element method (FEM). We refer to Larour et al. (2012c) for more details on the FEM discretization as well as numerical schemes to handle the material nonlinearity and the stability of our time stepping.

Cost function
Our forward model computes, from model inputs α (friction coefficient) and M s (surface mass balance), time series of surface heights s(t) and depth-averaged horizontal velocities (u(t), v(t)). The diagnostic quantity considered here for our forward model is, however, not the surface height s(t) nor horizontal velocities (u(t), v(t)), but rather the cost function J which describes the time and space-averaged squared difference between the modeled surface heights s(t) and the observed surface elevations from ICESat altimetry. If we name s(t) obs the time-evolving altimetry observations, we can define our cost function as where is the spatial domain (here the entire NEGIS basin), S its surface extent and [0,T] the time domain over which ICESat observations are available. This cost function describes the time-averaged and space-averaged misfit to altimetry observations. It can be decomposed also into a spatial average of a local time-averaged misfit J T (x, y), as follows: where J T is defined as In the remainder of our study, we assimilate temporally variable friction α(t) and surface mass balance M s (t) in order to minimize the cost function J = J (F (α(t), M s (t))), where F is the forward model described in Eqs. (1), (2), (4) and (5). This means that we temporally invert for the friction and SMB that best fit our observations. Our initial values for both forcings come from (1) the time variable SMB from Box (2013) and (2) a time-constant friction which is variable in space, inferred using an adjoint-based inversion of existing present-day surface velocities (Morlighem et al., 2010;Rignot and Mouginot, 2012). The main results of our temporal inversion are temporal corrections to these forcings, and an improved best fit (or minimized J ) to observations.

Algorithmic differentiation of the gradient of J
The basis for inverting forcings α and M s is the determination of time-and space-dependent gradients of our cost function J , namely ∂J /∂α(x, y, t) and ∂J /∂M s (x, y, t). A common approach in the cryosphere community to obtain gradients of forward models has been to rely on the adjoint method (MacAyeal, 1993;Rommelaere andMacAyeal, 1997 Vieli andPayne, 2003;Joughin et al., 2004b;Larour et al., 2005;Vieli et al., 2006;Khazendar et al., 2007Khazendar et al., , 2009Morlighem et al., 2010;Arthern and Gudmundsson, 2010). The approach consists in analytically deriving the adjoint state of the forward model, which allows for an easy computation of the The Cryosphere, 8, 2335-2351, 2014 www.the-cryosphere.net/8/2335/2014/ gradient of the cost function. This approach works particularly well for self-adjoint models, which is the case for the stress balance equations of ice flow when the rheology is considered linear viscous. When nonlinearities are present in the model, as is the case when relying on a material law such as (Eq. 4) with n = 1, the adjoint approach can still be viable if the problem is linearized  or if an exact adjoint can be analytically derived. For more complex ice-flow models, however, the adjoint state is usually not easily derivable, and other methods have to be considered to compute cost-function gradients. The first one is to rely on approximation by forward differencing, as in Larour et al. (2012b, a) and Schlegel et al. (2013). This method is flexible (it can be applied to any type of forward model), but it is computationally expensive. Indeed, for a given cost function and a model input of size m (m being the number of degrees of freedom, such as the mesh size for an FEM discretization, or the grid size for a finite-difference discretization), this method requires at least m+1 computations of the forward model. For transient ice-flow models which are computationally expensive, this is therefore impractical. Furthermore, for highly nonlinear models the choice of perturbation greatly affects the quality of the approximation.
The second method relies on algorithmic differentiation (AD) of the forward model, where the derivative computation is enabled by semantically augmenting the computer program that implements F (Griewank and Walther, 2008). The AD approach has been implemented for common programming languages in a variety of tools. Such tools include source-to-source transformation frameworks (TAF, Giering et al. (2005); OpenAD, Utke et al. (2008); and Tapenade, Hascoët (2004)) and/or overloaded operator frameworks (ADOL-C (Automatic Differentiation of Algorithms), Griewank et al. (1996); Walter et al. (2012)). AD tools can automatically generate derivatives (first-order gradients, Taylor type developments, Hessians) of the forward model at machine precision and at a computational cost that, unlike the cost of forward-differencing methods, is a small fixed factor independent of m. Typically, source-to-source transformation tools can compute a gradient of the forward model for 4-10 times the cost of the forward run itself. This can be somewhat higher (but still fixed) for overloaded-operator approaches, which are not computationally as efficient. AD tools have been leveraged extensively in the oceanic context for state-of-the art ocean models (Marotzke et al., 1999;Heimbach et al., 2002;, and more recently to ice-sheet models (Heimbach and Bugnion, 2009;Goldberg and Heimbach, 2013).
In this paper we work with ISSM, which is written in C++ and uses a variety of object-oriented features. Because of the complexity of the C++ syntax and semantics, there currently are no AD tool for the comprehensive source code transformation of C++ models. Thus the operator-overloading approach becomes the method of choice. In this type of approach, the floating-point operations on quantities which are part of the computation to be differentiated are recorded into a tape during the forward run. This tape can then be interpreted in a reverse sweep to compute the gradient of the cost function with respect to model inputs, using the chain rule in reverse order. Here, the underlying library used to implement the overloading of our floating-point operations in ISSM is ADOL-C (Walter et al., 2012).
Testing and validation of the modifications were carried out against the forward difference approach implemented in Larour et al. (2012b). Benchmarks for computation times were also carried out, showing a computation time for the gradient of the cost function with respect to either α or M s on the order of 4 times the computation time for the forward model. This is for models on the order of 2000 degrees of freedom, which represents a very efficient ratio of performance. In reality, this ratio is much lower than the currently accepted ratio of 10 expected of operator-overloading approaches. The reason for this discrepancy lies in the fact that our stress-balance and mass-transport solvers are not fully scalable (Larour et al., 2012c). Our computation time, irrespective of whether it is carried out in forward or AD mode, is therefore 90 % constrained by the solver, and not the automatic differentiation phase. The ratio of 4 is therefore not representative of what is expected for a fully scalable solution.

Inverse method
If we restrict ourselves to the case of temporally inverting basal friction (the same logic applies to surface mass balance), we can compute ∂J /∂α(x, y, t), the gradient of our cost function J with respect to basal friction α(x, y, t). The cost function is computed using the forward model J = J (F (α(x, y, t))). α is variable both in space and time, as is the gradient itself. Using a classic steepest descent along the vectorial direction set by ∂J /∂α(x, y, t), we can infer an update α to the initial α 0 , such that α = α 0 + α leads to a simulated surface height evolution that minimizes the cost function. Once a minimum J is reached, we have effectively determined a new "inverted" α, for which the modeled icesheet height best fits surface altimetry. The difference between this algorithm and the classic approach presented in inversion studies such as MacAyeal (1993), Morlighem et al. (2010) and Arthern and Gudmundsson (2010) is the fact that we do not need to hand-derive the adjoint state of the model, as it is automatically computed by the AD gradient operator, and that the inversion is temporal in nature.
This inversion is also a data assimilation in that we compute corrections that have to be applied to an existing time series of forcings (here basal friction or surface mass balance) in order for a certain objective function of our model to match observations. Here, we do not invert for both forcings α and M s at the same time. Rather, we invert α given observed surface mass balance M s , and vice-versa. This approach allows us to better understand which parameter results in the best assimilation of existing altimetry observations, and what type of physical processes are involved in best fitting the model results to observed data.

Altimetry
Elevation changes during the 2003-2009 period were reconstructed from ICESat laser altimetry observations using the Surface Elevation Reconstruction And Change detection (SERAC) approach (Schenk and Csatho, 2012). This comprehensive method was developed to determine surface changes by a reconstruction of the surface topography. Although general in the design, SERAC was specifically developed for detecting changes in ice-sheet elevation from ICE-Sat crossover areas. It addresses the problem of computing a surface and surface elevation changes from discrete, irregularly distributed samples of the changing surface. In every sampling period, the distribution and density of the acquired laser points are different. The method is based on fitting an analytical function to the laser points of a surface patch, for example a crossover area, size 1 km × 1 km, or within repeat tracks, for estimating the ice-sheet surface topography. The assumption that a surface patch can be well approximated by analytical functions, e.g., low-order polynomials, is supported by various roughness studies of ice sheets . Considering physical properties of solid surfaces and the rather small size of surface patches suggests that the surface patches are only subject to elevation changes but no significant deformations. That is, the shapes of the 1 by 1 km surface patches centered at the crossovers remain constant and only the vertical position changes as it is confirmed by the low surface fitting errors obtained by SERAC (Schenk and Csatho, 2012). It is important to realize that SERAC determines elevation changes as the difference between surfaces, unlike other methods that take the difference between identical points of two surfaces.
Laser points of all time epochs of a surface patch contribute to the shape parameters, while the laser points of each time period determine the absolute elevation of the surface patch of that period. Since there are many more laser points than unknowns, surface elevation and shape parameters are recovered by a least squares adjustment whose target function minimizes the square sum of residuals between the fitted surface and the data points. The large redundancy makes the surface recovery and elevation change detection very accurate and robust. Moreover, the confidence of the results is quantified by rigorous error propagation.
With SERAC we reconstructed time series of elevation change histories at 837 ICESat ground-track crossover locations within the NEGIS drainage basin (Fig. 2). Assuming that the laser points entering the adjustment model are uncorrelated and have all the same weight, the random errors of elevation as a function of time are determined from the variance-covariance matrix of the normal equation. Elevation changes were corrected to remove the effect of vertical crustal motion due to glacial isostatic adjustment (GIA) and variations of firn compaction rates in [2003][2004][2005][2006][2007][2008][2009]. Indeed, ISSM is a model that relies on the assumption of incompressible ice flow, and the surface elevation must therefore be converted form a snow/ice equivalent (where density throughout the firn layer is variable) to an ice equivalent (where we assume a through-thickness density profile that is constant and equal to 917 kg m −3 ). GIA-related vertical crustal motion estimates are from A. G. et al. (2013), based on ICE-5G ice history, a Mantel Viscosity Model 2 (VM2, Peltier, 2004) and a 1 • by 1 • mesh. Estimates range from −2.7 to 4.6 mm yr −1 , with errors that are negligible compared with elevation changes due to other factors. Variations of firn compaction rates above the equilibrium line altitude (ELA) are from a 5 × 5 km gridded model forced by the output from the HIRHAM5 (subset of the HIRLAM (High Resolution Limited Area Model) and ECHAM (EC Hamburg) models) regional climate model (Sørensen et al., 2011;Lucas-Picher et al., 2012) and range from −0.016 to 0.146 m yr −1 (negative values of firn densification correspond here to cases where pore space increases relatively by addition of fresh new snow). Finally the corrected elevation changes were converted into ice-equivalent elevation changes using a constant ice density of 917 kg m −3 in the ablation and superimposed ice zones and a simple firn-densification model from (Reeh et al., 2005) and (Reeh, 2008) in the accumulation zone. This model assumes that all retained melt water (Superimposed Ice Remaining at the end of the melt season (SIR) = melt − runoff) refreezes at the same annual layer at the end of each balance year (31 August), giving where ρ s is the density of the annual firn layer on the surface; "SIR" is the amount of refrozen ice, estimated as the difference between the annual melt and runoff; M s is the annual net surface mass balance (all from RACMO2/GR, Ettema et al., 2009; van Angelen et al., 2012); ρ ice = 0.917 kg m −3 ; and ρ 0 is the temperature-dependent density of new firn before the formation of ice lenses. The density of the new firn is calculated from the following empirical relationship: ρ 0 = 625 + 18.7T f + 0.293T 2 f and T f = TMA + 26.6SIR, where T f is the firn temperature at 10 m depth and "TMA" is the mean annual temperature (Reeh et al., 2005).
Typical elevation errors for crossover areas higher up on the ice sheet, involving some 100 observations, are about of the firn-densification rate is 0.001 m yr −1 near the ice divide, increasing to 0.03 m yr −1 at the ELA (Sørensen et al., 2011), resulting in a 0.0025-0.075 m total additional elevation error during the ICESat mission. In addition to the GIA, vertical bedrock motion includes the elastic crustal response to contemporary ice-sheet mass changes; e.g., Khan et al. (2010), Sørensen et al. (2011). Uplift rates measured by the Greenland GPS Network (GNET) significantly exceed the predicted GIA rates, indicating that the present-day vertical crustal deformation is dominated by this elastic response (Bevis et al., 2012). Although the magnitude of the elastic response can reach a few cm yr −1 in the coastal regions, we have not attempted to remove it, because of the low temporal resolution of current best reconstructions (3-year averages, Abbas Khan, personal communication, 2014) and the lack of error estimates.

Spin-up and model inputs
Since our assimilation method does not adjust initial conditions of the model, we have to rely on a spun-up model state which more or less closely matches surface altimetry observations for the time period considered. Indeed, the success of inverse methods applied to nonlinear problems often relies, in practice, on initial guesses of the independent variables that yield states that are not too far from observations. However, as demonstrated by the wide range of model outcomes in the SeaRISE experiments (Nowicki et al., 2013a, b), spin-ups are very difficult to calibrate. One approach is to run long-term paleo-reconstructions of the GIS in ways that try to match present-day observations (Huybrechts and Oerlemans, 1988;Ritz et al., 1997;Pollard et al., 2005;Pollard and DeConto, 2009;Greve, 1997a). This approach is usually biased towards conservation of mass, where the diagnostics of choice is the ice thickness. It also usually relies on lower-order ice-flow models, such as the Shallow- Ice Approximation (Hutter, 1982), which are computationally efficient but tend to lead to large misfits to observed surface velocities.
Another approach is to rely on what has sometimes been described as "instantaneous spin-ups", in which inversion methods are used to try to capture the dynamics of ice flow at present time. This involves inferring basal friction at the ice-bed interface in order to match present-day observed surface velocities (MacAyeal, 1993;Morlighem et al., 2010;Arthern and Gudmundsson, 2010;Joughin et al., 2010;Petra et al., 2012). However, this approach relies on a steadystate thermal and mechanical regime for the ice sheet, which is not realistic, and usually leads to lumping any mismatch between model and observations into the inversion itself. In addition, as demonstrated on Nioghalvfjerdsfjorden Glacier (hereafter referred to as 79 North), artifacts in the interpolation of bedrock data can lead to ice-flux divergence anomalies, which are not physical (Rasmussen, 1988;Seroussi et al., 2011) over timescales of 10 to 50 years. This can be mitigated using a mass-conserving (MC) interpolation approach , allowing for transients that spin up in ways that match present-day ice velocities and iceflux divergence. Here, however, we are interested in variations in surface heights that are small and could easily be confused for residual ice-flux divergence anomalies that remain even after implementation of the MC approach.
We therefore opt for the different approach of combining both spin-up methods. We carry out an adjoint-based inversion of the basal friction coefficient (we refer the reader to Morlighem et al. (2010) and Larour et al. (2012c) for details on the implementation of this inversion within ISSM) using a MC bedrock of the area , followed by a relaxation of the ice sheet/ice shelf over a period of 50 000 years (in which forcings are kept constant), until the NEGIS ice volume stabilizes. The climate forcing is represented by a constant SMB taken equal to the average value between 1971 and 1988, when the GIS was considered more or less in steady-state balance . This is then followed by a forcing of the ice-sheet evolution starting from the Little Ice Age (LIA) in 1840 to the start of our assimilation, in 2003, using the SMB time series from Box (2013). This ensures that our spin-up does not exhibit ice-flux divergence anomalies, matches closely the present-day observed surface velocities (±100 m yr −1 over the whole basin, ±300 m yr −1 at the grounding line) and responds to variations in climate forcing over the last 173 years.
The ice boundaries for the NEGIS domain are determined by the position of the ice divide (as determined from the gradient direction of ice-sheet surface topography; ; see Fig. 3f) and ice-ocean as well as ice-rock boundaries from the Greenland Ice Mapping Project (GIMP; Howat et al., 2014). We also make sure, given the bedrock is computed using the MC approach (Fig. 3b), that the domain covers the extent of the 2008 InSAR (interferometric synthetic aperture radar) surface velocities, as shown in Fig. 3f, includ-  (Fig. 3a). The resulting ice thickness is shown in Fig. 3c, from subtracting the MC bedrock to the surface height. SMB for the 1971-1988 period comes from the Box (2013) time series (Fig. 3d). α, the basal drag coefficient used for the entire length of the relaxation period, as well as the 2003 to 2009 run, is inverted from the 2008 InSAR surface velocities (Fig. 3e). The underlying mesh for the FEM model (Fig. 3f) comprises 1409 elements, for a resolution ranging from 70 km near the ice divide to 30 km at the ELA and 5 km on the 79 North and Zachariae Isstrøm ice shelves. The altimetry data are interpolated onto the mesh vertices using a linear interpolation algorithm, between February 2003 and September 2009, when continuous data are available.
Starting 2003, the model is run at a 2-week time step, in order to coincide with the time sampling of the surface altimetry observations, and the cost function J is computed for the entire 2003-2009 time period. The inversion is carried out twice, once for M s and once for α. A simultaneous inversion for both parameters would be more efficient and realistic; however, the ISSM framework does not yet possess the capability to do so, but this type of inversion will definitely be the subject of future studies. Because the model spin-up does not reach a configuration that matches the altimetry time series within a 1σ standard deviation, a spatially variable time-mean bias is removed from the altimetry observations, corresponding to the difference between the 2006 DEM and the spun-up surface height (Fig. 4). This implies that we are here interested in what corrections have to be applied to M s and α to match short-term transient variations in surface height. As to longer term trends, we cannot match them as they depend on assimilating data over much longer time spans, which is not feasible as comprehensive altimetry data coverage on NEGIS prior to 2003 is not available.

Results
Gradients computed for the temporal inversion of α and M s exhibit high variability both in space and time (Fig. 5). This is expected given the quadratic nature of the cost function relied upon for the temporal inversion. Indeed, for our cost function (Eq. 7), the gradient to, for example, α will be of the form showing that the gradient is driven by the misfit s(t)−s obs (t), which is itself time-space variable. Peaks in the magnitude of both gradients are highly localized, with, for example, ∂J /∂α exhibiting clear peaks between 2003 and 2009 over Storstrømmen, which is a surgerecovering glacier (Reeh et al., 1994) where we indeed expect large variations in surface height. Such peaks can reverse sign in time, as is the case for ∂J /∂α 90 km upstream from Zachariae Isstrøm's ice front, which is positive in September 2003 and then turns negative starting June 2006. This is in contrast with vast expanses of the NEGIS domain where gradients can be largely constant in space and time. For example, ∂J /∂α and ∂J /∂M s are almost nil 100 km upstream of 79 North and Zachariae Isstrøm. For ∂J /∂α, this is probably related to the fact that the initial misfit to observations is smaller inland (Fig. 8d), which according to Eq. (11) implies that the gradient in the same area will be relatively small. For ∂J /∂M s , the lack of variation inland suggests that the overall trend in the forward model cap- tures the inland surface height variations realistically, meaning that corrections to the Box (2013) time series will not be significant, and no obvious bias is exhibited by our forward model. Over the mountain ranges between Storstrømmen and Zachariae Isstrøm, variations in ∂J /∂M s are high both temporally and spatially, suggesting that the time series of SMB has a complex signature that may not be fully captured by the SMB forcing. In particular, ∂J /∂M s is positive in September 2003, suggesting that improvements in our best fit to observations will be achieved by decreasing SMB over the mountain range. The situation reverses, with ∂J /∂M s turning negative in June 2006, pointing to the need for increasing SMB over this time period. In December 2009, the situation reverses again.
Overall, the variability in both gradients closely follow the variability in the ICESat surface height time series (Fig. 2). This clearly implies that our model lacks the intrinsic variability required to match observations. In addition, there is a clear demarcation line in the gradient, which runs perpendicular to flow, that coincides with an abrupt transition in ice thickness (see Fig. 3c, 1000 m contour in black) across the entire NEGIS domain. Upstream of this transition line, gradients become much more uniform and diffuse in space, though short variations in time remain significant. Downstream of this line, spatial variations become much sharper, with features developing on the order of 10-20 km, which could again be related to the fact that gradients above this line are smaller due to an initial low local misfit. Figure 6 shows the evolution of J during both inversions, over 35 iterations, after which convergence of the optimization is stopped, for considerations that are computational in nature. Both curves clearly demonstrate that corrections in the SMB time series (computed by the temporal inversion) are much more efficient in terms of reducing the overall misfit to observations than corrections in the basal friction coefficient. This is expected, as SMB is a direct forcing to the mass transport equation (Eq. 5), with a clear equivalence between SMB and surface thickening rate, while basal friction is a direct forcing to the stress-balance equations (Eqs. 1 and 2), which have no direct bearing on the surface thickening rate. In other words, it is much easier for the inversion algorithm to match surface heights by adding or subtracting mass directly from the ice column thickness (which is what SMB captures) than by modifying the state of stress at the ice-bed interface. The difference in convergence between varying alpha or M s is significant, with the SMB inversion reducing misfit J by 68 %, and the basal friction inversion reducing misfit by 14 %. This is in line with how observed surface heights are matched by the model at locations I and II (see Fig. 1). The first location correspond to the trunk of 79 North, while the second location corresponds to Zachariae Isstrøm, near the grounding line. Both locations are in areas of enhanced ice flow. At location I, the inversion increases basal friction over the entire ICESat time period, and the modeled surface height increases by approximately 20 cm starting 2007 (see Fig. 7a and b). The resulting improvement over the initial modeled surface height is not obvious, however, and points rather to an increase in the misfit. This increase is localized, and points to an intrinsic inability of the model to match surface heights at this location through variations in basal friciton. For location II, however, basal friction is decreased by the inversion between 2004 and 2008, and it results in a much better fit to local surface heights. At this location, the model is therefore capable of correcting the basal forcing appropriately to match observations. The local nature of the improvements is confirmed in Fig. 8, which shows that location I is in an area of increase of the misfit to observations, while location II is in the area that sees the most improvement.
For the SMB inversion at location I ( Fig. 7c and d), an increases in SMB slightly prior to 2006 and a significant decrease by almost 30 cm yr −1 after 2007 is modeled. The maximum decrease occurs around summer 2008. The resulting modeled surface height matches observations well, with a decrease in the modeled height reaching up to 60 cm in year 2009. The situation is very similar for location II, albeit with one difference: the magnitude of the SMB correction, which is very large at location II, with significantly more melting modeled by the inversion. Overall, the SMB inversion improves the best fit to observations much better than the basal drag inversion, as confirmed by  Fig. 7. Improvement in the best-fit between modeled surface height and the altimetry record after inversion of basal friction α and surface mass balance M s for the two locations indicated in Fig.(3f) corresponding to the center of 79 North (frames a),b),c) and d)) and a location near the grounding line of Zachariae Isstrøm (frames (e), f), g) and h)). Frames a) and e) show the difference between the original time series α 0 of friction (in black), and the inverted one α (in red). Frames b) and f) show the improvement in the modeled surface height (in red) best-fit to observations (in blue) compared to the original model (in black). Frames c) and g), d) and h) show similar results as for a) and e) and b) and f), with surface mass balance M s being the quantity inverted for instead of α. Errors in observed surface height time series for locations I and II, which are below the ELA, depend on the SERAC processing only (ice is assumed fully dense, with no firn compaction involved), and are estimated respectively at 8 cm and 1.2 m. 32 Figure 7. Improvement in the best fit between modeled surface height and the altimetry record after inversion of basal friction α and surface mass balance M s for the two locations indicated in Fig. (3f) Errors in observed surface height time series for locations I and II, which are below the ELA, depend on the SERAC processing only (ice is assumed fully dense, with no firn compaction involved) and are estimated respectively at 8 cm and 1.2 m. that the structure of the correction tightly matches the structure of the underlying SMB time series itself. Indeed, SMB is corrected mainly between peak summer rates, with the peaks themselves being preserved. Almost no correction to the summer values is detected, which is interesting given that the time step used for the transient runs is 2 weeks, which is short enough to allow the inversion to capture and correct peak summer rates. This suggests that modification in the summer peak magnitude does not profoundly impact the best fit to observed altimetry time series, and that the average summer-to-summer value is what critically controls the interannual variability in surface heights. It must be noted, however, that the inversion also predicts negative values for the melt rate throughout years 2005-2008, which is highly unrealistic. This is due to the fact that our SMB inversion does not constrain the variations of M s within a certain error range, as should be the case. This is certainly an aspect of our method that will be improved in further studies, in order to account for realistic variations in the model inputs.
Overall, the best fit to observations is improved by the inversion. However, locally, the improvement can be widely different. Figure 8 shows how J T (x, y) (cf. Eq. 9) is spatially distributed across the entire basin after inversion and what the resulting improvement J T (x, y) is. As expected, J T is much lower after inversion of SMB as opposed to basal friction (Fig. 8b vs. e, respectively). In terms of local improvement, J T is largest near the coastline, while it is more diffuse across the entire basin. For SMB, a greater decrease occurs near the main trunks of 79 North and Zachariae Isstrøm, but, the initial value of the misfit being also much higher, this still results in large misfit values near the coastline. For basal friction, the descent is much more difficult for the trunks of the two glaciers, with clear decreases of the misfit near the grounding line but small increases directly upstream of the grounding line. Indeed, as demonstrated by Fig. 7 and confirmed in Fig. 8, for location I the local misfit does increase for the basal drag inversion.

E. Larour et al.: NEGIS data assimilation of surface altimetry
This is compensated by large decreases near the grounding line and ice shelves of 79 North and Zachariae Isstrøm. Overall, a significant amount of misfit still remains at the coastline, where both inversions seem to be unable to further accommodate for short-term variations in surface height. Near the ice divide, the SMB inversion improves the best fit most, which can be explained by the uniformity of the ∂J /∂M s gradient in this area, allowing for a fast descent of the inversion algorithm. Of course, the best fit to observation depends on whether the convergence has been reached, and further improvements might be expected if the number of iterations is increased. This is especially true for the SMB inversion, which exhibits variations in J of 0.6 % for each iteration, after 35 iterations, as opposed to 0.15 % for the basal friction inversion. However, these values are significantly below the 1 % threshold, and we therefore do not expect large differences compared to a fully converged inversion.

Discussion
Our temporal inversion allows for the determination of forcings that best fit observations and that satisfy the physics described in the forward model. This is to our knowledge the first time this approach has been successfully carried out using an SSA-based transient ice-flow model, without relying on arbitrary tuning of surface and basal forcings. Our results clearly demonstrate that high variability in the model forcings is necessary to reproduce NEGIS surface altimetry from 2003 to 2009. Such variability is exhibited both in the inversion of M s and in the inversion of α, which suggests that our forward model lacks internal representation of such variability, and that enhancing our representation of boundary conditions M s and α in the forward model is therefore necessary.
At the surface, M s accounts for changes in the altimetry in terms of ice-equivalent mass (see Eq. 5). This is compatible with our ice-flow model, which is based on the assumption of incompressible ice flow, in which ice density is constant and equal to 917 kg m −3 . In order for the altimetry to be converted from the original surface to an iceequivalent surface, two firn-densification models were used. In the percolation-wet-snow-superimposed ice zones, the firn-densification model from Reeh et al. (2005) and Reeh (2008) was used, which includes the effect of ice lensing and is forced by the RACMO2/GR data (Ettema et al., 2009;van Angelen et al., 2012). Above the ELA, the firndensification model from Lucas-Picher et al. (2012) forced by HIRHAM5 climatologies was used, which accounts for densification through pore-space closure. Both models depend on atmospheric constraints such as accumulation rate, surface temperature and surface snow density. By relying on ice-equivalent thickness, it is therefore difficult to attribute which component of the variability exhibited in M s is due to the variability in the climate forcings used in the firn-densification formulation (such as surface snow den-sity, ice-lens content, accumulation rate and surface temperature). Therefore, while our approach clearly demonstrates that SMB time series for the area need to be corrected for, it also shows that without clear representation of firndensification processes in the forward model we cannot improve our understanding of which atmospheric and/or surface processes are most responsible for the surface height signature of NEGIS. It is therefore our intention to refine our approach in further studies, towards temporally inverting for surface snow density, surface temperature and accumulation rate. The goal is to understand which one of these processes is responsible for most of the variability observed on NEGIS. Ultimately, the hope is that inclusion of a firn-densification representation in the forward model will lead to increasingly smaller corrections required on the corresponding forcings, thus ensuring that the model itself intrinsically captures the observed surface height variability.
At the ice-bed interface, the basal drag coefficient exhibits high spatial and temporal variability which could be due to the underlying basal hydrology. Indeed, though a clear relationship has to our knowledge never been demonstrated, calibrated or measured between basal stress τ b (or driving stress τ d ) and sub-glacial water pressure w, empirical arguments such as in Alley (1989) suggest a relationship of the type N eff = k n τ d w , where k n is a basin-scale constant parameter. Because driving stress and basal stress are closely related through the stress balance equation, such relationships hint at a direct link between a highly variable drag coefficient and water pressure. Indeed, assuming this type of relationship holds, our approach can quantify variations in water pressure under the entire basin that can explain the observed variations in surface height. By a reasoning similar to our approach for surface mass balance, our results demonstrate the need for further integration of hydrological models in our forward model so that we can improve our understanding of how surface height variability can be generated by the water pressure forcing.
By design, our inversions were carried out independent of one another, which makes it difficult to attribute to either basal friction or surface mass balance the inferred variability in surface height. The fact that convergence is reached much faster and much more efficiently for SMB than for basal friction is a strong hint that most of the variability is probably atmospheric in nature; however, we cannot disregard entirely the variability in basal water pressure. Indeed, recent studies suggest strong links between water pressure and surface melt water draining through moulins and lakes, which can be seasonally driven (Alley et al., 2005;Luthje et al., 2006;Box and Ski, 2007;McMillan et al., 2007;van der Veen, 2007;Tedesco, 2007;Shepherd et al., 2009;Palmer et al., 2011;Tedesco et al., 2012). Another issue our inversions raise is the fact that surface altimetry is strongly biased towards inferring changes in surface mass balance, which, if we are to improve our understanding of variability and trends in basal hydrology, presents a challenge. Indeed, in order to invert for variations in basal friction, our observable should be surface velocity, as it is directly linked to basal stress through Eqs. (1) and (2), and plays a similar role to M s in Eq. (5). Several studies have demonstrated the usefulness of such an approach for steady-state ice-flow model inversions (MacAyeal, 1993;Morlighem et al., 2010;Vieli et al., 2006;Arthern and Gudmundsson, 2010), and our results suggest this extends to transient ice-flow models as well. Here, as previously alluded to by Heimbach and Bugnion (2009) and explored in Goldberg and Heimbach (2013), a combined approach should be entertained, in which both surface velocity and height be used to invert for the state of the ice at the ice-bedrock interface and at the surface. This puts serious constraints on the rate at which surface velocities from SAR platforms should be collected, but the emergence of satellites such as TerraSAR-X or Sentinel, which can provide high-repeat pass observations -in combination with continuous coverage from altimetry by CryoSat-2, sub-meterresolution stereo imaging from Worldview-2 (Shean et al., 2012) and in the coming years ICESat-2 -shows a highdegree of promise.
Some improvements to our methodology, which we are currently working on, should alleviate some of the issues regarding attribution of surface height signatures to surface or basal processes. Indeed, our inversion is not constrained by error margins in either our model forcings (in particular SMB) or model diagnostics (surface altimetry). By introducing such margins, we would ensure that our forcing corrections remain within the bounds of what is realistic. Indeed, it is highly probable that our cost-function decrease for the SMB inversion is too drastic and generates corrections that are too large to be acceptable within the uncertainty range of the time series from Box (2013). For basal friction, it is very difficult to assess the error margin on the initial time series. However, provided a basal hydrology model is included in the forward model, a better quantification of the uncertainty in the underlying hydrological model should be possible, which should result in a better quantification of the uncertainty in the computation of the basal drag coefficient itself.
Both inversions provide good results inland, where misfit is lowered significantly. Near the coastline, however, misfit remains significant (Fig. 8). The coastline is a very mountainous area, with few outlet glaciers (79 North, Zachariae Isstrøm, Storstrømmen) that are in contact with the ocean. For these outlet glaciers, the misfit can probably be attributed to a lack of representation of ice-ocean interactions. Indeed, melting rate under the ice shelf is taken constant during grounding line retreat, which does not take into account variations in sub-ice-shelf cavity circulation. For the remainder of the area, however, in the mountain ranges near the coast, high misfit is still observed. Given that ice velocities are negligible there, the misfit must be attributed to variations in SMB that are not captured in the initial forcing. This suggests large corrections are still required in the SMB local to these high-altitude areas. This could suggest two things: (1) that the altitude/lapse rate parametrizations need to be improved and (2) that our inversion needs to be locally and temporally refined in these specific areas. Indeed for the latter, our gradients computed in Fig. 5 provide a basin-scale vectorial direction along which the steepest-descent algorithm optimizes the cost function. However, smaller areas of NEGIS could be considered for the SMB inversion, for example those areas which exhibit high hypsometry only.

Conclusions
We presented a new data assimilation system within the ISSM framework that is capable of assimilating surface altimetry data from missions such as ICESat into reconstructions of transient ice flow. This system relies on algorithmic differentiation at its core to compute gradients of objective functions (such as a cost function between modeled and observed surface height) with respect to model forcings. An application to the Northeast Greenland Ice Stream was provided, where surface mass balance and basal friction forcings were temporally inverted, resulting in significant improvements in the best fit to observations. This new approach allows for a better understanding of which processes can be characterized by altimetry, and it illustrates the need for combining different data sets such as altimetry and satellitederived surface velocities into inversions of basal friction and surface mass balance. It also enables a better quantification of the contribution of each forcing to the model best fit to observations, and a better understanding of which type of physics are currently missing from transient ice-flow models in order to better characterize the important intra-and interannual variability in surface heights. Our results also demonstrate that large spatial and temporal variability is required in model forcings such as surface mass balance and basal friction, variability that can only be explained by including more complex processes such as snowpack compaction at the surface and basal hydrology at the bottom of the ice sheet. Our new approach, once combined with estimates of errors in the model inputs, should allow for a better identification of which underlying processes are responsible for specific signatures in the observed surface altimetry. This approach is indeed a first step towards assimilating the wealth of surface altimetry data that is currently available from EnviSat, ICESat, Operation IceBridge and CryoSat-2, and that will be available in the near future with the launch of ICESat-2.
Propulsion Laboratory's Research and Technology Development Program and the President's and Director's Fund Program. We would also like to acknowledge J. Box's help in providing SMB time series for the initial values of our inversion, J. van Angelen and M. van den Broeke for providing climatology for the density estimation and S. B. Simonsen for providing firn compaction estimates.
Edited by: A. Vieli