Simulated high-latitude soil thermal dynamics during the past 4 decades

Soil temperature (Ts) change is a key indicator of the dynamics of permafrost. On seasonal and interannual timescales, the variability of Ts determines the activelayer depth, which regulates hydrological soil properties and biogeochemical processes. On the multi-decadal scale, increasing Ts not only drives permafrost thaw/retreat but can also trigger and accelerate the decomposition of soil organic carbon. The magnitude of permafrost carbon feedbacks is thus closely linked to the rate of change of soil thermal regimes. In this study, we used nine process-based ecosystem models with permafrost processes, all forced by different observation-based climate forcing during the period 1960–2000, to characterize the warming rate of Ts in permafrost regions. There is a large spread of Ts trends at 20 cm depth across the models, with trend values ranging from 0.010± 0.003 to 0.031± 0.005 C yr. Most models show smaller increase in Ts with increasing depth. Air temperature (Ta) and longwave downward radiation (LWDR) are the main drivers of Ts trends, but their relative contributions differ amongst the models. Different trends of LWDR used in the forcing of models can explain 61 % of their differences in Ts trends, while trends of Ta only explain 5 % of the differences in Ts trends. Uncertain climate forcing contributes a larger uncertainty in Ts trends (0.021± 0.008 C yr, mean± standard deviation) than the uncertainty of model structure (0.012± 0.001 C yr), diagnosed from the range Published by Copernicus Publications on behalf of the European Geosciences Union. 180 S. Peng et al.: Simulated high-latitude soil thermal dynamics during the past 4 decades of response between different models, normalized to the same forcing. In addition, the loss rate of near-surface permafrost area, defined as total area where the maximum seasonal active-layer thickness (ALT) is less than 3 m loss rate, is found to be significantly correlated with the magnitude of the trends of Ts at 1 m depth across the models (R =−0.85, P = 0.003), but not with the initial total nearsurface permafrost area (R =−0.30, P = 0.438). The sensitivity of the total boreal near-surface permafrost area to Ts at 1 m is estimated to be of −2.80± 0.67 million km C. Finally, by using two long-term LWDR data sets and relationships between trends of LWDR and Ts across models, we infer an observation-constrained total boreal near-surface permafrost area decrease comprising between 39± 14× 10 and 75± 14× 10 km yr from 1960 to 2000. This corresponds to 9–18 % degradation of the current permafrost area.

S. Peng et al.: Simulated high-latitude soil thermal dynamics during the past 4 decades of response between different models, normalized to the same forcing. In addition, the loss rate of near-surface permafrost area, defined as total area where the maximum seasonal active-layer thickness (ALT) is less than 3 m loss rate, is found to be significantly correlated with the magnitude of the trends of T s at 1 m depth across the models (R = −0.85, P = 0.003), but not with the initial total nearsurface permafrost area (R = −0.30, P = 0.438). The sensitivity of the total boreal near-surface permafrost area to T s at 1 m is estimated to be of −2.80 ± 0.67 million km 2 • C −1 . Finally, by using two long-term LWDR data sets and relationships between trends of LWDR and T s across models, we infer an observation-constrained total boreal near-surface permafrost area decrease comprising between 39 ± 14 × 10 3 and 75 ± 14 × 10 3 km 2 yr −1 from 1960 to 2000. This corresponds to 9-18 % degradation of the current permafrost area.

Introduction
Arctic permafrost regions store ∼ 1300 Pg carbon (C) in the soil, including ∼ 1100 Pg C in frozen soil and deposits (Hugelius et al., 2014). Decomposition of these large carbon pools in response to permafrost thawing from projected future warming is expected to be a positive feedback on climate warming through increased emissions of CO 2 and CH 4 (Khvorostyanov et al., 2008;Schuur et al., 2008;McGuire et al., 2009;Koven et al., 2011;Schaefer et al., 2011). The magnitude of permafrost soil carbon feedbacks on climate depends on the rate of soil carbon decomposition, which is related to permafrost thaw; soil water and temperature changes; the quantity and quality of soil carbon available as a substrate for decomposition; and the concentration of oxygen in the soil, which determines the CH 4 vs. CO 2 production ratio (Schuur et al., 2008;Schädel et al., 2014;Elberling et al., 2013). Both the rate of permafrost thaw and the rate of soil carbon decomposition are closely related to soil thermal dynamics (Koven et al., 2011;Schädel et al., 2014;Elberling et al., 2013).
Measurements of active-layer depth across circumpolar regions and borehole temperature profiles indicate that activelayer thickness (ALT) on top of boreal permafrost has been increasing in response to the warming that occurred during recent decades in North America, northern Europe, and Russia (e.g., Zhang et al., 2001;Qian et al., 2011;Smith et al., 2005Smith et al., , 2010Romanovsky et al., , 2010. For example, the borehole record of Alert in Canada (82 • 30 N,62 • 25 W) shows that soil temperature at 9, 15, and 24 m increased at rates of 0.6, 0.4, and 0.2 • C decade −1 from 1978 to 2007, respectively . These observations provide long-term local monitoring of changes in active-layer thickness and soil temperature, but the measurement sites are sparse, and their temporal sampling frequency is often low (Romanovsky et al., 2010). Because site measurements can-not document permafrost area loss on a large scale, land surface models including "cold processes", such as soil freezethaw and the thermal and radiative properties of snow, are important tools for quantifying the rate of permafrost degradation on a large scale, and its evolution in response to climate change scenarios.
However, there are large uncertainties in soil thermal dynamics in land surface models (e.g., Koven et al., 2013), and these uncertainties also impact predictions of carboncycle feedbacks on climate. To quantify and reduce the uncertainty of modeled soil temperature (T s ), the driving factors of T s trends need to be investigated. Besides the uncertainty in model parameterization and structure, the gridded climate forcing for offline land surface models over highlatitude regions has large uncertainty (e.g., Troy and Wood, 2009;Rawlins et al., 2010). It is also important to distinguish the uncertainty caused by assigned parameter values and model structure from the uncertainty attributable to uncertain climate-forcing data.
In this study, nine process-based models that participated in the Permafrost Carbon Network (PCN, www. permafrostcarbon.org) were used (1) to compare trends of simulated T s at different depths over the boreal permafrost regions during the past 4 decades and to assess the uncertainty of modeled T s trends; (2) to identify which factors drive trends of permafrost T s ; and (3) to quantify the sensitivity of changes in near-surface permafrost area to warming.

Models and simulations
The nine land surface models that were used to simulate T s in permafrost regions organized by the PCN (www. permafrostcarbon.org) are listed in Table 1. All the models used a finite-difference solution of heat equation with phase change to simulate T s , but models have different soil depths, snow parameterizations, and soil thermal conductivities (Table 1). Three models (CLM, ISBA, UW-VIC) explicitly considered organic soil insulation, and seven models explicitly considered the effect of water in soil on phase change. All models explicitly considered snow insulation but with different snow layers. The soil thermal conductivity depends on soil moisture in all models. More details can be found in Rawlins et al. (2015) and Koven et al. (2015). We defined the Northern Hemisphere permafrost spatial domain as in Fig. 1, and the analysis considers three permafrost regions: boreal North America (BONA), boreal Europe (BOEU), and boreal Asia (BOAS) (Fig. 1;Brown et al., 1998). We did not include the Tibetan Plateau because not all the models covered this region. Hereafter, the term "boreal regions" is used for the sum of the three sub-regions BONA, BOEU, and BOAS in Fig. 1 (2001); Adam et al. (2006); Kalnay et al. (1996) Bohn et al. (2013) Surface shortwave and longwave downward radiation were internally calculated.
Following the simulation protocol of the PCN project, nine land surface models performed historical simulations from 1960 to 2000, using different forcing data sets (Table 1). The different modeling groups in this study used different forcing data sets for climate and other model boundary conditions (Table 1), which collectively represent uncertainty both from climate forcing (and other forcing files) and from model parameterization and structure in simulating soil thermal dynamics across the permafrost region. Climate-forcing data chosen by each group are presented in Table 1, and the differences in the trend of T a , precipitation, and radiative forcing are summarized in Figs. S1 and S2 in the Supplement. How differences between these drivers are related to differences of the modeled T s is discussed in the Results and discussion section.
To separate the contributions of the trends of four forcing variables (T a , atmospheric CO 2 , precipitation, and longwave downward radiation (LWDR)) to permafrost thermal dynamics and carbon stocks, six out of the nine models conducted factorial simulations (R01-R04). The ORCHIDEE and JULES performed two additional simulations (R05-R06) to isolate the contribution of LWDR to T s trends (Tables 2 and 3). In the reference simulation R01, all drivers varied at the same time. In R02 T a was detrended; in R03 atmospheric CO 2 was set constant to the observed 1960 level of 316 ppmv; in R04 both T a and precipitation were detrended; in R05 T a and LWDR were detrended; and in R06 T a , precipitation, and LWDR were detrended. Differences between two simulations were used to separate the controlling effect of each driver on T s . The interactions between CO 2 and T a as well as precipitation are also included in the differences variable, but with detrended T a and precipitation variable R05 variable, but with detrended T a and LWDR variable R06 variable, but with detrended T a , precipitation, and LWDR variable Table 3. The trends of annual air temperature (T a ), precipitation, and longwave downward radiation (LWDR) in the second to fourth columns. The fifth column shows the trends of annual T s at 20 cm in the reference simulation (R01). The last four columns show the contributions of drivers (T a , precipitation, CO 2 , and LWDR) to the trend of T s as mentioned in the Methods section. The relative contributions (divided by the trend of T s in Ref) are shown in the parentheses. The bold font indicates statistical significance (P < 0.05).

Model
Trend of T a ( • C yr −1 ) Trend of precipitation between the two simulations. For example, enhanced vegetation growth by increased T a /precipitation may transpire less water under higher CO2 conditions.

Analysis
Modeled monthly T s at 5, 20, 50, 100, 200, and 300 cm depths in every grid cell of each model was calculated by linear interpolation of T s between the central depths of two adjacent layers. Modeled T s at depths deeper than 300 cm (six models modeled Ts deeper than 300 cm, except CoLM, JULES and LPJ-GUESS was not extrapolated (the maximum soil depth of each model is shown in Table 1). For each of the boreal sub-regions -BONA, BOEU, and BOAS ( Fig. 1) -T s was first averaged over all grid cells and the trend of regional mean T s (denotedṪ s ) was calculated from a linear regression. The statistical significance ofṪ s is evaluated by a t test.
To estimate the uncertainty ofṪ s caused by differences in the trend of each climate input variable, we regressedṪ s against the trends of T a , precipitation, shortwave downward radiation (SWDR), and LWDR using the output of R01. The uncertainty ofṪ s attributed to each forcing variable was defined as the resulting range ofṪ s associated with different trends in each forcing variable in the models. To achieve this aim, we regressedṪ s against forcing variable across the models, and the uncertainty ofṪ s resulting from uncertain forcing data was calculated as the range ofṪ s from the maximum and minimum values of forcing data in the regression equation. Then we define theṪ s uncertainty attributed to model structure, which reflects the differences in model parameteriza- The Cryosphere, 10, 179-192, 2016 www.the-cryosphere.net/10/179/2016/ tions and parameter values, as the uncertainty ofṪ s assuming all models were using the same climate-forcing data.
Here, we defined near-surface permafrost as in previous studies (e.g., Schneider von Deimling et al., 2012): nearsurface permafrost is defined as where the maximum seasonal thaw depth (i.e., ALT) is less than 3 m. The total nearsurface permafrost area (NSPA) is the sum of the areas of grid cells that fulfill this condition.
We used monthly LWDR data from CRUNCEP v5.2 (http://dods.extra.cea.fr/data/p529viov/cruncep) and WATCH (Weedon et al., 2011) with a spatial resolution of 0.5 • by 0.5 • during the period 1960-2000 to derive the trend of LWDR. The CRUNCEP LWDR data set was derived from CRU TS3.21 and NCEP reanalysis meteorology, and ancillary data sets (e.g., Wei et al., 2014). The WATCH LWDR data set was derived from ERA-40 reanalysis (Weedon et al., 2011). Because there is no long-term, large-scale LWDR observation product available, we did an experiment using LWDR from CRUNCEP and WATCH data to estimate the loss of permafrost area during the period 1960-2000 by an empirical relationship between the loss of permafrost area and LWDR trends in seven out of the nine models (except LPJ-GUESS and UVic because LWDR was not used by these two models) (see Sect. 3.4 below).

Trend in upper-layer soil temperature over boreal regions
The simulated values ofṪ s at 20 cm depth averaged over boreal regions range from 0.010 ± 0.003 • C yr −1 (CoLM) to 0.031 ± 0.005 • C yr −1 (UVic) during the period 1960-2000 ( Fig. 2). Figure 3 showsṪ s at 20 cm for BONA, BOEU, and BOAS regions. Six out of the nine models show the largestṪ s at 20 cm in BOAS, followed by BONA and BOEU. The other three models (CoLM, JULES, and UW-VIC) show the small-estṪ s at 20 cm in BOAS. Among the six models with smalleṙ T s at 20 cm in BOEU, we found thatṪ s at 20 cm in BOEU is significantly lower than in BOAS and in BONA (P < 0.001, two-sample t test). This is also shown in the spatial distribution ofṪ s at 20 cm ( Fig. 4). For example, in northern Siberia, T s at 20 cm increased by more than 0.02 • C yr −1 in five out of the nine models (ISBA, LPJ-GUESS, MICRO-ESM, OR-CHIDEE, and UVic) but decreased in two models (CoLM and JULES). All models show an increase of T s at 20 cm in northern BONA, but this increase is of different magnitude between models (Fig. 4). Six models show significantṪ s at 20 cm over northern and western Siberia, but all models show non-significantṪ s at 20 cm over northern BOEU (Fig. 4).

Attenuation of the trend in soil temperature with soil depth
The trend of T s at different soil depths is shown in Fig. 5 for each model. Based on ground soil temperature observation, annual T s at 1.6 m increased by 0.02-0.03 • C yr −1 from the 1960s to 2000s in Russia (Park et al., 2014). The simulated trends of T s at 1.6 m over BOAS in most models are within this range (Fig. S3). Two models (CoLM and JULES) show vertically quasi-uniformṪ s over the upper 3 m of soil, probably because of too-quick soil thermal equilibrium in these two models. The seven other models show decreasing values ofṪ s with increasing soil depth, but the vertical gradient ofṪ s varies among them (Fig. 5a). UW-VIC has the largest negative vertical gradient ofṪ s (−0.0052 ± 0.0001 • C yr −1 m −1 ), followed by ISBA, MICRO-ESM, ORCHIDEE, and UVic (∼ −0.0030 ± 0.0003 • C yr −1 m −1 ) and by near-zero vertical gradient ofṪ s in CLM (−0.0009 ± 0.0003 • C yr −1 m −1 ) and in LPJ-GUESS (−0.0014 ± 0.0000 • C yr −1 m −1 ). Figure 5b shows the trend of T s in all soil layers over boreal regions. CLM and UVic show an increase of T s even at depths deeper than 40 m, but T s exhibited no changes deeper than 22 m in ORCHIDEE (Fig. 5b). from these two models. UW-VIC shows a negative trend of T s (i.e., cooling) at depths deeper than 2.5 m, which may be related to higher soil heat capacities with increased soil moisture, resulting in cooler summertime soil temperatures and shallower active layers in the regions . The trends of T s over BONA, BOEU, and BOAS regions decrease in magnitude with increasing soil depth, but they show different vertical gradients. In Fig. S3, the vertical gradient ofṪ s is shown to be larger in BONA and BOAS than that in BOEU for most models. Figure 6 shows the spatial distribution of the difference inṪ s at depths between 0.2 and 3 m.Ṫ s at 0.2 m is larger than that at 3 m over most regions in BONA, BOEU, and BOAS in seven out of the nine models, except JULES and CoLM. Generally, borehole records show that mean annual soil temperature at depths between 10 and 30 m has increased during the last 3 decades over the circumpolar northern permafrost regions (Osterkamp, 2003(Osterkamp, , 2007Romanovsky et al., 2010;Smith et al., 2005Smith et al., , 2012Vaughan et al., 2013). In Alaska, T s at 20 m from boreholes increased by ∼ 1 • C between the early 1980s and 2001 (Osterkamp, 2003). The observed value ofṪ s at one of the Alert (BH3) boreholes is ∼ 0.04 • C yr −1 at ∼ 2.5 m depth and nearly zero at ∼ 27 m depth during the period 1979-2004 (see Fig. 9 in  Smith et al., 2012). Some boreholes (BH1 and BH2) at Alert, however, still indicated a small warming during the period 1979-2008  at 37 m. This suggests that much deeper maximum soil depth than the currently prescribed maximum soil depths (Table 1) is needed for some models to calculate the heat flux into the entire soil profile (Stevens et al., 2007). CoLM, JULES, and LPJ-GUESS have too shallow a maximum soil depth for the calculation of permafrost soil temperature trends over the last 4 decades, which makes these models even less realistic for deeper T s projections over the next century (e.g., Alexeev et al., 2007). Compared to the increased ground temperature at depths deeper than 20 m in boreholes during the past 3 decades (Vaughan et al., 2013), most models that do not have deeper soil depth seem to underestimate the penetration of heat into deep soil layers (Fig. 5b). For the bottom boundary geothermal heat flux, eight out of the nine models are assumed to be zero. The ignored boundary geothermal heat flux is valid for the upper 20-30 m of soil within century scale  flux should not be ignored. Note that this comparison may be biased because of different periods and climate records between sites and model grid cells. It is also recommended that simulations at site level using in situ local climate forcing be compared with temperature profiles of boreholes  to evaluate why models underestimate the warming of T s at deeper depths.

Drivers of trends in soil temperature
We used the sensitivity runs (R02-R06) compared with the reference simulation with all drivers varying together (R01) to separate the effects of T a , CO 2 , precipitation, and LWDR onṪ s during 1960-2000 (Table 3). Seven of the nine models only provided results from R02, R03, and R04. Except for JULES, all the models show a positive response of T s to increasing T a , albeit with different sensitivities (Table 3). The fraction of the trend of T s explained by air temperature increase alone (R01-R02) is nearly 100 % in CLM and ISBA, and more than 100 % in UW-VIC, against only 34, 56, and 67 % in ORCHIDEE, UVic, and LPJ-GUESS, respectively. This indicates the importance of increasing T a for the trend of T s and is consistent with observations. Based on 30 cli-mate station observations in Canada during the period 1958-2008, T s at 10 cm significantly and positively correlates with T a at most sites (> 90 %) in spring, but at fewer sites (< 30 %) in winter (Qian et al., 2011). For winter T s , the winter snow depth was found to have significant and positive correlation with T s in shallow soil layers (e.g., Zhang et al., 2001;Qian et al., 2011). Recent increases in T a also explain the trend of T s at 1.6 m measured at Churapcha metrological station (62.02 • N, 132.36 • E), and at 5 m measured in a borehole at Iqaluit (63.47 • N, 68.48 • W) in Canada (Smith et al., 2005;. To some extent, the trend of T a is a good indicator for the trend of deep permafrost ground temperature with some time lag . For the modeled T s in land surface models, the effects of T a on T s depend on surface energy balance and ground heat flux into soil; i.e., the extent of coupled T a on T s is related to the surface properties such as snow, organic soil horizons, and roughness in the models. The different relative contributions of the trend of T a to the trend of T s in these models perhaps mainly result from the different model parameterization and structures, as the trends of T a (∼ 0.03 • C yr −1 ) in the climate forcing do not have a large spread (Fig. 7). The increase of atmospheric CO 2 concentration has almost no effect on the increase of T s in most models (−5 to +4 % of increase of T s , Table 3). This is expected since CO 2 has no direct effect on T s apart from its impact on climate. The only indirect effect of rising CO 2 on T s trends could result from feedbacks between plant productivity driven by rising CO 2 , soil carbon changes, and soil thermal properties. For instance, if models include heat production from microbial decomposition of soil organic carbon (Khvorostyanov et al., 2008) or if changes occur in soil organic carbon from the balance of net primary productivity (NPP) input and decomposition, these could impact the soil temperature directly or the profile of soil heat conductivity and capacity. In that case, the expected response is that a CO 2 -driven increase of productivity will increase soil organic carbon, which will enhance the insulation effect of soil organic carbon in the soil and lower the trend of T s Koven et al., 2009). Further, complex changes in the surface energy balance from changes in evapotranspiration under higher CO 2 concentrations can influence soil moisture content and affect T s trends (e.g., Field et al., 1995). Most models do not have a feedback between soil organic carbon dynamics and soil thermal properties, and the increase in soil organic carbon due to rising CO 2 is relatively small in the models compared to the initial soil organic carbon storage (< 0.1 %). The changes in evapotranspiration because of increasing CO 2 are also relatively small (−3 to +1 %). Therefore, the increased CO 2 concentration has a very small effect onṪ s from 1960 to 2000.
Precipitation shows an increase in BONA and BOEU and a decrease in BOAS in the climate forcing used by most models (Fig. S1b) (Table 3). Increasing winter snowfall can enhance T s in winter through the snow insulation effect (e.g., Smith et al., 2010;Koven et al., 2013). All models in this study indeed show higher winter T s where winter snow depth became deeper, albeit with different magnitudes of snow insulation effects across the models. The snow insulation effects are smaller in ISBA, LPJ-GUESS, and UVic than those in the other models. A decrease in snowfall could contribute to a negative trend of T s in CLM, and an increase in snowfall could enhance T s in ORCHIDEE ( Fig. S4; Table 3). In addition, increased rainfall in summer can cause an increase in evapotranspiration during the growing seasons, which could reduce the increase of T s . The effects of snowfall trends and growing-season precipitation trends may oppose each other as mentioned above. These two contrasting effects cannot be separated in this analysis, because models did not run simulations with seasonally detrended precipitation. But the different effects of seasonal precipitation on T s should be studied in the future.
LWDR significantly increased after 1960 in all models, albeit with different trends in the forcing data used by each modeling group (0.058 ∼ 0.200 W m −2 yr −1 ) (Fig. S2a). LWDR forcing is mainly from two reanalysis data sets (ERA and NCEP) with corrections (e.g., Weedon et al., 2011;http://dods.extra.cea.fr/data/p529viov/cruncep). OR-CHIDEE and JULES performed the simulation R05 with detrended LWDR. The results of R02-R05, allowingṪ s to be attributed to trends of LWDR, indicate that the increase of LWDR explains 56 and 31 % of the trend of T s since 1960 in ORCHIDEE and JULES, respectively. Increased LWDR provides additional energy to the surface and dominates the atmosphere-to-soil energy flux in winter over boreal regions when shortwave radiation is low. Even in summer, LWDR contributes ∼ 60 % of total downward radiation (SWDR+LWDR) over boreal regions in CRUNCEP. An in-crease of LWDR with time thus increases the surface energy input, which accelerates the warming of T s in case the extra energy is not dissipated by an increase of sensible and latent heat flux. The contribution of changes in LWDR, T a , and other factors to all components of the surface energy budget and to T s could be further studied by testing models against observations from eddy-flux towers located in permafrost soils.

Uncertainty of modeled soil temperature trends
The uncertainty of modeledṪ s at 20 cm is large, as given by the spread of model results (0.010-0.031 • C yr −1 ). The uncertainty ofṪ s across the models can be conceptually decomposed into two components: a forcing uncertainty (FU) reflecting how different climate input data used by each modeling group contribute to the spread ofṪ s (Table 1), and a structural uncertainty (SU) related to uncertain parameter values and different equations and parameterizations of processes in models. Since T a and LWDR are the two main drivers of the increase of T s in most of the models (Sect. 3.3), we re-gressedṪ s during 1960-2000 against the trends of T a and LWDR, in order to estimate the FU. We then estimated SU from the uncertainty of parameters in the regression equation for a normalized same climate forcing across the models.
We found no significant correlation betweenṪ a andṪ s over boreal regions or sub-regions across the nine models ( Fig. 7 and Fig. S5), indicating that a bias ofṪ a forcing is not simply associated with the bias ofṪ s in a particular model compared to the others. We also found that trends of SWDR and precipitation do not significantly explain differences inṪ s at 20 cm across the models (P > 0.05; 21 and 19 % explanation of differences inṪ s at 20 cm for trends of SWDR and precipitation, respectively; Fig. S6). The correlations between trends in winter snowfall and trends of annual or winter T s at 20 cm are not significant (P > 0.05) across the models for boreal regions or sub-regions. However, the trend of LWDR (LWDR) can explain 61 % of the differences inṪ s at 20 cm across the models (Fig. 8). This result indicates that, throughout the model ensemble, differences ofṪ s at 20 cm between models are positively correlated (R = 0.78, P = 0.037) with differences ofLWDR used by the different modeling groups.Ṫ s at 1 m also significantly correlated withLWDR (R = 0.79, P = 0.034) across the models. The values ofLWDR used by different models averaged over permafrost regions range from 0.058 to 0.200 W m −2 yr −1 , statistically explaining a range of simu-latedṪ s at 20 cm of 0.021 ± 0.005 • C yr −1 (solid blue arrow in Fig. 8). ThisṪ s range defines the FU (the range ofṪ s tȯ LWDR from 0.058 to 0.200 W m −2 yr −1 based on the linear regression of Fig. 8). We also used multiple linear regression betweenṪ s at 20 cm depth andṪ a , withLWDR as the independent variable across the models, to derive an estimation of the FU inṪ s of 0.021 ± 0.008 • C yr −1 (the deviation was derived from the uncertainty of regression coef- ficients in the multiple linear regression). However, the uncertainty of the linear regression ofṪ s at 20 cm byLWDR orṪ a andLWDR shows that, if all the models used the same climate-forcing data, the SU would be 0.012 ± 0.001 • C yr −1 (solid orange arrow in Fig. 8). If all models use LWDR from CRUNCEP or WATCH, then, applying the trend of annual LWDR (0.087 ± 0.023 W m −2 yr −1 from CRUNCEP and 0.187 ± 0.028 W m −2 yr −1 from WATCH) during the period 1960-2000 as an emerging observation constraint empirical relationship in Fig. 8, the posterior range is reduced compared with the priorṪ s range (black curve in right panel of Fig. 8). Overall, the total uncertainty range ofṪ s at 20 cm (∼ 0.02 • C yr −1 , defined as the spread ofṪ s at 20 cm across the models) can be broken down into FU (0.021 ± 0.008 • C yr −1 ) and SU (0.012 ± 0.001 • C yr −1 ). Since FU and SU are not independent, the total uncertainty ofṪ s at 20 cm is not the sum of FU and SU. Further, we found that correlation coefficients between trends of summer T s at 20 cm and at 1 m and summer LWDR over boreal regions are statistically significant (P < 0.05) (Fig. S7). This is also found for winter (November to March) T s at 20 cm and 1 m (Fig. S8). Trends of summer and winter T s at 20 cm or 1 m are not significantly correlated with climate drivers other than LWDR (snowfall, rainfall, T a , and SWDR) across the models (P > 0.05).
Meteorological stations are sparse in the cold permafrost regions. For example, there are only 8.8 stations per million square kilometers north of 60 • N in the CRU TS3.22 gridded air temperature product, compared to 41.1 stations per million square kilometers between 25 and 60 • N. This results in uncertainty in gridded climate products over Arctic regions, especially for trends of Arctic climate variables (Mitchell and Jones, 2005;Troy and Wood, 2009;Rawlins et al., 2010;Weedon et al., 2011). Troy and Wood (2009) reported 15-20 W m −2 of differences in radiative fluxes on seasonal timescales over northern Eurasia, among six gridded products. Among different gridded observations and reanalysis precipitation products, the magnitude of Arctic precipitation ranges from 410 to 520 mm yr −1 , and the trend of Arctic precipitation also has a large spread (Rawlins et al., 2010). These large uncertainties in climate forcing in the Arctic undoubtedly can cause a large spread of modeled T s . We found that the FU dominates the total uncertainty ofṪ s . This suggests that modelers not only need to improve their models, but they also need better climate-forcing data (or need to test the effects of different climate input data) when modeling long-term changes of T s in permafrost regions. However, to quantify the SU, simulations using the same agreed-upon climate-forcing data are highly recommended to further attribute the contribution of each process in the soil thermal dynamics of models such as organic carbon insulation effects, snow insulation effects, latent heat formation and emission, soil conductivity, and surface properties (see Koven et al., 2009;Bonfils et al., 2012;Gouttevin et al., 2012). In addition, important processes in permafrost regions such as dynamics of excessive ground ice (e.g., ice wedge growth and degradation) and thermokarst lakes (formation, expansion, and drainage) should be developed and evaluated in land surface models to improve the prediction of future permafrost feedbacks (e.g., van Huissteden et al., 2013;Lee et al., 2014).

Emerging constraint on how much near-surface permafrost has disappeared
The total boreal NSPA during 1960-2000 estimated by the nine models ranges from 6.8 million km 2 (CoLM) to 19.7 million km 2 (ORCHIDEE). The average of total NSPA in the nine-model ensemble (12.5 million km 2 ) is smaller than the estimate from the International Permafrost Association (IPA) map (16.2 million km 2 ; Brown et al, 1998;Slater and Lawrence et al., 2013). A statistic model based on relationships between air temperature and permafrost shows that permafrost extent over the Northern Hemisphere was also estimated in the range 12.9-17.8 million km 2 (Gruber, 2012), and six out of the nine models are within this range. Eight out of the nine models show a significant decrease in NSPA with climate warming during 1960-2000 (except UW-VIC). The loss rate of NSPA is found to vary by a factor of 13 across the nine models, varying from −4 × 10 3 km 2 yr −1 in MIROC-ESM to −50 × 10 3 km 2 yr −1 in JULES (Fig. 9a). The average loss rate of NSPA across the models (−23 ± 23 × 10 3 km 2 yr −1 ) is smaller than in the previous estimations of Burke et al. (2013) and Slater and Lawrence (2013). For example, the loss rate of NSPA was estimated at −81 × 10 3 to −55 × 10 3 km 2 yr −1 during the period 1967-2000 by JULES offline simulations with different climate-forcing data sets (Burke et al., 2013). The ranges of loss rate of NSPA in BONA, BOEU, and BOAS across the models are −16.6 × 10 3 to 2.2 × 10 3 km 2 yr −1 , −4.0 × 10 3 to 0.0 × 10 3 km 2 yr −1 , and −34.2 × 10 3 to −1.1 × 10 3 km 2 yr −1 , respectively (Fig. 9). This is consistent with the observed permafrost degradation (decrease in thickness) in these regions (Vaughan et al., 2013). The retreat rate of NSPA is not correlated significantly with the initial NSPA of each model (R = −0.30, P = 0.438), implying that the initial state of the models is less important than their response to climate change in determining NSPA loss rates. Contrary to the small effect of initial NSPA, the trend of summer T s at 1 m is found to be strongly correlated with NSPA loss rates across the models of the ensemble. Figure 9 shows that the trend of summer T s at 1 m can explain 73 % of the differences in NSPA loss rates between models. The sensitivity of NSPA loss rate to summerṪ s at 1 m is estimated to be −2.80 ± 0.67 million km 2 • C −1 , based on the linear regression between the loss rate of NSPA and the trend of summer T s at 1 m across the nine models (Fig. 9). For the BONA, BOEU, and BOAS sub-regions, the sensitivities of NSPA loss rate to summerṪ s at 1 m are −0.74 ± 0.10, −0.09 ± 0.03, and −1.74 ± 0.59 million km 2 • C −1 , respectively (Fig. 9). The sensitivity of future total NSPA changes to T a over pan-Arctic regions was estimated to be −1.67 ± 0.7 million km 2 • C −1 , ranging from 0.2 to 3.5 million km 2 • C −1 in the CMIP5 multi-model ensemble (Slater and Lawrence, 2013;Koven et al., 2013). The average of trends in summer T s at 1 m is only 70 % (43-100 %) ofṪ a in the nine models, so the sensitivity of total NSPA to T a over boreal regions in the nine models is about −2.00 ± 0.47 million km 2 • C −1 , which is larger than that from the the CMIP5 multi-model ensemble but comparable within the uncertainties of each estimate (Slater and Lawrence, 2013). Six out of the nine models of this study were also used as land surface schemes of the coupled CMIP5 models, but possibly for different versions.
A mean positive trend of summer LWDR of 0.073 ± 0.030 and 0.210 ± 0.027 W m 2 yr −1 over boreal regions from 1960 to 2000 is derived from the CRUNCEP and WATCH data sets, respectively. We applied this trend of LWDR to an emerging constraint on summer T s trends from the relationship between the trend of summer LWDR and the trend of summer T s at 1 m (Fig. S7). This approach constrains the trend of summer T s to 0.014 ± 0.004 • C yr −1 with CRUN-CEP and to 0.027 ± 0.004 • C yr −1 with WATCH. The uncertainty is reduced by 50 % from the prior range including different models and different forcings. A total NSPA loss rate of 39 ± 14 × 10 3 km 2 yr −1 can be constrained by multiplying the sensitivity of total NSPA loss rate to sum-merṪ s at 1 m (−2.80 ± 0.67 million km 2 • C −1 ) by the trend of T s at 1 m, itself empirically estimated byLWDR during 1960-2000 from CRUNCEP (0.014 ± 0.004 • C yr −1 ). The constrained loss rate of NSPA over BONA, BOEU, and BOAS based upon the CRUNCEPLWDR from 1960 to 2000 is 11 ± 5 × 10 3 , 1 ± 1 × 10 3 , and 25 ± 11 × 10 3 km 2 yr −1 , respectively. Similarly, if WATCHLWDR is used to constrain the NSPA loss rate, the total NSPA loss rate is 75 ± 14 × 10 3 km 2 yr −1 , and the loss rate of NSPA over BONA, BOEU, and BOAS is estimated to be 28 ± 10 × 10 3 , 2 ± 1 × 10 3 , and 39 ± 19 × 10 3 km 2 yr −1 , respectively. The southern boundary of the discontinuous permafrost zone has been observed to shift northward during recent decades (Vaughan et al., 2013), which is generally consistent with the simulations reported in this study. The larger warming rate and higher sensitivity of NSPA loss to T s over BOAS could explain the reason for significant degradation of permafrost over BOAS compared to the other boreal regions (Vaughan et al., 2013). The larger permafrost degradation rate in BOAS than that in BONA may have larger effects on changes in vegetation distribution and growth, and permafrost carbon in these two regions, and it can be quantified in future studies. Obviously, there is a large difference in constrained NSPA between CRUNCEP and WATCH. In the future, long-term climate reanalysis including radiation evaluated against sites with long-term radiation measurements (http://www.geba.ethz.ch) would be extremely useful for land surface models to provide improved estimate of NSPA.

Conclusions
In this study, trends of soil temperature (T s ) over boreal regions from nine process-based models were analyzed for the past 40 years. All models produce a warming of T s , but the trends of T s at 20 cm depth range from 0.010 ± 0.003 • C yr −1 (CoLM) to 0.031 ± 0.005 • C yr −1 (UVic) during 1960-2000. Most models show a smaller increase of T s with deeper depth. Air temperature (T a ) and LWDR are found to be the predominant drivers of the increase in T s averaged across large spatial scales. The relative contribution of T a and LWDR trends to the increase of T s is, however, different across the models. Note that the relative contribution of LWDR is based on only two models in this study, and this needs further investigation. The total uncertainty of the trend of T s at 20 cm is decomposed into the uncertainty contributed by uncertain climate-forcing data sets (0.021 ± 0.008 • C yr −1 ) and the uncertainty reflecting model structure (0.012 ± 0.001 • C yr −1 ). The NSPA loss rate is significantly correlated among the model results with the simulated trend of T s at 1 m, with a linear sensitivity of total NSPA loss rate to summer trend of T s (Ṫ s ) at 1 m of −2.80 ± 0.67 million km 2 • C −1 . Based on LWDR from CRUNCEP and WATCH data, the total NSPA decrease is estimated to be 39 ± 14 × 10 3 -75 ± 14 × 10 3 km 2 yr −1 from 1960 to 2000. The constraint method used in this study could be applied to estimate historical and future permafrost degradation rate, and further to quantify the permafrost carbon loss by a permafrost carbon distribution map (Hugelius et al., 2014).
Given that meteorological stations are sparse in the cold permafrost regions, especially in Siberia and other unpopulated land in the north, the gridded climate products over high-latitude regions have a large uncertainty as well (Mitchell and Jones, 2005;Rawlins et al., 2010;Weedon et al., 2011). This large uncertainty could propagate into simulated permafrost dynamics and feedbacks. More sites are needed in high-latitude regions for reducing the climate uncertainty. Future model intercomparisons on permafrost dynamics should investigate the full uncertainty by conducting simulations for multiple climate-forcing data sets. Since the beginning of the satellite era, microwave emissivity data related to land surface temperature have become increasingly available (e.g., Smith et al., 2004). These images could be used to independently evaluate soil surface temperature in models on a large scale or be integrated in ground temperature models (e.g., Westermann et al., 2015), although they have their own uncertainties. In addition, many complex processes affect permafrost thermal dynamics in the models, such as soil organic insulation effects, snow insulation effects, and soil freeze-thaw cycles; it is valuable to evaluate the uncertainty of each process effects on soil thermal dynamic simulations based on site measurements. This could be helpful for reducing permafrost simulation uncertainty.