Organic Content and explore associated Hydrological and Thermal Dynamics in an Arctic Tundra

Quantitative characterization of soil organic carbon (OC) content is essential due to its significant impacts on surface–subsurface hydrological-thermal processes and microbial 10 decomposition of OC, which both in turn are important for predicting carbon-climate feedbacks. While such quantification is particularly important in the vulnerable organic-rich Arctic region, it is challenging to achieve due to the general limitations of conventional core sampling and analysis methods, and to the extremely dynamic nature of hydrological-thermal processes associated with annual freeze-thaw events. In this study, we develop and test an inversion scheme that can flexibly 15 use single or multiple datasets, including soil water liquid, temperature and electrical resistivity data (ERT), to estimate the vertical distribution of OC content. We subsequently explore the control of OC on hydrological-thermal behavior. We employ the Community Land Model to simulate nonisothermal surface-subsurface hydrological dynamics from the bedrock to the top of canopy, with consideration of land surface processes and ice/liquid water phase transitions. For 20 inversion, we combine a deterministic and an adaptive Markov chain Monte Carlo (MCMC) optimization algorithm to estimate posterior distributions of desired model parameters. For hydrological-thermal to geophysical variable transformation, the simulated subsurface temperature, liquid and ice water content are explicitly linked to soil apparent resistivity via petrophysical and geophysical models. We validate the developed scheme using different 25 numerical experiments and evaluate the influence of measurement errors and benefit of joint inversion on the estimation of OC and other parameters. We also quantified the propagation of uncertainty from the estimated parameters to prediction of hydrological-thermal responses. We find that compared to inversion of single dataset (either temperature or liquid or apparent resistivity), joint inversion of these datasets significantly reduces parameter uncertainty. We find 30 that the joint inversion approach is able to estimate OC and sand content within the shallow active layer (0.3 m) with high reliability. Due to the small variations of temperature and moisture within The Cryosphere Discuss., doi:10.5194/tc-2017-1, 2017 Manuscript under review for journal The Cryosphere Published: 7 February 2017 c © Author(s) 2017. CC-BY 3.0 License.


Introduction
Soil organic carbon (OC) and its influence on terrestrial ecosystem feedbacks to global warming in permafrost regions is particularly important for the calculation of global carbon budget and prediction of future climate variation. Warmer air temperature leads to permafrost degradation, 10 which is expected to enhance decomposition of huge pools of previously-frozen OC, releasing carbon dioxide and methane to the atmosphere, and enhancing global warming (Koven et al., 2011;Schaphoff et al., 2013;Schuur et al., 2015). In that context, accurate estimation of OC content stored in both active layer and permafrost is crucial for investigation of carbon stocks exposing for microbial decomposition. 15 Predictive understanding of ecosystem feedbacks to climate in permafrost regions requires quantitative knowledge of surface-subsurface hydrological-thermal dynamics, which in turn are strongly governed by the hydrological-thermal properties of soil OC .
In particular, there are dramatic differences between thermal and hydraulic properties of OC and 20 mineral soil, both of which typically co-exist in shallow permafrost systems. OC's thermal conductivity (e.g., !",!"# = 0.05 W/mK) is significantly lower than that of mineral soil (e.g., !"#$ =8.4 W/mK). By contrast, its heat capacity is higher than mineral soil. Considering hydrological properties, the hydraulic conductivity of OC is higher and its capillary pressure is smaller than mineral soil (Lawrence and Slater, 2008). In addition, while mineral soil porosity 25 typically ranges from 0.4 to 0.6, the porosity of OC soil is usually greater than 0.8. Due to its low thermal conductivity, a top OC layer can behave as an insulator that reduces the magnitude of heat and energy exchange between the atmosphere and deeper soil (e.g., Hinzman et al., 1991;Rinke et al., 2008). Nicolsky et al. (2007) and Jafarov and Schaefer (2016) reported that inclusion of vertical OC content profile into land surface model can considerably improve prediction of 30 subsurface moisture, temperature and carbon dynamics. However, our ability to measure or estimate the distribution of OC is currently challenging, which inhibits accurate model prediction.
OC content is usually measured from core samples, which are collected from field sites and then analyzed in the laboratory (e.g., Kern, 1994). While this method is relatively accurate, it is labor intensive and typically limited in spatial coverage. There have been few studies attempting to indirectly estimate OC content (or its hydrological-thermal properties) by minimizing the difference between numerical simulations and measurements of soil temperature or soil moisture 5 in an inversion framework. For example, Nicolsky et al. (2009) employed a variational data assimilation technique to estimate the thermal parameters and porosity of different OC and mineral soil layers. Atchley et al. (2015) used soil temperature data to calibrate the hydrologicalthermal properties of moss, peat and mineral soils. However, these studies only used single dataset for inversion. There has been no study that jointly uses multiple types of data to improve the OC 10 content estimation.
Geophysical methods hold potential for characterizing the subsurface in permafrost regions as well as their associated physical, hydrological and thermal processes. Geophysical techniques offer an advantage over conventional point measurement techniques because they provide spatially 15 extensive information in a minimally invasive manner (e.g., Hubbard and Rubin, 2005). In spite of the potential benefits offered by geophysical data for characterizing permafrost systems, 25 geophysical inversion approaches typically suffer from several challenges. First, inversion methods are often ill-posed due to the fact that geophysical observables are sensitive to different soil properties. Secondly, inversion approaches often typically require petrophysical models to link the geophysical observables with the property of interest. Finally, there are differences between the geophysical support scale and the scale of the imaging target (Hubbard and Linde, 30 2011). In order to take advantage of information inherent in geophysical signatures yet minimize the non-uniqueness challenges described above, many recent studies have explored the value of coupled hydrogeophysical inversion frameworks for estimating soil properties (e.g., Kowalsky et al., 2011;Johnson et al., 2009;Tran et al., 2016). In these studies, the hydrological and geophysical models are coupled together so that geophysical data are used to estimate soil properties that control the subsurface hydrological-thermal dynamics. However, coupled hydrogeophysical inversion approaches developed to date are not adequate for use in permafrost 5 systems due to several gaps. Developed methods have only been applied to snow-free systems without consideration of the significant dynamics associated with the free-thaw transition.
Developed coupled hydrogeophysical inversion approaches have also not yet incorporated surfacesubsurface interactions (e.g., evapotranspiration, energy balance, plant water uptake). Finally, while a few studies have used Soil Vegetation Atmospheric Transfer (SVAT) models to 10 qualitatively interpret geophysical data (e.g., McClymont et al., 2013), to date, no study has coupled SVAT and geophysical models and data to improve property estimation.
Building on recent advances in the use of electrical methods to characterize permafrost properties as well as coupled hydrogeophysical inversion approaches described above, this study focuses on 15 the development of a hydrological-thermal-geophysical inverse approach to estimate OC and its control on hydrological-thermal responses in Arctic systems using single or multiple datasets. Our approach advances and couples several algorithms. We use a SVAT model known as Community Land Model (CLM4.5, Oleson et al., 2013) to simulate water, heat and energy exchange from the bedrock to the top of canopy. The model considers most of the land surface processes, ice/liquid 20 phase change and surface-subsurface hydrological-thermal dynamics. For parameter estimation, we combined deterministic and stochastic optimization algorithms to concurrently obtain the best parameter estimates and their associated uncertainties. The deterministic optimization algorithm is employed to estimate the initial parameter set and covariance matrix of the proposal distribution.
For the stochastic optimization, we used an advanced MCMC method known as Delayed rejection 25 Adaptive Metropolis (DRAM, Haario et al., 2006). With this implementation together with the adaptive MCMC algorithm, we expect to obtain the posterior probability distributions (pdfs) of the desired model parameters more quickly than the traditional MCMC technique. For hydrologicalthermal to geophysical transformation, we explicitly consider the dependence of the soil apparent resistivity on the soil ice/liquid water content and soil temperature via petrophysical and forward 30 geophysical models. This study advances capabilities to estimate and understand the controls of OC on hydrological and thermal properties through developing a hydrological-thermal-geophysical inversion scheme and exploring its potential to estimate the vertical distribution of OC and mineral content at several depths within a representative synthetic Artic soil column. Herein, we use synthetic studies to: 1) evaluate the relationship between the measurement error and uncertainties of parameter 5 estimates, 2) examine the improvement in parameter estimation offered by including various datasets in the inversion, including electrical resistivity data, 3) investigate how OC estimation changes if the mineral and petrophysical parameters are unknown, 4) explore how parameter estimation changes when soil porosity functionally correlates with the OC and mineral content, and 5) investigate the uncertainty propagation from the OC and mineral content to the 10 hydrological-thermal prediction.
The paper is organized as follows. Section 2 describes the development of the hydrologicalthermal-geophysical inversion scheme. Section 3 analyzes and discusses the results of different synthetic experiments. Summary and concluding remarks are provided in Section 4.

Methodology
The joint hydrological-thermal-geophysical inversion scheme developed in this study is shown in Figure 1. Generally, the scheme includes two main components: 1) A forward coupled hydrological-thermal-geophysical model that generates the subsurface state variables (i.e., ice/liquid content and temperature), and then transforms these variables to the apparent resistivity 20 using a set of petrophysical formulas and a forward electrical resistivity model ( Figure 1a); 2) A combined deterministic-stochastic optimization algorithm to estimate the pdfs of desired model parameters ( ), which include the OC content vertical profile (scenarios from 1 to 9), sand content vertical profile (scenarios 8-9) and petrophysical parameters (scenarios 8-9) (see Table 2) by minimizing the misfit between measured and simulated data. It is worth noting that the scheme is 25 developed so that single (e.g., soil temperature or liquid water content or apparent resistivity) or multiple datasets can be used for inversion.

Hydrological-thermal model
In this study, we employed CLM4.5 model (hereafter referred to as 'CLM'), which can effectively 30 simulate different land surface energy balance and surface-subsurface hydrological-thermal processes (Oleson et al., 2013). CLM represents horizontal heterogeneity using multiple parallel soil/snow columns having different land use and plant function types. The lateral flow between the soil columns is not accounted for in CLM. The model simulates the freeze-thaw dynamics by considering two phases of water: liquid and ice. The rate of phase change depends on the energy excess (for the ice to liquid transition) or deficit (for the liquid to ice transition) from the soil 5 temperature to the freezing temperature. Given CLM's ability to simulate different hydrologicalthermal processes in cold regions, we found it suitable for Arctic soil column simulations. The minimum requirements for the top boundary conditions in CLM include precipitation, incident solar, air temperature and wind speed. The land use and plant type information can be provided by users or extracted from the available model database. CLM assumes that soil is a mixture of three 10 soil types, namely, OC, sand and clay. It calculates the soil hydrological-thermal parameters based on the fractions of these soil types and their corresponding hydrological-thermal properties (see Appendix A for more detailed information on these relationships). In this study, we focus on estimating OC and mineral content and exploring their control on hydrological-thermal behavior under freeze-thaw conditions. 15 For more detailed exploration of the vertical variability of subsurface properties and associated hydrological-thermal dynamics, we increased the default number of soil layers in the CLM from 15 to 32 layers, and defined the depth of layer i th (z i ) as: (1) 20 Of these 32 layers, CLM assumes that the 5 bottom layers are bedrock layers; hydrologicalthermal dynamics are simulated in the top 27 soil layers, while only thermal dynamics are simulated in the 5 bottom bedrock layers. Equation 1 was used to ensure that the layer thicknesses near soil surface are thinner than those near the bottom (as shown in Figure 3) in order to capture the important hydrological and thermal dynamics at the topsoil active layers. 25 Moreover, in order to explore how the soil porosity influences the estimation of soil OC and sand content, we modified the CLM to consider two cases 1) the soil porosity profile was fixed and independent from the soil OC and sand content (see scenarios from 1 to 8 in Table 1), and 2) the soil porosity was calculated from the OC and sand content as default in the CLM (see scenario 9 30 in Table 1) in which is the soil porosity; %OC is the OC content in soil; Φ !"# and Φ !" are, the porosity of mineral and OC, respectively. In the CLM, the OC porosity is given as Φ !" = 0.9 and the mineral conductivity is calculated from sand fraction as properties with respect to the OC content, sand content and liquid saturation is similar for both cases. When the OC fraction increases from 0 to 100%, the soil thermal conductivity decreases and the soil heat capacity increases. By contrast, higher sand fraction leads to higher thermal conductivity and lower heat capacity. These relationships are expected, given that OC has a 15 considerably smaller thermal conductivity and a slightly higher heat capacity compared to sand.
The figure also shows that both soil thermal conductivity and heat capacity significantly increase with increasing liquid saturation. This is also reasonable, as the thermal conductivity and heat capacity of liquid water are much higher than that of air. The thermal diffusivity is defined as the ratio between the thermal conductivity and heat capacity. The figure indicates that the diffusivity 20 increases when the OC decreases and sand content increases.
Compared the two cases of soil porosity, the figure shows that when the soil porosity is defined as a function of OC and sand content, the soil thermal properties change more quickly in larger ranges with the variation of OC content, sand content and liquid saturation. It is because while the 25 soil porosity is fixed at 0.7 in the first case (Figure a

Petrophysical and geophysical transformation
In our inverse scheme, we link the output of hydrological-thermal simulation described above (soil ice/liquid saturation and temperature) to soil electrical conductivity using the following petrophysical relationship: in which is the porosity; S w is the liquid water saturation in the pore space; m and n are the cementation and saturation indexes, respectively, and ! is the soil electrical conduction, which was fixed at ! =0.005 S/m in this study ( Table 1). The water electrical conductivity is calculated from the concentration of all ions in water as (Minsley et al., 2015): in which ! , and z i are the ionic mobility and valence of the i th ion, respectively. Similar to Minsley et al. (2015), we assumed that Na + and Clare two main ions in this synthetic study. F c is Faraday's constant. ! is the concentration of the i th ion, which depends on the ice/liquid fraction as: 15 in which S fi and S fw are, respectively, the fraction of ice and liquid in ice-liquid water (S fi +S fw =1); varies from 0 to 1 which is the coefficient accounting for the reduction of soil water salinity when liquid saturation decreases. A larger implies a larger increasing rate of ion concentration with decreasing liquid fraction. The concentrations of ions in the ice-free water ( ! (! !" !!) ) can be obtained from samples in the summer season. The values of m, n, ! , , F c , ! , C i and used in 20 this synthetic study are presented in Table 1. Except for m, n and ! , the other parameters were taken from Minsley et al. (2015). Of these parameters, and m are two most important parameters that control the relationship between geophysical and hydrological-thermal variables, we estimated them by inverting soil moisture, temperature and geophysical data in scenarios 8 and 9 (see Table 2). 25 The effect of soil temperature (T) on the soil electrical conductivity is formulated as: The linkage between soil electrical conductivity with the apparent resistivity is established by the electrical forward model. In this study, we used the forward model of the Boundless Electrical

Stochastic and deterministic parameter estimation 5
In this section, we present the combination of deterministic and stochastic optimization algorithms to estimate the model parameters and their uncertainties. The stochastic optimization algorithm relies on the Bayesian inference and DRAM MCMC technique. The deterministic optimization technique was used to approximate the initial set of model parameters and initial covariance matrix of the proposal distribution for DRAM. Using this combined deterministic-stochastic 10 algorithm, the posterior distributions of estimated parameters are expected to more rapidly obtain than using stochastic algorithm because the initial set of estimated parameters and proposal covariance matrix for MCMC simulations are inherited from deterministic optimization rather than from arbitrary values. In addition, the DRAM optimization algorithm allows us to sequentially update the proposal covariance matrix and perform multiple tries to improve the 15 acceptance rate. This algorithm has been proved to be more efficient than the commonly-used MCMC Metropolis-Hasting method.

Bayesian inference
In the stochastic parameter estimation, the objective is to find the posterior probability distribution 20 | of parameters conditioned on the measurements Y from which we can extract the bestestimated parameters and their uncertainties. Based on Bayesian rule, this posterior distribution is formulated as follows: in which is priori parameter distribution of parameter and | is the likelihood 25 function. Assuming that the error residuals are uncorrelated, the likelihood function can be written as: where ! ! ! | denotes the probability density function (pdf) of measurement ! at time t i given the model parameters . If we further assume the error residuals (difference between modeling 30 and measurement) to be normally distributed, then ! ! ! | can be written as: and Equation 9 becomes: (11) in which ! ( ) is the model response at time t i ; ! ! is the variance of measurement error at time t i . 5

Delayed rejection adaptive Metropolis (DRAM) Markov Chain Monte Carlo method
Once posterior density distribution | of the model parameters is defined, we need to determine its statistical properties (e.g., mean, covariance). However, due to the nonlinearity of the dynamic model, it is usually difficult to analytically obtain these properties. In that respect, the 10 Monte Carlo methods can be used to generate samples from this posterior distribution and then calculate these properties. We employed the DRAM method that was improved from the Metropolis-Hasting MCMC method for this purpose. Basically, this method is a combination of the adaptive Metropolis and delayed rejection algorithm and briefly presented as follows:

15
Metropolis-Hasting: Given the current parameter set ! at iteration k th , the candidate for the next move ( !!! ! ) from the current value is generated from a proposal distribution ! ( ! , !!! ! ). The acceptance ratio is calculated as below: where ( ) is the target distribution needed to approximate ( | ). The next sample moves to 20 if > with as a random variable generated from uniform distribution U(0,1). Otherwise, the candidate is rejected and the next sample stays at the current location, !!! = ! .
Delayed rejection: In delayed rejection, once the candidate is rejected, instead of staying at the 25 current sample, a second ( !!! !! ) try is proposed. The acceptance ratio for this try is: If the second try is rejected, the third try can be generated and so on. The number of tries is specified by users.
Adaptation: One of the key limitations of the MCMC technique is the selection of the proposal 5 distribution model. In adaptive Metropolis, the proposal distribution is assumed to be Gaussian centered at the current sample ( ! , ! ) with the covariance matrix ! adapted from the previous samples as: In Equation 14, ! is the scaling parameter that depends on the length (d) of the estimated 10 parameter vector , which is set to ! = 2.4 ! ; ε > 0 is a very small constant to inhibit C k from becoming singular; and I d signifies the d-dimensional identity matrix.
The assessment of convergence of the MCMC chain is analyzed by Geweke's criterion, which compares the means and variances of the beginning and end segments of the chain as below: 15 where a denotes the beginning interval, which was selected as the first 10% of the chain, and where b denotes the end interval, which was selected as the last 50% of the chain. !,! , !,! are, respectively, the mean of parameters i th of segments a and b; n a , n b are the number of samples in a 20 and b segments; and !,! and !,! are their corresponding consistent spectral density estimates at zero frequency. The chain is considered to be converged if the G i score is within the 95% interval of the standard Gaussian distribution (−1.96 ≤ ! ≤ 1.96).

Deterministic optimization for approximating starting parameters and proposal 25 distribution
The speed of convergence of the MCMC optimization algorithm strongly depends on the initial model parameter ! and initial proposal distribution ! . In order to reduce the number of iterations needed to obtain the posterior distribution of model parameters, we used the local optimization Nelder-Mead Simplex Method to approximate the starting model parameter and initial covariance matrix of the proposal distribution. The starting point of the DRAM is the best-estimated set of model parameters obtained by the Nelder-Mead method. The covariance matrix of the proposal distribution is assumed to be similar to that of the model parameters, which are locally calculated at the optimal solution obtained by the Nelder-Mead method as below: in which ! ! denotes the initial covariance matrix of the proposal distribution ! and J is the Jacobian matrix, which is defined as below: The partial derivatives are calculated at the optimal solution of 10 the Nelder-Mead Simplex Method. Because these derivatives cannot be solved using analytical methods, we approximated them using: in which ∆ ! was set at 5% of the parameter ! . To test the value of the developed joint inversion approach under a range of conditions and assumptions, we performed several synthetic case studies using the numerical soil column illustrated in Figure 3. The synthetic column was developed to mimic typical soil and petrophysical properties associated with a high-centered polygon at the Next Generation Environmental Ecosystem, Artic (NGEE-Artic) tundra site near Barrow, Alaska (e.g., Hubbard et 5 al., 2013). Figure 4 shows the intensive study transect. The transect is 35 m in length and covers three typical topography types in Barrow, namely, high-centered (HCP), flat-centered (FCP) and low-centered polygon (LCP). The thawing occurs during the growing season lasts from the beginning of June to the end of September. In the growing season while the LCP is fully saturated with standing water above soil surface, the HCP is relatively dry and unsaturated. ERT 10 measurements were performed along the transect daily using Wenner-Schlumberger configuration with an electrode spacing of 0.5 m. Other measurements and conditions useful for our synthetic studies, including soil temperature, soil moisture, thaw depth, snow dynamics, and climate conditions were also measured (Dafflon et al., 2017). These data will be used for real application of the joint inversion scheme that is carrying on. 15 Soil properties and petrophysical information used for the synthetic studies are provided in Table   1. The "true" soil properties are based on the core sample analysis at the Barrow, AK site (Dafflon, personal communication) and the "true" petrophysical parameters were obtained from Minsley et al. (2015). It is worth noting that soil is represented in the CLM as a mixture of OC, 20 sand and clay. As such, in order to estimate the soil mixture, it was sufficient for us to consider OC and sand content (in sand-clay mineral mixture) only.
We assumed that the vertical profiles of soil properties (porosity, OC and sand content) were constructed by interpolating their corresponding values at 4 depths z k =0.15, 0.3, 0.6 and 1 m as 25 below: where f z are the soil properties at depth z and f k are the soil properties at the corresponding depth We synthetically explored 9 scenarios using the newly developed inversion procedure.
Descriptions of these scenarios are presented in Table 2. The objectives of these scenarios are as follows: 1. Scenarios 1 and 2: Evaluate the effect of measurement errors on uncertainties of soil OC 5 estimates (using electrical resistivity data as an example).  For all scenarios, we used daily time step meteorological forcing data (including air temperature, 20 wind speed, short-wave and long-wave radiation and precipitation) collected at the Barrow site over a year period from 01/01/2013 to 31/12/2013, which includes a time period over which some of the soil and electrical datasets were also collected at the NGEE-Arctic site. The plant functional type information was obtained from the CLM database for the Artic region. The general approach that we followed to perform all synthetic scenarios is presented in Figure 5. 25 In order to account for the measurement errors, we assumed that the error distribution was Gaussian, and added error to synthetic data to obtain "noisy" synthetic data (hereafter referred to as observation data). Information on observation data is presented in Table 2. We set the standard deviation of ERT measurement error to 2% of synthetic data for scenario 1 (low measurement 30 error) and to 5% for the other scenarios. We used a standard deviation of measurement errors of 0.5°C for soil temperature and 0.08 for soil liquid content. The standard deviation of liquid content

Simulation results
In order to estimate the posterior pdf of OC and sand content as well as petrophysical parameters, we generate 8000 samples for scenarios from 1 to 7 and 15000 samples for scenarios 8 and 9. The 20 number of samples in scenarios 8 and 9 are larger because there is more number of estimated parameters in these scenarios. We selected the last 5000 samples having a Geweke's score less than 0.4 to construct the pdfs of these parameters. Their best estimates and associated uncertainties (σ) are, respectively, represented by the means and standard deviations of the samples and summarized in Table 2. Discussion and comparison of the scenarios are presented below. 25

Effect of measurement error on parameter uncertainty
The influence measurement error on the parameter uncertainties was considered by comparing scenario 1 and 2 using of apparent resistivity as an example. Scenario 1 assumed that the standard deviation of measurement error is 2% of synthetic apparent resistivity data (small measurement 30 error), while this value for scenario 2 is 5% (large measurement error). For these two scenarios, we estimated the OC content at z= 0.1, 0.3 and 0.6 m Figure 6 shows the probability functions of the OC at these depths. The figures indicate that the uncertainties of the estimated OC content at z= 0.1 and 0.3 m are considerably higher when the measurement error is larger. As shown in Table   2, when the measurement error of apparent resistivity increases from 2% to 5%, the standard deviation of the posterior OC samples increases three times from 0.2 to 0.6 at z=0.1 m and more 5 than two times from 2.6 to 5.5 at z=0.2 m. At z=0.6 m, the OC content cannot be reliably obtained by both scenarios.
In order to investigate the non-uniqueness problem and the correlation between parameters, we estimated the misfit (sum of square differences) between the synthetic and sampled electrical 10 resistivity data as a function of the OC content at z=0.1, 0.3 and 0.6 m for scenario 2 (Figure 7). This figure indicates that while the OC at z=0.1 m is well identified, the misfit negligibly changes when the OC content at z=0.6 m varies from 20 to 50%. This indicates that the apparent resistivity data is insensitive to OC content at z=0.6 m. This is reasonable, because this depth is within the permafrost (see Figure 13), where temperature insignificantly changes over time. Figure 7 also  15 shows that there is a negative correlation between the OC content at z=0.3 and 0.6 m, which increases the uncertainties of OC estimates at both depths.

Influence of joint inversion of multiple data on parameter uncertainty
The effectiveness of the joint inversion of multiple datasets on the OC content estimation (at 20 z=0.1, 0.3 and 0.6 m) was investigated by comparing results obtained from 6 scenarios that used 1) single apparent resistivity (scenario 2); 2) single temperature (scenario 3); 3) single liquid content (scenario 4); 4) temperature and apparent resistivity data (scenario 5); 5) liquid water content and apparent resistivity (scenario 6); and 6) liquid water content, temperature and apparent resistivity data (scenario 7). Figure 8 presents the probability distribution of the OC content of these 25 scenarios. The figures indicate that joint inversion of apparent resistivity with either temperature and/or liquid content data significantly reduces the uncertainties of OC content at z=0.1 and 0.3 m.
For example, compared to use single temperature dataset, the uncertainty of OC content reduces from 0.4 to 0.2 at z=0.1 m, and from 4.6 to 1.9 at z=0.3 m when joint using temperature, liquid content and apparent resistivity datasets. Finally, we found that even when all "observation" data 30 are used, there is no improvement in the OC content estimate at z=0.6 m. These synthetic experiments suggest that given this depth is located within the permafrost (see Figure 13 apparent resistivity, liquid content and temperature data are insensitive to OC content.

Effect of mineral content and petrophysical parameters
In scenario 8, in addition to the OC content, we assumed that the sand content and petrophysical parameters m and α are unknown and estimated these parameters using the apparent resistivity, 5 temperature and liquid content data. Their posterior probability distributions are presented in sand-clay mineral (92%) dominates the OC content (8%), and therefore, the hydrological-thermal data are relatively sensitive to this parameter. By contrast, the OC and sand content at z=0.6 m are unidentifiable with uncertainties up to 6.5 and 8.2, respectively. Finally, both of the petrophysical parameters m and α are well specified but the parameter α is better obtained. This implies that α is more sensitive to the apparent resistivity than to m. 15 The pairwise relationships between estimated parameters are shown in Figure 10

Effect of porosity dependence on OC and mineral content
In this section, we evaluate how the parameter uncertainties change when the porosity is determined as a function of the OC and mineral content. We perform this evaluation by comparing scenarios 8 and 9. Table 2 shows that all information in scenario 9 is similar to that of scenario 8 30 except for the soil porosity. While the soil porosity in scenario 8 was fixed and independent from The pdfs of all estimated parameters for scenario 9 are shown in Figure 11. The figure indicates that compared to scenario 8, all uncertainties of sand and OC content in scenario 9 are smaller. 5 Especially, the uncertainties of these parameters at z=0.6 m are significantly decrease from 6.5 to 3.8 (for OC content) and from 8.2 to 1.8 (for sand content). This can be explained by the fact that, in addition to thermal parameters, the OC and sand content in scenario 9 controls the soil porosity, which also influences the subsurface hydrological-thermal dynamics (see Figure 2). As a result, the temperature, liquid water and apparent resistivity data in this scenario are more sensitive to 10 variations of OC and sand content than those in scenario 8. Consequently, these parameters are more identifiable. By contrast, the uncertainty of petrophysical parameters and m considerably increases from 0.002 to 0.022 (for ) and from 0.042 to 0.066 (for m). This is because while it was fixed in scenario 8, the soil porosity depends on the OC and sand content in scenario 9. Therefore, the soil porosity in scenario 9 is also uncertain due to the uncertainties of the OC and sand content. 15 Because the soil porosity, and m are closely correlated (see Equation 4), the uncertainty of soil porosity causes higher uncertainties of and m.

Uncertainty propagation from parameters to the hydrological-thermal and thaw layer thickness prediction 20
In this section, we evaluate the impact of parameter uncertainties on the prediction of hydrological-thermal dynamics. Posterior samples of the OC, sand content and petrophysical parameters m and α of scenario 8 were used for this analysis. Figure 12   The comparison between synthetic and predicted apparent resistivity data is presented in Figure   14. The figure shows that there is a very good agreement between them with no bias, which implies that our inversion scheme converges to the best solution region. The confidence ranges 15 corresponding to a level of 95% vary from 1.4 to 9.4% of the "observation" resistivity, which is suitable with the relative measurement error of 5%.

Summary and Conclusions
In this study, we developed and tested a surface-subsurface coupled hydrogeophysical inversion 20 approach to estimate OC and its influence on hydrological-thermal behavior under Arctic freezethaw conditions. In our inversion scheme, the CLM model serves as a forward model to simulate the land-surface energy balance and surface-subsurface hydrological-thermal processes. The new scheme can jointly use different types of data for the inversion, including electrical resistivity data.
The dependence of soil electrical resistivity on temperature and ice/liquid content are explicitly 25 accounted for within the inversion. and the initial covariance matrix of the proposal distribution for the stochastic optimization, which helps to more quickly converge to the targeted posterior distribution.
We tested the inversion scheme using multiple synthetic experiments in 1-D soil column 5 representative of the Artic tundra, where surface-subsurface hydrological and thermal regimes cointeract and are influenced by soil organic carbon and mineral content. The obtained results show that the new inversion approach well reproduced the synthetic data in all experiments. The shallow (upper 0.3 m) active layer OC and sand content and the petrophysical parameters can be reliably obtained using ERT data and the inversion approach. When the soil porosity is fixed, the 10 uncertainties of OC and sand content are very high in the permafrost section (0.6 m), even when soil temperature, liquid saturation and apparent resistivity data were jointly used in the estimation procedure. This suggests that when the porosity is fixed, the inversion approach is unable to significantly improve the estimation of OC within the permafrost, due to the small magnitude of temporal variation of both temperature and soil moisture in that section. However, if the soil 15 porosity is considered as a function of OC and sand content, the permafrost parameters can be reliably obtained because the variation of porosity with OC and sand content increases the sensitivity to ice/liquid water and temperature. Examining the relationship between measurement errors and parameter uncertainties, we found that the uncertainties of estimated parameters increases with increasing measurement error. We also explored the improvement in parameter 20 estimation when jointly using multiple data for the inversion. Compared to single dataset inversion (either temperature or soil moisture or electrical resistivity), joint inversion significantly reduces the uncertainties of estimated parameters, especially at 0.3 m depth. Finally, we quantified the influence of parameter uncertainties on the prediction of hydrological-thermal and thaw depth dynamics. The obtained results show that the soil water liquid content prediction is more uncertain 25 than the soil temperature and apparent resistivity predictions, due to its large measurement error and its high sensitivity to OC and sand content. The uncertainties in OC and sand content have an impact on the thaw depth estimation only during the warmest months of the year (August and September).

30
This study developed and tested a novel approach to estimate soil OC content using inverse modeling that can incorporate diverse hydrological, thermal and ERT datasets. In addition, the study also permitted exploration of surface-subsurface hydrological-thermal dynamics and spatiotemporal variations associated with free-thaw transitions. Given the importance of characterizing OC as part of ecosystem and climate studies, the typical challenges associated with collecting and analyzing 'sufficient' core data to characterize the vertical and horizontal variability of OC associated with a field study site, and the increasing use of electrical resistivity data to characterize vertical, horizontal and temporal variability in shallow systems, the new inversion approach offers significant potential for improved characterization of OC over field-relevant 5 conditions and scales. It also offers significant potential for improving our understanding of hydrological-thermal behavior of naturally heterogeneous permafrost systems. The successful validation of this approach using 1D synthetic studies provides a foundation for extending it to 2D and applying it to real field data, which is research that is currently underway.

10
In this study, we concentrated on the impact of OC on the soil electrical resistivity via its hydrological-thermal properties. Recent studies indicated that the soil OC content largely influences ionic mobility, and therefore, changes the polarization and relaxation time of soil response to the applied current, which can be measured by spectral induced polarization (SIP) (e.g., Schwartz, N., & Furman, 2015). As a result, our future study will explore the possibility to 15 integrate SIP measurements into our coupled hydrological-thermal-geophysical inversion scheme.
In that case, the OC content is linked to SIP measurements both by its hydrological-thermal and electrical polarization properties.
This study assumed that the salinity depends only on the water liquid/ice concentration. We are 20 planning to integrate a solute transport model into hydrological-thermal model to account for the spatiotemporal dynamics of ions in water.
With advancements in data acquisition, the surface-subsurface hydrological-thermal dynamics now can be monitored in real-time with a high temporal resolution using multiple above-and 25 below-ground measurements including geophysical techniques. Our next stage is to expand the inversion scheme so that it can assimilates these data into hydrological-thermal models to improve the model prediction in real-time.  Table 2