Important

The 2014-10 release is compatible with GIFtools v2.30 and earlier. This version of the DCIPoctree package will not be maintained. For documentation of the latest version, see latest DCIP octree manual .

2. Background theory¶

2.1. Introduction¶

This manual presents theoretical background, numerical examples, and explanation for implementing the program library DCIPoctree. This suite of algorithms, developed at UBC-GIF, is needed to efficiently invert large sets of DC potential data and IP responses over a 3D earth structure. The manual is designed so that a geophysicist who has understanding of DC resistivity and induced polarization field experiments, but who is not necessarily versed in the details of inverse theory, can use the codes to invert his or her data.

../_images/potentials.png — Fig. 2.1 Definition of three potentials associated with DC/IP experiments.¶

A typical DC/IP experiment involves inputting a current \(I\) to the ground and measuring the potential away from the source. In a time-domain system, the current alternates in direction and has off-times between the current pulses at which the IP voltages are measured. A typical time-domain signature is shown in Fig. 2.1. In that figure, \(\phi_\sigma\) is the potential that is measured in the absence of chargeability effects. This is the instantaneous value of the potential measured when the current is turned on. In mathematical terms, this potential is related to the electrical conductivity \(\sigma\) by:

\[\phi_\sigma = \mathcal{F}_{dc}[\sigma]\]

where forward mapping operator \(\mathcal{F}_{dc}\) is defined by the equation:

(2.1)¶\[\nabla\cdot(\sigma\nabla \phi_\sigma)=-I\delta(\mathbf{r}-\mathbf{r}_s)\]

and appropriate boundary conditions. In Equation (2.1), \(\sigma\) is the electrical conductivity in Siemens/meter (S/m), \(\nabla\) is the gradient operator, math:I is the strength of the input current in Amperes (A), and \(\mathbf{r}_s\) is the location of the current source. For typical earth structures, \(\sigma\), while positive, can vary over many orders of magnitude. The potential in Equation (2.1) is the potential due to a single current. This is the value that would be measured in a pole-pole experiment. If potentials from pole-dipole or dipole-dipole surveys are to be generated then they can be obtained by using Equation (2.1) and the principle of superposition.

When the earth material is chargeable, the measured voltage will change with time and reach a limit value which is denoted by :math:phi_eta: in Fig. 2.1. There are a multitude of microscopic polarization phenomena which when combined produce this response but all of these effects can be consolidated into a single macroscopic parameter called chargeability. We denote chargeability by the symbol \(\eta\). Chargeability is dimensionless, positive, and confined to the region [0,1).

To carry out forward modelling to compute \(\phi_{\eta}\), we adopt the formulation of [Sie59] which states that the effect of a chargeable ground is modelled by using the DC resistivity forward mapping \(\mathcal{F}_{dc}\) but with the conductivity replaced by \(\sigma=\sigma(1-\eta)\). Thus:

(2.2)¶\[\phi_\eta=\mathcal{F}_{dc}[\sigma(1-\eta)]\]

or

(2.3)¶\[\nabla\cdot(\sigma(1-\eta)\nabla\phi_\eta)=-I\delta(\mathbf{r}-\mathbf{r}_s)\]

The IP datum can be either the secondary potential (\(\phi_s\)) or the apparent chargeability (\(\eta_a\)). The former is the difference of the forward modelled potentials with, and without, the IP effect:

(2.4)¶\[\phi_s=\phi_\eta-\phi_\sigma=\mathcal{F}_{dc}[\sigma(1-\eta)]-\mathcal{F}_{dc}[\sigma]\]

The apparent chargeability is then given by the ratio:

(2.5)¶\[\eta_a=\frac{\phi_s}{\phi_\eta}=\frac{\mathcal{F}_{dc}[\sigma(1-\eta)]-\mathcal{F}_{dc}[\sigma]}{\mathcal{F}_{dc}[\sigma(1-\eta)]}\]

In this definition, the apparent chargeability is dimensionless and, in the case of data acquired over an earth having constant chargeability \(\eta_0\), we have \(\eta_a=\eta_0\). Equations (2.4) and (2.5) show that the IP data can be computed by carrying out two DC resistivity forward modellings with conductivities \(\sigma\) and \(\sigma(1-\eta)\). The secondary potential is the more general form of IP data and the apparent chargeability is only defined when the linear (or polar) arrays are used along a line on the surface or in the same borehole. When the current and potential dipole-electrodes are arranged in 3D space and so they are not aligned, the total potential can take on positive, zero, or negative values. The cross-line experiments on the surface and cross-hole experiment on boreholes are examples of such situations. Because of the zero-crossing in the total potentials, the commonly used apparent chargeability is undefined. In these cases, the appropriate data to measure the IP effect is the secondary potential. Therefore, we will use secondary potential as the basic IP datum except in the case of linear arrays.

The field data from a DC/IP survey are a set of N potentials (ideally \(\phi_\sigma\), but usually \(\phi_\eta\)) and a set of N secondary potentials \(\phi_s\) or a quantity that is related to \(\phi_s\). The goal of the inversionist is to use these data to acquire quantitative information about the distribution of the two physical parameters of interest: conductivity \(\sigma(x,y,z)\) and chargeability \(\eta(x,y,z)\).

The distribution of conductivity and chargeability in the earth can be extremely complicated. Both quantities vary as functions of position in 3D space. In addition, there is often large topographic relief. In this program library, the 3D nature of the physical properties and surface topography are fully incorporated. The Earth model is divided into prismatic cells each having a constant value of conductivity and chargeability. The surface topography is approximated by a piecewise constant surface.

2.2. Forward modelling¶

The forward modelling for the DC potentials and IP apparent chargeabilities is accomplished using a finite volume method [DM79] and a pre-conditioned conjugate gradient technique to solve Equation (2.1). The program that performs these calculations is DCIPoctreeFwd. The DC modelling is performed by a single solution of Equation (2.1), and the IP modelling is performed by carrying out two DC forward modellings. The IP data are generated according to the operations indicated in Equations (2.4) and (2.5). To illustrate the DC resistivity and IP forward modelling algorithm, we generate synthetic data that would be acquired over the 3D conductivity structure shown in Fig. 2.2.

../_images/5prisms1.png — Fig. 2.2 The synthetic model consists of five rectangular blocs in a uniform halfspace. The blocks S1, S2, and B2 are more conductive than the uniform halfspace; and blocks S3 and B1 are more resistive. All five blocks are chargeable. There are seven lines in east-west direction which are spaced 100 m apart. There are also four boreholes that extend to a depth of 400 m.¶

The model consists of five rectangular blocks buried in a uniform halfspace. Three smaller blocks are placed on the surface while two larger blocks are at depth to simulate the target of the survey. The blocks S1, S2, and B2 are more conductive than the uniform halfspace; and blocks S3 and B1 are more resistive. All five blocks are chargeable. Data from ten east-west lines surface lines with a line spacing of 100 m and four vertical boreholes are forward modelled. The surface experiment is carried out using pole-dipole data with a = 50 m and n = 1, 6, while the borehole experiments use a cross-hole pole-dipole configuration with a 50 m potential dipole.

Fig. 2.3 displays the DC resistivity data from three selected lines for the surface experiment. The data are displayed in pseudo-section format. Note the strong responses to the conductivity anomalies on the surface. They appear as pant-legs extending from small n-spacing all the way to the largest n-spacing. The buried blocks are hardly identifiable since their responses have low amplitudes and broad distributions and are masked by the surface anomalies. The apparent chargeability pseudo-sections from the same lines are shown in Fig. 2.4 (note that the apparent chargeability is well-defined in this case). The masking effects of the surface blocks are also evident in the IP data. Thus inversion is required.

../_images/FWD_cond.png — Fig. 2.3 Examples of the apparent conductivity pseudo-sections along three east-west traverses. The data are simulated for a pole-dipole array with a = 50 m and n = 1 to 6. The forward modelled data have been contaminated by independent Gaussian noise with a standard deviation equal to 2% of the accurate datum magnitude and mean of zero. The pseudo-sections are dominated by the surface responses, but there are some indications of the buried prisms. The colormap shows the apparent conductivity in mS/m.¶

../_images/FWD_chg.png — Fig. 2.4 Examples of the apparent chargeability pseudo-sections along three east-west traverses. The data have been contaminated by independent Gaussian noise with a standard deviation equal to 2% of the accurate datum magnitude plus a minimum of 0.001. The same masking effect of near-surface prisms observed in apparent conductivity pseudo-sections is also present here. The colormap shows the apparent chargeability multiplied by 100.¶

Since we intend to invert these data, we have added independent Gaussian noise. The standard deviation of the noise is equal to 2% of the datum magnitude plus a small threshold to deal with near zero data. The effect of the added noise can be seen in Fig. 2.3 and Fig. 2.4.

2.3. General inversion methodology¶

The inverse problem is formulated as an optimization problem where an objective function of the model is minimized subject to the constraints in Equation (2.1) for DC resistivity data or Equation (2.3) for IP data. To outline our methodology, it is convenient to introduce a single notation for the data and for the model. We let \(\mathbf{d} = (d_1,d_2,...,d_N)^T\) denote the data, where \(N\) is the number of data. Using this notation, \(d_i\) is either the \(i^{th}\) potential in a DC resistivity data set, or the \(i^{th}\) secondary potential/apparent chargeability in an IP survey. Let the physical property of interest be denoted by the generic symbol \(m\) for the model element. The quantity \(m_i\) denotes the conductivity or chargeability of the \(i^{th}\) model cell. For the inversion, we choose \(m_i=\ln(\sigma_i)\) when inverting for conductivities, and \(m_i=\eta_i\) when reconstructing the chargeability distribution.

Having defined a model, we next construct an objective function which, when minimized, produces a model that is geophysically interpretable and reproduces the data \(\mathbf{d}\) to a justifiable level based on their associated uncertainties. The details of the objective function are problem dependent but generally we need the flexibility to be close to a reference model \(m_o\) and also require that the recovered model be relatively smooth in all three spatial directions. Here we adopt a right-handed Cartesian coordinate system with \(y\) positive north and and \(z\) positive up. In defining the model objective function, the reference model will generally be included in the first component of the objective function but it can be removed, if desired, from the remaining derivative terms since we are often more confident in specifying the value of the model at a particular point than in supplying an estimate of the gradient. This leads to the following two distinct formulations of the model objective function.

(2.6)¶\[\begin{split}\Phi_m = &&\alpha_s\int\int\ w_s(m-m_0)^2dv + \alpha_x\int\int w_x\left(\frac{\partial{(m-m_0)}}{\partial x}\right)^2dv+ \nonumber \\ &&\alpha_y\int\int w_y\left(\frac{\partial{(m-m_0)}}{\partial y}\right)^2 dv + \alpha_z\int\int\ w_z\left(\frac{\partial{(m-m_0)}}{\partial z}\right)^2dv,\end{split}\]

(2.7)¶\[\begin{split}\Phi_m = &&\alpha_s\int\int\ w_s(m-m_0)^2dv + \alpha_x\int\int w_x\left(\frac{\partial{m}}{\partial x}\right)^2dv+ \nonumber \\ &&\alpha_y\int\int w_y\left(\frac{\partial{m}}{\partial y}\right)^2 dv + \alpha_z\int\int\ w_z\left(\frac{\partial{m}}{\partial z}\right)^2dv,\end{split}\]

where the weighting functions \(w_s\), \(w_x\), \(w_y\) and \(w_z\) are spatially dependent, and \(\alpha_s\), \(\alpha_x\), \(\alpha_y\) and \(\alpha_z\) are coefficients which affect the relative importance of different components in the model objective function. The reference model \(m_o\) may be a general background model that is estimated from previous investigations or it could be a zero model.

The model objective function in Equation (2.6) is used when the SMOOTH_MOD_DIF option is selected in the inversion input control file while Equation (2.7) is used when the SMOOTH_MOD option is selected in the inversion input control file. The choice of whether or not to include \(m_o\) in the derivative terms can have significant effect on the recovered model.

The relative closeness of the final model to the reference model at any location is controlled by the function \(w_s\). For example, if the interpreter has high confidence in the reference model at a particular region, he can specify \(w_s\) to have increased amplitude there compared to other regions of the model. The interface weighting functions \(w_x\), \(w_y\), and \(w_z\) can be designed to enhance or attenuate structures in various regions in the model domain. If geology suggests a rapid transition zone in the model, then a decreased weighting for flatness can be put there and the constructed model will exhibit higher gradients provided that this feature does not contradict the data.

To perform a numerical solution, we discretize the model objective functions in Equations (2.6) and (2.7) using a finite difference approximation on the mesh defining the conductivity/chargeability model. This yields:

(2.8)¶\[\begin{split}\Phi_m(\mathbf{m})&=&(\mathbf{m}-\mathbf{m}_o)^T(\alpha_s \mathbf{W}_s^T\mathbf{W}_s+\alpha_x \mathbf{W}_x^T\mathbf{W}_x+\alpha_y \mathbf{W}_y^T\mathbf{W}_y+\alpha_z \mathbf{W}_z^T\mathbf{W}_z)(\mathbf{m}-\mathbf{m}_o), \nonumber\\ &\equiv&(\mathbf{m}-\mathbf{m}_o)^T(\mathbf{W}_m^T\mathbf{W}_m)(\mathbf{m}-\mathbf{m}_o), \nonumber\\ &= &\left \| \mathbf{W}_m(\mathbf{m}-\mathbf{m}_o) \right \|^2,\end{split}\]

for Equation (2.6) and the following for Equation (2.7).

(2.9)¶\[\begin{split}\Phi_m(\mathbf{m}) & = &(\mathbf{m}-\mathbf{m}_o)^T(\alpha_s \mathbf{W}_s^T\mathbf{W}_s)(\mathbf{m}-\mathbf{m}_o)+\mathbf{m}^T(\alpha_x \mathbf{W}_x^T\mathbf{W}_x+\alpha_y \mathbf{W}_y^T\mathbf{W}_y+\alpha_z \mathbf{W}_z^T\mathbf{W}_z)\mathbf{m}, \nonumber\\ &\equiv&(\mathbf{m}-\mathbf{m}_o)^T(\mathbf{W}_s^T\mathbf{W}_s)(\mathbf{m}-\mathbf{m}_o)+\mathbf{m}^T\mathbf{W}^T\mathbf{W}\mathbf{m},\end{split}\]

where \(\mathbf{m}\) and \(\mathbf{m}_o\) are \(M\)-length discretized model vectors which characterize the conductivity/chargeability distributions within the current model and reference model, respectively. The individual matrices \(\mathbf{W}_s\) , \(\mathbf{W}_x\), \(\mathbf{W}_y\), and \(\mathbf{W}_z\) are straight-forwardly calculated once the model mesh and the weighting functions \(w_s\) , \(w_x\), \(w_y\), \(w_z\) are defined. The cumulative matrix \(\mathbf{W}_m^T\mathbf{W}_m\) is then formed.

Having chosen an appropriate model objective function the next step in setting up the inversion is to define a data misfit measure. Here we use the \(l_2\)-norm measure:

(2.10)¶\[\Phi_d = \left\| \textbf{W}_d(\textbf{d}-\textbf{d}^{obs})\right\|^2_2\]

and assume that the contaminating noise in the data is independent and Gaussian with zero mean. Specifying \(\mathbf{W}_d\) to be a diagonal datum weighting matrix whose \(i^{th}\) element is \(1/\epsilon_i\), where \(\epsilon_i\) is the standard deviation of the \(i^{th}\) datum, makes \(\Phi_d\) a chi-squared variable distributed with \(N\) degrees of freedom. Accordingly \(E[\chi^2]=N\) provides a target misfit for the inversion.

The inverse problem is solved by finding a model m which minimizes \(\phi_m\) and misfits the data by a pre-determined amount. Thus the solution is obtained by the following minimization problem of a global objective function \(\phi\),

(2.11)¶\[\begin{split}\min \Phi = \Phi_d+\beta\Phi_m \\ \mbox{s. t. } \Phi_{d}=\Phi_{d}^* \text{and optionally} ~ m^l\leq m\leq m^u, \nonumber\end{split}\]

where \(\beta\) is a trade-off parameter that controls the relative importance of the model norm and data misfit. When the standard deviations of data errors are known, the acceptable misfit is given by the expected value \(\phi_{d}^*\). In general, each parameter in the recovered model (\(\mathbf{m}\)) lies within its respective lower (\(\mathbf{m}^l\)) and upper (\(\mathbf{m}^u\)) bound. Chargeability is positive by definition so bounds are used in all IP inversions to implement the positivity constraint.

The choice of the regularization parameter \(\beta\) in the DC resistivity or IP inversion ultimately depends upon the magnitude of the error associated with the data. The inversion of noisier data requires heavier regularization, thus a larger value of \(\beta\) is required. Since the inversion of DC resistivity data is nonlinear, it is also important need to avoid the possibility of getting trapped in a local minima. The following strategy is implemented to determine an adequate \(\beta\) in the program library DCIPoctree.

For known uncertainty distributions, the expected value of \(\phi_d\) is easily calculated. For example, independent data with Gaussian noise of zero mean has an expected target misfit (\(\phi_{d}^*\)) of \(N\) number of data. The value of \(\beta\) should be such that the expected misfit is achieved.

A line search based on the misfit curve as a function of beta is performed to approximate the optimal value of \(\beta\). Due to the high computational expense associated with the inversion, we generally cannot afford to perform the line search by carrying out complete solutions for a series of \(\beta\)’s. Starting with a sufficiently large value of \(\beta\) ensures that the line search will successfully find an appropriate value while avoiding the computational expense of a full line search.

By reducing \(\beta\) by a fixed factor and performing one or two Gauss-Newton updates (which brings the recovered model close to its final solution for that \(\beta\)) for each value in the decreasing sequence it is possible to determine a general range for the optimal \(\beta\) value. Once this range is established the inversion is run to convergence for a few \(\beta\) values using the recovered model from a nearby \(\beta\) value inversion as the initial model for the next inversion. This greatly reduces the computational expense, by limiting the number of iterations required for convergence. The way optimal \(\beta\) value determined using the same basic strategy in both the DC and IP inversion codes. The only difference is that which the DC inversion we need to factor the forward modeling matrix every time that the conductivity model is updated, while in the IP case, only one (initial) factorization is required. The pseudo-code for computing the optimal \(\beta\) is shown in Fig. 2.5.

../_images/chart.png — Fig. 2.5 Pseudo-code describing the DC/IP inversion algorithm.¶

This inversion methodology provides a basic framework for solving a 3D geophysical inversion with arbitrary observation locations. The basic components are: the forward modelling operator, a model objective function that incorporates information about the reference model, a data misfit function, a trade-off parameter that ultimately determines how well the data will be reproduced, and an optimization algorithm that minimizes an objective function, subject to optional bound constraints. The specifics of the DC and IP data inversion are discussed in the following sections.

2.4. Inversion of DC resistivity data¶

The program library DCIPoctree provides a DC resistivity inversion program, DCoctreeInv. The inversion of DC resistivity data, formulated as the minimization of the global objective function (see Equation (2.11)), is nonlinear since the data do not depend linearly upon the conductivity model. A Gauss-Newton approach is used in which the objective function is linearized about a current model, \(\mathbf{m}^{(n)}\), a model perturbation is computed, and then used to update the current model. Substituting \(\mathbf{m}^{(n+1)}=\mathbf{m}^{(n)}+\delta\mathbf{m}\) into the global objective function (Equation (2.11)) gives:

(2.12)¶\[\phi(\mathbf{m}+\delta\mathbf{m})=\left \| \mathbf{W}_d(\mathbf{d}^{(n)}+\mathbf{J}\delta\mathbf{m}-\mathbf{d}) \right \|^2+\beta\left \| \mathbf{W}(\mathbf{m}+\delta\mathbf{m}- \mathbf{m}_0) \right \|^2+H.O.T\]

where \(\mathbf{J}\) is the sensitivity matrix and the element \(J_{ij}\) quantifies the influence of the model change in j-th cell on the i-th datum,

(2.13)¶\[J_{ij}=\frac{\partial d_i}{\partial m_j}=\frac{\partial \phi_i}{\partial ln(\sigma_i)}\]

Neglecting the higher order terms (H.O.T.) and setting to zero the derivative with respect to \(\delta\mathbf{m}\) yields the following system to solve for the model objective function (Equation (2.6)) used when the SMOOTH_MOD_DIF parameter is specified in the inversion input control file:

(2.14)¶\[(\mathbf{J}^T\mathbf{J}+\beta \mathbf{W}_m^{T}\mathbf{W}_m)\delta\mathbf{m} = -\mathbf{J}^T(\mathbf{d}^{(n)}-\mathbf{d})-\beta \mathbf{W}_m^T\mathbf{W}_m(\mathbf{m}^{(n)}-\mathbf{m}_0)\]

where \(\mathbf{W}_m^T\mathbf{W}_m\) is defined by Equation (2.8).

Similarly, the following system arises when the model objective function (Equation (2.7)) is used (i.e. the SMOOTH_MOD parameter is specified in the inversion input control file):

(2.15)¶\[(\mathbf{J}^T\mathbf{J}+\beta(\mathbf{W}_{s}^{T}\mathbf{W}_{s}+\mathbf{W}^{T}\mathbf{W}))\delta\mathbf{m} = -\mathbf{J}^T(\mathbf{d}^{(n)}-\mathbf{d})-\beta(\mathbf{W}_{s}^T\mathbf{W}_{s}(\mathbf{m}^{(n)}-\mathbf{m}_0)+\mathbf{W}^{T}\mathbf{W}\mathbf{m})\]

In these formulations we assume that the matrix \(\mathbf{W}_d\) has been absorbed into the sensitivity matrix and data vectors. By solving either of these inverse problems you obtain the model perturbation, which then allows you to generate a new model according to the following relation:

(2.16)¶\[\mathbf{m}^{(n+1)}=\mathbf{m}^{(n)} + \alpha \delta \mathbf{m},\]

where \(\alpha\) in (0,1] limits the step size and is chosen to ensure that the total objective function is reduced.

The major computational effort in this approach includes the calculation of the sensitivity matrix, solution of the basic linearized Equation (2.14), and the choice of regularization parameter \(\beta\). The sensitivity is computed using the standard adjoint equation approach, and Equation (2.14) or (2.15) is solved using a pre-conditioned conjugate gradient (CG) technique.

2.5. Inversion of IP data¶

To invert IP data it is necessary to linearize Equation (2.4). Let \(\eta_i\) and \(\sigma_i\) denote the chargeability and electrical conductivity of the \(i^{th}\) model cell. Linearizing the potential \(\phi_\eta\) about the conductivity model \(\sigma\) yields:

(2.17)¶\[\phi_\eta=\phi(\sigma-\eta \sigma)=\phi(\sigma)-\sum_{j=1}^{M}\frac{\partial \phi}{\partial \sigma_j}\eta_j\sigma_i+H.O.T\]

Substituting into Equation (2.4) yields:

(2.18)¶\[\phi_s=\phi_\eta-\phi_\sigma=-\sum_{j=1}^{M}\frac{\partial \phi}{\partial \sigma_j}\eta_j\sigma_i+H.O.T\]

When apparent chargeability is used as the IP data, substituting the above equation into Equation (2.5), yields:

(2.19)¶\[\eta_a=-\sum_{j}\frac{\sigma_j}{\phi_i}\frac{\partial \phi_i}{\partial \sigma_j}\eta_j =-\sum_{j}\sigma_j\frac{\partial ln(\phi)}{\partial ln(\sigma_j)}\eta_j\]

Thus the \(i^{th}\) datum (either secondary potential or apparent chargeability) is exposed as:

(2.20)¶\[d_i=\sum_{j=1}^{M}J_{ij}\eta_{ij}\]

where

(2.21)¶\[\begin{split}\left\{ \begin{array}{cl} \frac{\partial \phi_i \left[ \sigma \right]}{\partial ln\sigma_j}, &\mathbf{d}=\phi_s\\ \\ \frac{\partial ln\phi_i\left [ \sigma \right ]}{\partial ln\sigma_j},& \mathbf{d}=\eta_a \end{array}\right\}\end{split}\]

is the sensitivity matrix. Our inverse problem is formulated as:

(2.22)¶\[\begin{split}\min \phi_m=\left \| \mathbf{W}_m(\eta-\eta_0) \right \|^2 \nonumber \\ \mbox{s. t. } \phi_{d}=\phi_{d}^* \\ \text{and} \\ \eta\geq 0\end{split}\]

where \(\phi_d^{*}\) is a target misfit. Again, for ease of future notation we incorporate the diagonal weighting matrix (\(\mathbf{W}_d\)) into \(\mathbf{J}\) and \(\mathbf{d}\). In practice the true conductivity \(\sigma\) is not known and so we must use the conductivity found from the inversion of the DC resistivity data to construct the sensitivity matrix elements in Equation (2.21).