# 2. Background theory¶

## 2.1. Introduction¶

This manual presents theoretical background for DCIP OcTree modeling package. This suite of algorithms, developed at UBC-GIF, is needed to efficiently invert large sets of DC potential data and IP responses over a 3D earth structure. The manual is designed so that a geophysicist who has understanding of DC resistivity and induced polarization field experiments, but who is not necessarily versed in the details of inverse theory, can use the codes to invert his or her data.

A typical DC/IP experiment involves inputting a current \(I\) to the ground and measuring the potential away from the source. In a time-domain system, the current alternates in direction and has off-times between the current pulses at which the IP voltages are measured. A typical time-domain signature is shown in Fig. 2.1. In that figure, \(\phi_\sigma\) is the potential that is measured in the absence of chargeability effects. This is the instantaneous value of the potential measured when the current is turned on. In mathematical terms, this potential is related to the electrical conductivity \(\sigma\) by:

where forward mapping operator \(\mathcal{F}_{dc}\) is defined by the equation:

and appropriate boundary conditions. In Equation (2.1), \(\sigma\) is the electrical conductivity in Siemens/meter (S/m), \(\nabla\) is the gradient operator, math:I is the strength of the input current in Amperes (A), and \(\boldsymbol{r}_s\) is the location of the current source. For typical earth structures, \(\sigma\), while positive, can vary over many orders of magnitude. The potential in Equation (2.1) is the potential due to a single current. This is the value that would be measured in a pole-pole experiment. If potentials from pole-dipole or dipole-dipole surveys are to be generated then they can be obtained by using Equation (2.1) and the principle of superposition.

When the earth material is chargeable, the measured voltage will change with time and reach a limit value which is denoted by :math:phi_eta: in Fig. 2.1. There are a multitude of microscopic polarization phenomena which when combined produce this response but all of these effects can be consolidated into a single macroscopic parameter called chargeability. We denote chargeability by the symbol \(\eta\). Chargeability is dimensionless, positive, and confined to the region [0,1).

To carry out forward modelling to compute \(\phi_{\eta}\), we adopt the formulation of [Sie59] which states that the effect of a chargeable ground is modelled by using the DC resistivity forward mapping \(\mathcal{F}_{dc}\) but with the conductivity replaced by \(\sigma=\sigma(1-\eta)\). Thus:

or

The IP datum can be either the secondary potential (\(\phi_s\)) or the apparent chargeability (\(\eta_a\)). The former is the difference of the forward modelled potentials with, and without, the IP effect:

The apparent chargeability is then given by the ratio:

In this definition, the apparent chargeability is dimensionless and, in the case of data acquired over an earth having constant chargeability \(\eta_0\), we have \(\eta_a=\eta_0\). Equations (2.4) and (2.5) show that the IP data can be computed by carrying out two DC resistivity forward modellings with conductivities \(\sigma\) and \(\sigma(1-\eta)\). The secondary potential is the more general form of IP data and the apparent chargeability is only defined when the linear (or polar) arrays are used along a line on the surface or in the same borehole. When the current and potential dipole-electrodes are arranged in 3D space and so they are not aligned, the total potential can take on positive, zero, or negative values. The cross-line experiments on the surface and cross-hole experiment on boreholes are examples of such situations. Because of the zero-crossing in the total potentials, the commonly used apparent chargeability is undefined. In these cases, the appropriate data to measure the IP effect is the secondary potential. Therefore, we will use secondary potential as the basic IP datum except in the case of linear arrays.

The field data from a DC/IP survey are a set of N potentials (ideally \(\phi_\sigma\), but usually \(\phi_\eta\)) and a set of N secondary potentials \(\phi_s\) or a quantity that is related to \(\phi_s\). The goal of the inversionist is to use these data to acquire quantitative information about the distribution of the two physical parameters of interest: conductivity \(\sigma(x,y,z)\) and chargeability \(\eta(x,y,z)\).

The distribution of conductivity and chargeability in the earth can be extremely complicated. Both quantities vary as functions of position in 3D space. In addition, there is often large topographic relief. In this program library, the 3D nature of the physical properties and surface topography are fully incorporated. The Earth model is divided into prismatic cells each having a constant value of conductivity and chargeability. The surface topography is approximated by a piecewise constant surface.

## 2.2. Forward modelling¶

### 2.2.1. Discretized System¶

The forward modelling for the DC potentials and IP apparent chargeabilities is accomplished using a finite volume method [DM79] and a pre-conditioned conjugate gradient technique.

**For the DC problem**, (2.1) is discretized and the electric potential at cell centers (\(\boldsymbol{\phi_\sigma}\)) are computed by solving the following linear system:

where

\(\boldsymbol{D}\) is a discretize divergence operator whose transpose acts as a modified gradient operator

\(\boldsymbol{M_c} = diag(v)\) is a sparse diagonal matrix containing the cell volumes

\(\boldsymbol{M_{f\sigma}} = diag \big ( \boldsymbol{A_{fc}^T (v \odot \sigma^{-1})} \big )\) where \(\boldsymbol{A_{fc}}\) projects from faces to cell centers

\(\boldsymbol{q}\) is an integrated source term that lives at cell centers.

Once the system is solved, a sparse projection matrix \(\mathbf{P}\) maps the potentials at cell centers to the electrode positions and computes the data, i.e.:

**For the IP problem**, the secondary potential due to the IP is computed according to (2.4). \(\phi_\eta\) is obtained by replacing \(\sigma\) with \(\sigma (1 - \eta)\) in (2.6). Therefore:

where \(\boldsymbol{M_{f\eta}} = diag \big ( \boldsymbol{A_{fc}^T (v \odot \Delta\sigma^{-1} )} \big )\) such that \(\boldsymbol{\Delta \sigma} = \boldsymbol{\sigma \odot (1 - \eta)}\).

Thus using (2.6) and (2.7), the secondary potential at cell centers due to IP is:

The secondary potential data is given by:

And the apparent chargeabilities are given by:

### 2.2.2. Source Term¶

The right-hand size of (2.6) represents an integrated source term. For the DCIP OcTree package, the right-hand side is formed using an analytic primary field solution. For a homogeneous background conductivity model \(\boldsymbol{\sigma_0}\), a method of images approach is used to analytically compute the electric potential \(\boldsymbol{\phi_0}\) at cell centers.

Given that we know the background model \(\boldsymbol{\sigma_0}\) and background solution \(\boldsymbol{\phi_0}\), we can use (2.6) to compute the associated right-hand side, i.e.:

Important

The method of images solution **does** consider topography. It is extremely accurate when the surface topography is flat, however it becomes less accurate when topography is extreme.

## 2.3. General inversion methodology¶

The inverse problem is formulated as an optimization problem where an objective function of the model is minimized subject to the constraints in Equation (2.1) for DC resistivity data or Equation (2.3) for IP data. To outline our methodology, it is convenient to introduce a single notation for the data and for the model. We let \(\boldsymbol{d} = (d_1,d_2,...,d_N)^T\) denote the data, where \(N\) is the number of data. Using this notation, \(d_i\) is either the \(i^{th}\) potential in a DC resistivity data set, or the \(i^{th}\) secondary potential/apparent chargeability in an IP survey. Let the physical property of interest be denoted by the generic symbol \(m\) for the model element. The quantity \(m_i\) denotes the conductivity or chargeability of the \(i^{th}\) model cell. For the inversion, we choose \(m_i=\ln(\sigma_i)\) when inverting for conductivities, and \(m_i=\eta_i\) when reconstructing the chargeability distribution.

Having defined a model, we next construct an objective function which, when minimized, produces a model that is geophysically interpretable and reproduces the data \(\boldsymbol{d}\) to a justifiable level based on their associated uncertainties. The details of the objective function are problem dependent but generally we need the flexibility to be close to a reference model \(m_o\) and also require that the recovered model be relatively smooth in all three spatial directions. Here we adopt a right-handed Cartesian coordinate system with \(y\) positive north and and \(z\) positive up. In defining the model objective function, the reference model will generally be included in the first component of the objective function but it can be removed, if desired, from the remaining derivative terms since we are often more confident in specifying the value of the model at a particular point than in supplying an estimate of the gradient. This leads to the following two distinct formulations of the model objective function.

where the weighting functions \(w_s\), \(w_x\), \(w_y\) and \(w_z\) are spatially dependent, and \(\alpha_s\), \(\alpha_x\), \(\alpha_y\) and \(\alpha_z\) are coefficients which affect the relative importance of different components in the model objective function. The reference model \(m_o\) may be a general background model that is estimated from previous investigations or it could be a zero model.

The model objective function in Equation (2.8) is used when the `SMOOTH_MOD_DIF`

option is selected in the inversion input control file while Equation (2.9) is used when the `SMOOTH_MOD`

option is selected in the inversion input control file. The choice of whether or not to include \(m_o\) in the derivative terms can have significant effect on the recovered model.

The relative closeness of the final model to the reference model at any location is controlled by the function \(w_s\). For example, if the interpreter has high confidence in the reference model at a particular region, he can specify \(w_s\) to have increased amplitude there compared to other regions of the model. The interface weighting functions \(w_x\), \(w_y\), and \(w_z\) can be designed to enhance or attenuate structures in various regions in the model domain. If geology suggests a rapid transition zone in the model, then a decreased weighting for flatness can be put there and the constructed model will exhibit higher gradients provided that this feature does not contradict the data.

To perform a numerical solution, we discretize the model objective functions in Equations (2.8) and (2.9) using a finite difference approximation on the mesh defining the conductivity/chargeability model. This yields:

for Equation (2.8) and the following for Equation (2.9).

where \(\boldsymbol{m}\) and \(\boldsymbol{m}_o\) are \(M\)-length discretized model vectors which characterize the conductivity/chargeability distributions within the current model and reference model, respectively. The individual matrices \(\boldsymbol{W}_s\) , \(\boldsymbol{W}_x\), \(\boldsymbol{W}_y\), and \(\boldsymbol{W}_z\) are straight-forwardly calculated once the model mesh and the weighting functions \(w_s\) , \(w_x\), \(w_y\), \(w_z\) are defined. The cumulative matrix \(\boldsymbol{W}_m^T\boldsymbol{W}_m\) is then formed.

Having chosen an appropriate model objective function the next step in setting up the inversion is to define a data misfit measure. Here we use the \(l_2\)-norm measure:

and assume that the contaminating noise in the data is independent and Gaussian with zero mean. Specifying \(\boldsymbol{W}_d\) to be a diagonal datum weighting matrix whose \(i^{th}\) element is \(1/\epsilon_i\), where \(\epsilon_i\) is the standard deviation of the \(i^{th}\) datum, makes \(\Phi_d\) a chi-squared variable distributed with \(N\) degrees of freedom. Accordingly \(E[\chi^2]=N\) provides a target misfit for the inversion.

The inverse problem is solved by finding a model m which minimizes \(\phi_m\) and misfits the data by a pre-determined amount. Thus the solution is obtained by the following minimization problem of a global objective function \(\phi\),

where \(\beta\) is a trade-off parameter that controls the relative importance of the model norm and data misfit. When the standard deviations of data errors are known, the acceptable misfit is given by the expected value \(\phi_{d}^*\). In general, each parameter in the recovered model (\(\boldsymbol{m}\)) lies within its respective lower (\(\boldsymbol{m}^l\)) and upper (\(\boldsymbol{m}^u\)) bound. Chargeability is positive by definition so bounds are used in all IP inversions to implement the positivity constraint.

The choice of the regularization parameter \(\beta\) in the DC resistivity or IP inversion ultimately depends upon the magnitude of the error associated with the data. The inversion of noisier data requires heavier regularization, thus a larger value of \(\beta\) is required. Since the inversion of DC resistivity data is nonlinear, it is also important need to avoid the possibility of getting trapped in a local minima. The following strategy is implemented to determine an adequate \(\beta\) in the program library DCIPoctree.

For known uncertainty distributions, the expected value of \(\phi_d\) is easily calculated. For example, independent data with Gaussian noise of zero mean has an expected target misfit (\(\phi_{d}^*\)) of \(N\) number of data. The value of \(\beta\) should be such that the expected misfit is achieved.

A line search based on the misfit curve as a function of beta is performed to approximate the optimal value of \(\beta\). Due to the high computational expense associated with the inversion, we generally cannot afford to perform the line search by carrying out complete solutions for a series of \(\beta\)’s. Starting with a sufficiently large value of \(\beta\) ensures that the line search will successfully find an appropriate value while avoiding the computational expense of a full line search.

By reducing \(\beta\) by a fixed factor and performing one or two Gauss-Newton updates (which brings the recovered model close to its final solution for that \(\beta\)) for each value in the decreasing sequence it is possible to determine a general range for the optimal \(\beta\) value. Once this range is established the inversion is run to convergence for a few \(\beta\) values using the recovered model from a nearby \(\beta\) value inversion as the initial model for the next inversion. This greatly reduces the computational expense, by limiting the number of iterations required for convergence. The way optimal \(\beta\) value determined using the same basic strategy in both the DC and IP inversion codes. The only difference is that which the DC inversion we need to factor the forward modeling matrix every time that the conductivity model is updated, while in the IP case, only one (initial) factorization is required. The pseudo-code for computing the optimal \(\beta\) is shown in Fig. 2.2.

This inversion methodology provides a basic framework for solving a 3D geophysical inversion with arbitrary observation locations. The basic components are: the forward modelling operator, a model objective function that incorporates information about the reference model, a data misfit function, a trade-off parameter that ultimately determines how well the data will be reproduced, and an optimization algorithm that minimizes an objective function, subject to optional bound constraints. The specifics of the DC and IP data inversion are discussed in the following sections.

## 2.4. Inversion of DC resistivity data¶

The program library DCIPoctree provides a DC resistivity inversion program, `DCoctreeInv`

. The inversion of DC resistivity data, formulated as the minimization of the global objective function (see Equation (2.13)), is nonlinear since the data do not depend linearly upon the conductivity model. A Gauss-Newton approach is used in which the objective function is linearized about a current model, \(\boldsymbol{m}^{(n)}\), a model perturbation is computed, and then used to update the current model. Substituting \(\boldsymbol{m}^{(n+1)}=\boldsymbol{m}^{(n)}+\delta\boldsymbol{m}\) into the global objective function (Equation (2.13)) gives:

where \(\boldsymbol{J}\) is the sensitivity matrix and the element \(J_{ij}\) quantifies the influence of the model change in j-th cell on the i-th datum,

Neglecting the higher order terms (H.O.T.) and setting to zero the derivative with respect to \(\delta\boldsymbol{m}\) yields the following system to solve for the model objective function (Equation (2.8)) used when the `SMOOTH_MOD_DIF`

parameter is specified in the inversion input control file:

where \(\boldsymbol{W}_m^T\boldsymbol{W}_m\) is defined by Equation (2.10).

Similarly, the following system arises when the model objective function (Equation (2.9)) is used (i.e. the `SMOOTH_MOD`

parameter is specified in the inversion input control file):

In these formulations we assume that the matrix \(\boldsymbol{W}_d\) has been absorbed into the sensitivity matrix and data vectors. By solving either of these inverse problems you obtain the model perturbation, which then allows you to generate a new model according to the following relation:

where \(\alpha\) in (0,1] limits the step size and is chosen to ensure that the total objective function is reduced.

The major computational effort in this approach includes the calculation of the sensitivity matrix, solution of the basic linearized Equation (2.16), and the choice of regularization parameter \(\beta\). The sensitivity is computed using the standard adjoint equation approach, and Equation (2.16) or (2.17) is solved using a pre-conditioned conjugate gradient (CG) technique.

## 2.5. Sensitivity Weights¶

### 2.5.1. Computing Sensitivities¶

The computation of sensitivities is useful for depth of investigation analysis and for creating cell weights; the latter of which counteracts the natural tendency of DC/IP inversions to place anomalous structures very near to the electrodes. Instead of computing and outputting the full sensitivities, we instead prefer to compute and output the average or root mean squared sensitivities.

Where \(n\) is the number of data and \(v_j\) is the volume of cell \(j\), the **average sensitivity** of cell \(j\) is given by:

Another quantity used to characterize the sensitivities is the **root mean squared sensitivity** :

For the root mean squared sensitivities, we must compute the diagonal elements of \(\boldsymbol{J^T \! J}\). This can be done analytically or approximated iteratively using **Hutchinson’s method** :

where \(\boldsymbol{u}\) is a random vector and the accuracy of the approximation improves as \(K\) increases. Hutchinson’s method is easy to implement since we have sub-routines for computing the products of \(\boldsymbol{J}\) and \(\boldsymbol{J^T}\) with a vector.

### 2.5.2. Constructing Sensitivity Weights¶

Both the average and root mean squared sensitivities work well for depth of investigation analysis but they cannot be directly implemented as sensitivity weights. To create a sensitivity weights model, we must normalize, truncate and/or smooth the sensitivities. This process is explained as follows.

If the original vector \(\boldsymbol{s}\) was computed using Huchinson’s method or displays strong pixelated features, we may choose too apply smoothing. The smoothing can be applied multiple times and impacts the locations with the largest sensitivities (i.e. near electrodes).

For a vector of average or RMS sensitivities \(\boldsymbol{s}\), and a truncation factor \(0 < \tau < 1\), the sensitivity weights are obtained by:

normalizing \(\boldsymbol{s}\) by its maximum value

replacing any entries \(s_j < \tau\) with \(\tau\) (truncation)

then dividing by \(\tau\) so that the smallest value in the resulting cell weights is equal to 1.

## 2.6. Inversion of IP data¶

To invert IP data it is necessary to linearize Equation (2.4). Let \(\eta_i\) and \(\sigma_i\) denote the chargeability and electrical conductivity of the \(i^{th}\) model cell. Linearizing the potential \(\phi_\eta\) about the conductivity model \(\sigma\) yields:

Substituting into Equation (2.4) yields:

When apparent chargeability is used as the IP data, substituting the above equation into Equation (2.5), yields:

Thus the \(i^{th}\) datum (either secondary potential or apparent chargeability) is exposed as:

where

is the sensitivity matrix. Our inverse problem is formulated as:

where \(\phi_d^{*}\) is a target misfit. Again, for ease of future notation we incorporate the diagonal weighting matrix (\(\boldsymbol{W}_d\)) into \(\boldsymbol{J}\) and \(\boldsymbol{d}\). In practice the true conductivity \(\sigma\) is not known and so we must use the conductivity found from the inversion of the DC resistivity data to construct the sensitivity matrix elements in Equation (2.23).