Image reconstruction in diffuse optical imaging

Linear image reconstruction
The forward problem, discussed above, involves calculating simulated data y (which may be CW, time-domain or frequency-domain data), given a forward operator F and knowledge of the sources q and internal optical properties x (which may include µa and µ's). The forward problem may therefore be formulated as:

$$y=F(x)$$. (3)

To reconstruct an image, it is necessary to solve the inverse problem, which is to calculate the internal optical properties x, given data y and sources q (Arridge 1999):

$$x=F^{-1}(y)$$. (4)

This is a non-linear problem but it can be linearised if the actual optical properties x are close to an initial estimate x0 and the measured data y is close to the simulated measurements y0. This is typically the case in difference imaging where measurements are taken before and after a small change in the optical properties. Then we can expand (3) about x0 in a Taylor series:

$$y=y_0 + F'(x_0)(x-x_0)+F''(x_0(x-x_0)^2+...$$. (5)

where and are the first and second-order Fréchet derivatives of F, respectively. The Fréchet derivative is a linear integral operator mapping functions in the image space to functions in the data space. In some cases, such as in section 3.3.1, the kernel of the integral operator is known in terms of the Green's functions. This approach was taken by Boas et al. (1994).

More generally, the forward problem can be solved by a numeric technique by representing the Fréchet derivatives and by matrices J (the Jacobian) and H (the Hessian), respectively. Equation 5 can then be linearised by neglecting higher order terms and considering changes in the optical properties and data  to give the linear problem (6):

$$\Delta y = \mathbf{J} \Delta x$$. (6)

Linearising the change in intensity in this way gives rise to the Born approximation; linearising the change in log intensity instead is the Rytov approximation and may lead to improved images by reducing the dynamic range of y. Either way, image reconstruction consists of the problem of inverting the matrix J, or some form of normalised J. This may be large, underdetermined and ill-posed but standard matrix inversion methods can be used. These techniques differ in the way in which the matrix inversion is regularised to suppress the effect of measurement noise and modelling errors. Perhaps the most common techniques are truncated singular value decomposition, Tikhonov regularisation, and the algebraic reconstruction technique (ART) (Gaudette et al. 2000). The Moore-Penrose inverse J-1=JT(J JT)-1 offers a more efficient inversion if J is underdetermined and leads naturally to a Tikhonov-type formulation (7) where I is the identity matrix and &lambda; is a regularisation parameter. J itself is calculated from the forward model, most efficiently using the adjoint method of Arridge and Schweiger (1995). The linear inverse problem can then be expressed as:

$$\Delta x = \mathbf{J}^T (\mathbf{J J}^T + \lambda \mathbf{I})^{-1} \Delta y$$. (7)

Non-linear image reconstruction
If the inverse problem cannot be assumed to be that of reconstructing the difference between two similar states, but is instead a reconstruction from a single acquisition to obtain absolute values of the optical properties, the linear approximation cannot be justified and the full non-linear problem must be solved. This is commonly the case in breast imaging (Dehghani et al. 2003), and in static imaging of the brain (Bluestone et al. 2001, Hintz et al. 2001, Hebden et al. 2002). To solve the non-linear problem, we define an objective function &psi;, which represents the difference between the measured data y and data which is simulated using the forward model F(x). If $$\hat x$$ is the distribution of optical parameters which minimises &psi;, then it is also the model which best fits the data, and is therefore taken to be the image. However, because the problem is ill-posed, it must be regularised, and can be expressed as:

$$\psi = \left \Vert y - f(x)\right \Vert ^2 + \alpha \left \Vert \Pi \right \Vert ^2$$, (8)

where &alpha; is a regularisation parameter, and &Pi; represents prior information. This may be very simple, for example &Pi; = I, in which case (8) becomes an iterative solution of (7). However, it is increasingly common for &Pi; to include anatomical and other information, see section 3.4.3. $$\left \Vert. \right \Vert ^2$$ represents the L2-norm and gives the least squares solution. Equation 8 can be extended to include the covariance of the data CP and the covariance of the image CQ as follows:

$$\psi = (y-f(x))^T C_P (y-f(x)) + \alpha (\Pi ^T C_Q \Pi)$$. (9)

This approach gives a weighted least squares solution which reduces the influence of noise and cross-talk both in the data and in the image. Measuring CP and CQ may not be straightforward. Equation 9 is a non-linear minimisation problem which is generally solved either by a Newton method such as the Levenberg-Marqardt algorithm, or by a gradient method such as conjugate gradients (Arridge 1999).

Uniqueness
No general uniqueness results exist for optical imaging, because of the range of different unknowns (µa, µ's, refractive index, coupling coefficients, anisotropy, geometry, etc.) and the range of different measurables (time-domain or frequency-domain measurements or CW measurements of intensity alone). Some useful results do exist, however, to illustrate situations which may not yield unique solutions (Isakov 1993).

The first important uniqueness result directly relevant to the diffusion equation was derived by Arridge and Lionheart (1998) and developed further by Arridge (1999). They demonstrated that measurements of intensity alone cannot separate the effects of absorption and scatter. Furthermore, if the refractive index is also unknown, even frequency-domain or time-domain measurements are not sufficient for uniqueness, although Matcher (1999) showed that uniqueness may be restored if the underlying model is taken to be the P1 approximation. The specific case of optical mammography using a slab geometry was analysed by Romanov and He (2000), who showed that if full time-domain data are available then reflection measurements (where the sources and detectors are on the same side of the slab) can uniquely determine a piecewise continuous distribution of either µa or µ's if the other parameter is given, but transmission measurements can distinguish µa and µ's if both are unknown. In an interesting application of uniqueness theory, Corlu et al. (2003) used a uniqueness argument to determine which wavelengths gave the optimal separation between different chromophores in spectroscopic imaging, and to reconstruct directly for chromophore concentration.

Arridge and Lionheart (1998) illustrated their uniqueness result with an example of two distributions of µa and µ's which gave indistinguishable intensity data on the boundary, and Hoenders (1997) described a whole class of solutions which could not be distinguished from one another. More specifically, Schweiger and Arridge (1999a) plotted the objective function of the inverse problem against µa and µ's for a range of different datatypes extracted from the TPSF and showed that, where certain combinations of datatypes gave a well-behaved minimum, measurements of intensity alone gave an objective function with an extended minimum at approximately µa µ's = constant, suggesting that µa and µ's could not be separated.

However, as Hoenders (1997) points out, the existence of a solution to an inverse problem depends on more than just the uniqueness. It also depends on other issues including the quality of the data and the statistical behaviour of the problem. In particular, even if a problem is non-unique, it may be possible to distinguish between two identical data sets if appropriate prior information is available, whether it is used implicitly in the form of regularisation or normalisation of the Jacobian (Pei et al. 2001, Xu et al. 2002), or explicitly. This may enable the theoretical results outlined above to be reconciled with experimental work by Schmitz et al. (2002) and Pei et al. (2001), and by Jiang et al. (2002) and Xu et al. (2002) who have reconstructed images showing some separation between µa and µ's from intensity data alone. It remains to be seen how non-uniqueness affects reconstruction of clinical images, where the potential for a reconstruction to converge to an incorrect solution may be unacceptable.

Use of prior information
Advances in modelling and reconstruction techniques cannot overcome the fact that diffuse optical imaging is a non-unique, ill-posed, underdetermined problem and that this puts a limit on the image quality, particularly the spatial resolution, which can be achieved. One way to improve the quality of the image reconstruction is to make maximum use of prior information, which can be obtained from anatomical imaging techniques or by considering the physics and the physiology of the problem. This reduces the effect of the ill-posedness by improving the accuracy of the model, and makes the problem better determined by allowing the small number of measurements which are obtained to be used in a more effective way. For example, Ntziachristos et al. (2000) used an MR image of a breast to provide the location of a tumour a priori. Using this approach, the spatial resolution of the optical image effectively becomes that of the MR image. The reduction in quantitative accuracy due to the partial volume effect seen in optical images reconstructed without prior information does not occur, and truly quantitative images of the haemodynamic properties of the tumour can be obtained.

The most straightforward way to include prior information into the image reconstruction is to use an anatomically realistic forward model (section 3.3.3). This increases the accuracy of the forward model so that it can represent the measurements more precisely, thereby improving the image reconstruction. Further improvements can be made by including information about the covariance of the data CP, and of the image CQ, as in (9). The diagonal of CP is simply the variance of each measurement, and gives an indication of the reliability of that measurement. Off-diagonal elements are the covariance between each pair of measurements and give an indication of the interdependence between measurements. Introducing CP into (9) transforms the fit between the data and the model into a basis where all measurements are independent. CQ contains information about the predicted smoothness of the image. Prior knowledge of the anatomy can also be incorporated into the inverse problem. In the example of a breast tumour given above, the change in optical properties is assumed to come from either a region of interest defined from anatomy (Ntziachristos et al. 2002) or be heavily biased towards that region (Brooksby et al. 2003, Li et al. 2003). The degradation in image quality due to uncertainties in the anatomical prior, and the best way to minimise this degradation, are as yet uncertain (Schweiger and Arridge 1999b). Including prior anatomical information has advantages for both linear and non-linear reconstruction (Boas et al. 2004a)

Another approach is to express (8) as a statistical inverse problem in a Bayesian framework. Bayes theorem can be written as p(x|y) &prop; p(y|x)p(x) where p(x|y) is the posterior probability of obtaining the image x given the data y, p(y|x) is the likelihood of obtaining y given x (which is obtained by solving the forward problem), and p(x) is the prior density, which in these terms is simply an estimate, made before carrying out the experiment, of the most likely image. If we assume that the distribution of errors is Gaussian, then Bayes theorem takes the form

$$p(x|y) \propto exp(-\left \Vert y - f(x)) \right \Vert ^2) exp(-\left \Vert \Pi \right \Vert^2)$$. (10)

where the likelihood $$p(y|x) \propto exp(- \left \Vert y - f(x) \right \Vert ^2)$$ and the prior $$p(y|x) \propto exp(- \left \Vert \Pi \right \Vert ^2)$$. The value of x which maximises (10) is the maximum a posteriori (MAP) estimate. Maximising (10) is equivalent to minimising its negative logarithm, which is (8), but can offer further approaches which deal with prior information and covariance estimates in a direct and intuitive manner (Kwee 1999, Kohlemainen 2001, Mosegaard and Sambridge 2002, Evans and Stark 2002, Oh et al. 2002).

Recently, efforts have been made to incorporate Monte Carlo techniques into the inverse problem. Standard Monte Carlo techniques are too slow to be used for practical image reconstruction, although faster methods are available. One example is the Metropolis-Hastings algorithm which is an example of a Markov Chain Monte Carlo (MCMC) technique, in which individual photon steps are drawn from a biased random walk which is chosen to minimise the number of discarded photons (Mosegaard and Sambridge 2002). These techniques have been used as the basis of image reconstruction algorithms in EIT (Kaipio et al. 2000, West et al. 2004), partly because the Monte Carlo simulations form a probability density which can be interpreted as the Bayesian maximum a posteriori estimate, leading very naturally to a statistical expression of the inverse problem (Evans and Stark 2002).

The approaches discussed above assume that the prior information, whether in terms of the data covariance or the image statistics, is governed by Gaussian statistics. This is not necessarily the case: a Poisson model for the data may be more appropriate if photon counting errors are significant, and assuming that the pixel values are distributed as Gaussian random variables forces the image to be homogeneously and isotropically smooth. The latter issue has been identified as a significant weakness and has been addressed in two ways. First, Kaipio et al. (1999) introduced an inhomogeneous, anisotropic image covariance term which allowed, for example, a change in optical properties in the brain to be correlated more closely with neighbouring pixels in the brain than with those in the skull. A second approach is to minimise the total variation, defined as the integral of the absolute gradient of the image, rather than the L2-norm (Borsic 2002). The L2-norm forces the image to be smooth, and is optimal if the optical properties are distributed as a Gaussian random field. If the distribution of optical properties is known to be piecewise constant, the L2-norm will smooth the image, whereas minimising total variation will reduce oscillations in the image but still allow boundaries to be sharp. Other non-Gaussian priors may be considered.

Methods that exploit symmetry
Some improvements to speed and robustness of the inverse problem can be obtained for systems that possess symmetry. Examples are rotational symmetry in a cylinder or sphere, or translational symmetry in a slab. It is easiest to develop this idea for the analytical version of the linear inversion kernels based on Green's functions. Green's functions are the kernels of the inverse solver for a PDE, and when the domain possesses symmetry, the PDE may often be solved by a separation of variables technique, which implies that the inversion kernels can be expressed as the product of functions in the separated variables. In the work of Markel and Schotland (2001), the slab geometry gives rise to a product of transverse plane waves and one-dimensional integral equations in the perpendicular direction. It is this decoupling of transverse propagating waves, and perpendicular scalar waves that allows faster inversion of a set of one dimensional integral equations, rather than one coupled 3D integral equation. The solution of these linear equations builds up a solution in the basis of the transverse component, which can then be transformed into image space by a fast transform method. Similar methods apply in the cylindrical and spherical geometries where the transverse components are Fourier and spherical harmonic waves respectively. Furthermore the case where only part of the data is sampled can be included by convolution with a masking function. Finally Markel et al. (2003) considered extension to the nonlinear case by applying an iterative method, linearised around the symmetric case.

As well as the analytical case, a discrete implementation such as FEM can also be developed. This approach was developed by Metherall et al. (1996) for EIT, and applied to optical tomography by Hampel and Freyer (1998).

Solving for coupling coefficients
The measured data in optical tomography depends not only on the properties of the object under examination, but also on the characteristics of the source and detector fibres, instabilities in the source power and detector efficiency, and on the efficiency with which light is coupled into and out of the medium. A well-designed calibration procedure can minimise the effects of the optical fibres and the instrumentation (Hillman et al. 2000, McBride et al. 2001) but characterising the coupling between the fibres and the tissue can be more difficult. One approach which has been successfully adopted, particularly with regard to optical topography, is difference imaging, in which images are reconstructed using the difference between measurements recorded at two states in such a way that coupling effects cancel (Section 4.1). Alternatively, images can be reconstructed using temporal data only (Hebden et al. 2002), although this may reduce the ability to separate the absorption and scattering properties. These approaches are not always appropriate, however, and so techniques have been developed to correct for coupling effects during image reconstruction.

Schmitz et al. (2000) and Boas et al. (2001) introduced a method for solving for the unknown coupling coefficients as part of the image reconstruction problem. Each source and detector was associated with a complex coupling term. A measurement Mij between source i and detector j was modelled as Mij = SiDjmij where Si and Dj are the source and detector scaling factors and mij is the ideal measurement, depending only on the internal optical properties of the object. The reconstruction problem is then posed in terms of minimising the difference between the measured data and SiDjmij, where Si and Dj are solved for in the inverse problem. Images could be reconstructed successfully if up to 80% uncertainty in amplitude was added to simulated data (Boas et al. 2001). The technique has since been extended to allow small errors in optode position to be corrected (Culver et al. 2003, Stott et al. 2003) and has been used for dynamic breast imaging (Intes et al. 2003). Vilhunen et al. (2004) have implemented a related method which can be used when the sources and detectors have rotational symmetry. Oh et al. (2002) used a Bayesian formulation to reconstruct 3D images non-linearly by treating the unknown source-detector coupling coefficients as unwanted nuisance parameters.

Use of dynamic information
Image reconstruction is generally applied to the spatial domain but additional information can be obtained by examining the temporal evolution of the signal. Physiological signals from the heart rate, ventilation and blood vessels are time-dependent. If the image acquisition time is short compared to the physiological fluctuations, an image time series can be obtained, which can then be processed to give dynamic images showing parameters such as the covariance or the amplitude and phase at a particular frequency of physiological interest (Barbour et al. 2001, Graber et al. 2002).

A different situation occurs when the image acquisition rate is not short compared to the physiological changes. In this case, the optical properties change during the acquisition of each image and so a straightforward static image reconstruction would not be valid. The Kalman filter technique has been used to model the image space as a state whose properties evolve with time in a known manner, the details of which are supplied as prior information. It may be expressed as:

$$x_{t+1} = \mathbf{K} x_t + n_t$$, (11)

where xt is the image x at time t, K is the state transition matrix and nt is additive noise at time t. In the simplest case, K is the identity matrix, in which case the update is a random walk. This approach helps to condition the inverse problem by using the current state, and knowledge of how it evolves, to constrain the possible solutions for the next state. It can be used to image events which occur on a shorter timescale than the image acquisition by calculating an updated image using the Kalman filter when only a limited subset of data has been acquired. This is ideally suited for systems which acquire data from sources sequentially but the detectors simultaneously, in which case an updated state can be estimated when each source is activated. This approach has been successfully applied to both simulated data (Kohlemainen et al. 2003) and real data (Prince et al. 2003) in optical tomography. Eppstein et al. (2001, 2002) have developed an image reconstruction technique for fluorescence tomography which uses the Kalman filter to update each iteration of a static non-linear reconstruction procedure.

Recovery of object shape
Another approach to optimising the use of data in optical tomography is not to reconstruct for a large number of image pixels, which leads to a highly underdetermined problem, but instead to reconstruct for the boundaries and properties of internal regions which can be assumed to have piecewise constant optical properties.

One method to solve for the boundaries of internal regions is to formulate a minimisation problem where the image parameters are the Fourier coefficients of the smooth region boundaries. This was first solved assuming the optical properties of the regions are known a priori (Kohlemainen et al. 1999), and later extended to solve simultaneously for both the region boundaries and the optical properties (Kohlemainen et al. 2000a). A further refinement was to use an initial static image reconstruction to identify the number of regions and their approximate locations before finally solving for the boundaries and optical properties of the regions (Kohlemainen et al. 2000b). In 3D, it may be more appropriate to assume that regions of interest are ellipsoidal and to solve for the parameters of the ellipse rather than Fourier components (Kilmer et al. 2003).

A promising technique for the inverse shape estimation problem is provided by representing internal regions using a level set function. This is a function which is 1 inside the regions and 0 elsewhere. This method assumes that the optical properties of the background and regions of interest are known, but does not place constraints on the number of regions or their shapes. Dorn (2002) implemented this method, thresholding an initial spatial reconstruction to provide a starting estimate for the internal objects which are then described using a level set whose boundaries evolve as the reconstruction progresses.