A Fast Nonlinear Method for Parametric Imaging of Myocardial Perfusion by Dynamic 13N-Ammonia PET
=================================================================================================

* S. Raymond Golish
* Jens D. Hove
* Heinrich R. Schelbert
* Sanjiv S. Gambhir

## Abstract

A parametric image of myocardial perfusion (mL/min/g) is a quantitative image generated by fitting a tracer kinetic model to dynamic 13N-ammonia PET data on a pixel-by-pixel basis. There are several methods for such parameter estimation problems, including weighted nonlinear regression (WNLR) and a fast linearizing method known as Patlak analysis. Previous work showed that sigmoidal networks can be used for parameter estimation of mono- and biexponential models. The method used in this study is a hybrid of WNLR and sigmoidal networks called nonlinear regression estimation (NRE). The purpose of the study is to compare NRE with WNLR and Patlak analysis for parametric imaging of perfusion in the canine heart by 13N-ammonia PET. **Methods:** A simulation study measured the statistical performance of NRE, WNLR, and Patlak analysis for a probabilistic model of time–activity curves. Four canine subjects were injected with 740 MBq 13N-ammonia and scanned dynamically. Images were reconstructed with filtered backprojection and resliced into short-axis cuts. Parametric images of a single midventricular plane per subject were generated by NRE, WNLR, and Patlak analysis. Small regions of interest (ROIs) were drawn on each parametric image (8 ROIs per subject for a total of 32). **Results:** For the simulation study, the median absolute value of the relative error for a perfusion value of 1.0 mL/min/g was 16.6% for NRE, 17.9% for WNLR, 19.5% for Patlak analysis, and 14.5% for an optimal WNLR method (computable by simulation only). All methods are unbiased conditioned on a wide range of perfusion values. For the canine studies, the least squares line fits comparing NRE (*y*) and Patlak analysis (*z*) with WNLR (*x*) for all 32 ROIs were *y* = 1.02*x* − 0.028 and *z* = 0.90x + 0.019, respectively. Both NRE and Patlak analysis generate 128 × 128 parametric images in seconds. **Conclusion:** The statistical performance of NRE is competitive with WNLR and superior to Patlak analysis for parametric imaging of myocardial perfusion. NRE is a fast nonlinear alternative to Patlak analysis and other fast linearizing methods for parametric imaging. NRE should be applicable to many other tracers and tracer kinetic models.

*   parametric imaging
*   13N-ammonia PET
*   sigmoidal networks
*   artificial neural networks
*   function estimation
*   nonlinear regression

Estimation of the parameters of tracer kinetic models is increasingly important in clinical and research PET studies. In parametric imaging, a tracer kinetic model is fit to time–activity curve data on a pixel-by-pixel basis. The result is a quantitative image of the parameters of a physiologic or biochemical model. An important application of parametric imaging is myocardial perfusion imaging. 13N-ammonia PET is widely used for the estimation of myocardial perfusion (1,2). Parke-Davis has sponsored a multicenter, 300-patient trial in which a compartmental model is being fit to dynamic 13N-ammonia data in the resting and stressed heart.

Numerical methods for parametric imaging differ in statistical performance, computational cost, and generality. Weighted nonlinear regression (WNLR) approaches are the most general. Although the basic WNLR approach offers good statistical performance, it derives from approximations that introduce error (3). Many refined WNLR approaches have also been studied (4–7). However, WNLR methods are iterative and time-consuming to compute.

Patlak analysis is a type of fast linearizing method. These methods differ considerably, but all use integration and linear regression to speed solution (8–11). Linearizing approaches are accurate and fast for a variety of problems (1,2,12,13), but they still have drawbacks. Though some linearizing methods handle all linear compartmental models (11), none handles nonlinear tracer kinetic models (except for linearizing perturbation experiments). Also, linearizing methods often do not estimate the microparameters directly but require model-specific nonlinear algebra to recover them from the macroparameters. For these reasons, it would be desirable to have a fast, nonlinear method that learns parametric imaging of the microparameters from simulated data.

Previous work from this group (14,15) and others (16,17) showed that sigmoidal networks can be used for estimation of the parameters of compartmental models. Sigmoidal networks are a type of nonlinear function estimator. Function estimators are empirical tools from statistics, connectionism, and fuzzy systems that learn nonlinear input-output mappings from data (18,19). They have been used successfully for most major engineering problems including control, prediction, filtering, pattern recognition, and system identification. Whereas WNLR methods for parametric imaging approximate the maximum likelihood (ML) estimate, function estimation methods approximate conditional expectation. Here, the approaches are combined to approximate the ML estimate orders of magnitude faster than is otherwise possible. This hybrid method is called nonlinear regression estimation (NRE) because it uses a function estimator trained on optimal WNLR estimates.

The goal of this study is to compare the statistical performance and computational cost of NRE with WNLR and with Patlak analysis for parametric imaging of myocardial perfusion by 13N-ammonia PET. A simulation study estimated the performance of all 3 methods for ischemic, resting, and hyperemic myocardium. Parametric images of 4 canine subjects with resting and ischemic myocardium were computed by NRE, WNLR, and Patlak analysis. The results of this study show that NRE is an accurate, very fast method for perfusion imaging and has the potential to be an accurate, very fast, nonlinear method for parametric imaging with many tracers and tracer kinetic models.

## MATERIALS AND METHODS

### Simulation of Time–Activity Data

This section describes simulation of time–activity curve data used for training NRE and testing all estimators. A 2-compartment model for the tracer kinetics of 13N-ammonia (20,21) was used. The distribution of tracer is described by a continuous-time, linear time-invariant (LTI) system: \batchmode \documentclass[fleqn,10pt,legalpaper]{article} \usepackage{amssymb} \usepackage{amsfonts} \usepackage{amsmath} \pagestyle{empty} \begin{document} [\mathbf{{\dot{q}}}{\,}{=}{\,}\mathbf{Aq}{\,}{+}{\,}\mathbf{Bc}.] \end{document}Eq. 1 ![Formula][1]</img> \batchmode \documentclass[fleqn,10pt,legalpaper]{article} \usepackage{amssymb} \usepackage{amsfonts} \usepackage{amsmath} \pagestyle{empty} \begin{document} [K_{1}{=}\left(\right.\mathrm{exp}\left(\right.\frac{0.5F\ {+}\ 1.25}{F}\right)\ {-}\ 1\ \right),] \end{document}Eq. 2 where the state variables are Qf(*t*), the free activity in a relatively fast compartment (Bq/g), and Qs(*t*), the metabolically trapped activity in a relatively slow compartment (Bq/g). The input to the model is Cp(*t*), the activity in the perfusing coronary vasculature (Bq/mL), which was assumed to be equal to the left ventricular blood activity. The parameters of the model are *F*, the myocardial perfusion (mL/min/g); *f*, the spillover from blood to tissue (which is unitless); *Vf*, the (actual) volume of distribution; and *kr*, the return rate constant. In practice, the volume of distribution and return rate constant are fixed to known values (*Vf* = 0.8 mL/g and *kr* = 0 for these experiments).

The model of the input function is a modified version of the model of Thompson et al. (22). The modification is an asymptotic recirculation term that yields the model \batchmode \documentclass[fleqn,10pt,legalpaper]{article} \usepackage{amssymb} \usepackage{amsfonts} \usepackage{amsmath} \pagestyle{empty} \begin{document} [C\_{p}(\mathit{t}){\,}{=}{\,}C\_{max}\left(\frac{exp(1)}{{\alpha}{\beta}}{\,}(t{\,}{-}{\,}t\_{0})\right)^{{\alpha}}{\ }exp\left(\frac{{\,}{-}{\,}(t{\,}{-}{\,}t_{0})}{{\beta}}\right)] \end{document}Eq. 3 \batchmode \documentclass[fleqn,10pt,legalpaper]{article} \usepackage{amssymb} \usepackage{amsfonts} \usepackage{amsmath} \pagestyle{empty} \begin{document} [{+}C_{0}(1{-}exp\mathit{({-}(t{-}t_{0}}){/}{\tau})),] \end{document} where Cp(*t*) is the activity in the vasculature (Bq/mL); α (unitless), β (s), and Cmax (Bq/mL) are parameters of the original Thompson model; C (Bq/mL) and τ (s) are parameters of the modified model; and t (s) is the appearance time. The model is 0 before time t. In practice, the appearance time is fixed to a known value (5 s). Also, the time constant τ is automatically scaled so that the recirculation term reaches a maximum when the original model falls off to 5% of its peak. Therefore, there are only 4 independent input function parameters: α, β, Cmax, and C.

Sampling errors were modeled as follows. The input function samples are temporal averages of the input function model: \batchmode \documentclass[fleqn,10pt,legalpaper]{article} \usepackage{amssymb} \usepackage{amsfonts} \usepackage{amsmath} \pagestyle{empty} \begin{document} [\mathbf{{\bar{c}}}{\,}{=}{\,}\frac{1}{t\_{i{\,}{+}{\,}1}{\,}{-}{\,}t\_{t}}{\,}{{\int}\_{\mathit{t\_{i}}}^{\mathit{t\_{i{\,}{+}{\,}1}}}}{\,}\mathit{C}\_{p}\mathit{(t)dt},] \end{document}Eq. 4 where **m̄** denotes a sequence of model output for *i* = 1..*m* − 1, where *m* is the number of time intervals. The effect of integration is significant for the input function model because we use left ventricular PET data rather than arterial sampling data. The blood time–activity curve is sparsely sampled in the typical PET protocol, but this is largely unavoidable. By contrast, the tissue time–activity curve is usually well sampled and piecewise linear over the observation intervals. In this case, ideal samples are equivalent to temporal averages, and the midpoint approximation applies \batchmode \documentclass[fleqn,10pt,legalpaper]{article} \usepackage{amssymb} \usepackage{amsfonts} \usepackage{amsmath} \pagestyle{empty} \begin{document} [\mathbf{{\bar{y}}}{\,}{=}{\,}\mathit{y}((\mathit{t}_{\mathit{i}{\,}\mathit{{+}{\,}1}}{\,}{+}{\,}\mathit{t_{i}}){/}2),] \end{document}Eq. 5 where ȳ denotes a sequence of model output over the observation intervals and y(.) denotes Equation 2 evaluated at 1 time point.

There are several sources of random error in reconstructed PET time–activity curves. We assumed that noise is additive, independent, and gaussian with zero mean and with variance directly proportional to signal (1,3,7). Several noise models have been proposed for time–activity curve data. Other noise models are SD proportional to variance (also called constant coefficient of variation) (4,23) and noise equivalent counts (24).

Table 1 presents the ranges of 8 parameters used for simulation: 4 parameters of the blood time–activity curve signal model, 2 parameters of the tissue time–activity curve signal model, 1 parameter of the blood time–activity curve noise model, and 1 of the tissue time–activity curve noise model. The probability density for perfusion is an exponentially decaying density between the limit values. All other parameters have uniform density. The parameter ranges of the blood time–activity curve model were estimated from a set of 14 canine blood time–activity curves taken from studies at University of California, Los Angeles; none of these blood time–activity curves was from the 4 canine study subjects. A typical constrained regression strategy was used to fit the 14 blood time–activity curves.

View this table:
[TABLE 1](http://jnm.snmjournals.org/content/42/6/924/T1)

TABLE 1 
Ranges of Probability Densities for 8 Parameters Used in Monte Carlo Simulation of Blood and Tissue Time–Activity Curves

To simulate time–activity curves, 8 parameter values were drawn from a standard linear congruential pseudorandom number generator with a period much larger than the sample size. Using these parameters, 100 ideal samples of the blood time–activity curve model were computed, 1 sample per second over the range 1–100 s. These were used as input to a discrete time system, the state transition matrix (STM) solution of a continuous-time LTI system with first-order hold (FOH) input. An FOH input is a theoretical function that reconstructs discrete samples of the input function. The STM solution is exact in the case that the input function is actually FOH. In practice, the STM solution is nearly exact if the discrete input samples are fast, noiseless, and ideal. That is the case in this study. Other general solutions to LTI systems (time-domain and frequency-domain) have similar restrictions. The output of the STM solution was 100 ideal samples of the 2-compartment model. The sampling scheme for experimental PET data is a set of 10 time intervals of 10 s each between 0 and 100 s. To convert simulated data to this sampling scheme, the input samples were integrated with a trapezoid rule whereas the output samples were handled with the midpoint approximation. Gaussian noise with zero mean and variance proportional to signal was added to both the output and integrated input signals. Simulated time–activity curves with negative values caused by the noise model were discarded, because these time–activity curves do not model the negative values that sometimes occur in reconstructed time–activity curves. (Conditioning on positivity of the time–activity curves introduces only a very small bias.) Figure 1 is an example of a simulated pair of blood and tissue time–activity curves.

![FIGURE 1.](http://jnm.snmjournals.org/https://jnm.snmjournals.org/content/jnumed/42/6/924/F1.medium.gif)

[FIGURE 1.](http://jnm.snmjournals.org/content/42/6/924/F1)

FIGURE 1. 
Example of 1 simulated pair of blood and tissue time–activity curves. Curve that spikes is blood time–activity curve. Smooth line is continuous activity; ▿ = integrated activity; ▵ = noisy activity (final time–activity curve). Curve that does not spike is tissue time–activity curve. Smooth line is continuous activity; ○ = integrated activity; × = noisy activity (final time–activity curve). For simulation: perfusion = 1.0; spillover = 0.0; blood time–activity curve noise level = 2%; tissue time–activity curve noise level = 20%; all other parameters are drawn at random from Table 1. Note high noise level in tissue time–activity curve and sampling error at 5 s for blood time–activity curve = ▿.

### Weighted Nonlinear Regression

The most common approach to parametric imaging is likelihood inference. For independent gaussian errors, the ML estimate is a minimum of the log-likelihood or weighted residual sum of squares: ![Formula][2]</img> where ŷi are the elements of **ŷ**, which is the PET tissue time–activity curve; *ȳi***(p)** are elements of **ȳ****(p)**, which is the model output; **p** denotes the vector of parameters; ςi is the SD of *ŷi* and ∥ ∥ is a weighted 2-norm. WNLR is the process of computing Equation 6 by an iterative method for optimization or nonlinear equations.

For Equation 6 to define the ML estimate, the ςi values and Cp(*t*) must be known. But in practice, these are replaced with estimates. Because of the proportional variance noise model, the *ŷ*i values are an estimate of the ςi values up to a constant that does not affect WNLR. To reconstruct Cp(*t*), it was assumed that PET blood time–activity curves are fast, noiseless, ideal samples (3). The blood time–activity curves were linearly interpolated and then used as input to the STM solution with FOH. This implies the midpoint approximation, which does not strictly hold. The error introduced into WNLR by estimates for the ςi values and Cp(*t*) is significant, but the method is still a good one. This is the standard WNLR method for parametric imaging. More advanced methods are considered in the Discussion.

WNLR for parametric imaging may be ill posed: The solution may not be unique (multiple local optima) or may not exist (nonconvergence or instability). Only the latter problem was experienced during this study, though the former may also arise in parameter estimation (14). Nonconvergence necessitates that prior knowledge be used. During this study, parameter ranges were constrained with simple bounds. In likelihood inference, these are deterministic constraints on the ML estimate. In Bayesian inference, they are uniform priors. Because the same bounds were used for estimation as for Monte Carlo simulation of the forward model (Table 1), WNLR approximates the maximum a posteriori probability estimate. For all WNLR, perfusion and spillover were estimated using a Levenberg-Marquardt method with simple bounds through active sets (25).

### Nonlinear Regression Estimation

Methods for function estimation or statistical learning have arisen in the fields of nonparametric statistics, neural networks, and fuzzy systems. Sigmoidal networks, nonparametric kernel estimators, and support vector machines are closely related methods that use a univariate nonlinear transformation (usually to a higher dimension) followed by linear regression on the basis functions (26–31). In the deterministic case, such methods approximate nonlinear functions. In the probabilistic case, they estimate functions from noisy data. In theory, if a function approximator with a set of input random variables is brought into mean-squared correspondence with an output random variable, then it approximates conditional expectation (18) \batchmode \documentclass[fleqn,10pt,legalpaper]{article} \usepackage{amssymb} \usepackage{amsfonts} \usepackage{amsmath} \pagestyle{empty} \begin{document} [\mathbf{{\hat{p}}}(\mathbf{{\hat{y}}}){\,}{=}{\,}\mathit{E}(\mathbf{p}{\vert}\mathbf{{\hat{y}}}).] \end{document}Eq. 7

Conditional expectation is the mean-squared optimal estimator (32). Fuzzy systems also approximate conditional expectation (19).

In this study, sigmoidal networks were used: \batchmode \documentclass[fleqn,10pt,legalpaper]{article} \usepackage{amssymb} \usepackage{amsfonts} \usepackage{amsmath} \pagestyle{empty} \begin{document} [o(\mathbf{i}){\,}{=}{\,}{\sum}\_{\mathit{j}{\,}{=}{\,}1}^{s}{\,}\mathbf{w}_{0}^{(j)}g(\mathbf{w}_{\mathit{j}}{\,}{\cdot}{\,}\mathbf{i}{\,}{+}{\,}\mathit{b_{j}}),] \end{document}Eq. 8 where i is the input, w and b are parameters, g(.) is a univariate sigmoidal function, and *s* is the number of basis functions. The superscript denotes the elements of a vector, and the subscript denotes the elements of an ordered set. Sigmoidal networks are shown graphically in Figure 2. Sigmoidal networks are often called feedforward artificial neural networks.

![FIGURE 2.](http://jnm.snmjournals.org/https://jnm.snmjournals.org/content/jnumed/42/6/924/F2.medium.gif)

[FIGURE 2.](http://jnm.snmjournals.org/content/42/6/924/F2)

FIGURE 2. 
Graphical representation of sigmoidal networks. Here, *r* is dimensionality of input, *s* is number of basis functions, and each node represents dot product of input and local weights, followed by a sigmoidal transfer function. Output node has linear transfer function (linear regression on basis functions). Sigmoidal network for perfusion imaging has 20 inputs (10 samples each of blood and tissue time–activity curve), 52 basis functions, and 1 output (perfusion). Use of single output is explained in Discussion.

To approximate Equation 7 for parametric imaging of perfusion, a sigmoidal network would be trained with simulated time–activity curve data for inputs and known perfusion values for outputs using a mean squared error criterion. But instead of using the known perfusion values as network outputs, an optimal WNLR estimate was precalculated as follows. Whereas real-world data analysis requires estimates for Cp(*t*) and the ςi values, simulation allows WNLR with the known values for Cp(*t*) and the ςi values. This is very close to an optimal estimate. The only difficulty in learning the optimal WNLR estimate is that it is conditioned on the known ςi values and Cp(*t*) in addition to the observed data. But the sigmoidal network compensates by approximating conditional expectation to yield ![Formula][3]</img>Eq. 9 where ĉ is the observed blood time–activity curve. The use of a function estimator to learn an optimal WNLR estimate is NRE, a fully nonlinear method for parameter estimation.

To compute the optimal WNLR estimate to be used for sigmoidal network training, an important modification to Equations 1 and 2 was used. The logarithm of perfusion (log-perfusion) was considered, rather than perfusion directly. To simulate time–activity curves, a uniform density was used over log-perfusion; this is an exponentially decaying density for perfusion (Table 1). Instead of estimating perfusion directly, log-perfusion and its variance were estimated with optimal WNLR.

To train a sigmoidal network for NRE, a training set of 20,000 pairs of blood time–activity curves and tissue time–activity curves was simulated. The optimal WNLR estimate of log-perfusion was computed for each pair. Ten samples of the blood time–activity curve and 10 samples of the tissue time–activity curve were the inputs to a sigmoidal network, and the optimal WNLR estimate of log-perfusion was the output. The network used a logistic transfer function and a linear output function (linear regression on the basis functions). To estimate network weights, a simple cross-validation strategy was used, in which 3 sets of samples were simulated: training, validation, and test samples. The latter 2 samples each consisted of 25% of the number of training samples (18). Using a conjugate gradient method, the weighted mean squared error (WMSE) was optimized between the network and the log-perfusion training data; the log-perfusion variance estimates were the weights. Convergence was defined as a decrease in the training set and an increase in the validation set for 20 consecutive iterations (additional convergence criteria were defined but did not result in termination). After convergence, the WMSE of the test set was computed as a final check against overfitting. The number of basis functions was chosen pragmatically. The initial estimate was 20 basis functions, the number of inputs (18). The number was sequentially increased by 4, monitoring the WMSE of the test set.

### Canine and Simulation Studies

For the simulation study, 2,000 pairs of blood and tissue time–activity curves were simulated for each of 7 perfusion values: 0.25, 0.50, 0.75, 1.00, 2.00, 3.00, 4.00 (mL/min/g). The other parameters were as in Table 1. Perfusion for each of the 7 sets of 2,000 samples was estimated by WNLR, NRE, Patlak analysis, and optimal WNLR. For Patlak analysis, the Patlak transform was used followed by unweighted linear regression; leading zeros and 1 early data point were removed, as is necessary for the method.

In the canine studies, 13N-ammonia (740 MBq) was administered to anesthetized animals with 1 ligated vessel. The protocol for handling the subjects has been described (33). The PET data were reconstructed with filtered backprojection and resliced to achieve short-axis cuts. For each subject, the blood time–activity curve was obtained from an ROI in the left ventricular chamber and averaged over several planes. A parametric image was generated for a single midventricular short-axis cut per subject. After calculating parametric images, 8 small circular ROIs (29 pixels) were drawn for each subject, and the spatial average perfusion for each ROI was computed to yield a total of 32 ROI measurements of perfusion. No partial-volume correction was applied, because this correction can be applied to any parametric imaging method without modification. No metabolite correction or decay correction is needed for data to 100 s (34).

The numerical code is in Matlab 5.3 (The Mathworks Inc., Natick, MA). The platform is a 233-MHz Pentium II system with 128 MB of memory.

## RESULTS

### Simulation study

Table 2 presents the results of the simulation study conditioned on a wide range of perfusion values and marginal over all other parameters in the model. All methods are unbiased (by *t* tests) for all entries in Table 2. The median absolute value of the relative error is a nonparametric measure of total error due to both bias and variance. The ideal WNLR estimator provides a benchmark for performance. The bias and spread of the WNLR and Patlak estimators are similar to those of other reports (1,2).

View this table:
[TABLE 2](http://jnm.snmjournals.org/content/42/6/924/T2)

TABLE 2 
Perfusion Estimates for Simulation Study

### Canine Studies

Figure 3 compares parametric imaging by NRE, WNLR, and Patlak analysis in 1 canine subject. Visually, the NRE perfusion images are quite similar to the WNLR images for all 4 subjects. Figure 4 compares WNLR with NRE and Patlak analysis for 32 ROIs (8 ROIs per subject). The slope between Patlak analysis and WNLR is similar to that previously reported (2).

![FIGURE 3.](http://jnm.snmjournals.org/https://jnm.snmjournals.org/content/jnumed/42/6/924/F3.medium.gif)

[FIGURE 3.](http://jnm.snmjournals.org/content/42/6/924/F3)

FIGURE 3. 
Parametric images of myocardial perfusion by NRE (A), WNLR (B), and Patlak analysis (C). Pseudocolored scale bar shows units of mL/min/g. Ischemic region at upper right is caused by vessel ligation.

![FIGURE 4.](http://jnm.snmjournals.org/https://jnm.snmjournals.org/content/jnumed/42/6/924/F4.medium.gif)

[FIGURE 4.](http://jnm.snmjournals.org/content/42/6/924/F4)

FIGURE 4. 
Scatter plot of NRE versus WNLR (○) and Patlak analysis versus WNLR (×) for 32 ROIs taken from 4 canine subjects. Least squares line fits are also shown.

WNLR required about 2 h to process a 128 × 128 parametric image (between 5,000 and 7,000 time–activity curves are processed after thresholding the sum activity image). Patlak analysis required 15 s and NRE required 3 s. Initially, sigmoidal network training for NRE required about 4 h and terminated in 400 iterations with a WMSE of 0.012 for a final value of 52 basis functions. Simulation of time–activity curves for training and calculation of optimal WNLR estimates required 8 h for 20,000 time–activity curves. The 12-h training process need only be done once, after which NRE generates parametric images in 3 s.

## DISCUSSION

The simulation studies and canine studies show that the statistical performance of NRE is competitive with WNLR. NRE is accurate and stable because it combines the optimal WNLR method with the ability of the sigmoidal network to approximate conditional expectation. Many other refined WNLR methods exist, including those that use input function models (4,5,35), wavelet domain filters (36), and multivariate statistics (e.g., principal components (37), factor analysis (38), and independent components (39)). Feng et al. (6) validated a double modeling approach; O’Sullivan and Saha (7) used ridge regression. Like these methods, NRE achieves good statistical performance and can be combined with any refined method. However, NRE is the only nonlinear method that also speeds evaluation by orders of magnitude.

For fast parametric imaging, linearizing methods are usually used. The simulation and canine studies show that the statistical performance of NRE is superior to Patlak analysis, a well-validated linearizing method for perfusion imaging with 13N-ammonia (2). Feng’s generalized linear least squares (GLLS) (11) is an accurate and fast linearizing method that is well validated for perfusion imaging (1). Other linearizing methods include weighted integration (9) and linear least squares (10). Linearizing methods compare favorably with WNLR for parametric imaging of cerebral metabolic rate of glucose (12) and cerebral blood flow (13). Thus, the statistical performance of linearizing methods can be quite good. The results of this study suggest that the performance of NRE may be competitive with any method.

The NRE approach has advantages over traditional WNLR. Because the optimal WNLR estimate is very stable, multiple parameters can be estimated accurately under noisy conditions. Also, the use of a sigmoidal network to approximate optimal WNLR speeds parametric imaging by orders of magnitude (once the sigmoidal network is trained). In principle, NRE is more complex than a typical sigmoidal network approach, but it has several advantages that actually make training simpler. Recall that in conventional sigmoidal network training, the known parameter values are used for training rather than the optimal WNLR values. The known parameter values can be viewed as noisy point estimates, but the NRE training data exhibit the statistical properties of optimal WNLR. Therefore, NRE with even the simple training procedure used in this study produces images that are very similar to WNLR images. The training approach is robust to the stopping criteria, the number of iterations, the initial conditions, and even the number of basis functions. Although the traditional approach can likely be used for parametric imaging as well, successful training would require a more sophisticated strategy including optimization of several measures of bias and spread with respect to the stopping criteria, the initial conditions, and the number of basis functions. When using a simple training scheme, we found that sigmoidal networks trained with the conventional approach yield biases as high as 52% for low perfusion values (though for high perfusion values the biases are competitive with NRE performance) and seem to exhibit more trend in the bias at low perfusion values than NRE. Most importantly, images generated by sigmoidal networks trained with a simple version of the conventional approach are not as visually similar to WNLR images as are the NRE images. Another advantage of NRE over a traditional learning approach is that it allows multiple model parameters to be estimated by multiple networks having 1 output each. This is an easier problem than estimating multiple model parameters with 1 network having multiple outputs, as is necessary for Equation 7. Determining whether a sigmoidal network trained using the conventional approach can achieve the performance of NRE is an interesting direction for future work.

Two techniques were used to optimize the statistical performance of NRE. First, log-perfusion was estimated so that perfusion could be estimated accurately over several orders of magnitude. This property is necessary so that ischemic, resting, and hyperemic myocardium may exist in the same image. (The canine measurements are of ischemic and resting myocardium because the low perfusion case is harder.) Parameters such as spillover that are estimated over approximately 1 order of magnitude can be estimated directly. Second, the NRE method uses the WMSE criterion for sigmoidal network training; the weights are the variance estimates provided by optimal WNLR. The WMSE criterion is uncommon in sigmoidal network training because weights are not usually available.

To date, no parametric imaging method is completely general. WNLR methods are the most general because they directly estimate the microparameters of all linear and nonlinear tracer kinetic models. However, they are too expensive to compute and are ill posed for some problems. Among fast linearizing methods, GLLS is quite general, handling all linear compartmental models. For each new model, however, this method still requires model-specific nonlinear algebra to recover the microparameters. Also, it is not applicable to nonlinear tracer kinetic models (except for linearized perturbation experiments).

NRE has the potential to be a general nonlinear method that learns parametric imaging of microparameters from data generated by a user-supplied forward model. The method handles high noise levels and a wide parameter range. This study’s method handles a wide range of possible blood time–activity curves, and any blood time–activity curve model (with suitable parameter ranges) can be used. New parameter ranges, new sampling schedules, and new models of the blood time–activity curve, tissue time–activity curve, and noise can be accommodated by training new networks.

To apply NRE to other problems, it must be decided how to choose the number of basis functions and samples for any new network. For function learning in general, the curse of dimensionality can lead to optimization pathologies such as ill posed problems and intolerable run times. By contrast, this study’s learning problem is well sampled: The input space is only 20-variate, thousands of samples can be generated, and the target function is relatively smooth. This allows for adoption of a simple cross-validation approach to learning that is typical of successful applications. No regularization is needed, such as weight penalties or network pruning (18), though this is crucial in more challenging applications (40). Because most parametric imaging problems are of low dimensionality, the method used in this study will likely be generally useful.

A related question is whether the optimal WNLR method can be applied to all models to generate training data for NRE. This model has only 2 parameters, and the risk of ill posed WNLR problems increases with the number of parameters. However, simulation shows that ill posedness most often is caused by noise in the tissue time–activity curve variance estimate or in the blood time–activity curve estimate. Because optimal WNLR avoids these problems, it will likely be applicable to many models of interest.

## CONCLUSION

The criteria for any parametric imaging methodology are statistical performance, computational cost, and generality. Although Patlak analysis and other linearizing methods can be fast and accurate, they are not completely general and may not match WNLR methods as closely as possible. This study shows that NRE is statistically competitive with WNLR and superior to Patlak analysis for parametric imaging of myocardial perfusion by 13N-ammonia PET. NRE computes accurate myocardial perfusion images in 3 s.

NRE achieves the statistical performance of other refined WNLR methods by combining an optimal WNLR estimate with the ability of the sigmoidal network to approximate conditional expectation. NRE is a very fast method that learns parametric imaging of the microparameters from a forward model of data. Because the method is nonlinear, it has the potential to handle all nonlinear and linear tracer kinetic models.

## Acknowledgments

The authors thank Heiko Schoeder for providing the canine data and an analysis of the data, and the reviewers and editors for helpful suggestions. This study was supported by grant DE-FC03-87ER60615 from the Department of Energy, by the Ahmanson Foundation, and by Medical Scientist Training Program training grant from the National Institutes of Health.

## Footnotes

*   Received Sep. 5, 2000; revision accepted Jan. 11, 2001.
    
    For correspondence or reprints contact: Sanjiv S. Gambhir, MD, PhD, Crump Institute for Molecular Imaging, University of California, Los Angeles, School of Medicine, B3-399 BRI, 700 Westwood Plaza, Los Angeles, CA 90095-1770.

## REFERENCES

1.  Chen K, Lawson M, Reiman E, et al. Generalized linear least squares method for fast generation of myocardial blood flow parametric images with N-13 ammonia PET. IEEE Trans Med Imaging 1998;17:236–243.
    
    [PubMed](http://jnm.snmjournals.org/lookup/external-ref?access_num=9688155&link_type=MED&atom=%2Fjnumed%2F42%2F6%2F924.atom) 

2.  Choi Y, Huang SC, Hawkins RA, et al. A simplified method for quantification of myocardial blood flow using nitrogen-13-ammonia and dynamic PET. J Nucl Med 1993;34:488–497.
    
    [Abstract/FREE Full Text](http://jnm.snmjournals.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6Njoiam51bWVkIjtzOjU6InJlc2lkIjtzOjg6IjM0LzMvNDg4IjtzOjQ6ImF0b20iO3M6MjE6Ii9qbnVtZWQvNDIvNi85MjQuYXRvbSI7fXM6ODoiZnJhZ21lbnQiO3M6MDoiIjt9) 

3.  Chen K, Huang SC, Yu D. The effects of measurement errors in the plasma radioactivity curve on parameter estimation in positron emission tomography. Phys Med Biol 1991;36:1183–1200.
    
    [PubMed](http://jnm.snmjournals.org/lookup/external-ref?access_num=1946602&link_type=MED&atom=%2Fjnumed%2F42%2F6%2F924.atom) 

4.  Graham MM. Physiologic smoothing of blood TACs for PET data analysis. J Nucl Med 1997;38:1161–1168.
    
    [Abstract/FREE Full Text](http://jnm.snmjournals.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6Njoiam51bWVkIjtzOjU6InJlc2lkIjtzOjk6IjM4LzcvMTE2MSI7czo0OiJhdG9tIjtzOjIxOiIvam51bWVkLzQyLzYvOTI0LmF0b20iO31zOjg6ImZyYWdtZW50IjtzOjA6IiI7fQ==) 

5.  Chen K, Huang SC, Feng D. New estimation methods that directly use the time accumulated counts in the input function in quantitative dynamic PET studies. Phys Med Biol 1994;39:2073–2090.
    
    [PubMed](http://jnm.snmjournals.org/lookup/external-ref?access_num=15560012&link_type=MED&atom=%2Fjnumed%2F42%2F6%2F924.atom) 

6.  Feng D, Li X, Huang SC. A new double modeling approach for dynamic cardiac PET studies using noise and spillover contaminated LV measurements. IEEE Trans Biomed Eng 1996;43:319–327.
    
    [CrossRef](http://jnm.snmjournals.org/lookup/external-ref?access_num=10.1109/10.486290&link_type=DOI) 
    
    [PubMed](http://jnm.snmjournals.org/lookup/external-ref?access_num=8682545&link_type=MED&atom=%2Fjnumed%2F42%2F6%2F924.atom) 
    
    [Web of Science](http://jnm.snmjournals.org/lookup/external-ref?access_num=A1996TX67500011&link_type=ISI) 

7.  O’Sullivan F, Saha A. Use of ridge regression for improved estimation of kinetic constants from PET data. IEEE Trans Med Imaging 1999;18:115–125.
    
    [PubMed](http://jnm.snmjournals.org/lookup/external-ref?access_num=10232668&link_type=MED&atom=%2Fjnumed%2F42%2F6%2F924.atom) 

8.  Patlak CS, Blasberg RG, Fenstermacher JD. Graphical evaluation of blood-to-brain transfer constants from multiple-time uptake data. J Cereb Blood Flow Metab 1983;3:1–7.
    
    [PubMed](http://jnm.snmjournals.org/lookup/external-ref?access_num=6822610&link_type=MED&atom=%2Fjnumed%2F42%2F6%2F924.atom) 
    
    [Web of Science](http://jnm.snmjournals.org/lookup/external-ref?access_num=A1983QD19900001&link_type=ISI) 

9.  Carson RE, Huang SC, Green MV. Weighted integration method for local cerebral blood flow measurements with positron emission tomography. J Cereb Blood Flow Metab 1986;6:245–258.
    
    [PubMed](http://jnm.snmjournals.org/lookup/external-ref?access_num=3485644&link_type=MED&atom=%2Fjnumed%2F42%2F6%2F924.atom) 
    
    [Web of Science](http://jnm.snmjournals.org/lookup/external-ref?access_num=A1986C046800013&link_type=ISI) 

10. Thie JA, Smith GT, Hubner KF. Linear least squares compartmental-model-independent parameter identification in PET. IEEE Trans Med Imaging 1997;16:11–16.
    
    [PubMed](http://jnm.snmjournals.org/lookup/external-ref?access_num=9050404&link_type=MED&atom=%2Fjnumed%2F42%2F6%2F924.atom) 

11. Feng D, Huang SC, Wang Z, Ho D. An unbiased parametric imaging algorithm for nonuniformly sampled biomedical system parameter estimation. IEEE Trans Med Imaging 1996;15:512–518.
    
    [PubMed](http://jnm.snmjournals.org/lookup/external-ref?access_num=18215932&link_type=MED&atom=%2Fjnumed%2F42%2F6%2F924.atom) 

12. Feng D, Ho D, Chen K, et al. An evaluation of the algorithms for determining local cerebral metabolic rates of glucose using positron emission tomography dynamic data. J Nucl Med 1995;14:697–710.
    
    

13. Feng D, Wang Z, Huang SC. A study on statistically reliable and computationally efficient algorithms for generating local cerebral blood flow parametric images with positron emission tomography. IEEE Trans Med Imaging 1993;12:182–188.
    
    [PubMed](http://jnm.snmjournals.org/lookup/external-ref?access_num=18218406&link_type=MED&atom=%2Fjnumed%2F42%2F6%2F924.atom) 

14. Gambhir SS, Keppenne CL, Banerjee PK, Phelps ME. A new method to estimate parameters of linear compartmental models using artificial neural networks. Phys Med Biol 1998;43:1659–1678.
    
    [PubMed](http://jnm.snmjournals.org/lookup/external-ref?access_num=9651032&link_type=MED&atom=%2Fjnumed%2F42%2F6%2F924.atom) 

15. Gambhir SS, Banerjee PK, Phelps ME. An investigation of the feasibility of using neural networks for model parameter estimation [abstract]. J Nucl Med 1995;35.(suppl):81P.
    
    

16. Graham MM, Gillispie SB, Muzi M, O’Sullivan F. Generation of parametric images with artificial neural networks [abstract]. J Nucl Med 1996;37.(suppl):220P.
    
    

17. Smith GT, Liu U, Upadhyaya BR. Quantitative myocardial blood flow determined by dynamic PET scanning and neural network analysis [abstract]. Radiology 1990;177:148.
    
    

18. Haykin S. Neural Networks 2nd ed. Upper Saddle River, NJ: Prentice Hall; 1999: 108, 136.
    
    

19. Kosko B. Fuzzy Engineering Upper Saddle River, NJ: Prentice Hall; 1997.
    
    

20. Krivokapich J, Smith GT, Huang SC, et al. N-13 ammonia myocardial imaging at rest and with exercise in normal volunteers: quantification of absolute myocardial perfusion with dynamic positron emission tomography. Circulation 1989;80:1328–1337.
    
    [Abstract/FREE Full Text](http://jnm.snmjournals.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6MTQ6ImNpcmN1bGF0aW9uYWhhIjtzOjU6InJlc2lkIjtzOjk6IjgwLzUvMTMyOCI7czo0OiJhdG9tIjtzOjIxOiIvam51bWVkLzQyLzYvOTI0LmF0b20iO31zOjg6ImZyYWdtZW50IjtzOjA6IiI7fQ==) 

21. Smith GT, Huang SC, Nienaber CA, et al. Noninvasive quantification of regional myocardial blood flow with N-13 ammonia and dynamic PET. J Nucl Med 1988;28:940–947.
    
    

22. Thompson JA, Starmar F, Whalen RE, McIntosh HD. Indicator transit time considered as a gamma variate. Circ Res 1964;14:502–515.
    
    [FREE Full Text](http://jnm.snmjournals.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6MzoiUERGIjtzOjExOiJqb3VybmFsQ29kZSI7czoxMDoiY2lyY3Jlc2FoYSI7czo1OiJyZXNpZCI7czo4OiIxNC82LzUwMiI7czo0OiJhdG9tIjtzOjIxOiIvam51bWVkLzQyLzYvOTI0LmF0b20iO31zOjg6ImZyYWdtZW50IjtzOjA6IiI7fQ==) 

23. Coxson PG, Huesman RH, Borland L. Consequences of using a simplified kinetic model for dynamic PET data. J Nucl Med 1997;38:660–667.
    
    [Abstract/FREE Full Text](http://jnm.snmjournals.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6Njoiam51bWVkIjtzOjU6InJlc2lkIjtzOjg6IjM4LzQvNjYwIjtzOjQ6ImF0b20iO3M6MjE6Ii9qbnVtZWQvNDIvNi85MjQuYXRvbSI7fXM6ODoiZnJhZ21lbnQiO3M6MDoiIjt9) 

24. Pajevic S, Daube-Witherspoon ME, Bacharach SL, Carson RE. Noise characteristics of 3-D and 2-D PET Images. IEEE Trans Med Imaging 1998;17:9–23.
    
    [CrossRef](http://jnm.snmjournals.org/lookup/external-ref?access_num=10.1109/42.668691&link_type=DOI) 
    
    [PubMed](http://jnm.snmjournals.org/lookup/external-ref?access_num=9617904&link_type=MED&atom=%2Fjnumed%2F42%2F6%2F924.atom) 
    
    [Web of Science](http://jnm.snmjournals.org/lookup/external-ref?access_num=000073646700002&link_type=ISI) 

25. Gill PE, Murray W, Wright MH. Practical Optimization San Diego, CA: Academic Press; 1981.
    
    

26. Cybenko G. Approximations by superpositions of a sigmoidal function. Math Contr Signals Systems 1989;2:303–314.
    
    

27. Barron AR. Approximation and estimation bounds for artificial neural networks. In: Warmuth MK, Valiant LG, eds. Proceedings of the Fourth Annual Workshop on Computational Learning Theory Santa Cruz, CA: ACM Press; 1991:243–249.
    
    

28. Barron AR. Universal approximation bounds for superpositions of a sigmoidal function. IEEE Trans Inform Theory 1993;39:930–945.
    
    [CrossRef](http://jnm.snmjournals.org/lookup/external-ref?access_num=10.1109/18.256500&link_type=DOI) 

29. Poggio T, Girosi F. Networks for approximation and learning. Proc IEEE 1990;78:1481–1497.
    
    [CrossRef](http://jnm.snmjournals.org/lookup/external-ref?access_num=10.1109/5.58326&link_type=DOI) 

30. Krzyzak A, Linder T. Radial basis function networks and complexity regularization in function learning. IEEE Trans Neural Networks 1998;9:247–256.
    
    

31. Vapnik VN. Statistical Learning Theory New York, NY: Wiley; 1998.
    
    

32. Papoulis A. Probability, Random Variables, and Stochastic Processes 2nd ed. New York, NY: McGraw-Hill; 1984.
    
    

33. Schoeder H, Knight RJ, Kofoed KF, et al. Regulation of pyruvate dehydrogenase activity and glucose metabolism in post-ischemic myocardium. Biochem Biophys Acta 1998;1406:62–72.
    
    [PubMed](http://jnm.snmjournals.org/lookup/external-ref?access_num=9545535&link_type=MED&atom=%2Fjnumed%2F42%2F6%2F924.atom) 

34. Kuhle WG, Porenta G, Huang SC, et al. Quantification of regional myocardial blood flow using N-13 ammonia and reoriented dynamic positron emission tomographic imaging. Circulation 1992;86:1004–1017.
    
    [Abstract/FREE Full Text](http://jnm.snmjournals.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6MTQ6ImNpcmN1bGF0aW9uYWhhIjtzOjU6InJlc2lkIjtzOjk6Ijg2LzMvMTAwNCI7czo0OiJhdG9tIjtzOjIxOiIvam51bWVkLzQyLzYvOTI0LmF0b20iO31zOjg6ImZyYWdtZW50IjtzOjA6IiI7fQ==) 

35. Feng D, Huang SC, Wang X. Models for computer simulation studies of input functions for tracer kinetic modeling with positron emission tomography. Int J Biomed Comput 1993;32:95–110.
    
    [CrossRef](http://jnm.snmjournals.org/lookup/external-ref?access_num=10.1016/0020-7101(93)90049-C&link_type=DOI) 
    
    [PubMed](http://jnm.snmjournals.org/lookup/external-ref?access_num=8449593&link_type=MED&atom=%2Fjnumed%2F42%2F6%2F924.atom) 

36. Lin JW, Laine AF, Bergman SR. Use of a wavelet transform in analysis of TACs from PET [abstract]. J Nucl Med 1999;40.(suppl):77P.
    
    

37. Willemsen AT, Vaalberg WA, Paans AM. Noise reduction in functional PET imaging by temporal PCA prefiltering [abstract]. J Nucl Med 1999;40.(suppl):74P.
    
    

38. Wu HM, Hoh CK, Buxton DB, et al. Quantification of myocardial blood flow using dynamic nitrogen-13 ammonia PET studies and factor analysis of dynamic structures. J Nucl Med 1995;36:2087–2093.
    
    [Abstract/FREE Full Text](http://jnm.snmjournals.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6Njoiam51bWVkIjtzOjU6InJlc2lkIjtzOjEwOiIzNi8xMS8yMDg3IjtzOjQ6ImF0b20iO3M6MjE6Ii9qbnVtZWQvNDIvNi85MjQuYXRvbSI7fXM6ODoiZnJhZ21lbnQiO3M6MDoiIjt9) 

39. Lee JS, Ahn JY, Lee DS, et al. Blind separation of cardiac components and extraction of input function from O-15 water dynamic myocardial PET using independent component analysis: a validation study [abstract]. J Nucl Med 2000;41.(suppl):98P.
    
    

40. Werbos PJ. Supervised learning: Can it escape its local minimum? In: Roychowdhury V, Siu K, Orlitsky A, eds. Theoretical Advances in Neural Computation and Learning New York, NY: Kluwer Academic; 1994:449–461.

 [1]: /embed/graphic-1.gif
 [2]: /embed/graphic-2.gif
 [3]: /embed/graphic-3.gif