## Abstract

The aim of this work was to assess the variability of total lesion glycolysis (TLG) measurements in lung cancer patients, obtained with fixed percentages of the maximum standardized uptake value (SUVmax) thresholds. **Methods:** Thirteen lesions (10 patients) were analyzed in 10 successive 2.5-min frames of an ^{18}F-FDG PET dynamic acquisition obtained between 60 and 110 min after injection. ^{18}F-FDG–positive lesion volume, associated average SUV (SUVmean), and TLG (volume × SUVmean) were assessed in each frame using thresholds of 40%, 50%, 60%, 70%, and 80%. For each threshold, the average relative SD of TLG, leading to relative measurement error and repeatability, was calculated over the lesion series. The dependence of TLG variability on volume and SUVmean variability was also assessed. **Results:** The average relative SD of TLG correlated strongly with threshold: 1.0866 × exp(0.0472 × threshold) (*r* = 0.999; *P* < 0.01). For the 40% threshold, average TLG over the series was 225.9 g (range, 41.7–1,086.3), relative measurement error and repeatability were 14.5%–20.4% (95% confidence interval), and no significant difference was found between TLG and volume variability. For the other thresholds, TLG variability was significantly lower or greater than volume or SUVmean variability, respectively. **Conclusion:** In current clinical practice, a formula allows quick estimation of TLG variability for any percentage of the SUVmax threshold: the higher the threshold the greater the TLG variability.

PET/CT scans obtained with ^{18}F-FDG can provide several functional parameters of malignant tumors because of their increased glucose metabolism. The image-derived standardized uptake value (SUV) is a semiquantitative index of tumor ^{18}F-FDG uptake that is currently used in clinical practice. SUV can be obtained either from the voxel of maximal activity value, that is, SUVmax, or from the average SUV of voxels involved in the so-called metabolic active volume, that is, SUVmean. Estimating the metabolically active volume requires 3-dimensional outlining of the ^{18}F-FDG–positive tissue through various segmentation methods. In particular, the method of Erdi et al. delineates the tumor ^{18}F-FDG–positive volume using a threshold defined as a fixed percentage of SUVmax (1). Knowledge of volume and associated SUVmean provides a further functional parameter given by their product, that is, total lesion glycolysis (TLG = volume × SUVmean) (2).

The efficiency of these functional parameters for assessing treatment response or survival prognosis has recently been investigated in various types of malignancy, such as non-Hodgkin lymphoma (3), small cell and non–small cell lung cancer (4–7), esophageal and rectal cancer (8,9), and oropharyngeal squamous cell carcinoma (10). These studies found that, among the functional parameters, volume and TLG were usually better prognostic factors than SUVs. However, any parameter estimation is subject to variability, and repeated measurements of the same lesion obtained from different acquisitions (even performed with an identical procedure) will vary around an average “true” value. For reliable predictions, the estimates should vary as little as possible within an acceptable range of relative measurement error (the maximal difference expected between a single measurement and the average true value, with 95%–99% confidence interval [CI]) and an acceptable range of repeatability (the maximal difference expected between 2 successive measurements) (11,12). Variability of SUVmax and SUVmean was recently determined in a metaanalysis by de Langen et al. (13). Variability of volume (and SUVmax) has also recently been determined in lung cancer patients for various fixed percentages of the SUVmax thresholds (40%, 50%, 60%, and 70%), and a strong correlation was found between volume variability and threshold (14).

The current study on lung cancer patients was aimed at assessing TLG variability for fixed 40%, 50%, 60%, 70%, and 80% of SUVmax thresholds. Lung cancer lesions were selected according to the criteria of Erdi et al. (1). A single PET dynamic acquisition was performed involving 10 successive 2.5-min frames acquired within a typical window of 60–110 min after ^{18}F-FDG injection. For each threshold, the average relative SD of TLG was calculated over the lesion series, and we searched for a further correlation between average relative SD of TLG and threshold. Because TLG is volume × SUVmean, the dependence of TLG variability on volume and SUVmean variability was also assessed.

## MATERIALS AND METHODS

### Study Population

Ten patients (1 woman and 9 men; average age, 62 y; range, 54–78 y) with known malignant lung cancer were analyzed, and 13 lesions (10 lung lesions and 3 mediastinal lymph node lesions) were outlined. This study received the approval of the Ethics Committee of the teaching hospital, CHU de Bordeaux, and all patients gave their informed consent before their inclusion in the imaging procedure. The patients’ average weight and height were 70 kg (range, 44–95 kg) and 170 cm (range, 157–179 cm), respectively. All patients fasted for at least 6 h before the ^{18}F-FDG injection, and the preinjection plasma glucose concentration averaged 0.99 g⋅L^{−1} (range, 0.90–1.14 g⋅L^{−1}). Patient characteristics and pathologic features are shown in Table 1. According to the selection criteria of Erdi et al., the lesions had to be larger than 4 mL and the average SUV-to-background ratio had to be at least 5:1 (both obtained with a 40% threshold) (1).

^{18}F-FDG PET Imaging Procedure and TLG Assessment

^{18}F-FDG was administered intravenously for less than 1 min (average dose, 334 MBq [9.0 mCi]; range, 229–455 MBq [6.2–12.3 mCi]), and no tissue infiltration of the dose was revealed during a total-body scan that was performed for diagnosis purposes. Dynamic PET imaging was performed within 60–110 min after ^{18}F-FDG injection, that is, over a typical acquisition time window, using a Discovery ST PET/CT camera (GE Healthcare). A single bed position over the patient’s chest and centered on one or more ^{18}F-FDG–positive lesions was used, and 10 successive frames were acquired over 25 min (2.5-min each during shallow breathing). All dynamic PET images were acquired in 3-dimensional mode with in-plane axial 2.73- to 3.27-mm spatial resolution (field of view, 700 × 700 mm; in-plane matrix, 256 × 256 pixels), iterative reconstruction (Fourier rebinning plus ordered-subset expectation maximization using 32 subsets and 5 iterations; 3-dimensional Hann postprocessing filter with a cutoff frequency of 0.9 and order of 10.0). CT transmission imaging was performed before the PET imaging for attenuation correction (pitch, 1.675; slice thickness, 3.75 mm; field of view, 700 × 700 mm; matrix, 512 × 512 pixels). An Advantage 4.4 workstation (GE Healthcare) was used for automatic 3-dimensional outlining of each ^{18}F-FDG–positive lesion in 1 step, not slice by slice, with the 5 fixed thresholds. For each lesion and each fixed threshold, 3-dimensional outlining provided volume, SUVmean, and hence TLG.

### Statistical Analysis

For each lesion and each fixed threshold, the average TLG value and relative SD were computed for the 10 frames of the dynamic PET acquisition. Therefore, for each fixed threshold, the average relative SD (%), and hence relative measurement error (%) and repeatability (%) of TLG, were assessed over the series according to the method of Bland and Altman (11,12). Before this calculation, for each fixed threshold we verified over the series that relative SD did not depend on the size of the measurement (11). For each fixed threshold, relative measurement error was calculated as 1.96 × –2.58 × average relative SD (corresponding to 95%–99% CI), and repeatability of TLG was calculated as 2^{1/2} × 1.96 × –2^{1/2} × 2.58 × average relative SD (corresponding to 95%–99% CI). Average relative SD, relative measurement error, and repeatability of SUVmean and volume were assessed in a similar manner to that for TLG measurements (14).

## RESULTS

The individual characteristics of the patients are presented in Table 1, including—for each lesion—average volume for the 40% threshold, average SUVmean and associated relative SD for the 40% threshold, average TLG for the 40% threshold, and TLG relative SD for the 5 thresholds. For the 40% threshold, average volume, average SUVmean, and average TLG over the series were 22.6 mL (range, 5.9–95.5 mL), 9.1 g/mL (range, 5.4–14.7 g/mL), and 225.9 g (range, 41.7–1,086.3 g), respectively (Table 1).

For all thresholds, relative SD of TLG, SUVmean, and volume were found to be unrelated to parameter magnitude over the series: the greatest correlation coefficients were *r* = 0.29, 0.37, and 0.27, respectively. Figure 1 shows this lack of correlation for TLG for the 40% threshold over the series. For each threshold, this result allowed us to calculate the average relative SD of each functional parameter over the series (Table 1). The average relative SD of TLG was exponentially correlated with threshold: 1.0866 × exp(0.0472 × threshold) (Fig. 2; *r* = 0.999; *P* < 0.01). Figure 2 shows the average relative SD of TLG, relative measurement error, and repeatability of TLG for all investigated thresholds. For the 40% threshold, the average relative SD of TLG was 7.4%, and relative measurement error and repeatability were 14.5%–19.0% and 20.4%–26.9%, respectively, with a 95%–99% CI.

The average relative SD of SUVmean correlated exponentially with threshold: 3.3492 × exp(0.0087 × threshold) (Fig. 3; *r* = 0.950; *P* < 0.02). For the 40% threshold, the average relative SD of SUVmean was 5.0% and relative measurement error and repeatability were 9.7%–12.8% and 13.8%–18.1%, respectively, with a 95%–99% CI. For the 80% threshold, the average relative SD of SUVmean was 7.1% and did not significantly differ from the previously published value of 7.1% for average relative SD of SUVmax (14). Average relative SD of volume correlated exponentially with thresholds of 40%–80%: 1.3753 × exp(0.0453 × threshold) (Fig. 3; *r* = 0.999; *P* < 0.01). For the 40% threshold, the estimates of average relative SD of TLG and average relative SD of volume did not differ significantly (1-tailed paired *t* test: *P* = 0.081), whereas the estimate of average relative SD of TLG was significantly greater than that of average relative SD of SUVmean (1-tailed paired *t* test: *P* = 0.007). For the 50%, 60%, 70%, and 80% thresholds, the estimate of average relative SD of TLG was significantly lower than that of average relative SD of volume and greater than that of average relative SD of SUVmean (1-tailed paired *t* test, *P* ≤ 0.018).

## DISCUSSION

### TLG Variability

Average relative SD of TLG obtained from ^{18}F-FDG–positive lung cancer lesions correlated strongly with threshold (*r* = 0.999; *P* < 0.01; Fig. 2). For a 40% threshold in particular, average relative SD of TLG was 7.4%, and relative measurement error and repeatability were 14.5%–19.0% and 20.4%–26.9%, respectively, with a 95%–99% CI. In clinical practice, TLG variability for the 40% threshold might be useful for assessing treatment effects or for predicting survival (2), for several reasons: it is lower than that assessed for higher thresholds; TLG estimates for the 40% threshold are larger than those for the 50%, 60%, 70%, and 80% thresholds, and overestimation is more clinically relevant than underestimation; and TLG computed using a fixed thresholding method based on SUVmax is simple to implement and avoids intra- and interobserver variability. Moreover, our study suggests that our formula or the graph displayed in Figure 2 might be helpful in quickly estimating the magnitude of the TLG variability of an arbitrary lesion for any threshold ranging from 40% to 80% of SUVmax.

Because TLG is the product of ^{18}F-FDG–positive volume and associated SUVmean, variability of TLG should be compared with that of SUVmean and volume. Average relative SD of SUVmean and of volume also correlated strongly with threshold (respectively, *r* = 0.950 and 0.999; *P* < 0.02 and 0.01). Figure 3 shows that the higher the threshold, the larger the difference between average relative SD of TLG and of SUVmean, whereas the difference between average relative SD of TLG and of volume remains almost constant. As an example, for a 40% threshold, average relative SD of SUVmean, TLG, and volume were 5.0%, 7.4%, and 8.6%, respectively, whereas for an 80% threshold they were 7.1%, 47.8%, and 51.8%, respectively. However, for a 40% threshold the estimates of average relative SD of TLG and of volume did not significantly differ (nevertheless, with a low *P* value of 0.081). As a consequence, TLG and volume for the 40% threshold could be equivalently used for assessing treatment effects or predicting survival. If a 50% threshold is implemented, the use of TLG rather than volume may be justified on the basis of a significant variability difference. This suggestion is supported by literature data on patients with small cell lung cancer and non–small cell lung cancer (4,7). For all thresholds, the fact that both TLG and volume variability were greater than SUVmean variability (Fig. 3) emphasizes the relevance of the latter if variability magnitude is considered.

A few studies have investigated TLG variability, and to the best of our knowledge, no study has investigated TLG variability in lung cancer patients. Therefore, the order of magnitude of this TLG variability was compared with that obtained by Hatt et al. in esophageal cancer patients (15). Hatt et al. evaluated the repeatability of TLG estimates for a 50% SUVmax threshold, from two ^{18}F-FDG baseline scans acquired 60 min after injection within an average of 2–3 d of each other. The SD of the relative difference between the pairs of estimates was 23.1%, and a parameter equivalent to the average relative SD of TLG of our study can be calculated as 23.1/2^{1/2} = 16.3%. Although slightly higher, this value is comparable to the 11.4% average relative SD of TLG found in our study for a 50% threshold. In comparison with the study of Hatt et al., our study used a single dynamic acquisition, ruling out the influence of various factors such as changes in plasma glucose level that may play a role in test–retest studies (16).

### SUVmean Variability

In locally advanced rectal cancer, Hatt et al. found that TLG was the best predictor of pathologic response; however, they also found that SUVmean had weaker but similar predictive power (8). This result is likely related to the low relative measurement error and repeatability of SUVmean, in comparison with those of SUVmax, as recently established in a metaanalysis by de Langen et al. (13). In particular, that metaanalysis used results for the 50% SUVmax threshold from previous test–retest studies (17–19) and a graph representing the repeatability of SUVmean (with a 95% CI) versus SUVmean (Fig. 3C of de Langen et al. (13)). For a 50% threshold, the current study found that average relative SD of SUVmean was 5.1%, leading to repeatability of 2^{1/2} × 1.96 × 5.1 = 14.1% (with a 95% CI). Furthermore, the current study also found a 10.0 g/mL average SUVmean for the 50% threshold over the series. In the study of de Langen et al., the repeatability associated with this 10.0 g/mL was about 14.0%, which agrees well with the 14.1% found in the current study.

Figure 3 shows that the higher the threshold, the larger the average relative SD of SUVmean. This result may be explained by the fact that SUVmean is calculated for a determined volume. A previous study demonstrated that the higher the threshold the larger the average relative SD of volume, and as a result SUVmean and volume variabilities have the same origin, which corresponds to the SUVmax variability (14). Furthermore, for an 80% threshold, average relative SD of SUVmean did not significantly differ from that of SUVmax, showing that SUVmean has better repeatability than SUVmax, which depends on the threshold.

### Study Design

The present study assessed TLG variability using a single 25-min dynamic PET acquisition, therefore providing data over a ±12.5(25/2)-min time window around an average injection-acquisition time delay, in comparison with test–retest studies involving 2 baseline scans repeated on 2 different days but acquired at the same injection-acquisition time delay. However, we suggest that temporal changes in TLG during 12.5 min will result in a limited increase of TLG variability. Moreover, an overestimation of TLG variability is clinically more acceptable than an underestimation.

Large lung tumors with relatively high ^{18}F-FDG uptake were investigated as an example for a 40% threshold, and the range of SUVmean and volume was 5.4–14.7 g/mL and 5.9–95.5 mL, respectively. These features met the selection criteria defined by Erdi et al. (1). However, TLG is the product of volume and SUVmean. As a consequence, the lowest TLG value of 41.7 g investigated in this study was actually the product of 6.4 × 6.5 (mL × g/mL), but as an example, the same TLG value might also result from a larger volume of 16.7 mL multiplied by a smaller SUV of 2.5 g/mL. In other words, a similar low 41.7-g TLG value might also be obtained for a 3.2-cm-diameter lesion showing a faint SUVmean of 2.5 g/mL, such as for alveolar cell carcinomas or well-differentiated carcinomas. Nevertheless, the metaanalysis by De Langen et al. showed that SUVmean repeatability increased for a low SUVmean (13). Therefore, further studies are warranted to assess TLG variability in lesions smaller than 2 cm showing faint ^{18}F-FDG uptake, in other words, for lesions with TLG values lower than 41.7 g. Those studies should involve correction for partial-volume effect and respiratory movement (20).

Recently published studies on patients with small cell lung cancer and non–small cell lung cancer demonstrated the prognostic value of whole-body TLG at pretreatment ^{18}F-FDG PET imaging (5–7). Whole-body TLG is computed as the sum of the TLG values of all malignant hypermetabolic lesions found over the whole body of a single patient. Although the TLG range investigated in the present study may be considered relatively high, summing of the TLG values of the primary tumor, nodal metastases, and distant metastases may result in a whole-body TLG value that falls within the TLG range of this study.

## CONCLUSION

This study investigated TLG variability for various fixed percentages of SUVmax thresholds and its dependence on volume and SUVmean variability. We demonstrated the possibility of using a formula, in clinical practice, to estimate the TLG variability for any percentage of the SUVmax threshold. Because TLG variability is greater at higher thresholds, we suggest that a low threshold should be suitable for evaluating treatment effects or predicting survival in lung cancer patients.

## DISCLOSURE

No potential conflict of interest relevant to this article was reported.

## Footnotes

Published online Aug. 5, 2013.

## REFERENCES

- Received for publication March 11, 2013.
- Accepted for publication May 31, 2013.