1887
banner image
No data available.
Please log in to see this content.
You have no subscription access to this content.
No metrics data to plot.
The attempt to load metrics for this article has failed.
The attempt to plot a graph for these metrics has failed.
oa
Confidence intervals for performance assessment of linear observers
Rent:
Rent this article for
Access full text Article
/content/aapm/journal/medphys/38/S1/10.1118/1.3577764
1.
1. H. H. Barrett and K. J. Myers, Foundations of Image Science (John Wiley & Son, New York, 2004).
2.
2. S. Park, R. Jennings, H. Liu, A. Badano, and K. Myers, “A statistical, task-based evaluation method for three-dimensional x-ray breast imaging systems using variable-background phantoms,” Med. Phys. 37(12), 62536270 (2010).
3.
3. M. S. Pepe, The Statistical Evaluation of Medical Tests for Classification and Prediction (Oxford University Press, New York, 2003).
4.
4. G. Casella and R. L. Berger, Statistical Inference, 2nd ed. (Duxbury, Belmont, CA, 2001).
5.
5. J. H. Steiger and R. T. Fouladi, “Noncentrality interval estimation and the evaluation of statistical models,” in What if There Were No Significance Tests? edited by L. L. Harlow, S. A. Mulaik, and J. H. Steiger (Lawrence Erlbaum, Mahwah, NJ, 1997).
6.
6. C. E. Metz, “Quantification of failure to demonstrate statistical significance: The usefulness of confidence intervals,” Invest. Radiol. 28(1), 5963 (1993).
http://dx.doi.org/10.1097/00004424-199301000-00017
7.
7. D. Bamber, “The area above the ordinal dominance graph and the area below the receiver operating characteristic graph,” J. Math. Psychol. 12, 387415 (1975).
http://dx.doi.org/10.1016/0022-2496(75)90001-2
8.
8. N. A. Obuchowski and M. L. Lieber, “Confidence intervals for the receiver operating characteristic area in studies with small samples,” Acad. Radiol. 5(8), 561571 (1998).
http://dx.doi.org/10.1016/S1076-6332(98)80208-0
9.
9. R. G. Newcombe, “Confidence intervals for an effect size measure based on the Mann-Whitney statistic. Part 2: Asymptotic methods and evaluation,” Stat. Med. 25, 559573 (2006).
http://dx.doi.org/10.1002/sim.v25:4
10.
10. G. Ma and W. Hall, “Confidence bands for receiver operating characteristic curves,” Med. Decis Making 13(3), 191197 (1993).
http://dx.doi.org/10.1177/0272989X9301300304
11.
11. S. A. Macskassy, F. Provost, and S. Rosset, “ROC confidence bands: An empirical evaluation,” in Proceedings of 22nd International Conference on Machine Learning Bonn, Germany, 2005, pp. 537544.
12.
12. H. Godwin and S. Zaremba, “A central limit theorem for partly dependent variables,” Ann. Math. Stat. 32(3), 677686 (1961).
http://dx.doi.org/10.1214/aoms/1177704963
13.
13. J. Wang, H. Lu, Z. Liang, D. Eremina, G. Zhang, S. Wang, J. Chen, and J. Manzione, “An experimental study on the noise properties of x-ray CT sinogram data in Radon space,” Phys. Med. Biol. 53(12), 33273341 (2008).
http://dx.doi.org/10.1088/0031-9155/53/12/018
14.
14. R. M. Gagne, B. D. Gallas, and K. J. Myers, “Toward objective and quantitative evaluation of imaging systems using images of phantoms,” Med. Phys. 33(1), 8395 (2006).
http://dx.doi.org/10.1118/1.2140117
15.
15. N. L. Johnson, S. Kotz, and N. Balakrishnan, Continuous Univariate Distributions, 2nd ed. (John Wiley & Son, New York, 1995), Vol. 2.
16.
16. W. C. Lee and C. K. Hsiao, “Alternative summary indices for the receiver operating characteristic curve,” Epidemiology 7(6), 605611 (1996).
17.
17. L. J. Bain and M. Engelhardt, Introduction to Probability and Mathematical Statistics, 2nd ed. (Duxbury, 1992).
18.
18. A. C. Kak and M. Slaney, Principles of Computerized Tomographic Imaging, Series. Classics in Applied Mathematics (SIAM, Philadelphia, PA, 2001), Vol. l3.
19.
19. A. Wunderlich and F. Noo, “Image covariance and lesion detectability in direct fan-beam x-ray computed tomography,” Phys. Med. Biol 53(10), 24712493 (2008).
http://dx.doi.org/10.1088/0031-9155/53/10/002
20.
20. A. Agresti and B. A. Coull, “Approximate is better than “exact,” for interval estimation of binomial proportions,” Am. Stat. 52(2), 119126 (1998).
http://dx.doi.org/10.2307/2685469
21.
21. QRM GmbH, http://www.qrm.de/. Last accessed May, 2011.
22.
22. H. Lilliefors, “On the Kolmogorov-Smirnov test for normality with mean and variance unknown,” J. Am. Stat. Assoc. 62(318), 399402 (1967).
http://dx.doi.org/10.2307/2283970
23.
23. R. A. Johnson and D. W. Wichern, Applied Multivariate Statistical Analysis, 5th ed. (Prentice-Hall, Englewood Cliffs, NJ, 2002).
24.
24. E. Lehmann and G. Casella, Theory of Point Estimation, 2nd ed. (Springer, New York, 1998).
25.
25. H. V. Poor, An Introduction to Signal Detection and Estimation, 2nd ed. (Springer, New York, 1994).
26.
26. F. Jones, Lebesgue Integration on Euclidean Space, Revised ed. (Jones and Bartlett, Sudbury, MA, 2001).
http://aip.metastore.ingenta.com/content/aapm/journal/medphys/38/S1/10.1118/1.3577764
Loading

Figures

Image of FIG. 1.

Click to view

FIG. 1.

Percentage decrease of mean (95%) AUC confidence interval length, relative to the n 1 = n 2 case, plotted as a function of n 1, with n 2 and AUC held fixed. From top to bottom, the curves correspond to n 2 values of 25, 50, 75, 100, 125, 150, 175, and 200, respectively. The plots are for AUC values of 0.6 (left), 0.75 (center), and 0.9 (right).

Image of FIG. 2.

Click to view

FIG. 2.

Depiction of a small lesion embedded in a larger, uniform cylinder.

Image of FIG. 3.

Click to view

FIG. 3.

Mean images of the QRM phantom displayed with a grayscale window of [–200, 600] HU. Whole phantom (left) and reconstruction focused on the heart insert (right) with regions of interest marked with white boxes. ROI-1a, ROI-1b, and ROI-1c contain no lesion. ROI-2a, ROI-2b, and ROI-2c contain a low-contrast, and a medium-contrast, and high-contrast lesion, respectively.

Image of FIG. 4.

Click to view

FIG. 4.

Ninety-five percent confidence bands for the ROC curves corresponding to observer performance on short-scan and full-scan reconstructions. The band for short-scan reconstruction is shown in dark gray, delimited by dashed lines and the band for full-scan reconstruction is shown in light gray, delimited by solid lines. Note that the bands slightly overlap.

Image of FIG. 5.

Click to view

FIG. 5.

Ninety percent confidence region for (SNR ss , SNR fs ).

Tables

Generic image for table

Click to view

TABLE I.

Bounds on the variance ratio of the rating data, , for a lesion of diameter d mm with contrast C HU. (The other bound on is always 1).

Generic image for table

Click to view

TABLE II.

Estimated coverage probabilities (in percent) for two-sided 95% AUC confidence intervals generated from normally distributed rating data with n 1 = n 2 = n. The tables correspond to variance ratios of and . In all cases, the upper and lower bounds of a conservative 95% confidence interval for the coverage probability may be obtained by adding and subtracting 0.014% to/from each point estimate, respectively.

Generic image for table

Click to view

TABLE III.

Estimated coverage probabilities (in percent) for two-sided 95% AUC confidence intervals generated from normally distributed rating data with n 1 = 2n and n 2 = n. The tables correspond to variance ratios of and ). In all cases, the upper and lower bounds of a conservative 95% confidence interval for the coverage probability may be obtained by adding and subtracting 0.014% to/from each point estimate, respectively.

Generic image for table

Click to view

TABLE IV.

Estimated p-values for the example. The p-values for the Lilliefors normality test and for the two-sample F-test of equal variances.

Generic image for table

Click to view

TABLE V.

Comparison of 95% confidence intervals estimated for observer performance on short-scan and full-scan reconstructions. The intervals were estimated from n 1 = 136 class-1 ratings and n 2 = 136 class-2 ratings.

Loading

Article metrics loading...

/content/aapm/journal/medphys/38/S1/10.1118/1.3577764
2011-07-20
2014-04-24

Abstract

Purpose:

This work seeks to develop exact confidence interval estimators for figures of merit that describe the performance of linear observers, and to demonstrate how these estimators can be used in the context of x-raycomputed tomography(CT). The figures of merit are the receiver operating characteristic (ROC) curve and associated summary measures, such as the area under the ROC curve. Linear computerized observers are valuable for optimization of parameters associated with image reconstruction algorithms and data acquisition geometries. They provide a means to perform assessment of image quality with metrics that account not only for shift-variant resolution and nonstationary noise but that are also task-based.

Methods:

We suppose that a linear observer with fixed template has been defined and focus on the problem of assessing the performance of this observer for the task of deciding if an unknown lesion is present at a specific location. We introduce a point estimator for the observer signal-to-noise ratio(SNR) and identify its sampling distribution. Then, we show that exact confidence intervals can be constructed from this distribution. The sampling distribution of our SNR estimator is identified under the following hypotheses: (i) the observer ratings are normally distributed for each class of images and (ii) the variance of the observer ratings is the same for each class of images. These assumptions are, for example, appropriate in CT for ratings produced by linear observers applied to low-contrast lesion detection tasks.

Results:

Unlike existing approaches to the estimation of ROC confidence intervals, the new confidence intervals presented here have exactly known coverage probabilities when our data assumptions are satisfied. Furthermore, they are applicable to the most commonly used ROC summary measures, and they may be easily computed (a computer routine is supplied along with this article on the Medical Physics Website). The utility of our exact interval estimators is demonstrated through an image quality evaluation example using real x-rayCTimages. Also, strong robustness is shown to potential deviations from the assumption that the ratings for the two classes of images have equal variance. Another aspect of our interval estimators is the fact that we can calculate their mean length exactly for fixed parameter values, which enables precise investigations of sampling effects. We demonstrate this aspect by exploring the potential reduction in statistical variability that can be gained by using additional images from one class, if such images are readily available. We find that when additional images from one class are used for an ROC study, the mean AUC confidence interval length for our estimator can decrease by as much as 35%.

Conclusions:

We have shown that exact confidence intervals can be constructed for ROC curves and for ROC summary measures associated with fixed linear computerized observers applied to binary discrimination tasks at a known location. Although our intervals only apply under specific conditions, we believe that they form a valuable tool for the important problem of optimizing parameters associated with image reconstruction algorithms and data acquisition geometries, particularly in x-rayCT.

Loading

Full text loading...

/deliver/fulltext/aapm/journal/medphys/38/S1/1.3577764.html;jsessionid=53k3pa9fj0hir.x-aip-live-01?itemId=/content/aapm/journal/medphys/38/S1/10.1118/1.3577764&mimeType=html&fmt=ahah&containerItemId=content/aapm/journal/medphys
true
true
This is a required field
Please enter a valid email address
752b84549af89a08dbdd7fdb8b9568b5 journal.articlezxybnytfddd
Scitation: Confidence intervals for performance assessment of linear observers
http://aip.metastore.ingenta.com/content/aapm/journal/medphys/38/S1/10.1118/1.3577764
10.1118/1.3577764
SEARCH_EXPAND_ITEM