Elsevier

Academic Radiology

Volume 20, Issue 7, July 2013, Pages 915-919
Academic Radiology

Communication
A Brief History of Free-Response Receiver Operating Characteristic Paradigm Data Analysis

https://doi.org/10.1016/j.acra.2013.03.001Get rights and content

In the receiver operating characteristic paradigm the observer assigns a single rating to each image and the location of the perceived abnormality, if any, is ignored. In the free-response receiver operating characteristic paradigm the observer is free to mark and rate as many suspicious regions as are considered clinically reportable. Credit for a correct localization is given only if a mark is sufficiently close to an actual lesion; otherwise, the observer's mark is scored as a location-level false positive. Until fairly recently there existed no accepted method for analyzing the resulting relatively unstructured data containing random numbers of mark-rating pairs per image. This report reviews the history of work in this field, which has now spanned more than five decades. It introduces terminology used to describe the paradigm, proposed measures of performance (figures of merit), ways of visualizing the data (operating characteristics), and software for analyzing free-response receiver operating characteristic studies.

Section snippets

FROC data: Mark-Rating pairs

The mark is the location of the suspicious region and the rating is the confidence level that the region contains a lesion. The data analyst decides whether a mark is close enough to a real lesion to qualify as lesion localization (LL)—a location-level “true positive”—and otherwise the mark is classified as non-lesion localization (NL)—a location-level “false positive.” The quotes are intended to emphasize the confusion that can arise if one uses terminology developed for image-level ROC

Operating Characteristics and Figures-of-Merit

Data analysis starts with the selection of a figure-of-merit (FOM) and a procedure for estimating it from the observed collection of NLs and LLs, each with an associated rating (the rating does not have to be a discrete integer). A valid FOM rewards the observer for correct decisions and penalizes for incorrect decisions. Finding a suitable FOM usually starts with a way of visualizing the data. For example, the ROC curve suggests the area under the curve as a suitable FOM for ROC data. Bunch

FOMs

The AFROC plot (LLF versus FPF) is amenable to defining a nonparametric FOM. One compares all pairing of LLs and highest rated NLs on normal images. If the LL rating is greater, one cumulates unity; if they are equal, one cumulates 0.5, and at the end of the process, one divides by the total number of comparisons. Except for some nuances, described in a document available at www.devchakraborty.com, this is the FOM used in jackknife alternative FROC (JAFROC) analysis (23), currently the most

Discussion

This report has summarized the history of research in free-response data analysis. The history is essentially that of finding a good FOM and a method for testing the significance of the difference between two FOMs. The significance testing methodology has benefited immensely from work by DBM (25) and subsequent refinements by Hillis et al 26, 31, 32, 33, 34, 35.

Not discussed in this report are the modeling advances that have taken place in connection with FROC research 13, 14, 15, 36, 37, 38.

Acknowledgments

The author is grateful to Ms Kun-Wan Chen for proofing the manuscript. This work was supported by grants from the Department of Health and Human Services, National Institutes of Health (R01-EB005243 and R01-EB008688).

References (43)

  • J.P. Egan et al.

    Operating characteristics, signal detectability and the method of free response

    J Acoust Soc Am

    (1961)
  • C.J. D'Orsi et al.

    Breast Imaging Reporting and Data System: ACR BI-RADS – Breast Imaging Atlas

    (2003)
  • H. Miller

    The FROC curve: a representation of the observer's performance for the method of free response

    J Acoust Soc Am

    (1969)
  • P.C. Bunch et al.

    A free-response approach to the measurement and characterization of radiographic-observer performance

    J Appl Photogr Eng

    (1978)
  • D.P. Chakraborty et al.

    Digital and conventional chest imaging: a modified ROC study of observer performance using simulated nodules

    Radiology

    (1986)
  • L.T. Niklason et al.

    Simulated pulmonary nodules: detection with dual-energy digital versus conventional radiography

    Radiology

    (1986)
  • M. Kallergi et al.

    Evaluating the performance of detection algorithms in digital mammography

    Med Phys

    (1999)
  • T.M. Haygood et al.

    On the choice of acceptance radius in free-response observer performance studies

    Br J Radiol

    (2013)
  • D.P. Chakraborty et al.

    Free-response methodology: alternate analysis and a new observer-performance experiment

    Radiology

    (1990)
  • D.P. Chakraborty

    Maximum likelihood analysis of free-response receiver operating characteristic (FROC) data

    Med Phys

    (1989)
  • R.G. Swensson

    Unified measurement of observer performance in detecting and localizing target objects on images

    Med Phys

    (1996)
  • Cited by (56)

    • A Magnified Adaptive Feature Pyramid Network for automatic microaneurysms detection

      2021, Computers in Biology and Medicine
      Citation Excerpt :

      The simplified CNN structure alleviated the local feature loss during the convolution and pooling processes. Their method outperformed most of the other methods in metrics of the free response operating characteristics (FROC) [27,28]. This method has achieved certain success and uses CNN to calculate the probability map for region proposal.

    • Pulmonary nodule detection on chest radiographs using balanced convolutional neural network and classic candidate detection

      2020, Artificial Intelligence in Medicine
      Citation Excerpt :

      First, we evaluate and analyze the overall performance of the CADe scheme for Databases A–C. A free-response receiver operating characteristic (FROC) curve is a tool that simultaneously characterizes the performance of a free-response system at all decision thresholds [38]. The CADe system does not simply detect whether medical images contain an abnormal organization, it detects the numbers and locations of the sites.

    View all citing articles on Scopus
    View full text