Statistical evaluation of agreement between two methods for measuring a quantitative variable

https://doi.org/10.1016/0010-4825(89)90036-XGet rights and content

Abstract

Methodologic research is often concerned with determining whether two methods (procedures, laboratory instruments) can be used interchangeably for measuring some quantitative variable of interest. Logically, one method can be used as a surrogate of another provided the methods show high agreement on the measured results. Although the product-moment correlation (r) is often used as an indicator of agreement, this index is in fact inappropriate for this purpose. The intraclass correlation (r1) is the correct statistic for assessing agreement or consistency between two methods.

Another criterion sometimes used for supporting interchangeability is the similarity of the mean measured results obtained by the two methods. However, similarity of means (aggregate agreement) does not necessarily indicate individual-subject agreement, and it is the latter that is the pre-requisite for interchangeability. On the other hand, a marked difference between two means (lack of aggregate agreement) does necessarliy indicate lack of individual-subject agreement and therefore non-interchangeability.

Herein we suggest that two methods for measuring a quantitative variable can be judged interchangeable provided all of the following conditions are met: first the methods must not exhibit marked additive or nonadditive systematic bias; second the difference between the two mean readings is not “statically signficant”; third, the lower limit of the 95% confidence interval of the intraclass correlation is at least 0.75.

Statistical procedures to evaluate these conditions of interchangeability are described in detail. A computer program coded in SAS to carry out the procedures is listed in the Appendix. A similar program coded in DBASE III PLUS for the microcomputer is available upon request.

References (10)

  • J. Lee

    Significance in medical statistics

    Lancet

    (1976)
  • G.W. Snedecor et al.
  • P.E. Shrout et al.

    Intraclass correlations: uses in assessing rater reliability

    Psychol. Bull.

    (1979)
  • P. Kingman

    A procedure for evaluating the reliability of a gingivitis index

    J. clin. Periodontol.

    (1986)
  • J.L. Fleiss et al.

    Approximate interval estimation for a certain intraclass correlation coefficient

    Psychometrika

    (1978)
There are more references available in the full text version of this article.

Cited by (421)

  • Development of a histopathological index for skeletal muscle analysis in Rattus norvegicus (Rodentia: Muridae)

    2022, Acta Histochemica
    Citation Excerpt :

    The scores obtained for each standard analysis, as well as the final average of each evaluator, were analyzed using the Intra-Class Correlation Coefficient (ICC) analysis, performed in the licensed program MedCalc ©. The ICC values greater than 0.75 indicate a good correlation between the evaluators and, therefore, the replicability conditions of the instrument (Burdock et al., 1960; Lee et al., 1989). However, there is a conflict of interpretations regarding this cut-off point (Müller and Büttner, 1994).

View all citing articles on Scopus
View full text