Rank tests for ROC curves

Alan Girling

University of Birmingham

The diagnostic performance of a pair of competing markers may be compared by computing the area between their respective ROC curves. Where the ROC curves are estimated nonparametrically, the standard error of this area may be derived using the method of Delong et al (1988). This leads to a simple large-sample test of the null hypothesis that the ROC curves are identical. We describe an improvement to the large-sample test, and a small sample version of the test based on resampling ideas.

The large-sample improvement employs a "null" version of the standard error, which exploits the identity of the two curves under the null hypothesis. The corresponding modification to the statistical test is shown to have greater power whenever the sets of samples on which the ROC curves are based have exchangeable ranks within each diagnostic group. Extensions of the idea to the analysis of complex studies of "ratings" data are considered, with a corresponding exchangeability condition that is often met in practice.

Small sample versions of these tests are proposed, based on bootstrap resampling from empirical copulas to generate distributions consistent with the null hypothesis. This procedure is feasible because of the invariance of the ROC curves under monotone transformations of the diagnostic markers.


Go back to Statistics seminars

Go to Department of Statistics home page