Welcome to the first in a series of articles that will discuss the application of Receiver Operator Characteristic statistics (ROC) and ROC Curves, as well as other statistics, including Odds Ratios, Likelihood Ratios, Relative Risks and other statistics and graphics that are used to study outcomes. Our emphasis will be on understanding the concepts and interpretation of these tools using examples from the literature.
In certain areas of healthcare, the terms receiver operator (or operating) characteristic curves and the area under the ROC curve (AUC) are used in discussions when a test or a group of tests is being evaluated from the clinical view in contrast to the analytical aspects.
When a clinician requests the laboratory staff "run a test" on a patient, aid is sought in making a diagnosis or guiding treatment, such as drug dosage. This particular test, or group, was requested based on the signs and symptoms seen and heard by the clinician. These signs and symptoms only suggested, but did not confirm, a certain disease or a cluster of similar diseases.
The physician is hoping the result of this diagnostic test will help in refining the differential diagnosis, and give some direction for the next move -- perhaps another laboratory test, a radiology test, a biopsy or perhaps the diagnosis. The clinician expects that not only is the test result analytically accurate and precise (within the limits of the assay), but also that the result will aid in the diagnosis or treatment of the patient.
The physician orders certain tests when there are certain symptoms and signs. These tests have been evaluated not only for their analytical value but also for their ability to aid in ruling in or ruling out diagnoses, thus raising the possibility of another diagnosis.
The ROC was first developed by electrical engineers and radar engineers during World War II. In radar, the cathode ray tube was the receiver and would search the British skies for German planes. If the receiver was tuned too high, it found not only planes but also birds; when set too low it could miss planes. Thus the receiver and its operator had certain characteristics.
The ROC is also known as a relative operating characteristic curve, because it is a comparison of two operating characteristics (a true positive response and a false-positive response) as the criterion changes.
Accuracy and precision are analytical aspects of laboratory tests. Given acceptable analytical accuracy and precision, how well the ordered test separates patients with disease X from those patients without disease X, is the clinical efficacy of the test (Figure 1a and 1b).
In the figure, the x-axis represents the test result (such as blood sugar or blood pressure) and the y-axis represents the frequency that result appears in this study. The relative size of the two groups indicates that more often there are many more patients without the disease than patients with the disease. This may have a significant effect when comparing data from one study to another. We return to this later.
Before becoming part of the diagnostic protocol that a clinician uses, studies are done to determine diagnostic accuracy and to set the appropriate cutoff(s) for the test. When only one cutoff is used, a value below the cutoff is labeled "normal" or negative (troponin for example); a value above the cut-off is "abnormal" or positive.
When two cutoffs are used (such as TSH [for hypothyroidism and hyperthyroidism]), a value below the lower cutoff is "low"' or "abnormal, low" and a value above the high cutoff is "high" or "abnormal, high." Values between the two cutoffs are "normal."
We stress that a clinical evaluation as we discuss in this section should be performed in addition to the standard protocol used by the laboratory to evaluate new tests prior to them being available on its roster. In these evaluation situations, uncommon in smaller hospitals, the clinicians are asked to label the patient "positive" or "negative" for the disease without first having access to the result of the test being studied.
In the blind study, the test result is withheld from the clinician until a diagnosis is made. In these studies, the clinician may use any number of other tests (from the laboratory, the radiology department, a scan or a biopsy, etc.) as well as the signs and symptoms seen, but not the data from the test being evaluated to ascertain its value to the diagnosis and for the assignment of an appropriate cutoff. There are several reasons the clinicians would like to have access to an new biomarker (e.g., current tests could be more invasive, more expensive, less clinically sensitive or clinically specific [or both], difficult to perform, with long turn-around time, etc.).
In some situations, a test may be evaluated on a smaller scale to determine only the cutoff that best suits the clinical staff, although the clinician is not blinded. All the data are available to the clinician and a diagnosis made. Once there are enough data points, the best cutoff for that population and medical staff is determined.
In either case (blinded or not), a receiver operator characteristic (ROC) curve analysis is helpful. The ROC analysis is a cooperative effort among multiple disciplines which would include the laboratory, clinicians, nurse managers, specialists from diagnostic imaging, perhaps the pharmacy, and someone to discuss the financial impact.
We can think of the ROC curves as a means to trade off the rates of false negative results (FN) -- those patients who truly have the disease yet have a test result below the cut-off value studied) and false positive results (FP) -- those patients without the disease who have a test result above the cutoff.
(Editor's Note: this is the first part of a multi-part series. Check back for Part 2.)