Welcome to this, the second installment, of our series on Receiver Operator Characteristics and related statistics and graphics.
The ideal way to evaluate a diagnostic test or panel of tests from a clinical view point is to withhold the test data from the clinician until a diagnosis has been made. [This is labeled a blind study.] Then the diagnosis is compared with the test results. From this comparison (Table 1) the test may be found to be of value to the staff and at what value if the test is a laboratory quantified test, what cut-off would be chosen. [The table is for format only and no way suggests you can do a study with as few as 6 patients!]
A diagnostic test may be evaluated on a smaller scale to determine the cut-off that best suits the clinical staff of a smaller, general hospital (unlike a large teaching hospital where the variety of patients and the number of them, make a full blown 'blind' study possible) In this smaller study the clinician is not blinded. All the data are available to the clinician and a diagnosis made. Once there are enough data points, the best cut-off for that population and medical staff may be determined using ROC data and the ROC curve.
The ROC analysis is a cooperative effort among multiple disciplines which should include the laboratory, clinicians, nurse managers, specialists from diagnostic imaging, perhaps the pharmacy and someone to discuss the financial impact of incorporating a new test (and perhaps discarding one or more other tests.)
We can think of the ROC study as a means to trade off the rates of false negative results ("FN", those patients who truly have the disease yet have a test result below the cut-off value studied) and false positive results ("FP", those patients without the disease who have a test result above the cut-off).
Determination of Clinical Sensitivity and Specificity - Essential Statistics in the Design of ROC Curves
In a nut shell the ROC study is carried out to help select a cut-off for a test that gives the highest sensitivity (the number of patients giving a positive result divided by the number of patients diagnosed with the disease) OR the highest specificity (the number of patients giving a negative result divided by the number of patients without the disease) OR the highest efficiency (the most 'correct' answers). We emphasize OR to stress the fact that it is not always possible to have more than one of these three criteria achieved. Fig.1 illustrates this problem.
This is due to the fact that there is NO perfect diagnostic test, just as there are no perfect clinicians. In addition to the two possible results (True Positive, where the lab and the diagnosis agree with a positive lab result and True Negative), there are two others - a true negative ("TN", the test result matches the diagnosis of negative) and a true positive ("TP", a match between the diagnosis and the test result.)
Let's move to a 2x2 grid to illustrate the four partitions in ROC studies and the statistics that can be calculated from the table. We will begin our discussion of the 2 x 2 table with definitions of the four possible outcomes for any pre-determined value, or cut-off:
- "TN" - true negative, -- the patient does not have the disease [according to the physician - the patient is labeled negative] and the test# value is below the cut-off.
- "FN" - false negative, -- the patient does have the disease [according to the physician - the patient is labeled positive], but the test value is below the cut-off (IF the cut-off is set to detect abnormal patients whose test 'should be' below the cut-off.)
- "FP" - false positive, -- the patient does not have the disease [according to the physician - the patient is labeled negative], but the test value is above the cut-off. (IF the cut-off is set to detect abnormal patients whose test 'should be' higher than the cut-off.)
- "TP" - true positive, -- the patient does have the disease [according to the physician - the patient is labeled positive], and the test value is above [OR below] the appropriate cut-off.
The labels in the grid are intentionally in quotes to remind us that these are subject to errors in the diagnosis and in the test results.
Below is a sketch of an ROC worksheet for Ceramic. The worksheet might have a column for comments (and, in reality, a considerably larger number of data points).