The seven-point scale used in the rating exercise for the listening tests as seen on the computer screen.
ROC curves representing ideal performance, an example of realistic performance, and chance performance.
Relationship between a detection threshold in a binormal model and its corresponding operating point on a ROC curve. The coordinates of the operating point, (PFA,PD), on a ROC curve are the areas under the target-echo and clutter distributions to the right of a given detection threshold on the x axis of the binormal model. The full binormal ROC curve is obtained by smoothly varying the position of the threshold across the breadth of the Gaussians, to asymptotically reach the (0,0) and (1,1) extremities of the curve.
When modeling ROC performance, it is commonly assumed that the distributions of the two perceived classes of objects are Gaussian and to scale the Gaussians such that the mean and variance of the clutter distribution are 0 and 1, respectively. The two unknown parameters that determine the sensitivity of a detector (the shape of the ROC curve) are the mean and variance of the Gaussian representing the distribution of the perceived target echoes. The model that is fitted to the measured operating points (discussed further in Sec. III C) will also include a set of predicted thresholds, t 1, t 2,., tn .
Graphical result from the web-based program ROC analysis for subject s14 in rating exercise A. The plot has been recreated using a slightly different format so as to be legible in black and white print. The modeled ROC curve and asymmetric 95% confidence interval are depicted as well as the measured operating points obtained from binning and lumping together the listener’s ratings of the echoes (the points marked with squares).
A goodness-of-fit test showed that the use of a binormal ROC curve as a model was not valid for subject s03 in the first occurrence of the rating exercise in the full-bandwidth test. The inadequate fit could also be seen in the ROC plot: two of the measured operating points are situated outside the error bounds of the modeled ROC curve. Similar results were obtained for subject s09.
Measures of performance of the individual human subjects for the full-bandwidth test. Shown are the Az estimates and 95% confidence limits. Included in the ranking is the performance of the automatic classifier, which is discussed in Sec. IV.
Measures of performance of the individual human subjects for the reduced-bandwidth test. Shown are the Az estimates and 95% confidence limits. Included in the ranking is the performance of the automatic classifier, which is discussed in Sec. IV.
The automatic classifier computes a ratio of a posteriori probabilities for each echo tested. Ranking the echoes according to this ratio effectively ranks them from most clutter-like to most like a target echo. A subset of ranked echoes is depicted here. The criterion used to bin the echoes is to proceed through the sequence from right to left and define a boundary when at least n target echoes and n cases of clutter have been counted. From the right-most boundary shown here, 14 target echoes were counted before the number of clutter cases satisfied the criterion of n = 6. A boundary is set at this point, and a new count begins to define the next boundary.
Modeled ROC curve representing the automatic classifier’s performance in the reduced-bandwidth test.
Modeled ROC curves of one of the high-performing listeners (subject s08) in the reduced-bandwidth test in rating exercises A and B. Some of the listeners performed better in the first rating exercise. Others did better in the second. For subject s08, one can see that the performances were nearly identical.
The estimate of the performance of listener s15 was closest to the estimate of mean performance of all the listeners in the reduced-bandwidth test. Shown are the modeled ROC curves for subject s15.
The input data for the ROC analysis consists of two lines of n integers, where n is the number of rating categories. Each line represents how the human subject rated one class of echoes; the n integers on a line represent the count of the number of times each rating category is used to rate the sample of echoes forming one class. In the example shown here, the listener rated 4 of the clutter cases as is definitely, or almost definitely, clutter, 57 of the clutter cases as probably clutter, 7 of the clutter cases as could equally well be clutter or a target echo, and 5 of the clutter cases as possibly a target echo. The line representing the target echoes is interpreted in the same way. The sum of the n integers on the first line is therefore the total number of clutter cases used in the listening test and, on the second line, the total number of target echoes.
Summary of the results of the ROC analysis applied to the data from both executions of the rating exercise (rating exercises A and B) in the full-bandwidth test. The codes representing the listeners are found in the first column. The tabulated numbers are the index Az, the standard deviation of Az (), and the probability result of the goodness-of-fit test (Q). Comments on the irregular results are provided in Sec. ???.
Summary of the results of the ROC analysis applied to the data from the two executions of the rating exercise (rating exercises A and B) in the reduced-bandwidth test. The codes representing the listeners are found in the first column. The tabulated numbers are the index Az , the standard deviation of Az , and the probability result of a goodness-of-fit test (Q). Comments on the irregular results are provided in Sec. ???
Comparison of the mean performance of the human listeners and the performance of the automatic classifier for the full-bandwidth and reduced-bandwidth tests. Performance is stated as area under the binormal curve, Az , and associated standard error, .
Article metrics loading...
Full text loading...