Background: Poor inter-rater reliability in chest radiograph interpretation has been reported in the context of acute respiratory distress syndrome (ARDS), although not for the Berlin definition of ARDS. We sought to examine the effect of training material on the accuracy and consistency of intensivists' chest radiograph interpretations for ARDS diagnosis.
Methods: We conducted a rater agreement study in which 286 intensivists (residents 41.3%, junior attending physicians 35.3%, and senior attending physician 23.4%) independently reviewed the same 12 chest radiographs developed by the ARDS Definition Task Force ("the panel") before and after training. Radiographic diagnoses by the panel were classified into the consistent (n = 4), equivocal (n = 4), and inconsistent (n = 4) categories and were used as a reference. The 1.5-hour training course attended by all 286 intensivists included introduction of the diagnostic rationale, and a subsequent in-depth discussion to reach consensus for all 12 radiographs.
Results: Overall diagnostic accuracy, which was defined as the percentage of chest radiographs that were interpreted correctly, improved but remained poor after training (42.0 +/- 14.8% before training vs. 55.3 +/- 23.4% after training, p < 0.001). Diagnostic sensitivity and specificity improved after training for all diagnostic categories (p < 0.001), with the exception of specificity for the equivocal category (p = 0.883). Diagnostic accuracy was higher for the consistent category than for the inconsistent and equivocal categories (p < 0.001). Comparisons of pre-training and post-training results revealed that inter-rater agreement was poor and did not improve after training, as assessed by overall agreement (0.450 +/- 0.406 vs. 0.461 +/- 0.575, p = 0.792), Fleiss's kappa (0.133 +/- 0.575 vs. 0.178 +/- 0.710, p = 0.405), and intraclass correlation coefficient (ICC; 0.219 vs. 0.276, p = 0.470).
Conclusions: The radiographic diagnostic accuracy and inter-rater agreement were poor when the Berlin radiographic definition was used, and were not significantly improved by the training set of chest radiographs developed by the ARDS Definition Task Force.
Key words: ARDS; CARE
引用本文： . . 华西虚拟期刊, 2000, 1(1): -. doi: 10.1186/s13054-017-1606-4 复制