Faculté des lettres et sciences humaines

Reproducibility of Vertebral Fracture Assessment Readings From Dual-energy X-ray Absorptiometry in Both a Population-based and Clinical Cohort: Cohen's and Uniform Kappa

Aubry-Rozier, Bérengère ; Chapurlat, Roland ; Duboeuf, François ; Iglesias, Katia ; Krieg, Marc-Antoine ; Lamy, Olivier ; Burnand, Bernard ; Hans, Didier

In: Journal of Clinical Densitometry, 2015, vol. 18, no. 2, p. 233-238

Vertebral fracture assessments (VFAs) using dual-energy X-ray absorptiometry increase vertebral fracture detection in clinical practice and are highly reproducible. Measures of reproducibility are dependent on the frequency and distribution of the event. The aim of this study was to compare 2 reproducibility measures, reliability and agreement, in VFA readings in both a population-based and a... More

Add to personal list
    Summary
    Vertebral fracture assessments (VFAs) using dual-energy X-ray absorptiometry increase vertebral fracture detection in clinical practice and are highly reproducible. Measures of reproducibility are dependent on the frequency and distribution of the event. The aim of this study was to compare 2 reproducibility measures, reliability and agreement, in VFA readings in both a population-based and a clinical cohort. We measured agreement and reliability by uniform kappa and Cohen's kappa for vertebral reading and fracture identification: 360 VFAs from a population-based cohort and 85 from a clinical cohort. In the population-based cohort, 12% of vertebrae were unreadable. Vertebral fracture prevalence ranged from 3% to 4%. Inter-reader and intrareader reliability with Cohen's kappa was fair to good (0.35–0.71 and 0.36–0.74, respectively), with good inter-reader and intrareader agreement by uniform kappa (0.74–0.98 and 0.76–0.99, respectively). In the clinical cohort, 15% of vertebrae were unreadable, and vertebral fracture prevalence ranged from 7.6% to 8.1%. Inter-reader reliability was moderate to good (0.43–0.71), and the agreement was good (0.68–0.91). In clinical situations, the levels of reproducibility measured by the 2 kappa statistics are concordant, so that either could be used to measure agreement and reliability. However, if events are rare, as in a population-based cohort, we recommend evaluating reproducibility using the uniform kappa, as Cohen's kappa may be less accurate.