Inter-observer reliability of preoperative cardiopulmonary exercise test interpretation: a cross-sectional study.

Abbott TEF, Gooneratne M, McNeill J, Lee A, Levett DZH, Grocott
MPW, Swart M, MacDonald N; ARCTIC study investigators.

Br J Anaesth. 2018 Mar;120(3):475-483. doi: 10.1016/j.bja.2017.11.071. Epub 2017
Nov 29.

BACKGROUND: Despite the increasing importance of cardiopulmonary exercise testing
(CPET) for preoperative risk assessment, the reliability of CPET interpretation
is unclear. We aimed to assess inter-observer reliability of preoperative CPET.
METHODS: We conducted a prospective, multi-centre, observational study of
preoperative CPET interpretation. Participants were professionals with previous
experience or training in CPET, assessed by a standardized questionnaire. Each
participant interpreted 100 tests using standardized software. The CPET variables
of interest were oxygen consumption at the anaerobic threshold (AT) and peak
oxygen consumption (VO2 peak). Inter-observer reliability was measured using
intra-class correlation coefficient (ICC) with a random effects model. Results
are presented as ICC with 95% confidence interval, where ICC of 1 represents
perfect agreement and ICC of 0 represents no agreement.
RESULTS: Participants included 8/28 (28.6%) clinical physiologists, 10 (35.7%)
junior doctors, and 10 (35.7%) consultant doctors. The median previous experience
was 140 (inter-quartile range 55-700) CPETs. After excluding the first 10 tests
(acclimatization) for each participant and missing data, the primary analysis of
AT and VO2 peak included 2125 and 2414 tests, respectively. Inter-observer
agreement for numerical values of AT [ICC 0.83 (0.75-0.90)] and VO2 peak [ICC
0.88 (0.84-0.92)] was good. In a post hoc analysis, inter-observer agreement for
identification of the presence of a reportable AT was excellent [ICC 0.93
(0.91-0.95)] and a reportable VO2 peak was moderate [0.73 (0.64-0.80)].
CONCLUSIONS: Inter-observer reliability of interpretation of numerical values of
two commonly used CPET variables was good (>80%). However, inter-observer
agreement regarding the presence of a reportable value was less consistent.