Inter-observer reliability of preoperative cardiopulmonary exercise test interpretation: a cross-sectional study.

Abbott TEF; William Harvey Research Institute, Queen Mary University of London, London, UK; Barts Health NHS Trust, London, UK. Electronic address: t.abbott@qmul.ac.uk.
Gooneratne M; Barts Health NHS Trust, London, UK.
McNeill J; Barts Health NHS Trust, London, UK.
Lee A; William Harvey Research Institute, Queen Mary University of London, London, UK.
Levett DZH; Critical Care Research Group, Southampton NIHR Biomedical Research Centre, University Hospital Southampton-University of Southampton, Southampton, UK.
Grocott MPW; Critical Care Research Group, Southampton NIHR Biomedical Research Centre, University Hospital Southampton-University of Southampton, Southampton, UK.
Swart M; South Devon Healthcare NHS Trust, Torbay, UK.
MacDonald N; Barts Health NHS Trust, London, UK.

British Journal Of Anaesthesia [Br J Anaesth] 2018 Mar; Vol. 120 (3), pp. 475-483. Date of Electronic Publication: 2017 Nov 29.

Background: Despite the increasing importance of cardiopulmonary exercise testing (CPET) for preoperative risk assessment, the reliability of CPET interpretation is unclear. We aimed to assess inter-observer reliability of preoperative CPET.
Methods: We conducted a prospective, multi-centre, observational study of preoperative CPET interpretation. Participants were professionals with previous experience or training in CPET, assessed by a standardized questionnaire. Each participant interpreted 100 tests using standardized software. The CPET variables of interest were oxygen consumption at the anaerobic threshold (AT) and peak oxygen consumption (VO2 peak). Inter-observer reliability was measured using intra-class correlation coefficient (ICC) with a random effects model. Results are presented as ICC with 95% confidence interval, where ICC of 1 represents perfect agreement and ICC of 0 represents no agreement.
Results: Participants included 8/28 (28.6%) clinical physiologists, 10 (35.7%) junior doctors, and 10 (35.7%) consultant doctors. The median previous experience was 140 (inter-quartile range 55-700) CPETs. After excluding the first 10 tests (acclimatization) for each participant and missing data, the primary analysis of AT and VO2 peak included 2125 and 2414 tests, respectively. Inter-observer agreement for numerical values of AT [ICC 0.83 (0.75-0.90)] and VO2 peak [ICC 0.88 (0.84-0.92)] was good. In a post hoc analysis, inter-observer agreement for identification of the presence of a reportable AT was excellent [ICC 0.93 (0.91-0.95)] and a reportable VO2 peak was moderate [0.73 (0.64-0.80)].
Conclusions: Inter-observer reliability of interpretation of numerical values of two commonly used CPET variables was good (>80%). However, inter-observer agreement regarding the presence of a reportable value was less consistent.