Skip to main content

Validation of an internationally derived patient severity phenotype to support COVID-19 analytics from electronic health record data

Venn Diagram showing overlap of code classes among patients with the 4CE Severe Phenotype

Abstract

Objective The Consortium for Clinical Characterization of COVID-19 by EHR (4CE) is an international collaboration addressing coronavirus disease 2019 (COVID-19) with federated analyses of electronic health record (EHR) data. We sought to develop and validate a computable phenotype for COVID-19 severity.

          Materials and Methods
          Twelve 4CE sites participated. First, we developed an EHR-based severity phenotype consisting of 6 code classes, and we validated it on patient hospitalization data from the 12 4CE clinical sites against the outcomes of intensive care unit (ICU) admission and/or death. We also piloted an alternative machine learning approach and compared selected predictors of severity with the 4CE phenotype at 1 site.
        
        
          Results
          The full 4CE severity phenotype had pooled sensitivity of 0.73 and specificity 0.83 for the combined outcome of ICU admission and/or death. The sensitivity of individual code categories for acuity had high variability—up to 0.65 across sites. At one pilot site, the expert-derived phenotype had mean area under the curve of 0.903 (95% confidence interval, 0.886-0.921), compared with an area under the curve of 0.956 (95% confidence interval, 0.952-0.959) for the machine learning approach. Billing codes were poor proxies of ICU admission, with as low as 49% precision and recall compared with chart review.
        
        
          Discussion
          We developed a severity phenotype using 6 code classes that proved resilient to coding variability across international institutions. In contrast, machine learning approaches may overfit hospital-specific orders. Manual chart review revealed discrepancies even in the gold-standard outcomes, possibly owing to heterogeneous pandemic conditions.
        
        
          Conclusions
          We developed an EHR-based severity phenotype for COVID-19 in hospitalized patients and validated it at 12 international sites.

Citation

JG Klann, H Estiri, GM Weber, B Moal, P Avillach, C Hong, ALM Tan, BK Beaulieu-Jones, V Castro, T Maulhardt, A Geva, A Malovini, AM South, S Visweswaran, M Morris, MJ Samayamuthu, GS Omenn, KY Ngiam, KD Mandl, M Boeker, KL Olson, DL Mowery, RW Follett, DA Hanauer, R Bellazzi, JH Moore, NHW Loh, DS Bell, KB Wagholikar, L Chiovato, V Tibollo, S Rieg, ALLJ Li, V Jouhet, E Schriver, Z Xia, M Hutch, Y Luo, IS Kohane, The Consortium for Clinical Characterization of COVID-19 by EHR (4CE) (CONSORTIA AUTHOR), GA Brat, SN Murphy. “Validation of an internationally derived patient severity phenotype to support COVID-19 analytics from electronic health record data”, Journal of the American Medical Informatics Association 28(7):1411-1420 (2021). doi:10.1093/jamia/ocab018