BMC Res Notes. 2023 Sep 11;16(1):208. doi: 10.1186/s13104-023-06469-x.
OBJECTIVE: To assess whether electronic health record (EHR) data text mining can be used to improve register-based heart failure (HF) subtyping. EHR data of 43,405 individuals from two Finnish hospital biobanks were mined for unstructured text mentions of ejection fraction (EF) and validated against clinical assessment in two sets of 100 randomly selected individuals. Structured laboratory data was then incorporated for a categorization by HF subtype (HF with mildly reduced EF, HFmrEF; HF with preserved EF, HFpEF; HF with reduced EF, HFrEF; and no HF).
RESULTS: In 86% of the cases, the algorithm-identified EF belonged to the correct HF subtype range. Sensitivity, specificity, PPV and NPV of the algorithm were 94-100% for HFrEF, 85-100% for HFmrEF, and 96%, 67%, 53% and 98% for HFpEF. Survival analyses using the traditional diagnosis of HF were in concordance with the algorithm-based ones. Compared to healthy individuals, mortality increased from HFmrEF (hazard ratio [HR], 1.91; 95% confidence interval [CI], 1.24-2.95) to HFpEF (2.28; 1.80-2.88) to HFrEF group (2.63; 1.97-3.50) over a follow-up of 1.5 years. We conclude that quantitative EF data can be efficiently extracted from EHRs and used with laboratory data to subtype HF with reasonable accuracy, especially for HFrEF.