medRxiv. 2023 Sep 5:2023.09.05.23295061. doi: 10.1101/2023.09.05.23295061. Preprint.


OBJECTIVE: The study aimed to develop and validate algorithms for identifying people with type 1 and type 2 diabetes in the All of Us Research Program (AoU) cohort, using electronic health record (EHR) and survey data.

RESEARCH DESIGN AND METHODS: Two sets of algorithms were developed, one using only EHR data (EHR), and the other using a combination of EHR and survey data (EHR+). Their performance was evaluated by testing their association with polygenic scores for both type 1 and type 2 diabetes.

RESULTS: For type 1 diabetes, the EHR-only algorithm showed a stronger association with T1D polygenic score ( p =3×10 -5 ) than the EHR+. For type 2 diabetes, the EHR+ algorithm outperformed both the EHR-only and the existing AoU definition, identifying additional cases (25.79% and 22.57% more, respectively) and showing stronger association with T2D polygenic score (DeLong p =0.03 and 1×10 -4 , respectively).

CONCLUSIONS: We provide new validated definitions of type 1 and type 2 diabetes in AoU, and make them available for researchers. These algorithms, by ensuring consistent diabetes definitions, pave the way for high-quality diabetes research and future clinical discoveries.

ARTICLE HIGHLIGHTS: Why did we undertake this study?: a.This study was conducted to develop and validate algorithms for identifying type 1 and type 2 diabetes cases in the All of Us Research Program (AoU).What is the specific question(s) we wanted to answer?: b.Can accurate algorithms for type 1 and type 2 diabetes identification be developed and validated using AoU cohort Electronic Health Record (EHR) and survey data? Do the identified diabetes cases show association with polygenic scores in diverse populations?What did we find?: c.We developed a new validated type 1 diabetes definition and expanded upon the existing type 2 diabetes definition.What are the implications of our findings?: d.The developed algorithms can be universally implemented in AoU for identifying study participants for well-defined case-control diabetes studies.

PMID:37732265 | PMC:PMC10508798 | DOI:10.1101/2023.09.05.23295061