Polygenic Scores Help Reduce Racial Disparities in Predictive Accuracy of Automated Type 1 Diabetes Classification Algorithms
Automated algorithms to identify individuals with type 1 diabetes using electronic health records are increasingly used in biomedical research. It is not known whether the accuracy of these algorithms differs by self-reported race. This manuscript by CGM investigators Miriam Udler, Jose Florez, and CGM associate member Alisa Manning and colleauges investigates whether polygenic scores improve identification of individuals with type 1 diabetes. Using two large hospital-based biobanks (Mass General Brigham [MGB] and BioMe) the group analyzed an established automated algorithm for identifying type 1 diabetes and compared it to two published polygenic scores for type 1 diabetes. Importantly, the automated algorithm was more likely to incorrectly assign a diagnosis of type 1 diabetes in self-reported non-White individuals than in self-reported White individuals. After incorporating polygenic scores into the MGB Biobank, the positive predictive value of the type 1 diabetes algorithm increased from 70 to 97% for self-reported White individuals (meaning that 97% of those predicted to have type 1 diabetes indeed had type 1 diabetes) and from 53 to 100% for self-reported non-White individuals. Similar results were found in BioMe. This work importantly illuminates the inherent problems with automated phenotyping algorithms, and the risks of exacerbating health disparities because of an increased risk of misclassification of individuals from underrepresented populations. Polygenic scores may be used to improve the performance of phenotyping algorithms and potentially reduce this disparity.