bioRxiv [Preprint]. 2024 Apr 9:2024.04.03.587209. doi: 10.1101/2024.04.03.587209.

ABSTRACT

Genomic scientists have long been promised cheaper DNA sequencing, but deep whole genomes are still costly, especially when considered for large cohorts in population-level studies. More affordable options include microarrays + imputation, whole exome sequencing (WES), or low-pass whole genome sequencing (WGS) + imputation. WES + array + imputation has recently been shown to yield 99% of association signals detected by WGS. However, a method free from ascertainment biases of arrays or the need for merging different data types that still benefits from deeper exome coverage to enhance novel coding variant detection does not exist. We developed a new, combined, “Blended Genome Exome” (BGE) in which a whole genome library is generated, an aliquot of that genome is amplified by PCR, the exome regions are selected and enriched, and the genome and exome libraries are combined back into a single tube for sequencing (33% exome, 67% genome). This creates a single CRAM with a low-coverage whole genome (2-3x) combined with a higher coverage exome (30-40x). This BGE can be used for imputing common variants throughout the genome as well as for calling rare coding variants. We tested this new method and observed >99% r 2 concordance between imputed BGE data and existing 30x WGS data for exome and genome variants. BGE can serve as a useful and cost-efficient alternative sequencing product for genomic researchers, requiring ten-fold less sequencing compared to 30x WGS without the need for complicated harmonization of array and sequencing data.

PMID:38645052 | PMC:PMC11030253 | DOI:10.1101/2024.04.03.587209