2025 Hyper Recent •CC0 1.0 Universal

This work is dedicated to the public domain. No rights reserved.

Access Preprint From Server
May 8th, 2025
Version: 2
Broad Institute of MIT and Harvard
genomics
biorxiv

All of Us diversity and scale improve polygenic prediction contextually with greatest improvements for underrepresented populations

Tsuo, K.Open in Google Scholar•Shi, Z.Open in Google Scholar•Ge, T.Open in Google Scholar•Mandla, R.Open in Google Scholar•Hou, K.Open in Google Scholar•Ding, Y.Open in Google Scholar•Pasaniuc, B.Open in Google Scholar•Wang, Y.Open in Google Scholar•Martin, A. R.Open in Google Scholar

Recent studies have demonstrated that polygenic risk scores (PRS) trained on multi-ancestry data can improve prediction accuracy in groups historically underrepresented in genomic studies, but the availability of linked health and genetic data from large-scale diverse cohorts representative of a wide spectrum of human diversity remains limited. To address this need, the All of Us research program (AoU) generated whole-genome sequences of 245,388 individuals (release v7) who collectively reflect the diversity of the USA. Leveraging this resource and another widely-used population-scale biobank, the UK Biobank (UKB) with a half million participants, we developed PRS trained on multi-ancestry and multi-biobank data with up to ~750,000 participants for 32 common, complex traits and diseases across a range of genetic architectures. We then evaluated effects of ancestry, PRS methodology, and genetic architecture on PRS accuracy across a held out subset of ancestrally diverse AoU participants. Overall, we found that the increased diversity of AoU significantly improved PRS performance in some participants in AoU, especially underrepresented individuals, across multiple phenotypes. Notably, maximizing sample size by combining discovery data across AoU and UKB is not the optimal approach for predicting some phenotypes particularly in African ancestry populations; rather, using data from only AoU for these traits resulted in the greatest accuracy. This was especially true for less polygenic traits with large ancestry-enriched effects, and larger heritability estimates in African ancestry populations, such as neutrophil count (R2: 0.055 vs. 0.035 using AoU vs. cross-biobank meta-analysis, respectively, because of e.g. DARC). Lastly, we calculated individual-level PRS accuracies rather than grouping by continental ancestry, a critical step towards interpretability in precision medicine. Individualized PRS accuracy decays linearly as a function of ancestry divergence, but the slope was smaller using multi-ancestry GWAS compared to using European GWAS. Our results highlight the potential of biobanks with more balanced representations of human diversity to facilitate more accurate PRS for the individuals least represented in genomic studies.

Similar Papers

biorxiv
Fri May 09 2025
Natural variation in chalcone isomerase defines a major locus controlling radial stem growth variation among Populus nigra populations
Poplar is a promising resource for wood production and the development of lignocellulosic biomass, but currently available varieties have not been optimized for these purposes. Therefore, it is critical to investigate the genetic variability and mechanisms underlying traits that affect biomass yield. Previous studies have shown that target traits in different poplar species are complex, with a sma...
Durufle, H.
•
Dejardin, A.
•
Jorge, V.
•
Pegard, M.
...•
Segura, V.
biorxiv
Fri May 09 2025
A comprehensive water buffalo pangenome reveals extensive structural variation linked to population specific signatures of selection
Water buffalo is a cornerstone livestock species in many low- and middle-income countries, yet major gaps persist in its genomic characterization, complicated by the divergent karyotypes of its two sub-species (swamp and river). Such genomic complexity makes water buffalo a particularly good candidate for the use of graph genomics, which can capture variation missed by linear reference approaches....
Arshad, F.
•
Jayaraman, S.
•
Talenti, A.
•
Owen, R.
...•
Prendergast, J. G.
biorxiv
Fri May 09 2025
Spatially varying graph estimation for spatial transcriptomics cancer data
Modern spatial transcriptomic profiling techniques facilitate spatially resolved, high-dimensional assessment of cellular gene transcription across the tumor domain. The characterization of spatially varying gene networks enables the discovery of heterogeneous regulatory patterns and biological mechanisms underlying cancer etiology. We propose a \\textit{spatial Graphical Regression} (\\texttt{sGR...
Acharyya, S.
•
Kang, J.
•
Baladandayuthapani, V.
biorxiv
Thu May 08 2025
High-Throughput Multiomics Profiling of Model Systems Using the AVITI24 Platform
We present a multiomics platform comprising Teton, a detection assay system, and AVITI24, a dual-flowcell instrument that performs both cellular imaging and sequencing readout. Teton integrates a compartmentalized flowcell for cell culture with methods to measure morphology, RNA, and protein at subcellular resolution. The platform quantifies morphological features through cell painting of 6 cellul...
Lopez, T.
•
Honigfort, D.
•
Mah, A.
•
Thompson, C.
...•
biorxiv
Thu May 08 2025
EnrichSci: Transcript-guided Targeted Cell Enrichment for Scalable Single-Cell RNA Sequencing
Large-scale single-cell atlas efforts have revealed many aging- or disease-associated cell types, yet these populations are often underrepresented in heterogeneous tissues, limiting detailed molecular and dynamic analyses. To address this, we developed EnrichSci - a highly scalable, microfluidics-free platform that combines Hybridization Chain Reaction RNA FISH with combinatorial indexing to profi...
Liao, A.
•
Zhang, Z.
•
Sziraki, A.
•
Abdulraouf, A.
...•
Cao, J.
biorxiv
Thu May 08 2025
Cryptic diversity arises from glacial cycles in Pacific herring, a critical forage fish
Forage fishes are biological drivers throughout the Pacific Ocean, from the Arctic to nearly subtropical latitudes. As a critical trophic link, the health and stability of Pacific herring (Clupea pallasii) populations have implications for other marine species, including several targeted by large, productive fisheries. Previous research has indicated marked divergence between Pacific herring in th...
Timm, L. E.
•
Almgren, S. A.
•
Lopez, J. A.
•
Glass, J. R.
biorxiv
Thu May 08 2025
Identification of a novel transcriptome signature for predicting the response to anti-TNF-α treatment in rheumatoid arthritis patients
Objectives: To identify and validate a transcriptomic signature capable of predicting the response to antitumour necrosis factor (TNF) therapy in patients with rheumatoid arthritis (RA) before treatment initiation. Methods: We performed a retrospective transcriptomic analysis using two public datasets, RNA-seq data from peripheral blood mononuclear cells in GSE138746 and microarray data from whole...
Pena, R. D.
biorxiv
Thu May 08 2025
Integrative analysis of RNA binding proteins identifies DDX55 as a novel regulator of 3'UTR isoform diversity
The 3\' untranslated regions (3\'UTRs) of mRNAs play a critical role in controlling gene expression and function because they contain binding sites for microRNAs and RNA binding proteins (RBPs) that alter mRNA stability, localization, and translation. Most mRNA 3\' ends contain multiple polyadenylation sites (PAS) that can be utilized in condition-specific manners, a process known as alternative p...
Gazzara, M. R.
•
Cater, T.
•
Mallory, M. J.
•
Barash, Y.
•
Lynch, K. W.
biorxiv
Thu May 08 2025
Identity-by-descent captures Shared Environmental Factors at Biobank Scale
The apple does not fall far from the tree is an old idiom that encapsulates a key concept: being related extends beyond merely sharing genetic material. It often implies sharing a common environment, including culture, language, dietary habits, and geographical location. In this study, we show that the analysis of genetic relatedness can serve as an indicator of health conditions by capturing the ...
Marsico, F.
•
Buonaiuto, S.
•
Amos-Abanyie, E.
•
Chinthala, L.
...•
Colonna, V.