2025 Hyper Recent •CC0 1.0 Universal

This work is dedicated to the public domain. No rights reserved.

Access Preprint From Server
May 23rd, 2025
Version: 1
University of California San Francisco
genomics
biorxiv

A catalog of ancient proxies for modern genetic variants

Brand, C. M.Open in Google Scholar•Capra, J. A.Open in Google Scholar

The ability to observe the genomes of past human populations using ancient DNA provides an extraordinary perspective on many fundamental questions in human genetics, including understanding the evolutionary history of variants that underlie human disease and other phenotypes. However, ancient DNA is often damaged and degraded, yielding low-coverage of most nucleotides. Further, many publicly available genotypes for ancient humans are limited to ~1.23 million specific loci. Thus, variants of interest often fall outside these specific positions, limiting the ability of ancient DNA to shed light on many loci. Here, we address this challenge by quantifying linkage disequilibrium (LD) between modern variants and ancient genotyped variants (AGVs) to generate a catalog enabling rapid identification of proxy variants. We identified 260,732,675 pairs of AGVs and modern variants with a minimum LD threshold hold of R squared >= 0.2. Even at R squared >= 0.9, >= 60% of common variants were linked to an AGV in non-African ancestry groups, as were 34% of common variants in Africans. We evaluated the accuracy of the genotypes inferred from proxy variants in two high-coverage ancient genomes finding that > 90% of genotypes were correctly predicted, even in a 45,000 year old individual. We also find that AGVs are significantly older than expected and that many likely are evolving neutrally. We integrate these results in a database that researchers can easily query to identify ancient proxy variants if their variant of interest is not directly genotyped in ancient humans.

Similar Papers

biorxiv
Fri May 23 2025
Everything, everywhere, all at once - Surveillance and molecular epidemiology reveal Melissococcus plutonius is endemic among Michigan, US beekeeping operations of all sizes and present in some honey bee colonies year-round
European foulbrood (EFB) is a severe bacterial disease of honey bee brood often leading to significant declines in colony health and honey production. The dearth of data on this disease in the United States (US) complicates response efforts. In this study, we combine surveillance and molecular epidemiology to investigate prevalence, diversity, and transmission dynamics of Melissococcus plutonius, ...
Fowler, P. D.
•
Dhakal, U.
•
Chang, J. H.
•
Milbrath, M. O.
biorxiv
Fri May 23 2025
Mapping the Regulatory Architecture of Circadian Clock Adaptation: A Genome-Wide eQTL Analysis in Drosophila melanogaster
The circadian clock system enables organisms to synchronize internal daily rhythms with environmental cues, critically impacting survival and fitness. While the molecular architecture of this system in Drosophila melanogaster is well-characterized through transcription-translation negative feedback loops involving ten core clock genes, regulatory genetic variants affecting their expression remain ...
Yair, M.
•
Fishman, B.
•
Aslan, M.
•
Tauber, E.
biorxiv
Fri May 23 2025
Unveiling the Biochemical Mechanisms of Aging and the Implications of Oxidative Stress on Cellular Senescence through Multi-Omics Analysis of Fibroblasts
This research investigates the complex biochemical mechanisms underlying aging by analyzing primary human fibroblasts using a longitudinal multi-omics dataset. This dataset includes cytology, DNA methylation and epigenetic clocks, bioenergetics, mitochondrial DNA sequencing, RNA sequencing, and cytokine profiling. Key findings indicate that mitochondrial efficiency declines with age, while glycoly...
Mandal, R.
•
Xie, N.
•
Alterovitz, G.
biorxiv
Fri May 23 2025
The CLAMP GA-binding transcription factor regulates heat stress-induced transcriptional repression by associating with 3D chromatin loops
To survive exposure to heat stress (HS), organisms activate stress response genes and repress constitutive gene expression, thereby preventing the accumulation of potentially toxic RNA and protein products. Although many studies have elucidated the mechanisms that drive HS-induced activation of stress response genes across species, little is known about the mechanisms that repress constitutively e...
Aguilera, J.
•
Duan, J.
•
Cortez, K.
•
Lee, R.
...•
Larschan, E.
biorxiv
Fri May 23 2025
Designing DNA With Tunable Regulatory Activity Using Score-Entropy Discrete Diffusion
Designing regulatory DNA sequences with precise, cell-type-specific activity is critical for applications in medicine and biotechnology, but remains challenging due to the vast combinatorial space and complex regulatory grammar governing gene expression. Recent deep generative models---including genomic language models and diffusion-based approaches---offer new tools for sequence design, yet lack ...
Sarkar, A.
•
Kang, Y.
•
Somia, N.
•
Mantilla, P.
...•
Koo, P.
biorxiv
Thu May 22 2025
Functional genomic analysis of non-canonical DNA regulatory elements of the aryl hydrocarbon receptor
The aryl hydrocarbon receptor (AHR) is a ligand-dependent transcription factor that is activated by environmental toxicants, like halogenated and polycyclic aromatic hydrocarbons, and then binds to DNA and regulates gene expression. AHR is involved in various physiological processes, including liver and immune system function, cell cycle regulation, oncogenesis, and metabolism. In the canonical pa...
Shahriar, S.
•
Patel, T. D.
•
Nakka, M.
•
Grimm, S. L.
...•
Gorelick, D. A.
biorxiv
Thu May 22 2025
Phenotypic tolerance for rDNA copy number variation within the natural range of C. elegans
The genes for ribosomal RNA (rRNA) are encoded by ribosomal DNA (rDNA), whose structure is notable for being present in arrays of tens to thousands of tandemly repeated copies in eukaryotic genomes. The exact number of rDNA copies per genome is highly variable within a species, with differences between individuals measuring in potentially hundreds of copies and megabases of DNA. The extent to whic...
Hall, A. N.
•
Morton, E.
•
Walters, R.
•
Cuperus, J. T.
•
Queitsch, C.
biorxiv
Thu May 22 2025
A complete reference genome assembly and annotation of the Black Redstart (Phoenicurus ochruros)
The Black Redstart (Phoenicurus ochruros) is one of the most widely distributed species, occupying diverse habitats and exhibiting remarkable altitudinal migration, making it suitable model for studying altitudinal migration and high-altitude adaptation. In this study, we present the first reference genome of Phoenicurus ochruros, generated using PacBio HiFi long-read sequencing. The nuclear genom...
Ghimire, P.
•
Wang, N.
•
Lamichhaney, S.
biorxiv
Thu May 22 2025
Draft genome and transcriptomic sequence data of three invasive insect species
Cydalima perspectalis (the box tree moth), Leptoglossus occidentalis (the western conifer seed bug), and Tecia solanivora (the Guatemalan tuber moth) are three economically harmful invasive insect species. This study presents their genomic and transcriptomic sequences, generated through whole-genome sequencing, RNA-seq transcriptomic data, and Hi-C sequencing. The resulting genome assemblies exhib...
Lombaert, E.
•
Klopp, C.
•
Blin, A.
•
Annonay, G.
...•
Deleury, E.