January 22nd, 2025
Version: 1
Department of Plant Biology, University of Georgia, 2502 Miller Plant Sciences, Athens GA 30602, U.S.A
genomics
biorxiv

Low level contamination confounds population genomic analysis

Genome sequence contamination has a variety of causes and can originate from within or between species. Previous research focused primarily on cross-species contamination or on prokaryotes. This paper visualizes B-allele frequency to test for intra-species contamination, and measures its effects on phylogenetic and admixture analysis in two fungal species. Using a standard base calling pipeline, we found that contaminated genomes superficially appeared to produce good quality genome data. Yet as little as 5-10% genome contamination was enough to change phylogenetic tree topologies and make contaminated strains appear as hybrids between lineages (genetically admixed). We recommend the use of B-allele frequency plots to screen genome resequencing data for intra-species contamination.

Similar Papers

biorxiv
Sun Apr 06 2025
Human-specific transposable elements shaped the evolution of craniofacial development through regulation of neural crest migration
Craniofacial development and neural crest specification are evolutionarily conserved processes, yet subtle modifications to their gene regulatory networks drive species-specific craniofacial diversity. Transposable elements (TEs) are increasingly recognized as contributors to genome evolution, but their role in shaping neural crest regulatory programs remains underexplored. Here, we investigate th...
Deelen, L.
Mitchell, Z. H.
Demurtas, M.
Garcia Del Valle, B.
Trizzino, M.
biorxiv
Sat Apr 05 2025
Reduced taurine transporter expression in lymphoblastoid cell lines from Alzheimer's disease patients compared with age-matched controls: Therapeutic implications?
Taurine is an atypical amino acid that cannot form peptide bonds and thus does not take place in building proteins. Yet, taurine takes part in regulating many cell functions, including cell osmolarity and volume, mitochondrial function, membrane ion channels and neuronal activity, and cell survival. Taurine is synthesized by the liver, and available from consumption of meat and fish, but not plant...
Gavriel, Y.
Voinsky, I.
Klin, H.
Squassina, A.
Gurwitz, D.
biorxiv
Sat Apr 05 2025
PanSpace: Fast and Scalable Indexing for Massive Bacterial Databases
Motivation: Species identification is a crucial task in fields such as agriculture, food processing, and healthcare. The rapid expansion of genomics databases, especially with the growing focus on investigating new bacterial genomes in clinical microbiology, has surpassed the capabilities of conventional tools like BLAST for basic search and query procedures. A major bottleneck in microbiome studi...
Avila Cartes, J. E.
Ciccolella, S.
Denti, L.
Dandinasivara, R.
...
Schonhuth, A.
biorxiv
Sat Apr 05 2025
Unraveling the Genomic and Phylogenetic Complexity of the understudied microfungus Basidiobolus: Insights from 19 Newly Sequenced Genomes
Basidiobolus is a globally distributed genus of early-diverging fungi within Zoopagomycota, known for its presence in diverse ecological niches ranging from soil and decaying organic matter to vertebrate gastrointestinal tracts. Despite its ecological and medical relevance, the taxonomy and evolutionary relationships within the genus remain poorly resolved due to limited genomic resources. In this...
Carleton, J. P.
Bradshaw, A. J.
Cleary, L. P.
Hincher, M. R.
...
Tabima, J. F.
biorxiv
Sat Apr 05 2025
Extremely Early Flowering and Large Grain Isogenic Japonica Rice Koshihikari Integrated with Gene e1 and GW2
The extremely early flowering/large-grain isogenic Koshihikari was developed by combining the large-grain allele, GW2, derived from Inochinoichi with the year-round flowering allele, e1, from Kanto 79. We conducted four back crosses with Koshihikari as a recurrent parent by using an extremely early flowering/large-grain e1GW2 homozygote as a non-recurrent parent, which was segregated in B1F2 betwe...
Tomita, M.
Arai, K.
biorxiv
Sat Apr 05 2025
Understanding the Role of Toggle Genes in Chronic Lymphocytic Leukemia Proliferation
Cancer cell populations, such as chronic lymphocytic leukemia (CLL), are characterized by aberrant proliferation and plasticity: cells may switch between states so increasing population heterogeneity. Previous works have shown that gene expression noise can impact cell-state transition. To gain better insights into transcriptome-wide expression dynamics and the effect of noise on state transition,...
Sirbu, O.
Agarwal, G.
Giuliani, A.
Selvarajoo, K.
biorxiv
Sat Apr 05 2025
Single-cell RNA Sequencing Analysis of Sputum Cell Transcriptomes Reveals Pathways and Communication Networks That Contribute to the Pathogenesis of Asthma
Background Asthma is driven by complex interactions amongst structural airway cells, cells of the immune system, and the environmental. While sputum cell characterization has been instrumental in studying asthma pathogenesis and refining treatment strategies, the nuances of cellular transcriptomes and intercellular communication in asthmatic sputum remain poorly understood. Methods We employed sin...
Yan, X.
Liu, Q.
Adams, T. S.
Schupp, J. C.
...
Chupp, G. L.
biorxiv
Fri Apr 04 2025
Spatial transcriptomics AI agent charts hPSC-pancreas maturation in vivo
Spatial transcriptomics has revolutionized our understanding of tissue organization by simultaneously capturing gene expression and spatial localization within intact tissues. However, analyzing these increasingly complex datasets requires specialized expertise across computational biology, statistics, and biological context. To address this challenge, we introduce the Spatial Transcriptomics AI A...
Lin, Z.
Wang, W.
Marin-Llobet, A.
Li, Q.
...
Liu, J.
biorxiv
Fri Apr 04 2025
Neural stem cell epigenomes and fate bias are temporally coordinated during corticogenesis
The cerebral cortex orchestrates complex cognitive functions, yet how its distinct temporal lineages are molecularly patterned during development remains unresolved. Here, we integrate single-cell transcriptomics and chromatin accessibility, together with genome-wide profiling of DNA methylation and 3D chromosomal contact across mouse corticogenesis (E13-E18) to elucidate cell fate transitions. Us...
Shapira, Y.
Noack, F.
Vangelisti, S.
Chong, F.
...
Bonev, B.
biorxiv
Fri Apr 04 2025
Adaptively integrated sequencing and assembly of near-complete genomes
Recent advances in long-read sequencing (LRS) and assembly algorithms have made it possible to create highly complete genome assemblies for humans, animals, plants and other eukaryotes. However, there is a need for ongoing development to improve accessibility and affordability of the required data, increase the range of usable sample types, and reliably resolve the most challenging, repetitive gen...
Gamaarachchi, H.
Stevanovski, I.
Hammond, J. M.
Reis, A. L. M.
...
Deveson, I. W.