2025 Hyper Recent •CC0 1.0 Universal

This work is dedicated to the public domain. No rights reserved.

Access Preprint From Server
January 21st, 2025
Version: 4
Department of Biological Sciences, Smith College, Northampton, MA, USA
evolutionary biology
biorxiv

Rethinking large scale phylogenomics with EukPhylo v1.0, a flexible toolkit to enable phylogeny-informed data curation and analyses of diverse eukaryotic lineages

Katz, L. A.Open in Google Scholar•Cote-L'Heureux, A. E.Open in Google Scholar•Leleu, M.Open in Google Scholar•Ani, G.Open in Google Scholar•Gawron, R.Open in Google Scholar

Eukaryotic diversity is largely microbial, with macroscopic lineages (plant, animals and fungi) nesting among a plethora of diverse protists. Understanding the evolutionary relationships among eukaryotes is rapidly advancing through omics analyses, but phylogenomics are challenging for microeukaryotes, particularly uncultivable lineages, as single-cell sequencing approaches generate a mixture of sequences from hosts, associated microbiomes, and contaminants. Moreover, many analyses of eukaryotic gene families and phylogenies rely on boutique datasets and methods that are challenging for other research groups to replicate. To address these challenges, we present EukPhylo v1.0, a modular, user-friendly pipeline that enables effective data curation through phylogeny-informed contamination removal, estimation of homologous gene families (GFs), and generation of both multisequence alignments and gene trees. Analyses can use a hook database of ~15k ancient GFs or users can easily replace this hook with a set of gene families of interest. We demonstrate the power of EukPhylo, including a suite of stand-alone utilities, through analyses of 500 conserved GFs sampled from 1,000 diverse species of eukaryotes, bacteria and archaea. We show improvements in estimates of the eukaryotic tree of life, recovering clades that are well established in the literature, through successive rounds of curation using the EukPhylo contamination loop. The final trees corroborate numerous hypotheses in the literature (e.g. Opisthokonta, Rhizaria, Amoebozoa) while challenging others (e.g. CRuMs, Obazoa, Diaphoretickes). We believe that the flexibility and transparency of EukPhylo sets standards for curation of omics data for future studies.

Similar Papers

biorxiv
Wed Jul 02 2025
Microbiome evolution plays a secondary role in host rapid adaptation
Understanding how populations adapt to environmental change is a central goal in evolutionary biology. Microbiomes have been proposed as a source of heritable variation that is central to rapid adaptation in hosts, yet empirical evidence supporting this remains limited, particularly in naturalistic settings. We combined a field evolution experiment in Drosophila melanogaster exposed to an insectic...
Shahmohamadloo, R. S.
•
Gabidulin, A. R.
•
Andrews, E. R.
•
Rudman, S. M.
biorxiv
Wed Jul 02 2025
Phenotypic and transcriptomic similarity between the N2 Ancestral and a Tropical wild isolate of C. elegans reveals divergence from the reference Bristol strain
In recent years, the scientific community has increasingly recognized the importance of incorporating ecologically relevant perspectives into laboratory research. In the case of the free-living nematode Caenorhabditis elegans, numerous studies have documented the domestication of the N2 Bristol strain (isolated in 1951). This has led to a growing interest in recently isolated wild strains from div...
Hersch-Gonzalez, J.
•
Cano-Dominguez, N.
•
Zurita-Leon, M.
•
Soto-Nava, M.
...•
Valdes, J.
biorxiv
Wed Jul 02 2025
Cost of altered translation accuracy shapes adaptation to antibiotics in E. coli
Protein synthesis, while central to cellular function, is error-prone. The resulting mistranslation is generally costly, but we do not know how these costs compare or interact with the costs imposed by external selection pressures such as antibiotics. We also do not know whether and how these costs are compensated during evolution. It is important to answer these questions, since mistranslation is...
Samhita, L.
•
Tamhankar, S.
•
Miranda, J.
•
Basu, A. K.
•
Agashe, D.
biorxiv
Wed Jul 02 2025
Dietary Zn governs protein: carbohydrate regulation of fecundity and lifespan in Drosophila melanogaster
Recent work has shown that dietary zinc (Zn) restriction has a strong effect to limit egg production in female Drosophila, while also producing a variable, but beneficial, effect on lifespan. This combination of phenotypes is interesting because it is consistent with the disposable soma theory of ageing, and phenocopies the well-studied effects of reducing the dietary protein-to-carbohydrate (P:C)...
Sarmah, S.
•
Burke, R.
•
Mirth, C.
•
Piper, M.
biorxiv
Wed Jul 02 2025
Weak genetic draft and the Lewontin's paradox
Neutral theory assumes that in a population of size N, diversity results from an equilibrium between new mutations arising at rate and genetic drift that purge them at rate 1/N, predicting an equilibrium value proportional to N. The difference between this expectation and the much lower observed molecular diversity is known as Lewontin's paradox of variation. Here, we investigate the effect of ge...
Achaz, G.
•
Schertzer, E.
biorxiv
Wed Jul 02 2025
Energetic shifts reflect survival likelihood in Anopheles gambiae
Life history theory predicts that resource allocation adapts to ecological and evolutionary pressures. We investigated resource and energy in the malaria vector Anopheles gambiae following exposure to two stressors: blood meals and infection by the microsporidian Vavraia culicis. Our findings reveal the costs of blood feeding and parasitism on longevity, highlighting trade-offs in lifetime protein...
Zeferino, T. G.
•
Silva, L. M.
•
Koella, J. C.
biorxiv
Wed Jul 02 2025
Avian germline-restricted chromosomes are reservoirs for active long-terminal-repeat retroviruses
Germline-restricted chromosomes (GRCs) are unique to germ cells and absent from somatic cells in songbirds. However, their contents, functions, and evolutionary mechanisms remain unclear. We performed comparative genomics on long-read assembled GRCs from male House Finch (Haemorhous mexicanus), Common Rosefinch (Carpodacus erythrinus), and Blue Tit (Cyanistes caeruleus), the first two of which are...
Fang, B.
•
Edwards, S. V.
biorxiv
Tue Jul 01 2025
Back-projection improves inference from sparsely sampled genomic surveillance data
Highly transmissible SARS-CoV-2 variants have emerged throughout the COVID-19 pandemic, driving new waves of infections. Genomic surveillance data can provide insights into the virus\'s evolution and biology. However, delayed and limited regional data can introduce biases in epidemiological models, potentially obscuring transmission patterns. To address this issue, we used a novel, variant-specifi...
Finney, E. E.
•
Lee, B.
•
Ahmed, S. F.
•
Sohail, M. S.
...•
Barton, J. P.
biorxiv
Tue Jul 01 2025
Dispersal behavior rather than dispersal morphology creates social polymorphism in Formica ants
Dispersal evolution and social evolution are interlinked. Dispersal is necessary for avoiding kin competition and inbreeding, but limited dispersal also allows beneficial social interactions with kin. In ants, a correlation between poor dispersal and complex societies, where a big proportion of queens are philopatric, is well documented, but the underlying causal mechanisms are not clear. In this ...
Hakala, S. M.
•
Belevich, I.
•
Jokitalo, E.
•
Seppa, P.
•
Helantera, H.
biorxiv
Tue Jul 01 2025
Stepwise expansion of recombination suppression on sex chromosomes and other supergenes through lower load advantage and deleterious mutation sheltering
Many organisms possess sex chromosomes with non-recombining regions that have expanded progressively. Yet, the causes of this stepwise expansion remain poorly understood. Here, using mathematical modeling and stochastic simulations, we show that recombination suppression can expand simply due to the widespread presence of deleterious recessive mutations in genomes. We demonstrate that a significan...
Jay, P.
•
Veber, A.
•
Giraud, T.