2025 Hyper Recent •CC0 1.0 Universal

This work is dedicated to the public domain. No rights reserved.

Access Preprint From Server
May 7th, 2025
Version: 1
CSIRO
systems biology
biorxiv

EMMAi: fast enzyme-allocation constraints in GEMs for improved biomass prediction across carbon sources

Molina Ortiz, J. P.Open in Google Scholar•Benson, D.Open in Google Scholar•Watts, J.Open in Google Scholar•Velasque, M.Open in Google Scholar•Warden, A.Open in Google Scholar•Morgan, M.Open in Google Scholar

Genome-scale metabolic models (GEMs) predict emergent phenotypes by modeling the metabolic networks encoded in genomes. While GEMs have significantly advanced systems biology, metabolic engineering, biomedicine, and environmental science, they require extensive time and resources for manual curation, which can limit their utility in rapidly evolving research landscapes. Recent findings suggest that manually curated reactions can sometimes reduce prediction accuracy, indicating that integrating additional biologically grounded constraints may better capture emergent phenotypes. One promising approach is the incorporation of enzyme allocation constraints, which has been shown to enhance the predictive accuracy in metabolic models. Enzymatically constrained GEMs (ecGEMs) rely on enzyme turnover rates (kcat) and protein molecular weights (MWs) to account for intracellular resource limitations by introducing an enzyme pool variable and assigning costs to reactions, thereby simulating enzymatic resource constraints. Tools such as GECKO, AutoPACMEN, and ECMpy provide computational pipelines for ecGEM generation. However, these pipelines are often limited by their reliance on experimentally measured kcat values or deep learning-predicted values, such as those generated by DLKcat, which face challenges in predicting kinetics for enzymes dissimilar to their training data. Additionally, these methods frequently require extensive manual curation of kcat values based on empirical data, a time-intensive process that hampers scalability and applicability to non-model organisms. To address these limitations, we introduce EMMAi (Enzyme-constrained Metabolic Models with AI), a pipeline that fully automates the incorporation of enzyme constraints into GEMs. Unlike existing pipelines, EMMAi exclusively utilizes kcat values predicted by UniKP, an AI framework with improved accuracy over DLKcat, particularly for enzymes not present in training datasets. UniKP achieves a 13% improvement in correlation for unseen enzymes, enabling EMMAi to deliver ecGEMs with enhanced prediction accuracy without manual curation requirements. We evaluated EMMAi by applying it to three GEMs: two manually curated models, iJO1366 (Escherichia coli str. K-12 substr. MG1655) and iMO1056 (Pseudomonas aeruginosa PAO1), and one draft GEM constructed and gap-filled using CarveMe. EMMAi-generated ecGEMs showed an average Pearson Correlation Coefficient (PCC) improvement of 0.27 for manually curated GEMs when compared to predicted and experimentally measured growth rates and Biolog readings. Notably, for the draft GEM of Pseudomonas aeruginosa PAO1, the PCC improved dramatically from -0.3 to 0.6. EMMAi demonstrates that automating the integration of enzyme allocation constraints using AI-predicted kinetic parameters significantly enhances the prediction accuracy of GEMs, even in the absence of manual curation. These results underscore EMMAi\'s potential as a scalable, efficient, and accurate tool for advancing GEM-based research in systems biology, metabolic engineering, and beyond.

Similar Papers

biorxiv
Fri May 09 2025
Machine Learning Meets Pharmacokinetics: A Comparative Analysis of Predictive Models for Plasma Concentration-Time Profiles
Predicting pharmacokinetic (PK) profiles from molecular structures constitutes a significant advancement in pharmaceutical research with substantial implications for expediting the entire drug discovery process. Our investigation presents a comprehensive comparative analysis of five distinct methodological frameworks for predicting rat plasma concentration-time profiles: four approaches that integ...
Jost, F.
•
Giegerich, C.
•
Grebner, C.
•
Matter, H.
•
Cordes, H.
biorxiv
Thu May 08 2025
An integrative framework linking molecular signatures and locomotory phenotypes in space-induced sarcopenia
Age-related skeletal muscle deterioration, referred to as sarcopenia, poses significant risks to astronaut health and mission success during spaceflight, yet its multisystem drivers remain poorly understood. While terrestrial sarcopenia manifests gradually through aging, spaceflight induces analogous musculoskeletal decline within weeks, providing an accelerated model to study conserved atrophy me...
Ball, B. K.
•
Khan, H. F.
biorxiv
Thu May 08 2025
A multilayer network approach elucidates time- and tissue-specific developmental and aging processes
Understanding the dynamic interplay of proteins across different life stages and tissues is essential for deciphering the molecular mechanisms underpinning development, aging, and disease. Here, we present a comprehensive network-based framework that constructs and integrates 119 time- and tissue-specific protein-protein interaction (PPI) networks derived from transcriptomic data, offering insight...
Lombardo, S. D.
•
Rendeiro, A. F.
•
Menche, J.
biorxiv
Thu May 08 2025
Regulation of Transcriptional Bursting and Spatial Patterning in Early Drosophila Embryo Development
Nascent RNA synthesis often occurs in periods of high transcriptional activity, interspersed with basal or no activity periods. This phenomenon, known as transcriptional bursting, drives high intercellular variability in gene expression levels. How do key patterning genes in early Drosophila melanogaster embryos overcome this variability to establish precise spatial patterns for tissue development...
Nieto, C.
•
Vahdat, Z.
•
Lim, B.
•
Singh, A.
biorxiv
Thu May 08 2025
Virtual Colon: Spatiotemporal modelling of metabolic interactions in a computational colonic environment
Host-microbial metabolic interactions have been recognised as an essential factor in host health and disease. Genome-scale metabolic modelling approaches have made important contributions to our understanding of the interactions in such communities. One particular such modelling approach is BacArena in which metabolic models grow, reproduce, and interact as independent agents in a spatiotemporal m...
Marinos, G.
•
Zimmermann, J.
•
Taubenheim, J.
•
Kaleta, C.
biorxiv
Wed May 07 2025
miRNA-mRNA network analysis identifies PAX5 as a potential regulator of adaptive immune response in COPD
Micro-ribonucleic acids (miRNAs) are key post-transcriptional regulators of the immune system and may play a role in Chronic Obstructive Pulmonary Disease (COPD). In this paper, we constructed subject-specific miRNA-mRNA regulatory networks using bulk and deconvoluted whole blood RNA-sequencing, whole blood miRNA-sequencing, and B-cell receptor-sequencing data from up to 570 miRNAs, 11,859 mRNAs, ...
Gentili, M.
•
De Marzio, M.
•
Hobbs, B.
•
Hersh, C. P.
...•
Glass, K.
biorxiv
Wed May 07 2025
Identification and Overexpression of Endogenous Transcription Factors to Enhance Lipid Accumulation in the Commercially Relevant Species Chlamydomonas pacifica
Sustainable low-carbon energy solutions are critical to mitigating global carbon emissions. Algae-based platforms offer potential by converting carbon dioxide into valuable products while aiding carbon sequestration. However, scaling algae cultivation faces challenges like contamination in outdoor systems. Previously, our lab evolved Chlamydomonas pacifica, an extremophile green alga, which tolera...
Gupta, A.
•
Oliver, A.
•
Molino, J. V. D.
•
Wnuk-Fink, K. M.
...•
Mayfield, S.
biorxiv
Wed May 07 2025
The zoo of the gene networks capable of pattern formation by extracellular signaling
A fundamental question of developmental biology is pattern formation, or how cells with specific gene expression end up in specific locations in the body to form tissues, organs and, overall, functional anatomy. Pattern formation involves communication through extracellular signals and complex intracellular gene networks integrating these signals to determine cell responses (e.g., further signalin...
Anhon, K. M.
•
Ciudad, I. S.
biorxiv
Wed May 07 2025
Context-dependent spatial multicellular network motifs for single-cell spatial biology
The clinical state of diseased tissue is caused by complex intercellular processes that go beyond pairwise cell-cell interactions and are difficult to infer due to the combinatorial explosion of such high-dimensionality. We present context-dependent identification of spatial motifs (CISM), a two-step method to identify local cell structures associated with a disease state in single cell spatial da...
Zamir, A.
•
Amitay, Y.
•
Tamir, Y.
•
Keren, L.
•
Zaritsky, A.