2025 Hyper Recent •CC0 1.0 Universal

This work is dedicated to the public domain. No rights reserved.

Access Preprint From Server
May 8th, 2025
Version: 3
Uppsala University and Helmholtz Centre for Infection Research
bioinformatics
biorxiv

FlashFold: a standalone command-line tool for accelerated protein structure and stoichiometry prediction

Saha, C. K.Open in Google Scholar•Roghanian, M.Open in Google Scholar•Häussler, S.Open in Google Scholar•Guy, L.Open in Google Scholar

ABSTARCTAlphaFold has revolutionized the decades-old issue of precisely predicting protein structures. However, its high accuracy relies on a computationally intensive step that involves searching vast databases for homologous sequences as the query protein of interest. Additionally, predicting the quaternary structure of protein complexes requires prior knowledge of subunit counts, a prerequisite rarely met. To address these limitations, we introduce FlashFold - a fast, user-friendly tool for protein structure prediction. It accelerates homology searches using a compact built-in database, enabling structure predictions up to 3-fold faster than AlphaFold3, with sacrificing little or no accuracy. Unlike others, FlashFold features adaptable built-in databases that allow users to easily incorporate their own privately sequenced data - an option that can positively influence the prediction accuracy. Moreover, it allows users to estimate stoichiometry of protein complexes directly from sequence information. To support high-throughput workflows and streamline downstream decision-making, it generates interactive and filterable summary reports, enabling users to efficiently visualize protein structures, and interpret large volumes of prediction results. FlashFold runs locally on Linux and Mac, eliminating reliance on third-party servers. FlashFold is available at https://github.com/chayan7/flashfold.

Similar Papers

biorxiv
Fri May 09 2025
Streamlining Multiplexed Tissue Image Analysis with PIP{Sigma}X: An Integrated Automated Pipeline for Image Processing and EXploration for Diverse Tissue Types
Spatial proteomics via multiplexed tissue imaging is transforming how we study biology, enabling researchers to investigate dozens of markers in a single tissue section and explore how cells behave in their native habitat. While imaging technologies have advanced rapidly, data analyses remain a bottleneck. To address this, we developed PIP{Sigma}X (Pipeline for Image Processing and EXploration), a...
Mardamshina, M.
•
Ballllosera Navarro, F.
•
Martinez Casals, A.
•
Avenel, C.
...•
Lundberg, E.
biorxiv
Fri May 09 2025
IsoBayes: a Bayesian approach for single-isoform proteomics inference
Studying protein isoforms is an essential step in biomedical research; at present, the main approach for analyzing proteins is via bottom-up mass spectrometry proteomics, which return peptide identifications, that are indirectly used to infer the presence of protein isoforms. However, the detection and quantification processes are noisy; in particular, peptides may be erroneously detected, and mos...
BOLLON, J.
•
SHORTREED, M. R.
•
JORDAN, B. T.
•
MILLER, R.
...•
Tiberi, S.
biorxiv
Fri May 09 2025
Application of spatial transcriptomics across organoids: a high-resolution spatial whole-transcriptome benchmarking dataset
Stem cell-derived organoid models hold great promise to model tissue-specific disease. To enable this, it is crucial to determine how their composition compares to endogenous organs. However, technologies such as spatial transcriptomics (STs) that can inform on regional molecular identity have been challenging to apply to organoids. Here we present the first systematic profiling of multiple organo...
Nucera, M. R. R.
•
Charitakis, N.
•
Leung, R.
•
Leichter, A.
...•
Ramialison, M.
biorxiv
Fri May 09 2025
RNAcompare: Integrating machine learning algorithms to unveil the similarities of phenotypes based on clinical, multi-omics using Rheumatoid Arthritis and Heart Failure as Case Studies
Background: Gene expression analysis is crucial for understanding the biological mechanisms underlying patient subgroup differences. However, most existing studies focus primarily on transcriptomic data while neglecting the integration of clinical heterogeneity. Although batch correction methods are commonly used, challenges remain when integrating data across different tissues, omics layers, and ...
Tang, M.
biorxiv
Fri May 09 2025
multideconv - an integrative pipeline for combining first and second generation cell type deconvolution results
Summary: The number of computational methods for cell type deconvolution from bulk RNA-seq data has been increasing in the last years, but their high feature complexity and variability of results across methods and signatures limit their utility and effectiveness for patient stratification. Applying multiple combinations of deconvolution methods and signatures often results in hundreds of redundan...
Hurtado, M.
•
Essabbar, A.
•
Khajavi, L.
•
Pancaldi, V.
biorxiv
Thu May 08 2025
Predicting Molecular Taste: Multi-Label and Multi-Class Classification
Predicting the taste of chemical compounds is a complex task and has been a challenge for decades. This study explores the application of machine learning to predict taste profiles of chemical compounds using the ChemTastesDB dataset, comprising 2,944 tastants categorized into 44 taste labels and 9 taste classes. Addressing the challenges of label imbalance and correlation, the dataset was preproc...
Ramanathan, V.
•
DN, S. S.
biorxiv
Thu May 08 2025
INLAomics for Scalable and Interpretable Spatial Multiomic Data Integration
Integrating spatial transcriptomics with antibody-based proteomics enables the investigation of biological regulation within intact tissue architecture. However, current approaches for spatial multi-omics integration often depend on dimensionality reduction or autoencoders, which disregard spatial context, limit interpretability, and face challenges with scalability. To address these limitations, ...
Arnroth, L.
•
Vickovic, S.
biorxiv
Thu May 08 2025
Not All Saliva Samples Are Equal: The Role of Cellular Heterogeneity in DNA methylation and Epigenetic Age Analyses with Biological and Psychosocial Factors
Saliva is widely used in biomedical population research, including epigenetic analyses to investigate gene-environment interplay and identify biomarkers. Its minimally invasive collection procedure makes it ideal for studies in pediatric populations. Saliva is a heterogenous tissue composed of immune and buccal epithelial cells (BEC). Amongst the many epigenetic marks, DNA methylation (DNAm) is th...
Chan, M. H.-M.
•
Meijer, M.
•
Merrill, S. M.
•
Fu, M. P. Y.
...•
Kobor, M. S.
biorxiv
Thu May 08 2025
GeneFix-AI: AI-Powered CRISPR-Cas9 System for Real-Time Detection and Correction of Mutations in Non-Human Species
The evolution of genome engineering technologies has transformed biomedical research, enabling precise and efficient modification of genetic material Doudna and Charpentier, 2014. Among these, CRISPR-Cas9 stands out as a revolutionary gene-editing tool, though it often requires extensive expertise and technical knowledge Cong et al., 2013; J. G. Doench et al., 2016. We propose GeneFix-AI, an Artif...
Ali, M.
biorxiv
Thu May 08 2025
Surforama: interactive exploration of volumetric data by leveraging 3D surfaces
Motivation: Visualization and annotation of segmented surfaces is of paramount importance for studying membrane proteins in their native cellular environment by cryogenic electron tomography (cryo-ET). Yet, analyzing membrane proteins and their organization is challenging due to their small sizes and the need to consider local context constrained to the membrane surface. Results: To interactively ...
Yamauchi, K. A.
•
Lamm, L.
•
Gaifas, L.
•
Righetto, R. D.
...•
Harrington, K.