2025 Hyper Recent •CC0 1.0 Universal

This work is dedicated to the public domain. No rights reserved.

Access Preprint From Server
June 4th, 2025
Version: 2
Bioinformatics Solutions Inc.
bioinformatics
biorxiv

De Novo sequencing-assisted homology search for DIA data analysis enables low abundance peptide variants discovery

Qiao, R.Open in Google Scholar•Li, H.Open in Google Scholar•Bian, H.Open in Google Scholar•Xin, L.Open in Google Scholar•Shan, B.Open in Google Scholar

Data-independent acquisition mass spectrometry (DIA-MS) has emerged as a powerful approach for comprehensive proteome profiling. Spectral library search and library-free search are the two major approaches for DIA data analysis. The spectral library search requires high-quality spectral libraries derived from the search results of data-dependent acquisition (DDA) experiments, while library-free approaches rely on prediction models to generate in silico libraries. Both methodologies constrain the search space to the peptide list in the database, limiting the discovery of variant peptides arising from genetic variations or mutations. We present a novel computational method DIAVariant designed to identify peptide sequence variants directly and solely from complex DIA spectra while rigorously controlling the false discovery rate. Our experimental results demonstrate that DIAVariant successfully identifies sequence variants previously detected through proteogenomic approaches, while maintaining high specificity across multiple datasets. When integrated with existing DIA database search solutions, our approach constitutes a comprehensive analytical workflow capable of identifying peptides both represented within reference protein databases and those arising from sequence variations not captured in standard databases.

Similar Papers

biorxiv
Thu Jun 05 2025
A resource and computational approach for quantifying gene editing allelism at single-cell resolution
CRISPR-Cas9-based gene editing is a powerful approach to developing gene and cell therapies for several diseases. Engineering cell therapies requires accurate assessment of gene editing allelism because editing patterns can vary across cells leading to genotypic heterogeneity. This can hinder development of complex cell therapies involving the use of multiplex editing. Droplet-based targeted singl...
Ung, M. H.
•
Angelini, G.
•
Wang, R.
•
Pyclik, A.
...•
Ge, H. G.
biorxiv
Thu Jun 05 2025
Making AI accessible for forensic DNA profile analysis
Deep learning has the potential to be a powerful tool for automating allele calling in forensic DNA analysis. Studies to date have relied on bespoke model architecture and painstaking manual annotations to train models, which makes it challenging for other researchers to work with these techniques. In this study, we explore the possibility of training a well-performing model using data gathered as...
de Wit, A. K. J. G.
•
Wagenaar, C. D.
•
Janssen, N. A.
•
Hoegen, B.
...•
Ypma, R. J.
biorxiv
Thu Jun 05 2025
Supervised Deep Learning for Efficient Cryo-EM Image Alignment in Drug Discovery with cryoPARES
Cryo-Electron Microscopy (cryo-EM) is a pivotal tool for determining the 3D structures of biological macromolecules. Current cryo-EM workflows, while effective, are computationally demanding and require manual intervention, creating bottlenecks for use in high-throughput scenarios such as structure-based drug discovery. Often in structure-based drug discovery, one can assume that all instances of ...
Sanchez-Garcia, R.
•
Berndt, A.
•
Apelbaum, A.
•
Reeks, J.
...•
Saur, M.
biorxiv
Thu Jun 05 2025
Machine learning driven acceleration of biopharmaceutical formulation development using Excipient Prediction Software (ExPreSo)
Formulation development of protein biopharmaceuticals has become increasingly challenging due to new modalities and higher desired drug substance concentrations. The constraint in drug substance supply and the need for many analytical methods means that only a small selection of excipients can be thoroughly tested in the lab. There are few in-silico tools developed to refine the candidate excipien...
Vidal-Henriquez, E.
•
Holder, T.
•
Lee, N. F.
•
Pompe, C.
•
Teese, M. G.
biorxiv
Thu Jun 05 2025
Integrating Multimodal Data for a Comprehensive Knowledge Graph to Advance Infectious Disease Research
Infectious diseases remain a formidable threat to global public health, with their escalating morbidity and mortality rates compounded by recurrent epidemics and the alarming rise of antimicrobial resistance (AMR). These challenges have intensified the urgent demand for innovative therapeutic strategies that can accelerate drug development cycles and overcome traditional research bottlenecks. To a...
Fan, H.
•
Guo, L.
•
Li, F.
•
Yuan, Z.
...•
Li, S.
biorxiv
Thu Jun 05 2025
TCRanalyzer: A user-friendly tool for comprehensive analysis of T-cell diversity, dynamics and potential antigen targets
T cells are critical for immune responses, recognizing antigens via their unique T-cell receptors (TCRs). Analyzing the diverse TCR repertoires, especially the hypervariable CDR3 region, is essential for understanding immune function in health and disease. Current TCR analysis tools often require specialized expertise, computational resources, or sacrifice biological information for efficiency. To...
Seifert, N.
•
Reinke, S.
•
Kurz, N. S.
•
Demmer, J. A.
...•
Altenbuchinger, M.
biorxiv
Thu Jun 05 2025
Prevalence of dual-donating amines in key regions of functional RNAs
RNA performs many critical functions nearly all of which are enabled by complex hydrogen bonded structures. Nucleotides possess far fewer hydrogen bond donors than acceptors, and only the exocyclic amine can donate two H-bonds, suggesting a specialized role. To assess the prevalence and structural contexts of dual-donating amines within structured RNAs, we created a computational workflow that min...
Veenis, A. J.
•
Saon, M. S.
•
Bevilacqua, P. C.
biorxiv
Thu Jun 05 2025
MBCO PathNet: Integration and visualization of networks connecting functionally related pathways predicted from transcriptomic and proteomic datasets
Our desktop application MBCO PathNet allows for quick and easy integration and visualization of networks of functionally related pathways predicted from numerous gene and protein lists using the Molecular Biology of the Cell Ontology (MBCO) and other ontologies. Within networks of hierarchical parent-child relationships or functional relationships, pathways are visualized as pie charts where each ...
Hansen, J.
•
Iyengar, R.
biorxiv
Thu Jun 05 2025
A ggplot-based single-gene viewer reveals insights into the translatome and other nucleotide-resolution omics data
Seeing is believing. Visualizing Ribo-seq and other sequencing data within genes of interest is a powerful approach to studying gene expression, but its application is limited by a lack of robust tools. Here, we introduce ggRibo, a user-friendly R package for visualizing individual gene expression, integrating Ribo-seq, RNA-seq, and other genome-wide datasets with flexible scaling options. ggRibo ...
Wu, H.-Y. L.
•
Kaufman, I. D.
•
Hsu, P. Y.
biorxiv
Thu Jun 05 2025
Structural and temporal dynamics analysis on PANoptosis in sepsis: a bibliometric analysis
PANoptosis, as a new type of programmed cell death, is characterized by pyroptosis, apoptosis and necroptosis, and is a key mechanism causing a variety of inflammatory diseases. Despite the growing number of studies indicating the crucial role of PANoptosis in sepsis, there has been no bibliometric analysis of the research hotspots and trends in this field. Therefore, this study aims to explore th...
Li, Z.
•
Nie, D.
•
Yin, L.
•
Qin, Q.
...•
Wang, Y.