2025 Hyper Recent •CC0 1.0 Universal

This work is dedicated to the public domain. No rights reserved.

Access Preprint From Server
May 8th, 2025
Version: 1
School of Artificial Intelligence, Wuhan University
bioinformatics
biorxiv

BioMedTools: a language model-powered community for biomedical computational tools

Liu, S.Open in Google Scholar•Xing, H.Open in Google Scholar•Han, M.Open in Google Scholar•Zhang, D.Open in Google Scholar•Gong, L.Open in Google Scholar•Liu, D.Open in Google Scholar•Chen, J.Open in Google Scholar•Cai, P.Open in Google Scholar•Hu, Q.-N.Open in Google Scholar

A large number of biomedical computational tools have spawned several tool registries. However, in the face of the rapid growth in the number of tools, existing tool registries, which are manually curated or community-driven, are difficult to keep up to date, resulting in inadequate tool repository data. In this paper, we show that language models (LMs) can aid in building a community of tools. We introduce BioMedTools (https://biomed.tools), a community of biomedical computational tools that mainly implements LM-based tool identification and a chat assistant. Compared with existing tool registries, BioMedTools achieves excellence in terms of the number of tools, frequency of data updates, and functionality. Meanwhile, the Model Context Protocol (MCP) servers hub in BioMedTools may promote the building of agents in the biomedical field. BioMedTools enables the efficient collection of tools and enhances their findability and accessibility.

Similar Papers

biorxiv
Fri May 09 2025
Streamlining Multiplexed Tissue Image Analysis with PIP{Sigma}X: An Integrated Automated Pipeline for Image Processing and EXploration for Diverse Tissue Types
Spatial proteomics via multiplexed tissue imaging is transforming how we study biology, enabling researchers to investigate dozens of markers in a single tissue section and explore how cells behave in their native habitat. While imaging technologies have advanced rapidly, data analyses remain a bottleneck. To address this, we developed PIP{Sigma}X (Pipeline for Image Processing and EXploration), a...
Mardamshina, M.
•
Ballllosera Navarro, F.
•
Martinez Casals, A.
•
Avenel, C.
...•
Lundberg, E.
biorxiv
Fri May 09 2025
IsoBayes: a Bayesian approach for single-isoform proteomics inference
Studying protein isoforms is an essential step in biomedical research; at present, the main approach for analyzing proteins is via bottom-up mass spectrometry proteomics, which return peptide identifications, that are indirectly used to infer the presence of protein isoforms. However, the detection and quantification processes are noisy; in particular, peptides may be erroneously detected, and mos...
BOLLON, J.
•
SHORTREED, M. R.
•
JORDAN, B. T.
•
MILLER, R.
...•
Tiberi, S.
biorxiv
Fri May 09 2025
Application of spatial transcriptomics across organoids: a high-resolution spatial whole-transcriptome benchmarking dataset
Stem cell-derived organoid models hold great promise to model tissue-specific disease. To enable this, it is crucial to determine how their composition compares to endogenous organs. However, technologies such as spatial transcriptomics (STs) that can inform on regional molecular identity have been challenging to apply to organoids. Here we present the first systematic profiling of multiple organo...
Nucera, M. R. R.
•
Charitakis, N.
•
Leung, R.
•
Leichter, A.
...•
Ramialison, M.
biorxiv
Fri May 09 2025
RNAcompare: Integrating machine learning algorithms to unveil the similarities of phenotypes based on clinical, multi-omics using Rheumatoid Arthritis and Heart Failure as Case Studies
Background: Gene expression analysis is crucial for understanding the biological mechanisms underlying patient subgroup differences. However, most existing studies focus primarily on transcriptomic data while neglecting the integration of clinical heterogeneity. Although batch correction methods are commonly used, challenges remain when integrating data across different tissues, omics layers, and ...
Tang, M.
biorxiv
Fri May 09 2025
multideconv - an integrative pipeline for combining first and second generation cell type deconvolution results
Summary: The number of computational methods for cell type deconvolution from bulk RNA-seq data has been increasing in the last years, but their high feature complexity and variability of results across methods and signatures limit their utility and effectiveness for patient stratification. Applying multiple combinations of deconvolution methods and signatures often results in hundreds of redundan...
Hurtado, M.
•
Essabbar, A.
•
Khajavi, L.
•
Pancaldi, V.
biorxiv
Thu May 08 2025
Predicting Molecular Taste: Multi-Label and Multi-Class Classification
Predicting the taste of chemical compounds is a complex task and has been a challenge for decades. This study explores the application of machine learning to predict taste profiles of chemical compounds using the ChemTastesDB dataset, comprising 2,944 tastants categorized into 44 taste labels and 9 taste classes. Addressing the challenges of label imbalance and correlation, the dataset was preproc...
Ramanathan, V.
•
DN, S. S.
biorxiv
Thu May 08 2025
INLAomics for Scalable and Interpretable Spatial Multiomic Data Integration
Integrating spatial transcriptomics with antibody-based proteomics enables the investigation of biological regulation within intact tissue architecture. However, current approaches for spatial multi-omics integration often depend on dimensionality reduction or autoencoders, which disregard spatial context, limit interpretability, and face challenges with scalability. To address these limitations, ...
Arnroth, L.
•
Vickovic, S.
biorxiv
Thu May 08 2025
Not All Saliva Samples Are Equal: The Role of Cellular Heterogeneity in DNA methylation and Epigenetic Age Analyses with Biological and Psychosocial Factors
Saliva is widely used in biomedical population research, including epigenetic analyses to investigate gene-environment interplay and identify biomarkers. Its minimally invasive collection procedure makes it ideal for studies in pediatric populations. Saliva is a heterogenous tissue composed of immune and buccal epithelial cells (BEC). Amongst the many epigenetic marks, DNA methylation (DNAm) is th...
Chan, M. H.-M.
•
Meijer, M.
•
Merrill, S. M.
•
Fu, M. P. Y.
...•
Kobor, M. S.
biorxiv
Thu May 08 2025
GeneFix-AI: AI-Powered CRISPR-Cas9 System for Real-Time Detection and Correction of Mutations in Non-Human Species
The evolution of genome engineering technologies has transformed biomedical research, enabling precise and efficient modification of genetic material Doudna and Charpentier, 2014. Among these, CRISPR-Cas9 stands out as a revolutionary gene-editing tool, though it often requires extensive expertise and technical knowledge Cong et al., 2013; J. G. Doench et al., 2016. We propose GeneFix-AI, an Artif...
Ali, M.
biorxiv
Thu May 08 2025
Surforama: interactive exploration of volumetric data by leveraging 3D surfaces
Motivation: Visualization and annotation of segmented surfaces is of paramount importance for studying membrane proteins in their native cellular environment by cryogenic electron tomography (cryo-ET). Yet, analyzing membrane proteins and their organization is challenging due to their small sizes and the need to consider local context constrained to the membrane surface. Results: To interactively ...
Yamauchi, K. A.
•
Lamm, L.
•
Gaifas, L.
•
Righetto, R. D.
...•
Harrington, K.