2025 Hyper Recent •CC0 1.0 Universal

This work is dedicated to the public domain. No rights reserved.

Access Preprint From Server
May 8th, 2025
Version: 1
Institut Curie
bioinformatics
biorxiv

Next generation statistical framework for next generation spatial transcriptomics data

Mangane, F.Open in Google Scholar•Bost, P.Open in Google Scholar

The rapid advancement of spatial transcriptomic technologies, particularly in situ hybridization based methods, has enabled the profiling of gene expression at sub cellular resolution across large tissue sections. Commercial platforms such as Xenium and CosMx now routinely generate high-quality datasets of increasing size and complexity. However, existing analytical approaches, often adapted from single-cell genomics, fall short in addressing the specific challenges posed by spatial data, especially at scale. In this work, we present TranspaceR, a new R package that introduces computational and statistical methods tailored to the analysis of next-generation spatial transcriptomic datasets. Our framework includes novel quality control procedures, scalable gene selection strategies especially for spatially variable genes, and optimized normalization and dimensionality reduction techniques based on in-depth statistical characterization of spatial data. We also demonstrate how single-cell annotation tools can be leveraged for automated cell-type labeling within spatial contexts. Together, these tools enable the efficient and robust analysis of imaging-based spatial transcriptomics datasets comprising millions of cells, paving the way for deeper insights into tissue organization.

Similar Papers

biorxiv
Fri May 09 2025
Streamlining Multiplexed Tissue Image Analysis with PIP{Sigma}X: An Integrated Automated Pipeline for Image Processing and EXploration for Diverse Tissue Types
Spatial proteomics via multiplexed tissue imaging is transforming how we study biology, enabling researchers to investigate dozens of markers in a single tissue section and explore how cells behave in their native habitat. While imaging technologies have advanced rapidly, data analyses remain a bottleneck. To address this, we developed PIP{Sigma}X (Pipeline for Image Processing and EXploration), a...
Mardamshina, M.
•
Ballllosera Navarro, F.
•
Martinez Casals, A.
•
Avenel, C.
...•
Lundberg, E.
biorxiv
Fri May 09 2025
IsoBayes: a Bayesian approach for single-isoform proteomics inference
Studying protein isoforms is an essential step in biomedical research; at present, the main approach for analyzing proteins is via bottom-up mass spectrometry proteomics, which return peptide identifications, that are indirectly used to infer the presence of protein isoforms. However, the detection and quantification processes are noisy; in particular, peptides may be erroneously detected, and mos...
BOLLON, J.
•
SHORTREED, M. R.
•
JORDAN, B. T.
•
MILLER, R.
...•
Tiberi, S.
biorxiv
Fri May 09 2025
Application of spatial transcriptomics across organoids: a high-resolution spatial whole-transcriptome benchmarking dataset
Stem cell-derived organoid models hold great promise to model tissue-specific disease. To enable this, it is crucial to determine how their composition compares to endogenous organs. However, technologies such as spatial transcriptomics (STs) that can inform on regional molecular identity have been challenging to apply to organoids. Here we present the first systematic profiling of multiple organo...
Nucera, M. R. R.
•
Charitakis, N.
•
Leung, R.
•
Leichter, A.
...•
Ramialison, M.
biorxiv
Fri May 09 2025
RNAcompare: Integrating machine learning algorithms to unveil the similarities of phenotypes based on clinical, multi-omics using Rheumatoid Arthritis and Heart Failure as Case Studies
Background: Gene expression analysis is crucial for understanding the biological mechanisms underlying patient subgroup differences. However, most existing studies focus primarily on transcriptomic data while neglecting the integration of clinical heterogeneity. Although batch correction methods are commonly used, challenges remain when integrating data across different tissues, omics layers, and ...
Tang, M.
biorxiv
Fri May 09 2025
multideconv - an integrative pipeline for combining first and second generation cell type deconvolution results
Summary: The number of computational methods for cell type deconvolution from bulk RNA-seq data has been increasing in the last years, but their high feature complexity and variability of results across methods and signatures limit their utility and effectiveness for patient stratification. Applying multiple combinations of deconvolution methods and signatures often results in hundreds of redundan...
Hurtado, M.
•
Essabbar, A.
•
Khajavi, L.
•
Pancaldi, V.
biorxiv
Thu May 08 2025
Predicting Molecular Taste: Multi-Label and Multi-Class Classification
Predicting the taste of chemical compounds is a complex task and has been a challenge for decades. This study explores the application of machine learning to predict taste profiles of chemical compounds using the ChemTastesDB dataset, comprising 2,944 tastants categorized into 44 taste labels and 9 taste classes. Addressing the challenges of label imbalance and correlation, the dataset was preproc...
Ramanathan, V.
•
DN, S. S.
biorxiv
Thu May 08 2025
INLAomics for Scalable and Interpretable Spatial Multiomic Data Integration
Integrating spatial transcriptomics with antibody-based proteomics enables the investigation of biological regulation within intact tissue architecture. However, current approaches for spatial multi-omics integration often depend on dimensionality reduction or autoencoders, which disregard spatial context, limit interpretability, and face challenges with scalability. To address these limitations, ...
Arnroth, L.
•
Vickovic, S.
biorxiv
Thu May 08 2025
Not All Saliva Samples Are Equal: The Role of Cellular Heterogeneity in DNA methylation and Epigenetic Age Analyses with Biological and Psychosocial Factors
Saliva is widely used in biomedical population research, including epigenetic analyses to investigate gene-environment interplay and identify biomarkers. Its minimally invasive collection procedure makes it ideal for studies in pediatric populations. Saliva is a heterogenous tissue composed of immune and buccal epithelial cells (BEC). Amongst the many epigenetic marks, DNA methylation (DNAm) is th...
Chan, M. H.-M.
•
Meijer, M.
•
Merrill, S. M.
•
Fu, M. P. Y.
...•
Kobor, M. S.
biorxiv
Thu May 08 2025
GeneFix-AI: AI-Powered CRISPR-Cas9 System for Real-Time Detection and Correction of Mutations in Non-Human Species
The evolution of genome engineering technologies has transformed biomedical research, enabling precise and efficient modification of genetic material Doudna and Charpentier, 2014. Among these, CRISPR-Cas9 stands out as a revolutionary gene-editing tool, though it often requires extensive expertise and technical knowledge Cong et al., 2013; J. G. Doench et al., 2016. We propose GeneFix-AI, an Artif...
Ali, M.
biorxiv
Thu May 08 2025
Surforama: interactive exploration of volumetric data by leveraging 3D surfaces
Motivation: Visualization and annotation of segmented surfaces is of paramount importance for studying membrane proteins in their native cellular environment by cryogenic electron tomography (cryo-ET). Yet, analyzing membrane proteins and their organization is challenging due to their small sizes and the need to consider local context constrained to the membrane surface. Results: To interactively ...
Yamauchi, K. A.
•
Lamm, L.
•
Gaifas, L.
•
Righetto, R. D.
...•
Harrington, K.