GatorST: A Versatile Contrastive Meta-Learning Framework for Spatial Transcriptomic Data Analysis

Wang, S.Open in Google Scholar•Liu, Y.Open in Google Scholar•Zhang, Z.Open in Google Scholar•Song, Q.Open in Google Scholar•Bian, J.Open in Google Scholar

Introduction: Recent advances in spatial transcriptomics (ST) technologies have revolutionized our understanding of cellular functions by providing gene expression profiles with rich spatial context. Effectively learning spatial representations is crucial for downstream analyses and requires robust integration of spatial information with transcriptomic data. While existing methods have shown promise, they often fail to adequately capture both local (neighbor-level) and global (tissue-wide) spatial contexts. Moreover, they tend to rely heavily on augmentation strategies, which can introduce noise and instability. Objectives: This study aims to introduce and demonstrate a novel, versatile framework called GatorST, which explicitly combines graph-based modeling with advanced learning strategies to generate spatially informed representations of ST data. GatorST is designed to improve various downstream tasks, including identification of spatial domains, gene expression imputation, batch effect removal, and trajectory inference. Methods: GatorST constructs a spot-spot graph by connecting each node to its k nearest spatial neighbors and extracts two-hop neighborhood subgraphs to capture local context. At the global level, gene expression profiles are clustered using soft K-means to generate pseudo-labels, which serve as weak supervision signals within a contrastive learning framework. This process encourages the alignment of embeddings with shared pseudo-labels while separating those with different labels. GatorST further adopts an episodic training strategy inspired by meta-learning, wherein each episode consists of a support set for contrastive optimization and a disjoint query set for embedding classification, guided by the pseudo-labeled data. This design enables the model to classify unseen samples based on learned embeddings, thereby enhancing its generalization to new spatial contexts. Results: Comprehensive comparisons with fifteen state-of-the-art methods across fourteen spatial transcriptomics datasets demonstrate that GatorST consistently achieves superior performance in identifying spatial domains, imputing gene expressions, and removing batch effects. The results showcase the versatility and strong generalization capabilities of GatorST across diverse tissue types and experimental settings. Conclusion: GatorST effectively integrates spatial topology and global gene expression through graph-based modeling, pseudo-labeling, and contrastive meta-learning. This framework generates biologically meaningful representations and significantly improves key downstream tasks, including spatial domain identification, gene expression imputation, batch effect removal, and trajectory inference.

GatorST: A Versatile Contrastive Meta-Learning Framework for Spatial Transcriptomic Data Analysis

Similar Papers