2025 Hyper Recent •CC0 1.0 Universal

This work is dedicated to the public domain. No rights reserved.

Access Preprint From Server
July 17th, 2025
Version: 1
Oregon Health & Science University
bioinformatics
biorxiv

MiroSCOPE: An AI-driven digital pathology platform for annotating functional tissue units

Fenner, M. R.Open in Google Scholar•Sevim, S.Open in Google Scholar•Wu, G.Open in Google Scholar•Beavers, D.Open in Google Scholar•Guo, P.Open in Google Scholar•Tang, Y.Open in Google Scholar•Eddy, C. Z.Open in Google Scholar•Ait-Ahmad, K.Open in Google Scholar•Rice-Stitt, T.Open in Google Scholar•Thomas, G.Open in Google Scholaret al.

Cancer tissue analysis in digital pathology is typically conducted across different spatial scales, ranging from high-resolution cell-level modeling to lower-resolution tile-based assessments. However, these perspectives often overlook the structural organization of functional tissue units (FTUs), the small, repeating structures which are crucial to tissue function and key factors during pathological assessment. The incorporation of FTU information is hindered by the need for detailed manual annotations, which are costly and time-consuming to obtain. While artificial intelligence (AI)-based solutions hold great promise to accelerate this process, there is currently no comprehensive workflow for building the large, annotated cohorts required. To remove these roadblocks and advance the development of more interpretable approaches, we developed MiroSCOPE, an end-to-end AI-assisted platform for annotating FTUs at scale, built on QuPath. MiroSCOPE integrates a fine-tunable multiclass segmentation model and curation-specific usability features to enable a human-in-the-loop system that accelerates AI annotation by a pathologist. The system is used to efficiently annotate over 71,900 FTUs on 184 prostate cancer hematoxylin and eosin (H&E)-stained tissue samples and demonstrates ready translation to breast cancer. Furthermore, we publicly release a dataset named Miro-120, consisting of 120 prostate cancer H&E with 30,568 annotations, which can be used by the community as a high-quality resource for FTU-level machine learning aims. In summary, MiroSCOPE provides an adaptable AI-driven platform for annotating functional tissue units, facilitating the use of structural information in digital pathology analyses.

Similar Papers

biorxiv
Fri Jul 18 2025
A Deep Learning-based Method for Drug Molecule Representation and Property Prediction
Accurately and robustly representing drug molecule features, prediction of drug-target biomacromolecule interactions, and determining drug molecule physicochemical properties are crucial in drug development. However, due to issues such as insufficient generalization ability of single-modal representation, lack of multi-task prediction frameworks, and weak adaptability in cold-start scenarios, thes...
Zhang, Q.
•
Yu, X.
•
Wei, y.
•
Wang, Z.-H.
•
Yu, D.-J.
biorxiv
Thu Jul 17 2025
Mapping the Metalloproteome of Deinococcus indicus DR1 through Integrative Structure and Function Annotation
Deinococcus indicus DR1 is a rod-shaped bacterium isolated from the Dadri wetlands (Uttar Pradesh, India) that tolerates ionizing radiation and arsenic. The molecular basis of its wider heavy-metal resilience, particularly among the 1017 out of 4128 proteins still annotated as hypothetical, remains unclear. We performed a proteome-wide structural and functional survey to address this gap. All the ...
Ramesh, S. D.
•
Vasan, G.
•
Senthilkumar, S.
•
Thambiraja, M.
...•
Yennamalli, R. M.
biorxiv
Thu Jul 17 2025
scDNAm-GPT: A Foundation Model for Capturing Long-Range CpG Dependencies in Single-Cell Whole-Genome Bisulfite Sequencing to Enhance Epigenetic Analysis
Accurately identifying development- and disease-associated DNA methylation features from single-cell DNA methylation data remains challenging due to the genome-wide scale and the sparse, stochastic nature of CpG coverage. We present scDNAm-GPT, a novel framework that integrates CpG token design, a Mamba backbone, and a cross-attention head to efficiently process ultra-long sequences while preservi...
Liang, C.
•
Ye, P.
•
Yan, H.
•
Zheng, P.
...•
Li, J.
biorxiv
Thu Jul 17 2025
A periodic table of bacteria?: Mapping bacterial diversity in trait space
Bacterial diversity can be overwhelming. There is an ever-expanding number of bacterial taxa being discovered, but many of these taxa remain uncharacterized with unknown traits and environmental preferences. This diversity makes it challenging to interpret ecological patterns in microbiomes and understand why individual taxa, or assemblages, may vary across space and time. While we can use informa...
Hoffert, M. C.
•
Lladser, M. E.
•
Gorman, E. D.
•
Fierer, N.
biorxiv
Thu Jul 17 2025
SVPG: A pangenome-based structural variant detection approach and rapid augmentation of pangenome graphs with new samples
Breakthrough advances in long-read sequencing technologies have opened unprecedented opportunities to study genetic variations through comprehensive pangenome analysis. However, the availability of structural variant (SV) calling tools that can effectively leverage pangenome information is limited. In addition, efficient construction of pangenome graphs becomes increasingly challenging with acquis...
Hu, H.
•
Gao, R.
•
Jiang, Z.
•
Cao, S.
...•
Wang, G.
biorxiv
Thu Jul 17 2025
Improving causal effect estimation in multi-ancestry multivariable Mendelian randomization with transfer learning
Multivariable Mendelian randomization (MVMR) is widely used to estimate the causal effects of exposures on disease outcomes. However, its applications have been largely limited to individuals of European ancestry, due to the larger sample sizes available in European genome-wide association studies (GWAS). Although methods that jointly analyze multiple ancestries have been proposed to improve power...
Yang, Y.
•
Zhu, X.
biorxiv
Thu Jul 17 2025
PromoterAtlas: decoding regulatory sequences across Gammaproteobacteria using a transformer model
Recent advances in deep learning, particularly transformer architectures, have improved computational approaches for biological sequence analysis. Despite these advances, computational models for bacterial promoter prediction have remained limited by small datasets, species-specific training, and binary classification approaches rather than comprehensive annotation frameworks. We present PromoterA...
Coppens, L.
•
Ledesma-Amaro, R.
biorxiv
Thu Jul 17 2025
mm2-ivh: simple and precise overlap detection in alpha satellite HORs with interval hashing
Summary: We propose a new algorithm, \"interval hashing,\" which distinguishes identical k-mers arising from different repeat sequences, particularly in complex repeat arrays such as alpha satellite HORs. We implement this algorithm as a fork of minimap2, named mm2-ivh. In local assembly of alpha satellite HORs, mm2-ivh accurately reconstructs more haplotypes than assemblers using standard minimiz...
Suzuki, H.
•
Sugawa, M.
•
Sakamoto, Y.
•
Shiraishi, Y.
biorxiv
Thu Jul 17 2025
Identifying associations between maize leaf transcriptome and bacteriome during different diurnal periods
Bacterial communities play important roles in the plant phyllosphere. Both microbial communities and their hosts have circadian rhythms and are subject to diurnal environmental changes. However, the interaction between the host and microbiome is still poorly understood. Here, we exploit paired sequencing data of host transcriptome and microbiome derived maize genotypes in field conditions and unde...
dos Santos, R. A. C.
•
Hidalgo-Martinez, K. J.
•
Munoz Perez, J. M.
•
Laspisa, D. J.
...•
Wallace, J.