2025 Hyper Recent •CC0 1.0 Universal

This work is dedicated to the public domain. No rights reserved.

Access Preprint From Server
June 5th, 2025
Version: 1
Barcelona Supercomputing Center (BSC)
bioinformatics
biorxiv

10 Years of Variational Autoencoder: Insights from Cancer Temporal Progression Studies, a Systematic Literature Review

Prol-Castelo, G.Open in Google Scholar•Cirillo, D.Open in Google Scholar•Valencia, A.Open in Google Scholar

Deep Learning methods such as Deep Representation Learning (DRL) and, specifically, the Variational Autoencoder (VAE), have been widely used to handle the high dimensionality of available datasets. Hence, these methods have been applied to study cancer through omics data. Cancer is one of the leading causes of death worldwide, and its complex and dynamic nature makes it especially difficult to study. However, the temporal dimension of cancer progression is a target that has not been tackled. In this systematic literature review, we explore the use of DRL, particularly the VAE, in cancer studies with omics data in a temporal context. Our research reveals that the most common uses of the VAE in cancer studies are related to subtyping, diagnosis, and prognosis. Meanwhile, cancer\'s temporal aspect is often overlooked, namely, because of the lack of longitudinal data. We propose that applying the VAE as a generative model to study cancer in time, for example, focusing on cancer staging, will lead to meaningful advancements in our understanding of cancer.

Similar Papers

biorxiv
Fri Jun 06 2025
SCNT: An R Package for Data Analysis and Visualization of Single-Cell and Spatial Transcriptomics
Background: The emergence of single-cell (SC) and spatial transcriptomics (ST) has revolutionized our understanding of gene expression dynamics in complex tissues. However, it also presents challenges for data analysis and visualization, particularly due to the complexity of ST data and the diversity of analysis platforms. The SCNT (Single-Cell, Single-Nucleus, and Spatial Transcriptomics Analysis...
Qing, J.
•
Wu, J.
•
Li, Y.
•
Wu, J.
biorxiv
Fri Jun 06 2025
OriGene: A Self-Evolving Virtual Disease Biologist Automating Therapeutic Target Discovery
Therapeutic target discovery remains a critical yet intuition-driven bottleneck in drug development, typically relying on disease biologists to laboriously integrate diverse biomedical data into testable hypotheses for experimental validation. Here, we present OriGene, a self-evolving multi-agent system that functions as a virtual disease biologist, systematically identifying original and mechanis...
Zhang, Z.
•
Qiu, Z.
•
Wu, Y.
•
Li, S.
...•
Zheng, S.
biorxiv
Fri Jun 06 2025
Amira: detection of AMR genes directly from long reads using gene-space de Bruijn graphs
Accurate detection of antimicrobial resistance (AMR) genes is essential for the surveillance, epidemiology and genotypic prediction of AMR. This is typically done by generating an assembly from the sequencing reads of a bacterial isolate and running AMR gene detection tools on the assembly. However, despite advances in long-read sequencing that have greatly improved the quality and completeness of...
Anderson, D.
•
Lima, L.
•
Le, T.
•
Judd, L. M.
...•
Iqbal, Z.
biorxiv
Fri Jun 06 2025
Limitations of Current Machine-Learning Models in Predicting Enzymatic Functions for Uncharacterized Proteins
Thirty to seventy percent of proteins in any given genome have no assigned function and have been labeled as the protein unknome. This large knowledge shortfall is one of the final frontiers of biology. Machine-Learning (ML) approaches are enticing, with early successes demonstrating the ability to propagate functional knowledge from experimentally characterized proteins. An open question is the a...
de Crecy-Lagard, V.
•
Dias, R.
•
Sexson, N.
•
Friedberg, I.
...•
Swairjo, M.
biorxiv
Fri Jun 06 2025
An improved model for prediction of de novo designed proteins with diverse geometries
Nature uses structural variations on protein folds to fine-tune the geometries of proteins for diverse functions, yet deep learning-based de novo protein design methods generate highly regular, idealized protein fold geometries that fail to capture natural diversity. Here, using physics-based design methods, we generated and experimentally validated a dataset of 5,996 stable, de novo designed prot...
Orr, B.
•
Crilly, S. E.
•
Akpinaroglu, D.
•
Zhu, E.
...•
Kortemme, T.
biorxiv
Fri Jun 06 2025
Pangenome-aware DeepVariant
Population-scale genomics information provides valuable prior knowledge for various genomic analyses, especially variant calling. A notable example of such application is the human pangenome reference released by the Human Pangenome Reference Consortium, which has been shown to improve read mapping and structural variant genotyping. In this work, we introduce pangenome-aware DeepVariant, a variant...
Asri, M.
•
Chang, P.-C.
•
Mier, J. C.
•
Siren, J.
...•
Shafin, K.
biorxiv
Fri Jun 06 2025
sCIN: A Contrastive Learning Framework for Single-Cell Multi-omics Data Integration
The rapid advancement of single-cell omics technologies such as scRNA-seq and scATAC-seq has transformed our understanding of cellular heterogeneity and regulatory mechanisms. However, integrating these data types remains challenging due to distributional discrepancies and distinct feature spaces. To address this, we present a novel single-cell Contrastive INtegration framework (sCIN), that integr...
Ebrahimi, A.
•
Siahpirani, A. F.
•
Montazeri, H.
biorxiv
Fri Jun 06 2025
Global profiling of the proteome and acetylome in mice with abdominal aortic aneurysms
Objective: Abdominal Aortic Aneurysm (AAA) is a life-threatening vascular disease with a high risk of rupture. Current treatments rely on surgery, as effective drug therapies remain unavailable due to limited understanding of disease mechanisms and a lack of therapeutic targets. This study aims to identify potential targets for pharmacological intervention through global proteomic and acetylomic a...
Yang, J.
•
Zhang, L.
•
Yang, B.
•
Ding, T.
...•
Liu, J.
biorxiv
Thu Jun 05 2025
Machine learning driven acceleration of biopharmaceutical formulation development using Excipient Prediction Software (ExPreSo)
Formulation development of protein biopharmaceuticals has become increasingly challenging due to new modalities and higher desired drug substance concentrations. The constraint in drug substance supply and the need for many analytical methods means that only a small selection of excipients can be thoroughly tested in the lab. There are few in-silico tools developed to refine the candidate excipien...
Vidal-Henriquez, E.
•
Holder, T.
•
Lee, N. F.
•
Pompe, C.
•
Teese, M. G.
biorxiv
Thu Jun 05 2025
Learning Genetic Perturbation Effects with Variational Causal Inference
Advances in sequencing technologies have enhanced the understanding of gene regulation in cells. In particular, Perturb-seq has enabled high-resolution profiling of the transcriptomic response to genetic perturbations at the single-cell level. This understanding has implications in functional genomics and potentially for identifying therapeutic targets. Various computational models have been devel...
Liu, E.
•
Zhang, J.
•
Uhler, C.