2025 Hyper Recent •CC0 1.0 Universal

This work is dedicated to the public domain. No rights reserved.

Access Preprint From Server
May 8th, 2025
Version: 1
Emory University
bioinformatics
biorxiv

A novel machine learning-based algorithm for eQTL identification reveals complex pleiotropic effects in the MHC region

Li, R. Y.Open in Google Scholar•Su, C.Open in Google Scholar•Qin, Z. S.Open in Google Scholar

Expression quantitative trait loci (eQTLs) are regulatory variants that affect the expression level of their target genes and have significant impact on disease biology. However, eQTL mapping has been done mostly in one tissue at a time, despite the known prevalence of correlations among tissues. Multivariate analyses incorporating multiple phenotypes are available, but they emphasize linear combinations of phenotypes. We present MTClass, a machine learning framework that attempts to classify an individual\'s genotype based on a vector of multi-phenotype expression levels of a given gene. We conduct simulation studies and multiple case studies using real and imputed data, and we demonstrate that MTClass detects more functionally relevant variants and genes compared to existing single-tissue approaches as well as multi-phenotype association tests. Our results suggest that the importance of expression regulation at the MHC region may have been underestimated, and they provide fresh biological insights into genetic variants that have pleiotropic effects, influencing gene expression in a complex manner.

Similar Papers

biorxiv
Fri May 09 2025
Streamlining Multiplexed Tissue Image Analysis with PIP{Sigma}X: An Integrated Automated Pipeline for Image Processing and EXploration for Diverse Tissue Types
Spatial proteomics via multiplexed tissue imaging is transforming how we study biology, enabling researchers to investigate dozens of markers in a single tissue section and explore how cells behave in their native habitat. While imaging technologies have advanced rapidly, data analyses remain a bottleneck. To address this, we developed PIP{Sigma}X (Pipeline for Image Processing and EXploration), a...
Mardamshina, M.
•
Ballllosera Navarro, F.
•
Martinez Casals, A.
•
Avenel, C.
...•
Lundberg, E.
biorxiv
Fri May 09 2025
IsoBayes: a Bayesian approach for single-isoform proteomics inference
Studying protein isoforms is an essential step in biomedical research; at present, the main approach for analyzing proteins is via bottom-up mass spectrometry proteomics, which return peptide identifications, that are indirectly used to infer the presence of protein isoforms. However, the detection and quantification processes are noisy; in particular, peptides may be erroneously detected, and mos...
BOLLON, J.
•
SHORTREED, M. R.
•
JORDAN, B. T.
•
MILLER, R.
...•
Tiberi, S.
biorxiv
Fri May 09 2025
Application of spatial transcriptomics across organoids: a high-resolution spatial whole-transcriptome benchmarking dataset
Stem cell-derived organoid models hold great promise to model tissue-specific disease. To enable this, it is crucial to determine how their composition compares to endogenous organs. However, technologies such as spatial transcriptomics (STs) that can inform on regional molecular identity have been challenging to apply to organoids. Here we present the first systematic profiling of multiple organo...
Nucera, M. R. R.
•
Charitakis, N.
•
Leung, R.
•
Leichter, A.
...•
Ramialison, M.
biorxiv
Fri May 09 2025
RNAcompare: Integrating machine learning algorithms to unveil the similarities of phenotypes based on clinical, multi-omics using Rheumatoid Arthritis and Heart Failure as Case Studies
Background: Gene expression analysis is crucial for understanding the biological mechanisms underlying patient subgroup differences. However, most existing studies focus primarily on transcriptomic data while neglecting the integration of clinical heterogeneity. Although batch correction methods are commonly used, challenges remain when integrating data across different tissues, omics layers, and ...
Tang, M.
biorxiv
Fri May 09 2025
multideconv - an integrative pipeline for combining first and second generation cell type deconvolution results
Summary: The number of computational methods for cell type deconvolution from bulk RNA-seq data has been increasing in the last years, but their high feature complexity and variability of results across methods and signatures limit their utility and effectiveness for patient stratification. Applying multiple combinations of deconvolution methods and signatures often results in hundreds of redundan...
Hurtado, M.
•
Essabbar, A.
•
Khajavi, L.
•
Pancaldi, V.
biorxiv
Thu May 08 2025
Predicting Molecular Taste: Multi-Label and Multi-Class Classification
Predicting the taste of chemical compounds is a complex task and has been a challenge for decades. This study explores the application of machine learning to predict taste profiles of chemical compounds using the ChemTastesDB dataset, comprising 2,944 tastants categorized into 44 taste labels and 9 taste classes. Addressing the challenges of label imbalance and correlation, the dataset was preproc...
Ramanathan, V.
•
DN, S. S.
biorxiv
Thu May 08 2025
INLAomics for Scalable and Interpretable Spatial Multiomic Data Integration
Integrating spatial transcriptomics with antibody-based proteomics enables the investigation of biological regulation within intact tissue architecture. However, current approaches for spatial multi-omics integration often depend on dimensionality reduction or autoencoders, which disregard spatial context, limit interpretability, and face challenges with scalability. To address these limitations, ...
Arnroth, L.
•
Vickovic, S.
biorxiv
Thu May 08 2025
Not All Saliva Samples Are Equal: The Role of Cellular Heterogeneity in DNA methylation and Epigenetic Age Analyses with Biological and Psychosocial Factors
Saliva is widely used in biomedical population research, including epigenetic analyses to investigate gene-environment interplay and identify biomarkers. Its minimally invasive collection procedure makes it ideal for studies in pediatric populations. Saliva is a heterogenous tissue composed of immune and buccal epithelial cells (BEC). Amongst the many epigenetic marks, DNA methylation (DNAm) is th...
Chan, M. H.-M.
•
Meijer, M.
•
Merrill, S. M.
•
Fu, M. P. Y.
...•
Kobor, M. S.
biorxiv
Thu May 08 2025
GeneFix-AI: AI-Powered CRISPR-Cas9 System for Real-Time Detection and Correction of Mutations in Non-Human Species
The evolution of genome engineering technologies has transformed biomedical research, enabling precise and efficient modification of genetic material Doudna and Charpentier, 2014. Among these, CRISPR-Cas9 stands out as a revolutionary gene-editing tool, though it often requires extensive expertise and technical knowledge Cong et al., 2013; J. G. Doench et al., 2016. We propose GeneFix-AI, an Artif...
Ali, M.
biorxiv
Thu May 08 2025
Surforama: interactive exploration of volumetric data by leveraging 3D surfaces
Motivation: Visualization and annotation of segmented surfaces is of paramount importance for studying membrane proteins in their native cellular environment by cryogenic electron tomography (cryo-ET). Yet, analyzing membrane proteins and their organization is challenging due to their small sizes and the need to consider local context constrained to the membrane surface. Results: To interactively ...
Yamauchi, K. A.
•
Lamm, L.
•
Gaifas, L.
•
Righetto, R. D.
...•
Harrington, K.