2025 Hyper Recent •CC0 1.0 Universal

This work is dedicated to the public domain. No rights reserved.

Access Preprint From Server
February 1st, 2025
Version: 1
Bernhard-Nocht Institute for Tropical Medicine
bioinformatics
biorxiv

Potentials and limitations in the application of Convolutional Neural Networks for mosquito species identification using wing images

Nolte, K.Open in Google Scholar•Baumbach, J.Open in Google Scholar•Lins, C.Open in Google Scholar•Lohmann, J. J. G.Open in Google Scholar•Kollmannsberger, P.Open in Google Scholar•Sauer, F. G.Open in Google Scholar•Luehken, R.Open in Google Scholar

1. This study addresses the pressing global health burden of mosquito-borne diseases by investigating the application of Convolutional Neural Networks (CNNs) for mosquito species identification using wing images. Conventional identification methods are hampered by the need for significant expertise and resources, while CNNs offer a promising alternative. Our research aimed to develop a reliable and applicable classification system that can be used under real-world conditions, with a focus on improving model adaptability to unencountered devices, mitigating dataset biases, and ensuring usability across different users without standardized protocols. 2. We utilized a large, diverse dataset of mosquito wing images of 21 taxa and three image-capturing devices and an optimized preprocessing pipeline to standardize images and remove undesirable image features. 3. The developed CNN models demonstrated high performance, with an average balanced accuracy of 98.3% and a macro F1-score of 97.6%, effectively distinguishing between the 21 mosquito taxa, including morphologically similar pairs. The preprocessing pipeline improved the model\'s robustness, reducing performance drops on unfamiliar devices effectively. However, the study also highlights the persistence of inherent dataset biases, which the preprocessing steps could only partially mitigate. The classification system\'s practical usability was demonstrated through a feasibility study, showing high inter-rater reliability. 4. The results underscore the potential of the proposed workflow to enhance vector surveillance, especially in resource-constrained settings, and suggest its applicability to other winged insect species. The classification system developed in this study is available for public use, providing a valuable tool for vector surveillance and research, supporting efforts to mitigate the spread of mosquito-borne diseases.

Similar Papers

biorxiv
Thu Jul 03 2025
Foundation Model Attributions Reveal Shared Inflammatory Program Across Diseases
Determining a gene's functional significance within a specific cellular context has long been a challenge. We introduce a framework for quantifying gene importance by leveraging attributions learned by foundation models (FMs) trained on large corpora of single-cell RNA-sequencing (scRNA-seq) datasets. Attribution scores robustly quantify gene importance across datasets, emphasizing key genes in re...
Gold, M. P.
•
Reyes, M.
•
Diamant, N.
•
Kuo, T.
...•
Biancalani, T.
biorxiv
Thu Jul 03 2025
Amino acid exchangeability and surface accessibility underpin the effects of single substitutions
Deep mutational scans have measured the effects of many mutations on many different proteins. Here we use a collection of such scans to perform a statistical meta-analysis of the effects of single amino acid substitutions. Specifically, we model the relative deleteriousness of each substitution in each deep mutational scan with respect to the identities of the wildtype and mutant residues, and the...
Alpay, B. A.
•
Nanda, P.
•
Nagy, E.
•
Desai, M. M.
biorxiv
Thu Jul 03 2025
Hybrid Generative Model: Bridging Machine Learning and Biophysics to Expand RNA Functional Diversity
Functional RNAs perform diverse catalytic roles, yet natural sequences represent only a narrow subset of what is possible. Rediscovering such activities requires exploring functional sequence diversity beyond natural RNAs. We introduce a Bayesian hybrid generative model that combines a coevolutionary likelihood with an RNA secondary structure prior. This approach disentangles folding constraints f...
Opuu, V.
biorxiv
Thu Jul 03 2025
In Silico Investigation Reveals a Potential Functional Role for Human Microbiome in Chronic Obstructive Pulmonary Disease
Chronic Obstructive Pulmonary Disease (COPD) is a progressive enervating lung disease characterized by chronic inflammation, airway inhibition and unrecoverable structural damage to the lungs. While traditionally associated with environmental factors similar as cigarette smoke and air pollution as well as genetic factors, recent revelations has increasingly indicative of the role of microbiomes in...
Jana, N.
•
Dhara, O.
•
Bhattacharya, S. S.
biorxiv
Thu Jul 03 2025
CLONEID: A Framework for Longitudinal Integration of Phenotypic and Genotypic Data to Monitor and Steer Subclonal Dynamics
Understanding how genetic and phenotypic diversity emerges and evolves within cancer cell populations is a fundamental challenge in cancer biology. CLONEID is a novel framework designed to organize and analyze clone-specific measures as structured time-series data. By integrating and monitoring genotypic and phenotypic experimental data over time, CLONEID facilitates hypothesis-driven and hypothes...
Veith, T.
•
Beck, R. J.
•
Tagal, V.
•
Li, T.
...•
Andor, N.
biorxiv
Wed Jul 02 2025
Inferring metabolite states from spatial transcriptomes using multiple graph neural network
Metabolism serves as the pivotal interface connecting genotype and phenotype in various contexts, such as cancer reprogramming and immune metabolic reprogramming. Compared to the transcriptome, the development of the single-cell metabolome faces significant challenges. While various methods exist for predicting metabolite levels from transcriptome, their efficacy remains limited. We developed an e...
Jiaxu, L.
•
Daosheng, A.
•
Sun, W.
biorxiv
Wed Jul 02 2025
hoodscanR: profiling single-cell neighborhoods in spatial transcriptomics data
Understanding complex cellular niches and neighborhoods have provided new insights into tissue biology. Thus, accurate neighborhood identification is crucial, yet existing methodologies often struggle to detect informative neighborhoods and generate cell-specific neighborhood profiles. To address these limitations, we developed hoodscanR, a Bioconductor package designed for neighborhood identifica...
Liu, N.
•
Martin, J.
•
Bhuva, D. D.
•
Chen, J.
...•
Davis, M. J.
biorxiv
Wed Jul 02 2025
A systematic assessment of phylogenomic approaches for microbial species tree reconstruction
A key challenge in microbial phylogenomics is that microbial gene families are often affected by extensive horizontal gene transfer (HGT). As a result, most existing methods for microbial phylogenomics can only make use of a small subset of the gene families present in the microbial genomes under consideration, potentially biasing their results and affecting their accuracy. To address this challen...
Weiner, S.
•
Feng, Y.
•
Gogarten, J. P.
•
Bansal, M. S.
biorxiv
Wed Jul 02 2025
nf-core/viralmetagenome: A Novel Pipeline for Untargeted Viral Genome Reconstruction
Motivation: Eukaryotic viruses present significant challenges for genome reconstruction and variant analysis due to their extensive diversity and potential genome segmentation. While de novo assembly followed by reference database matching and scaffolding is a commonly used approach, the manual execution of this workflow is extremely time-consuming, particularly due to the extensive reference cura...
Klaps, J.
•
Lemey, P.
•
nf-core community,
•
Kafetzopoulou, L. E.
biorxiv
Wed Jul 02 2025
MORPH Predicts the Single-Cell Outcome of Genetic Perturbations Across Conditions and Data Modalities
Modeling cellular responses to genetic perturbations is a significant challenge in computational biology. Measuring all gene perturbations and their combinations across cell types and conditions is experimentally challenging, highlighting the need for predictive models that generalize across data types to support this task. Here we present MORPH, a MOdular framework for predicting Responses to Per...
He, C.
•
Zhang, J.
•
Dahleh, M. A.
•
Uhler, C.