Accurately identifying development- and disease-associated DNA methylation features from single-cell DNA methylation data remains challenging due to the genome-wide scale and the sparse, stochastic nature of CpG coverage. We present scDNAm-GPT, a novel framework that integrates CpG token design, a Mamba backbone, and a cross-attention head to efficiently process ultra-long sequences while preserving both local CpG interactions and broader genomic context. Pretrained on over one million single cells from 28 human and mouse tissues, scDNAm-GPT effectively reconstructs sparse methylation landscapes, enhancing the resolution and accuracy of epigenetic analyses. It outperforms existing methods across key biomedical applications, including improved cell clustering, enhanced trajectory inference for precise mapping of differentiation pathways, identification of disease-relevant DNA methylation features, and robust, reference-free cell type deconvolution from cfDNA data. scDNAm-GPT learns regulatory features in a hierarchical manner and and its attention scores exhibit high biological interpretability by highlighting functionally relevant CpG regions. These advancements establish scDNAm-GPT as a scalable and generalizable solution for single-cell epigenomic analysis, paving the way for broader applications in single-cell DNA methylation profiling and uncovering novel insights into the epigenetic mechanisms underlying health and disease.