Genomic variation underlies human phenotypic diversity, disease susceptibility, and evolutionary adaptation. Although large-scale genomic sequencing has transformed our ability to map genetic variation, accurately modeling and interpreting this data remains challenging due to fundamental limitations in existing genomic foundation models (GFMs). Current genomic models typically treat DNA simplistically as linear textual sequences, overlooking critical biological context, such as genomic structural annotations, regulatory elements, and functional contexts central to genomic interpretation. As a result, these models are prone to positional memorization of common sequences, severely limiting their generalization to biologically meaningful tasks. Here, we introduce BioToken, a modular and extendable tokenization framework designed to encode genomic variants and biologically relevant structural annotations directly into genomic representations. By utilizing intrinsic inductive biases, BioToken facilitates meaningful representation learning and generalization across diverse molecular phenotypes, such as gene expression, alternative splicing, and variant pathogenicity prediction. Built on BioToken, our genomic foundation model, BioFM, achieves competitive or superior results relative to specialized models (e.g., Enformer, SpliceTransformer) across a comprehensive suite of genomic benchmarks, including noncoding pathogenicity, expression modulation, sQTL prediction, and long-range genomic interactions. Notably, BioFM achieves state-of-the-art performance with significantly fewer parameters (265M), substantially reducing training costs and computational requirements. Our findings highlight the substantial advantages of integrating biologically-informed inductive biases into genomic foundation modeling, providing a robust and accessible path forward in genomics. We provide our code and model checkpoints to support further research in this direction.