Shifting from hand-crafted to learned representations of data has revolutionized fields like natural language processing and computer vision. Despite this, current approaches to bacterial phenotype prediction from the genome rely on training machine learning models on hand-crafted features, often binary indicators or counts of the presence of different conserved genomic elements and protein domains. Defining these shared elements and domains as our "genomic element vocabulary", we tokenize entire bacterial genomes as sequences of these conserved elements and take advantage of advances in long-context language modeling to perform self-supervised whole-genome representation learning (WGRL). Through multi-task pretraining on a phylogenetically diverse dataset of hundreds of thousands of bacterial genomes, we present a genomic language model which produces representations of input genomes with features predictive of a broad range of phenotypes. We assess the quality of the learned representations through k-nearest neighbours prediction of 25 bacterial phenotypes, finding our WGRL representations more predictive than standard protein domain presence/absence representations for 23/25 different phenotypes. We additionally find the WGRL representations are robust to both poor genome assembly quality and incompleteness. Through learning the relationships between evolutionarily conserved genomic elements with self-supervised long-context language modeling, we demonstrate the first approach for extracting general-purpose whole genome representations while preserving gene order.