Motor control is essential for organisms to efficiently interact with the environment by maintaining accurate action and adjusting to future changes. Speech production, one of the most complex motor behaviors, relies on a feedback control process to detect sensory errors and trigger updates in a feedforward control process that implements compensations. However, the specific contributions of these critical processes in sensorimotor learning during continuous vocal production remain debated. Here, we used two experimental designs in five experiments to dissociate these mechanisms. First, we employed a serial-dependence design with randomized pitch perturbations, dissociating the influences of sensory errors and motor compensation on subsequent vocalizations on a trial-by-trial basis. We found that motor compensation, rather than sensory errors, predicted the compensatory responses in the subsequent trials, suggesting instantaneous serial learning mediated by updates in the feedforward process. This compensation-driven serial learning was generalized across productions of different vowel categories. Second, we further implemented a serial-dependence adaptation design in a sentential context, where auditory perturbation occurred only on a preceding syllable. Any learning effects in its subsequent syllable without pitch perturbation would reflect changes in the speech motor representation. Our results consistently revealed that compensation in the preceding syllable predicted pitch changes in the subsequent syllable, but only when the two adjacent syllables were embedded within a word boundary. Collectively, the study provides ecological-valid evidence supporting that error-based motor compensation, incorporating cognitive and linguistic constraints, directly regulates the speech motor representation and mediates the instantaneous serial learning in successive actions.