The cerebral cortex, cerebellum and basal ganglia are essential for flexible learning in mammals. Although traditionally thought to operate under different learning rules, recent evidence suggests that both the basal ganglia and the cerebellum may employ reinforcement learning mechanisms. This raises the question of how these structures coordinate when a common reward prediction error mechanism is active. To address this issue, we first examined output signals from the basal ganglia and cerebellum following the activity of the cerebral cortex. We recorded single-neuron activity from the output regions of the cerebellum and basal ganglia - the cerebellar nuclei (CN) and substantia nigra pars reticulata (SNr) - in both male and female ChR2 transgenic rats. Neurons in the CN and SNr exhibited distinct temporal response patterns; notably, the fast excitatory response in the CN, driven by mossy fiber input, was synchronized with the inhibitory response in the SNr, mediated via the direct pathway. Using these experimental findings together with connectome data, we developed both a semi-realistic spiking network model and a reservoir-based reinforcement learning model. In the latter model, successful learning depended on synaptic plasticity in both the cerebellum and basal ganglia with a temporal precision on the order of 10 ms. Furthermore, cortical {beta}-oscillations enhanced learning and optimal reinforcement learning occurred when the output of cerebellar and basal ganglia signal phase-locked at the frequency of cortical oscillation. Taken together, our results suggest that the coordinated output of the cerebellum and basal ganglia, driven by tightly tuned cortical input, underlies brain-wide synergistic reinforcement learning.