Many real-life decisions involve a tension between short-term and long-term outcomes, which requires forward-looking abilities. In reinforcement learning, this tension arises at the initial stage of multi-step learning and decision tasks, where forward-looking decisions collect smaller immediate rewards but govern the transition to more advantageous second-stage states. Here, we investigated the neural mechanisms underlying such forward-looking decisions in a cohort of healthy participants undergoing fMRI scanning (N=28). Behavioral results confirmed that participants were able to learn concurrently reward values and state-transition probabilities. By contrasting BOLD signal elicited by first-stage versus second-stage at the time of decision, we isolated a brain network, with central nodes in the bilateral parahippocampal cortex (BA37), whose activity correlate with forward-looking choices. BOLD activity in another network, including the bilateral parietal cortex, correlated with structure learning signal at the time of outcome processing. Our results shed new light on the neural bases of model-based reinforcement learning by suggesting a specific role of the parahippocampal cortex in forward planning and the parietal cortex in learning the Markovian structure of the task.