Attentional engagement with target and distractor streams predicts speech comprehension in multitalker environments
Understanding speech while ignoring competing speech streams in the surrounding environment is challenging. Previous studies have demonstrated that attention shapes the neural representation of speech features. Attended streams are typically represented more strongly than unattended ones, suggesting either enhancement of the attended or suppression of the unattended stream. However, it is unclear how these complementary processes support attentional filtering and speech comprehension on different hierarchical levels. In this study, we used multivariate temporal response functions to analyze the EEG signals of 43 young adults, examining the relationship between the neural tracking of acoustic and higher-level linguistic features and a fine-grained speech comprehension measure. We show that the neural tracking of word and phoneme onsets and word-level linguistic features in the attended stream predicted comprehension at the individual single-trial level. Moreover, acoustic tracking of the ignored speech stream was positively correlated with comprehension performance, whereas word level linguistic neural tracking of the ignored stream was negatively correlated with comprehension. Collectively, our results suggest that attentional filtering during speech comprehension requires target enhancement as well as distractor suppression at different hierarchical levels.