⬅ Back to Agenda

A Corpus Study on the Lexical Coverage of TED Talks: Implication for Vocabulary Teaching and Material Selection

Shao-wei Liu
National Taiwan Normal University, Taipei, Taiwan


While the lexical coverage of TED Talks have been extensively examined in previous studies, a majority of researchers inclined to analyze the main topics (e.g., Global issues, Science). However, little did we know the lexical coverage of TED talks in other disciplines. The present study would like to base on the limitation of previous studies, and analyze the lexical coverage of TED Talks in four registers: Art, Education, Economic, and Medicine, which are not listed as the main topics. A total of 120 TED talks scripts (255,450 tokens) were collected and analyzed by using three word lists: the British National Council and the Corpus of Contemporary American English (BNC/COCA) word lists, the Academic Word List (AWL), and the Academic Spoken Word List (ASWL). The results suggest that 3,000 word families plus additional word lists can reach 96.53% of the TED Talks corpus. To reach 98.35%, 5,000 word families plus additional word lists is necessary. Coverage is different for each register with medicine has the highest word load. Another important finding is that the ASWL word lists covers 90.02% of the TED Talks corpus with only 1,741 word families. The figure is more than AWL with only 4.45% coverage. It is hoped that the findings of present study could provide some insights for instructors in the EMI (English as the medium of instruction) courses in local universities on how to select TED Talks to cultivate learners with sufficient language ability for EMI courses.


TED Talk, Lexical coverage, Corpus study, Vocabulary teaching, EMI

International Joint Conference of APLX, ETRA40, and TESPA 2023