EN-SPEAK: English Speech and Pronunciation Enhancement AI Kit

This project builds on a recent collaboration between the Centre for Global English (CGE) and the Computational Linguistics & Text Mining Lab (CLTL).

The CGE runs a successful MOOC English Pronunciation in a Global World, gathering over 130,000 students from 191 countries. MOOCs offer many benefits but some challenges remain; e.g. lack of personal feedback on student work. EN-SPEAK will help tackle this issue by developing an automated pronunciation checker capable of providing personal, adequate and timely feedback concerning the quality of pronunciation.

The project comprises two components: A) the creation of an open English Learner’s Pronunciation Dataset, and B) the development of an automated pronunciation checker to aid in the diagnosis of English pronunciation mistakes. A) will address the lack of annotated speech Learner Data. Learner Data is valuable and rare, especially when annotated, and crucial for the development of education technology as it helps understand the real necessities of students (e.g., where they struggle, what kind of help they need). The existing MOOC provides a unique opportunity to analyze large numbers of data from L2 speakers across the world and identify areas of struggle in English pronunciation. We will then use the result of this analysis to generate high-quality synthetic data using speech synthesis technology to produce artificial learner data that reflect student’s challenges. In this way, we will derive two unique and complementary data sets.

B) will use the data analyzed and generated by A) to inform the development of an automated pronunciation checker. A proof of concept of this tool is currently under development. We will use state-of-the-art models for speech recognition/transcription to determine the quality of pronunciation. We will test different methods (e.g. ensemble models, fine-tuning paradigm, etc.) to create an automated pronunciation checker suitable to be integrated in CGE’s MOOC.

Supervisors:

: Laura Rupp

: Luis Morgado da Costa

Academy Assistants: tba

Network Institute

The hub for interdisciplinary research on the digital society @VU

EN-SPEAK: English Speech and Pronunciation Enhancement AI Kit