EN-SPEAK: English Speech and Pronunciation Enhancement AI Kit

This project builds on a recent collaboration between the Centre for Global English (CGE) and the Computational  Linguistics & Text Mining Lab (CLTL).  

The CGE runs a successful MOOC English Pronunciation in a Global World, gathering over 130,000 students from  191 countries. MOOCs offer many benefits but some challenges remain; e.g. lack of personal feedback on student  work. EN-SPEAK will help tackle this issue by developing an automated pronunciation checker capable of providing  personal, adequate and timely feedback concerning the quality of pronunciation. 

The project comprises two components: A) the creation of an open English Learner’s Pronunciation Dataset, and B)  the development of an automated pronunciation checker to aid in the diagnosis of English pronunciation mistakes. A) will address the lack of annotated speech Learner Data. Learner Data is valuable and rare, especially when annotated,  and crucial for the development of education technology as it helps understand the real necessities of students (e.g.,  where they struggle, what kind of help they need). The existing MOOC provides a unique opportunity to analyze large  numbers of data from L2 speakers across the world and identify areas of struggle in English pronunciation. We will  then use the result of this analysis to generate high-quality synthetic data using speech synthesis technology to produce  artificial learner data that reflect student’s challenges. In this way, we will derive two unique and complementary data  sets. 

  1. B) will use the data analyzed and generated by A) to inform the development of an automated pronunciation checker.  A proof of concept of this tool is currently under development. We will use state-of-the-art models for speech  recognition/transcription to determine the quality of pronunciation. We will test different methods (e.g. ensemble  models, fine-tuning paradigm, etc.) to create an automated pronunciation checker suitable to be integrated in CGE’s  MOOC. 

Supervisors:

Academy Assistants: tba