Pronunciation Assessment
A major obstacle in computer-aided language learning is the missing feedback from a human teacher. This is especially true for learning the correct pronunciation. Automatic pronunciation assessment aims to brigde that gap. Even for cases where there is a teacher present, it can help the students by providing more intensive pronunciation training than is possible in class. Also, it can help learners that tend to avoid exercising their pronunciation aloud in class.
Data Collection
In order to train and test an automatic pronunciation scoring system, a comprehensive corpus of annotated examples has to be collected. The annotation is very time-consuming and has to be performed by experienced labellers.
Automatic Pronunciation Scoring
The basis of modern approaches to automatic pronunciation scoring is an HMM-based speech recognizer. A number of parameters are computed from the recognition result and from phoneme alignments, e.g. the likelihood of the spoken utterance to be a realization of the given target sentence. These can be used to classify whether the uttered words were pronounced correctly. Complementary to techniques based on a speech recognizer, the spoken utterance can directly be compared to reference speakers. Thus, subtleties can be captured that a standard triphone HMM-recognizer cannot model.
