Type to search


AI-powered Grammar Checker & Pronunciation to be a Cost-effective Tutor for English Language Learners

The lingua franca of modern times, English connects people across the globe. Every fifth person in the world today can speak English. It is the first language of over 400 million people around the globe, and the official language of over 50 countries. Besides being the global and digital language, English also connects people belonging to different parts of India who may not be well versed in Hindi.

Although elementary English is sufficient to engage in a somewhat intelligible conversation, language fluency enables an individual to be an effective communicator and also excel in academics and professional life. It is thus essential to hone one’s listening, speaking, reading and writing (LSRW) skills in the language.

Resorting to conventional tutoring on language skills might turn out to be a difficult and costly affair. In fact, it may be very difficult to find a tutor who is skilled enough to train students on the nitty-gritty of the language. Pronunciation, an important aspect of effective speaking, is a complex phenomenon and it is difficult to pin down pronunciation errors with a single approach. Moreover, it is an arduous task for teachers to correct grammatical errors of each and every student.

Sophisticated machine-learning technologies can step in to address such concerns effectively in a costeffective manner. Artificial learning technologies are programmed to correct grammatical errors and fine-tune the pronunciation of a student by pointing out errors providing corrections and offering explanations. These technologies also give recommendations to students on practising concepts they are weak in.

The fee for a skilled English teacher costs anywhere around $3000 a month, that is, $36,000 a year. It can be assumed that grammar and pronunciation will constitute 40% of the total agenda that the teacher will cover; others being literature, writing skills etc. So, it would cost around $16,200 to hone the pronunciation and grammar skills of a child. An AI-powered pronunciation app and a grammar tool available in the market, on the contrary, will each cost around `2000 a year, that amounts to
$4000, four times less than the cost of a tutor.

There are various nuances to be perfected to attain proficiency in a language. In this article, we will discuss how sophisticated machine-learning techniques can help a learner improve their pronunciation and grammar.


Comprising various components such as accent, intonation, modulation, diction etc., pronunciation is the way in which a word (narrowly) and a language (broadly) is spoken.

Automatic pronunciation teaching has been the focus of research in the domain of artificial intelligence in the last couple of years. Bringing together multiple disciplines and bodies of work such as linguistics, socioand- psycho-linguistics, speech recognition, pedagogy and auditory research, automating error detection in pronunciation training is truly interdisciplinary in its approach and practice. Research on computer-assisted pronunciation teaching (CAPT) dates back to the last decade of the previous century. However, the research gained momentum only in the last couple of years with an increase in computing power and improved speech recognition techniques.

Pronunciation errors can be broadly divided into two categories—prosodic and phonemic.

As against sentence, word and syllable, phoneme represents the smallest unit of a language. There could be serious errors on the phonemic side. Replacement, omission or insertion of phonemes could result in a change of meaning. For instance, ‘eat’ and ‘it’ are close sounding words. A non-native speaker may have difficulty in distinguishing between the phonemes /I/ and /i:/; while the first one is used in the case of ‘eat’, the second is used in the case of ‘it’. Replacing one with another could alter the meaning completely.

On the prosodic side, a non-native accent can be categorised in terms of stress, rhythm and intonation; i.e., energy and pitch, in simple terms. Errors could result due to the distance between the stressed and unstressed syllables, duration of inter-word pauses, rate of speech (words per second or minute) etc.

In recent years, the academic circle has engaged in a debate to understand the goal of pronunciation teaching–whether it is to sound like a native speaker or be intelligible. A consensus has been reached on this and it has been accepted that intelligibility is more important than sounding like a native speaker. AI-powered feedback should be provided keeping this goal in mind.

Recent research in this area has focused on designing audio-visual articulatory guides that would demonstrate the correct movement of the tongue with computer animations. Also, if the feedback showcases an approximation of intonation contours with information focussed on pitch, ascents and syllable boundary, it could help students correct their prosodic errors. These features have helped students improve tongue awareness, pitch and movement, eventually improving their pronunciation and articulation.

When it comes to prosodic errors, a one-size-fits-all approach cannot be adopted to address them as the pitch and energy varies from male to female; females have a higher pitch than males, and children have a higher pitch than adults. Thus, a sample of each of these cases has to be made available, which the software uses as a reference both to flag errors as well as to show corrective reference.


Grammar is a set of rules, typical to every language. For any sentence, phrase or even a clause to be intelligible, grammar rules should be followed. Grammar checking is one of the most common uses of the natural language processing (NLP) application. An interdisciplinary field of study involving artificial intelligence, linguistics and computer science, NLP could be said to be an interaction between a computer and human (natural) language.

As everyone is aware, language is contextual. Humans follow grammar rules but bring in their creativity and social and cultural influence to a language to make it their own; using it as a means of communication, they share their thoughts, ideas and feelings. Thus, language is natural to humans. Even when it involves communicating in a language that is foreign, humans can demonstrate intuitiveness towards it, but demonstrating such a kind of behaviour is not possible for computers. Dealing with human language is thus a new and exciting field for language technology, and to correct human mistakes is quite a challenge.

In most cases, the grammar checker functions involve checking grammar and spelling. A spelling checker can limit its functions to words alone but a grammar checker has to veer into contexts. The initial step involved in natural language processing involves morphological processing, i.e., analysis of words and non-words such as punctuations, and syntactic parsing, i.e., assigning a syntax to a sentence.


The three common approaches adopted to check grammar are:

  • Machine learning-based checking, where parts of speech (POS) tagging of each sentence is judged against an annotated corpus,
  • Rule-based checking, which is similar to the previous one but where the rules are drafted by humans, and
  • Hybrid checking, which is a combination of the first two methods.

Artificial intelligence-powered pronunciation and grammar tools are not only cheaper than human tutors, but they also allow students to learn at their own pace and space. A student struggling with one section can take their own time to learn new topics. Similarly, a student who makes rapid progress can move on to new topics faster. This allows students to take the reins of learning in their own hands.

Sai Charan Thirandas

Sai Charan Thirandas completed his Bachelors in Electronics & Communication Engineering from IIT-Guwahati. He joined Next Education India Pvt Ltd. as an R & D Engineer. He is currently working on the AI-based EdTech products – Adaptive Tests and Pronunciation Tool.


Leave a Comment

Your email address will not be published. Required fields are marked *