Skip to main content

Technology in Bharathiya Bhasha for Schools


Before I try and put my views on what we should be doing in terms of Technology in Bharatiya Bhasha (Indian languages), I would quickly like to skim through what is the current status or the work that has happened, especially in terms of using technology in Bharatiya languages.

If you look specifically at the topics for discussion, on Day 2, Session IV session,  namely, development of Indic keyboard, handwriting recognition for Bharatiya scripts or the voice-to-text conversion for Bharatiya languages, they broadly fall into the realm of Human Machine Interaction or Human Computer Interaction.

A significant amount of work has happened and is happening, as we speak, in several academic institutes, startups, established industries, but in isolation and in pockets and in many cases the same things (same languages, same use case) is being done. Fortunately, several academic institutes have been liberally funded by the department of science and technology and the progress in terms of outputs or performance has been immense. So the good thing is some of the best minds in India have been working on these cutting-edge technologies for quite some time now. Add to this, the technological improvements that have shaped up in the recent past, namely (a) deep learning architectures especially for language processing and  (b) the recent improvements in the kind of computing, quantum computing included; there has been a significant jump in the performance of most of the technologies and working systems. 

Our own work in Bharatiya Bhasha and the recent advances in deep architecture and computing is an advantage that we can ride on.
While most of the technologies have been demonstrated for a certain type of data and for certain specific languages which are popular around the globe, this is not a disadvantage for us. These architectures which enable enhanced performance can actually be used quite effectively to address Bharatiya languages especially if we have access to the data required to train these architectures. As an example, if you want to build a voice-to-text system you need to have speech data spoken in Bharatiya language and the corresponding Bharatiya text  that you want to generate.

Technological effectiveness has been demonstrated  for English for which there is access to significant amount of readily available data.

Fortunately we are not far behind, we have agencies like the National Language Translation Mission, which have started collecting speech data for different Indian languages (called Bhasha Daan) and actually building demonstrable systems which work as of now in the offline mode and not necessarily in real time yet.  But for real time performance, the technology demonstrated is significantly matured. I wouldn't claim it to be as good as for English but definitely we are on the path of developing  high quality language systems. However these are generic systems.

However, if you look from the education angle there is a need for certain type of speech and text data. For example you might build a speech recognition engine for an Indian language, but for them to work for children of different age groups would take a lot of effort. What is probably required is to be able to collate speech data from school going children. Now this is not a very easy task because not only  are there a large number of Indian languages (+dialects +accents) but the variability in spoken speech in children is very very varied.  A child's voice changes dramatically with age. So all this has to be kept in mind if we want to make a serious effort to build something that can be used by our school and college children.  As the saying goes “data is the new oil” we need to collect data from people for whom we want to build functional systems. 

Building systems for children is a mammoth task requiring government agencies, academic institutions and industries to work together very closely and collect data.
Lets face it. Most of us have been influenced my our teachers to choose a certain specialization of career path. We have had our share of motivated teachers who make sure the children develop interest in that subject.  Specifically from the school and children angle why can not we have motivated teachers for all the children?

Say a teacher teaching in Telugu in Andhra Pradesh has the rare ability to motivate children. Why not have this teacher teach in Odiya in Orissa school? What we probably need is a speech to speech translation technology that can actually translate from Telugu to Odiya. Language translation technology exists. However most of these systems have been trained using generic data. For example they may be built by scraping the data (available in abundance) that has come from newspapers or on Wikipedia articles which have very specific kind of information. So what is really required, for enabling Telugu teacher teaching in Odiya is for us to very systematically collect data or scrape data which is very very specific to the school curriculum. By doing this you should be able to, in a way, get a good teacher in language A to be able to deliver the same technical knowledge in language B without the student actually realizing that the teacher can not speak in Odiya.

An additional possibility is to enable the teaching material to be spoken in the voice of a certain personality. How about learning mathematics from Shakuntla Devi? or history from Amitab Bhachan? Personalized text to speech as a technology exists. We need data, specific to school curricula and in a form that can be used to train speech generating systems (text to speech)

How about learning mathematics from Shakuntla Devi? or history from Amitab Bhachan? 

For varying reasons, Indian students do not ask too many questions, though they are inquisitive. Technology can help build self help systems, like conversational interfaces,  where the children can actually seek answers to their queries without any inhibitions. This is possible using technologies like large language models (LLM). To build LLMs, especially  when it comes to schools, you need to be very careful about the authenticity of the data. You cannot have children learn incorrect information. Luckily the LLMs don't play so much havoc with syntax  but they have hallucinations, which is a huge problem from an angle of stating true facts. The curation of data  is very very important if these are to be adopted for schools and colleges. 

We need a collaborative effort involving several different organizations to come together to make it happen.
To summarize, what I think, are the important things that we should be working on are

  • build systems which can listen to  children's voices and respond interactively in the language of their choice with empathy.
  • give them technology which can be used  to listen to their lessons in the language of their choice and in the voice of their choice.
  • tools that can enable them to self assess themselves again in the language of their choice by interacting with a voice of their choice.


Exams are every students nightmare every year we have our Prime Minister having a session with children to how to face exams! Self assessment conversational interfaces might help ease the exam nightmare.

Exams are every students nightmare. Technology can help.

Most of these things are something the current or the near future technology can enable; but we need to put very serious and concentrated effort in collecting data, curating it and making sure it is in a form that can be digested by machine learning architectures.

Before I end, there is one small catch. Technologies that cater to human machine interactions have cultural aspects built into them as well. Today we use design aspects that are broadly conceptualized in the western world to cater to the western cultures and as a result are not the right fit.  One way of overcoming this is to  build people's strength in higher colleges 

To wrap-up. We are a huge country. We have loads of challenges in terms of number of languages, dialects, accents. However, we have the intellectual energy in terms of ideas. We have access to usable technology. We have the ability to build working solutions. We have people who are motivated. We have the belief.

We just need a collaborative and a focused effort to move the needle.

Thank you.

Comments

Usha said…
Insightful rendering, Sunilgaru. Neat concept!

Popular posts from this blog

Visualizing Speech Processing Challenges!

Often it is difficult to emphasize the difficulty that one faces during speech signal processing. Thanks to the large population use of speech recognition in the form of Alexa, Google Home when most of us are asking for a very limited information ("call my mother", "play the top 50 international hits" or "switch off the lights") which is quite well captured by the speech recognition engine in the form of contextual knowledge (it knows where you are; it knows your calendar, it know you parents phone number, it knows your preference, it knows your facebook likes .... ). Same Same - Different Different:   You speak X = /My voice is my password/ and I speak Y= /My voice is my password/. In speech recognition both our speech samples (X and Y) need to be recognized as "My voice is my password" while in speaker biometric X has to be attributed to you and and Y has to be attributed to me! In this blog post we try to show   visually   what it means to pro

BITS Pilani Goa Campus - Some Useful Information

You have cleared the BIT Aptitude Test and have got admission to BITS Pilani Goa Campus. Congratulation . Well Done. This is how the main building looks! Read on for some useful information, especially since you are traveling for the first time to the campus and more or less you will face the same scenario that we faced! We were asked report on 29-Jul-2018 (Sunday) to take admission on, 30-Jul-2018.  We reached Madgoan (we traveled by train though the airport is pretty close to the BITS campus, primarily to allow us to carry more luggage!)at around 0700 hours (expect a few drizzles to some good rain - so carry an umbrella) on 29-July-2019. As you come out you will be hounded by several taxi drivers, but the best is to take the official pre-paid taxi. It should cost you INR 700 to reach the BITS campus. We had booked a hotel in Vasco (this is one of the closest suburb from BITS campus, a taxi should charge you around 300-350 INR; you will make plenty of trips!) and

Authorship or Acknowledgement? Order of Authors!

 {Personal views} Being in an R&D organization means there are several instances when you have to write (Scientific or Technical Papers) about what you do in peer reviewed conference or journals.Very often, the resulting work is a team effort and as a consequence most papers, written today, have multiple authors.  Few decades ago, as a research scholar, it was just you and your supervisor as the two sole authors of any output that came out of the PhD exploration. This was indeed true, especially if you were writing a paper based on your ongoing research towards a PhD. In the pre-google days, the trend was to email the second author (usually the supervisor) to ask for a copy of the paper so that you could read the research and hopeful build on it because you knew that the supervisor would be more static in terms of geo coordinates than the scholar.   However the concept of multiple authors for a research article is seeping into academic research as well. These days labs write papers