Machine learning model for Indian languages revealed by Google
Last updated on December 21st, 2020 at 05:35 am
Machine learning model: In a bid to promote local Indian languages, global tech giant Google has unveiled Indian languages learning model. The move is welcomed as more non-English speaking Indian population is progressively using the internet.
Sanjay Gupta, Country Head and VP, Google India said, “India has added over 100 million new internet users from rural India in the last two years. Every new user coming online is an Indian language user, and we are committed to play a part. Today, we are calling up the industry to take a Bharat-first approach and build an internet that works for every Indian.”
The machine learning tool for Indian languages was unveiled on Thursday by Google. This tool is intended to assist researchers, students and also startups that are moving ahead to build technologies based on local languages.
The tool is developed at Google India’s research unit and is for now supporting English and 16 local Indian languages. The model has been trained through Google’s own language learning model BERT (Bidirectional Encoder Representations from Transformers). BERT currently is used to analyze English queries on the search engine.
The model is called Multilingual Representations for Indian Languages (MuRIL). It is designed to address limitations around understanding of Indian languages by computer systems, including spelling variations, mixed languages and specific use cases. The model will also support transliterated text, for instance, Hindi using Roman script.
Partha Talukdar, research scientist at Google Research India said, “MuRIL is a starting point of what we believe can be the next evolution for Indian language understanding. We hope it will prove to be a better foundation for researchers, startups, students, and anyone else interested in building Indian language technologies.”