Preserving Indigenous Languages in the Digital Age
Indigenous languages, rich repositories of cultural heritage, are under the looming threat of extinction. The United Nations Educational, Scientific and Cultural Organization (UNESCO) warns that without intervention, 40% to 50% of the world’s 6,000 Indigenous languages may vanish by the end of this century, leading to the loss of not just words but entire worldviews and cultural know-how.
In response to this crisis, Indigenous language speakers and experts across the Americas are leveraging technology to safeguard their endangered tongues. According to linguistics professor Roberto Zariquiey of the Pontifical Catholic University of Peru (PUCP), who spoke to Axios Latino in a recent article, losing a language means losing more than just words: it means losing knowledge, cultural nuances, and unique perspectives on the environment.
Zariquiey, along with linguist Mariana Poblete, is spearheading a groundbreaking initiative in the Peruvian Amazon. Collaborating with the Max Planck Institute in Austria and the University of Zurich, their team is utilizing high-tech cameras in psycholinguistic studies to understand the role of vision in language. In addition, the researchers have also developed two apps that teach the Iskanawa language, integrating flashcards and a pronunciation guide. These apps not only provide an interactive learning experience but also contribute to the documentation and preservation of Indigenous languages.
This is not the only case of technology used to preserve indigenous languages. The Axios article also mentioned that, in Mexico, a group of undergraduates has taken a similar initiative with the Miyotl app, a free language learning platform that began as a bilingual dictionary for Maya, Mixe, and Hñahñu languages. Now encompassing 15 of Mexico’s 68 Indigenous languages, Miyotl includes annotated texts, aiding in reading comprehension and understanding grammatical constructions.
These efforts are not exclusive to Latin America. In fact, they are happening in the U.S. too. The Language Conservancy, a U.S.-based nonprofit, collaborates with Native American linguists, employs a computer program for rapid word collection in languages like Apache, Lakota, or Cree to create online dictionaries and educational materials. Alaska is not staying behind. There are efforts to integrate technology in the preservation of the Tlingit language. However, large language models (LLMs) have a very difficult time processing the complexity of verb tenses and result in producing text with little sense.
Native North American languages, characterized by their polysynthetic nature and distinctive phonetics, present a formidable challenge for existing software and algorithms built on European languages. As Native languages can convey complex ideas in a single word, adapting existing language models becomes a complex and labor-intensive process.
While technology offers immense potential in preserving Indigenous languages, concerns about intellectual property rights and potential biases in data collection linger. This was further explained in an article in the Japan Times by Rina Chandran. There, Peter-Lucas Jones, CEO of Te Hiku Media, a Māori-owned organization in New Zealand focusing on language conservation, highlights the importance of ensuring that AI models using Indigenous languages are developed with respect for cultural sensitivities and without scraping data from the internet indiscriminately, which has also become a concern in other fields.