06/23/2026 | News release | Distributed by Public on 06/23/2026 16:13
Jared Coleman, assistant professor of computer science at Loyola Marymount University, is working on documenting, preserving, and revitalizing endangered indigenous languages using artificial intelligence (AI) and computational linguistics. A member of the Big Pine Paiute Tribe of the Owens Valley in California, Coleman got into this work because he wanted to support his community's ongoing efforts to revitalize their endangered language.
Through his work in this field, Coleman acknowledges that computer science scholars, linguists, and language educators often work in silos and might not include members of indigenous communities early in the process. To facilitate a more collaborative approach, Coleman conceived and hosted an "AI and Indigenous Language Revitalization" Workshop in June 2026 to explore how AI can be developed and applied responsibly in endangered language contexts.
This workshop was funded through a $50,000 grant from the National Science Foundation awarded to Coleman in fall 2025 to further investigate endangered language revitalization.
Held on campus at LMU, the three-day workshop convened 60+ invited guests from across the country, including indigenous community members, computer scientists, AI experts, linguists, archivists, language teachers, nonprofit leaders, education experts, LMU students, students from other universities, and staff from AI and computer science companies.
One of the main goals of this type of work is to combine advanced natural language processing (NLP) with community-driven data stewardship to create accessible translation tools, speech recognition software, and immersive learning platforms. The conference included panel discussions, expert presentations, talks, tutorials, and breakout sessions.
"Through my research, I learned how far the traditional language documentation field has come in moving from a largely extractive process to a more community-driven one," said Coleman. "AI, on the other hand, is still in a largely extractive phase of its research lifespan. My hope with this workshop was to bring together people from very different backgrounds to surface shared concerns and move toward defining what responsible AI looks like within this space."
During one of the panel discussions, Kiana Maillet, a member of the Lone Pine Paiute-Shoshone Tribe, brought up a common concern around outsiders extracting knowledge from marginalized communities. "Our community has many talented and knowledgeable people, but we lack the resources needed to do this work," said Maillet, assistant professor of American Indian studies at University of California, San Marcos.
"Right now, outside people and organizations often come into our communities, extract knowledge and benefit from this work because they have the resources," Maillet explained. "What we need instead is for them to provide the technology, training, and support our community needs, then allow us to lead and carry out the work ourselves."
Over the three-day workshop, some key conversations and pain points like this came to the surface. For example, participants discussed the mistrust from indigenous communities who have a long history of extractive research and being taken advantage of paired with AI/Natural Language Processing (NLP) researchers not understanding that history.
On the technical side, participants also brought up how AI jargon can lead to misunderstandings and barriers to real conversations among stakeholders. In addition, indigenous communities' goals and needs vary widely so there is no one-size-fits-all solution.
"My primary takeaway from this workshop is that conversations about this topic are genuinely difficult, for many reasons," said Coleman, who specializes in a machine translation approach called Large Language Model-Assisted Rule-Based Machine Translation (LLM-RBMT). In this method, the large language model (LLM) guides rule-based translators, which rely on grammatical and vocabulary rules to translate between languages.
"In the age of AI, more people should be paying attention to this topic," said Coleman. "Nearly every major question about AI society is asking shows up in language revitalization work: environmental impact, privacy, data sovereignty, consent, ownership, misrepresentation, and misinformation. AI has significant potential for positive solutions. If you want to understand what responsible AI looks like in practice, not just in principle, this is one of the best places to look."
As part of the grant, Coleman is working with workshop participants to build a living, example-driven document that offers researchers, communities, and institutions concrete guidance on responsible AI in practice. Another outcome, aimed at supporting sustained collaboration, will be the formation of an interdisciplinary working group, including many of the disciplines represented at the workshop. This working group will be formed in partnership with the San Diego Supercomputer Center's CORE Institute Fellowship, which will put out a call for fellows soon.