The Facebook Papers have revealed how ill-equipped the platform’s automated moderation systems are, especially in non-English languages. One document stated that Facebook had not developed hate-speech classifiers in Hindi and Bengali until 2018 and 2020, respectively.

One curious fact is that there isn’t much data on the internet in languages other than English. Hindi, for instance, is the third most-spoken language in the world, but the 32nd most-used online, accounting for just 0.1% of all internet data. 

Misinformation researcher Tarunima Prabhakar has experiential knowledge of this shortcoming. While peers in the U.S. trained language-learning models with a few lines of code, she spent six months listening to names of vegetables in Gujarati. “That’s what you do when you have an under-resourced language,” she said.

She briefly worked as a policy researcher at Facebook in 2018. Now, Prabhakar’s day job involves curating social media datasets in multiple Indian languages and opening them up for the public to study misinformation in the Global South. I spoke to her about the complexities in building hate speech classifiers for “resource-poor” languages.

Many believe that Facebook is unable to build efficient AI hate classifiers in non-English languages because the internet is skewed toward English. Do you think that’s the case? 

I think that’s part of the problem. But then, if you were to devote resources to it, you would find ways to collect the data. If Google could have gone and digitized books to create a Google Books project, if you prioritize language representation, you will collect that data as well. So it’s not as if [lack of local language data is] the only reason. I really think it’s just prioritization. If anyone has the resources to do this, it is Big Tech. 

Unfortunately the state of machine learning is that it’s led by Big Tech. No one has the resources they have. So yeah, I think, if they decided to actually devote resources, even based on their markets and not based on where regulation is going to hurt most, I think the world would look a lot different. 

Can you share examples of tools that are easily available for building hate speech classifiers in English but are difficult to access in other local languages?  

A very basic thing would be a list of slurs. For example, there’s something called the racial slurs database. You’ll find a bunch of slur lists in English, but, if you just need even basic slurs in Hindi, these lists don’t exist. Indian English deserves unique attention because some of these platforms are doing a really bad job of capturing slurs in Indian English. For instance: <3day [spelled, styled, and pronounced in a variety of different ways].

This is basic infrastructure because, once you have that, you need to keep adding to it, watching out for variations, trying to match weird spelling variations with the terms that are on the list. So what we’re doing currently is really just asking activists to help us develop a list of slurs that are used online. If we get some of these resources built, it will hopefully nudge platforms to actually start acting on it. 

What will be your next step after the slur list?

The way we’re piloting it is to use it to train a model that detects gender-based violence, hate speech, and harassment on a person’s Twitter profile. And then you can either mute the tweet or replace the slur. It’s a way to give a user more agency to control what’s happening on their own feeds.