This New Way to Train AI Could Curb Online Harassment

For about six months last year, Nina Nørgaard met weekly for an hour with seven people to talk about sexism and violent language used to target women in social media. Nørgaard, a PhD candidate at IT University of Copenhagen, and her discussion group were taking part in an unusual effort to better identify misogyny online. Researchers paid the seven to examine thousands of Facebook, Reddit, and Twitter posts and decide whether they evidenced sexism, stereotypes, or harassment. Once a week, the researchers brought the group together, with Nørgaard as a mediator, to discuss the tough calls where they disagreed.

Misogyny is a scourge that shapes how women are represented online. A 2020 Plan International study, one of the largest ever conducted, found that more than half of women in 22 countries said they had been harassed or abused online. One in five women who encountered abuse said they changed their behavior—cut back or stopped use of the internet—as a result.

Social media companies use artificial intelligence to identify and remove posts that demean, harass, or threaten violence against women, but it’s a tough problem. Among researchers, there’s no standard for identifying sexist or misogynist posts; one recent paper proposed four categories of troublesome content, while another identified 23 categories. Most research is in English, leaving people working in other languages and cultures with even less of a guide for difficult and often subjective decisions.

So the researchers in Denmark tried a new approach, hiring Nørgaard and the seven people full-time to review and label posts, instead of relying on part-time contractors often paid by the post. They deliberately chose people of different ages and nationalities, with varied political views, to reduce the chance of bias from a single worldview. The labelers included a software designer, a climate activist, an actress, and a health care worker. Nørgaard’s task was to bring them to a consensus.

“The great thing is that they don’t agree. We don’t want tunnel vision. We don’t want everyone to think the same,” says Nørgaard. She says her goal was “making them discuss between themselves or between the group.”

Nørgaard viewed her job as helping the labelers “find the answers themselves.” With time, she got to know each of the seven as individuals, and who, for example, talked more than others. She tried to make sure no individual dominated the conversation, because it was meant to be a discussion, not a debate.

“The great thing is that they don’t agree. We don’t want tunnel vision. We don’t want everyone to think the same.” 

Nina Nørgaard, PhD student and moderator

The toughest calls involved posts with irony, jokes, or sarcasm; they became big topics of conversation. Over time, though, “the meetings became shorter and people discussed less, so I saw that as a good thing,” Nørgaard says.

The researchers behind the project call it a success. They say the conversations led to more accurately labeled data to train an AI algorithm. The researchers say AI fine-tuned with the data set can recognize misogyny on popular social media platforms 85 percent of the time. A year earlier, a state-of-the-art misogyny detection algorithm was accurate about 75 percent of the time. In all, the team reviewed nearly 30,000 posts, 7,500 of which were deemed abusive.

The posts were written in Danish, but the researchers say their approach can be applied to any language. “I think if you’re going to annotate misogyny, you have to follow an approach that has at least most of the elements of ours. Otherwise, you’re risking low-quality data, and that undermines everything,” says Leon Derczynski, a coauthor of the study and an associate professor at IT University of Copenhagen.

The findings could be useful beyond social media. Businesses are beginning to use AI to screen job listings or publicly facing text like press releases for sexism. If women exclude themselves from online conversations to avoid harassment, that will stifle democratic processes.

“If you’re going to turn a blind eye to threats and aggression against half the population, then you won’t have as good democratic online spaces as you could have,” Derczynski said.

Get in Touch

Related Articles

Latest Posts