Can anti-bias efforts help women get their voices back?
Ellyn Winters-Robinson just can't get anywhere with Siri. "I have tried time and again," she says, but invariably Apple's voice assistant simply can't understand her. "I don't know whether I talk fast or I mumble or it's my Canadian accent, but every time I ask Siri try to call my husband, she dials someone else. Inevitably it ends up with me giving up, losing my temper, and swearing at her."
Winters-Robinson isn't alone. Women have struggled for decades with a range of technologies designed to understand, transcribe, or reproduce their voices. As artist, composer, and engineer Tina Tallon wrote in a 2019 New Yorker article, the trouble dates all the way back to the 1920s and the dawn of AM radio, when government regulations throttled the amount of bandwidth assigned to each station. Limited to a narrow voiceband of just 300 to 3400 hertz, AM radio tended to (and still does) sound tinny and hollow, particularly for women broadcasters, whose voices naturally register at a higher pitch. As a result, broadcasting stations reportedly gravitated to deeper voiced (and overwhelmingly male) on-air talent, noting that women's voices didn't sound appealing. Additional problems ensued during the dawn of digital music, when frequency throttling limited the dynamic range of compressed audio, trimming the quality of higher-pitched audio, again impacting women more than men.
More men in the audio industry meant more men's voices recorded for posterity – at higher quality – and a cascading bias effect was the result, to the point where similar struggles persist today. The Register reported in 2018 that voice recognition algorithms were "naturally sexist," favoring men's voices over women's because of their lower average pitch and due to insufficient inclusion of women's voices in training data – the latter an issue directly brought about by years of women's underrepresentation in archived voice recordings. Research published by Rachel Tatman at the North American Chapter of the Association for Computational Linguistics (NAACL) previously found that Google's speech recognition systems were 13 percent more accurate for men vs. women.
Garbage in one ear, garbage out the other
The bias problem here turns out to be a layered, complex one. "So many of these tools in AI are built on large language models," says Tallon, author of the New Yorker piece, "where all these different types of biases are hard coded in, and a lot of those biases are very clearly demonstrated. When you look at what GPT-3 is trained on, a lot of it is literally upvoted Reddit posts. It's basically just a giant cybernetic human centipede of garbage."
Please read: How AI is changing the way we talk to each other
"Part of the issue is that there's this feedback mechanism of things that become increasingly polarized based upon the engagement that it gets," adds Tallon. "So we're training models to create things that get more engagement by using things that have already gotten more engagement. Everybody complains about hyperpolarization in the political sphere and in society in general. But it's our tools that are literally doing that."
As well, voice recognition models suffer from a related problem with bias that is endemic across the burgeoning AI industry, says Tallon. In language processing, the training data is biased, and it's used to generate transcripts that are then used to retrain the AI models. Poor quality content is thrown out, "so you're further enriching the biased content," says Tallon. "It's like a feedback loop, where these biases become embedded in it in different stages."
Lastly, the issue is compounded by compression algorithms common in the audio space. "The vast majority of people are engaging with the lowest common denominator: MP3s and streaming audio which is being listened to on a little phone speaker," says Tallon. "You're dealing with highly compressed, highly degraded data. That ends up amplifying some of these biases further."
An algorithm you can trust
Solving these myriad issues won't be easy, but there is hope.
"There's a lot of research going on here, but it's still not really transparent," says Rohini Chavakula, data scientist with Hewlett Packard Enterprise's AI & Security divisions. "I would say it's still a black box. I'm trying to make it a glass box, or at least a white box."
Rohini doesn't just work on AI bias issues for HPE. She's experienced them firsthand. "We got a Google Home Mini about a year ago," she says. "When my husband would say something, it detected it very nicely. When I was talking, it was always misspelling things and giving me the wrong output."
Part of the solution, she hopes, can be found in her research into Trustworthy AI, an initiative designed to identify ethical principles, including security, privacy and inclusion, to guide the development of AI platforms. When it comes to voice recognition, Rohini says her primary focus is on input bias. "It's really hard to understand the algorithmic bias," she says, "so we are looking at what we are feeding into the algorithm. Can we look at the error rates due to bias and then see how we could tweak the input based on that?" Rohini notes that this involves a series of tradeoffs: Limiting data input because of perceptions of bias restricts the ability of the AI to learn anything at all. The tighter the grip over the input provided, the less accurate the AI may become on the whole.
A second piece of the puzzle, says Rohini, is much more difficult to control: Human bias that is unconsciously inserting bias into this domain. "The human element is still present in AI," she says, "and that human element is inducing social bias, which has to be reduced. It's a human who is deciding, in training and validation and the testing phase for an AI solution, if the model is good enough to move into a live environment. That's where Trustworthy AI can come into play, using policies and regulations to formalize steps that can help improve and control bias."
Another goal that could help solve this problem organically: Get more women into the AI field. According to Rohini, of those working in voice AI research and development, only 12 percent are women.
Can you hear me now?
Tallon agrees that the some of the tenets of Trustworthy AI have promise, but that we have a long way to go. "There are companies out there which do AI audits and equity audits," she says, "and I think that that'll probably become more commonplace over time. But then Google goes and fires its equitable AI team. It doesn't inspire a lot of confidence." Tallon adds that some form of governmental regulation may ultimately be the best and fastest way forward.
The good news is that something seems to be working, and things are improving, if cautiously. A follow-up study by Tatman at the NAACL found that bias in voice recognition had decreased considerably – at least for women – though there was still a considerable performance difference along racial and regional dialect lines.
Rohini says her Google Mini is getting better too. "I still see errors," she says, "but the accuracy is little bit improved. It's like now it finally started learning, understanding my voice."
"These different types of biases are hard-coded in, and a lot of them are very clearly demonstrated. It's basically just a giant cybernetic human centipede of garbage."
This article/content was written by the individual writer identified and does not necessarily reflect the view of Hewlett Packard Enterprise Company.