More and more tech companies have initiatives in place to support Diversity, Equity & Inclusion (DEI) work. But even as Chief Diversity Officers get hired and diversity statements make their way onto company websites, diverse representation in tech is still lagging. This representation deficit, particularly in product and engineering departments, has huge implications. With the current population of software engineers comprising 25% women, 7.3% Latinos and 4.7% Black people, the teams building technology are not adequately representing the people using it.
Artificial Intelligence (AI) is an area of computer science that focuses on enabling computers to perform tasks that have traditionally required human intelligence. The innovations leveraging AI can be incredibly powerful, but they are as prone to biases as the humans that made them. Representation in this case needs to go well beyond “diversity of thought.” When the right perspectives, identities and experiences don’t go into building, training and testing AI, the outputs can range from embarrassing to life-threatening.
New cases of biased AI are constantly surfacing. For anyone looking to avoid these missteps and to make the case for diverse representation on AI engineering teams, here are a few of the many examples to learn from.
AI can be used to predict a future event or outcome based on past data and pattern-matching. While this can provide incredible insights into the future, this is an area that can be fraught with bias, based on the data used and how it is trained.
For example, an analysis of a health AI system that was used to predict which patients should get additional medical care found that the racial bias introduced by the algorithm “reduces the number of Black patients identified for extra care by more than half” and that fixing this disparity “would increase the percentage of Black patients receiving additional help from 17.7 to 46.5%.”
An investigation by Propublica into an AI criminal scoring system found similarly life-impacting results. An analysis of the risk assessment tool that is used in courtrooms to inform decisions around who can be set free found that “the formula was particularly likely to falsely flag Black defendants as future criminals, wrongly labeling them this way at almost twice the rate as white defendants,” while “White defendants were mislabeled as low risk more often than black defendants.” These are cases where errors in AI outcomes aren’t just numbers — they’re human lives.
Language processing for both voice and text has been a leading scenario in AI research. There are continued reports about biases emerging from work in this space.
One example that surfaced a few days ago involved a Google translation of Hungarian text, which is a gender-neutral language. Google Translate inserted gendered pronouns into the gender-neutral phrases, revealing strong gender biases. These included, “He’s a professor. She’s an assistant.” and “He makes a lot of money. She is baking a cake.”
Voice recognition is another area of language processing that has long performed worse for non-male, non-white voices. With the rising-ubiquity of voice assistants, such as Siri, Alexa and Google Home, this has a broad-ranging impact. A study found that for Americans with English as a first language that spoke to a voice assistant, the accuracy rate for a white man was 92%, for a white woman was 79%, and for a mixed-race woman was 69%. As more of our systems rely on voice technology, from medical communications to licensing and authorizations, these biased results can have significant consequences.
AI is used to understand and make decisions about imagery as well. This is an area where machine learning biases come up frequently.
Many people remember the story several years ago when Google Images suggested that the faces of Black people were gorillas—and the even stranger outcome of them fixing the issue by removing gorilla images from their library rather than doing a better job of recognizing Black faces. Another example of imaging biases emerged in recent months, when people discovered that the image previews on Twitter favor white faces over Black faces, regardless of where the face appears in the image.
While these examples yield offensive results, it’s not difficult to imagine how the results of faulty image analysis can lead to more life-impacting situations as well. As identities and roles become more closely tied to image verification, the unbiased accuracy of these algorithms becomes increasingly crucial.
AI impacts the physical world as well, thanks to innovations in IoT (Internet of Things), advanced sensors and manufacturing automation. Because of this, AI biases can have significant physical implications.
There have been several cases where automatic sinks and soap dispensers did not recognize hands with darker skin, due to the way they were calibrated and tested. A much more life-threatening example has emerged with self-driving cars, with recent studies showing that pedestrians with lighter skin were more detectable, and thus less likely to be hit by the car, than pedestrians with darker skin.
Every person has biases. The key is to pay attention to what those biases are, and then actively retrain our thinking around the ones that prove harmful. Likewise, AI will always be as biased as the humans that created it. That’s why having diverse representation in the people who are programming, calibrating and testing AI algorithms is of utmost importance. The best way to avoid mistakes that range from awkward to dangerous is to ensure that the people building products represent the people using them.