Skip to main content
Exploring what’s next in tech – Insights, information, and ideas for today’s IT and business leaders

It's time for AI to explain itself

AI models get more accurate all the time, but even the data scientists who built them can't explain why—and that's a problem.

AI-driven algorithms are now a daily part of nearly everyone's lives. We've all grown used to machines suggesting a new series to binge on Netflix, another person to follow on Facebook, or the next thing we need to order from Amazon.

They're also driving far more important decisions, like what stocks to invest in, which medical procedures to consider, or whether you qualify for a mortgage. In some cases, an AI system might offer better advice than a human financial adviser, doctor, or banker.

But if you end up on the wrong side of an algorithm, there's no person you can buttonhole to explain why your loan application was rejected or your resumé discarded.

And all too often, the companies that created and deployed these algorithms can't explain them either.

Please read: Solving the AI efficiency conundrum

According to The State of Responsible AI: 2021, a survey sponsored by financial services firm FICO, two-thirds of companies are unable to explain how the AI models they've deployed arrive at decisions. And only one in five actively monitor these models to ensure decisions are made ethically and fairly.

AI has a transparency problem. Organizations that want to earn the trust of consumers, avoid the ire of regulators, or simply determine how well their machine learning models are actually working will need to adopt explainable AI (XAI) moving forward.

What could go wrong?

While no one is suggesting that AI is about to become self-aware and start waging war on humanity, the negative impacts of automated decision-making are well documented.

The most common problem is bias introduced while the AI model is being trained. For example, facial recognition algorithms are notoriously inaccurate when detecting darker skinned individuals, most likely because the training data included fewer people of color. Predictive policing models rely on existing criminal records, reinforcing decades of racial injustice. Resume-screening algorithms based on a company's historical hiring patterns can discriminate against women, older applicants, or people of color.

Please read: The ethical considerations of AI

Skewed or insufficient data can also lead to inaccurate predictions, making a model useless and potentially dangerous. For example, when a Florida hospital acquired an IBM Watson system designed to help oncologists treat cancer patients, the AI recommended procedures that would have made them worse. The reason? It relied on data from hypothetical patients, not real ones. The $62 million system was scrapped.

But eliminating bias from AI models involves more than merely excluding data from protected categories such as race, gender, and age. You also need to account for data that acts as a proxy for these categories. For example, if you live in Florida and have an AOL email address, a model might assume you're more likely to be a senior citizen.

And you need the ability to explain how an AI model arrived at a particular decision, especially when it has a significant impact on people's lives, notes Kirk Bresniker, a fellow at Hewlett Packard Enterprise who was involved in the creation of the company's AI Ethical Principles.

Most human-driven decisions can be traced back to their origins, adds Bresniker. With AI, that's usually not the case.

"You can go back and ask the lawmaker what they were thinking, or you can look at lines of code," he says. "But with AI systems, you can't go in and audit all the way back to the source data. If we can get to explainable AI technologies you can audit, there may be more areas where we can employ them."

Solving the ‘black box’ problem

Many current AI models were built using deep learning neural networks, which devise their own methods for interpreting data and whose internal operations are a mystery even to the people who designed them—hence the term "black box AI."

These networks may contain dozens of layers of mathematical functions. Data enters one layer and is analyzed and passed onto the next layer, which passes it on to a subsequent layer, and so on. Eventually, the network generates a prediction based on its internal calculations. Data scientists evaluate the results, adjust network settings as needed, and start the training process all over again.

Feed a neural network a million images of different cats labeled "cat" as well as a million pictures of other objects labeled "not cat," and it will learn how to identify a photo of a feline it has never seen before. But the route it took to arrive at the correct prediction may be opaque.

So-called glass box AI, on the other hand, reveals the inner workings of the model as it's being trained. A handful of glass box methodologies have emerged over the past few years.

Please read: AI sharpens its edge

FICO, which has been using algorithms to generate consumer credit scores for decades, is building shallow-sparse neural networks with just one or two layers, says Scott Zoldi, chief analytics officer at the $1.3 billion company. That makes it easier to see which combinations of data inputs have the greatest impact on a prediction.

So if a mortgage application is denied, glass box AI makes it possible to identify the factors that influenced the decision, such as the relationship between the loan amount and the applicant's credit history.

"Glass box models put a priority on interpretability and can perform just as well as black box," says Zoldi. "But you need to take time to build the models carefully. It's not something you can do in four hours."

Explaining the unexplainable

Glass box methodologies may represent a viable path for building new AI systems, but what about all the existing black box algorithms? XAI platforms like Fiddler.ai have emerged to help unravel some of the mysteries by reverse engineering the data that leads to specific predictions.

Attributing inputs to outcomes requires a bit of detective work, says CEO and founder Krishna Gade, who previously worked at Facebook, helping to explain, monitor, and debug its newsfeed algorithms. In the simplest terms, Fiddler works by plugging different variables into an AI model, seeing what results the model generates, and correlating the two. Do it enough times, and you can say with a high level of confidence that input A leads to outcome B.

"Say you've got an underwriting model that's trying to predict if a customer is likely to default on a loan," Gade says. "Which factors have the greatest impact? Is it their income? The loan amount? By probing the model with different inputs, we can draw statistically viable conclusions."

The point is not merely to explain how the models are working, says Gade, but to make sure they're working as intended. That in turn increases trust in the model among companies, employees, and customers.

Ultimately, however, systems like these are simply building new models to explain the old ones, says Zoldi. They may provide more plausible reasons for particular predictions, but they'll never fully explain how the original model arrived at them.

"We can't take a complicated, multilayered neural network and say with absolute certainty, 'This is why you didn't get the outcome you wanted,'" says Zoldi. "All we can say is, 'Here are the reasons why people who are similar to you got that outcome.'"

Legal, humble, and smart

Ultimately, organizations may not have much choice but to adopt XAI. Regulators have taken notice. The European Union's General Data Protection Regulation (GDPR) demands that decisions based on AI be explainable. Last year, the U.S. Federal Trade Commission issued stringent guidelines around how such technology should be used.

Companies found to have bias embedded in their decision-making algorithms risk violating multiple federal statutes, including the Fair Credit Reporting Act, the Equal Credit Opportunity Act, and antitrust laws.

"It is critical for businesses to ensure that the AI algorithms they rely on are explainable to regulators, particularly in the antitrust and consumer protection space," says Dee Bansal, a partner at Cooley LLP, which specializes in antitrust litigation. "If a company can't explain how its algorithms work [and] the contours of the data on which they rely … it risks being unable to adequately defend against claims regulators may assert that [its] algorithms are unfair, deceptive, or harm competition."

Please read: How AI makes Disney magic happen

It's also just a good idea, notes James Hodson, CEO of the nonprofit organization AI for Good.

"It makes business sense to know what decisions you're making and why," says Hodson. "But most machine learning models aren't audited or retrained on a regular basis. The data isn't updated after it's been deployed. Models I wrote and trained 15 years ago are still in production at companies today. The world has changed a lot since then."

It's equally important to document the process by which a machine learning model is built, including which decisions were made, who made them, and when, says Zoldi, who has patented a system that relies on blockchain technology to create AI audit trails.

Business leaders should understand the full implications of the technology their companies are deploying, he adds. They may need to corral the "data science cowboys" in their organizations and adopt so-called humble AI models that default to simpler algorithms when confidence in a prediction falls below a certain threshold.

In some cases, organizations may need to reconsider the use of AI entirely, says Bresniker.

"As interesting and beneficial as this technology might be, we may have to exclude it in areas where the potential harm outweighs the potential benefit," he says. "Using machines to augment the decision-making process is very powerful, and we want to apply it to the most human-centric challenges we face. But we need to do it in a way that's consistent with our ethical standards."

This article/content was written by the individual writer identified and does not necessarily reflect the view of Hewlett Packard Enterprise Company.