in , , , ,

Does Your Machine-Learning Model Have To Be A Black Box To Work Well?

CEO and Co-Founder of DataVisor, the leading fraud detection company with solutions powered by transformational AI technology.

Can you tell the difference between a husky and a wolf? Both are large canines with shaggy, dense fur. Both have longer snouts and pointy ears. Both look huggable — but one definitely isn’t. And while all dogs share a huge portion of their DNA with wolves, huskies are even more closely related, because they’re descended from wolf-dog hybrids.

As humans, we’ve been trained through experience to distinguish between a domestic dog like a husky and a potentially dangerous wild animal like a wolf. If asked, we can easily explain our reasoning. But what happens when a machine-learning algorithm is put to the same test?

Depending on the data and the weight a particular classifier is given in the model, you may get different results. Even if the results are highly accurate, how the model made the decision is a mystery. Did the model really determine physical differences between the animals — or did something else trigger the decision?

In 2016, a researcher from the University of California, Irvine, highlighted the black-box nature of complex machine-learning models. He demonstrated how a student’s highly accurate algorithm for distinguishing between huskies and wolves was making decisions based on ancillary data, specifically, the presence of snow. Labeling an animal “wolf” or “husky” had nothing to do with the animal itself but with the surrounding environment.

The discovery led to an important question: How can we trust machine-learning algorithms if we can’t explain how they work?

Trust Issues

Although examining the structure of classifiers in simple linear models or trees can provide clues, as models increase in complexity, they become “black boxes.” The more complex the algorithm is, the harder the results are to explain.

For most applications, explainability isn’t that critical as long as the model works well. When an AI-based model makes a prediction, its accuracy can be 70-80% at first, and as it learns from new data, it becomes more accurate. The more data, the better — and the more confident we are in its ability to make accurate predictions.

Who cares how it works, as long as it works, right?

But explainability becomes critical when the results can have an impact on someone’s security or safety or put an individual at risk for legal or financial ramifications. Say you’re accused of fraud because an algorithm detects money-laundering activity. That’s a serious crime. And organizations can face devastating consequences if their models are found to discriminate based on certain attributes, such as race or gender. Explainability is needed to comply with regulations that exist to protect consumers and is equally critical for justifying decisions related to health or safety.

Take the recent Apple Card scandal, for example. Goldman Sachs’ credit card practices were scrutinized after a well-known entrepreneur called out the financial services giant for discriminatory practices when his wife was denied a credit line increase despite having a higher credit score than him. The complaint led to an investigation by the New York State Department of Financial Services (DFS) to determine if the algorithm used to make the credit limit decision was discriminating based on sex — a violation of state laws.

Such incidents underscore the impact of explainability on trust and, in turn, a brand’s reputation. According to PwC’s 2017 Global CEO Survey, 67% of business leaders believe that AI and automation will negatively impact stakeholder trust levels over the next five years.

You’d Better Have A Good Explanation

As Microsoft points out, explainability is needed to support initiatives for transparency, particularly when customer trust is at stake. In one demonstration, they showed how a retailer can leverage explainability to support AI-driven product recommendations, thereby building transparency and trust in AI among customers.

As lack of trust in AI has been cited by 39% of IT decision-makers as the biggest hurdle to adoption, the efforts of all of us AI practitioners could have a big impact on the industry.

When AI is used to make critical decisions in finance, healthcare and industries that require other safety-critical applications such as autonomous vehicles, explainability isn’t an option — it’s a requirement. But as many advanced AI-driven algorithms provide limited explainability, an uphill battle lies ahead.

While some complex algorithms such as deep-learning may not offer the explainability necessary for scenarios that require it, such as finance or healthcare, there are other machine learning algorithms that are highly effective and also explainable. For example, decision tree-based learning algorithms, by nature, offer better explainability because the learned tree paths resemble the rules workflows with variables, which are familiar to analysts and easier to explain. 

One common misconception is that unsupervised machine learning (UML) — where you don’t need labeled data to discover new patterns — works more like a black-box approach. 

However, depending on the structure of the algorithm, a UML model can actually be fully explainable. Clustering, or linkage analysis-based UML, can usually provide very specific reason codes for why a certain transaction or activity is flagged as fraudulent, by describing on the feature dimensions, such as activities, behaviors, timing and other factors, where suspicious patterns group and emerge. Furthermore, these features can be represented visually clustered together in a graph, helping fraud teams link suspicious activities, pinpoint connections and connect the dots. In this way, fraud teams achieve highly accurate results and are fully equipped to explain how they came to their conclusions.

Finding The Right Balance

All in all, not all advanced machine learning models are black box, and for most applications, a degree of explainability is sufficient to meet legal and regulatory requirements. For fraud detection, choosing algorithms that provide both accuracy and explainability is always your safest choice.

And remember, if you’re running from a wolf, never stop to explain why!


Forbes Technology Council is an invitation-only community for world-class CIOs, CTOs and technology executives. Do I qualify?


What do you think?

NFT “Idea Tokens” Are Not Just Here To Stay. They Are The Future Of The Economy.

IT leadership: 3 ways to enable continuous improvement