Are AI models explainable in a way that humans can understand them?

by | Aug 14, 2020

A reality-rooted perspective on "explainable AI" and what this means for the future of the field.

Image credit: Getty Images

In recent years, artificial intelligence (AI) has gained widespread interest in academia and industry for providing groundbreaking solutions to difficult learning problems. For instance, algorithms for image and audio processing have been developed using deep learning that considerably improved upon previous state-of-the art methods like SVM or random forest. Further applications include the advent of self-driving cars, medical diagnosis systems, and game playing computers (AlphaGo). These success stories brought back the original spirit of AI, which was established as a field in 1965 with a dream to create thinking machines.

The downside is that the computational methods underlying AI are not only complex in the sense that they are based on lengthy and nested computer code, but that the methods themselves are intricate with a network-based structure which can be lengthy and nested.

In this context, to what extend are AI methods explainable in a way that humans can understand them? In the literature, this problem is known as explainable AI. In a recent study published in WIREs Data Mining and Knowledge Discovery, we sought to investigate this question.

Previous studies centered on this question addressed the problem from the perspective of what explainable AI should be. However, such a perspective neglects constraints and limitations inherent in any system, including AI. For this reason, much work in this area is based on wishful thinking and potentially unattainable goals.

To put this into perspective, in physics, the term theory is generally associated with a mathematical framework derived from a set of basic axioms or postulates which allows one to generate experimentally testable predictions. Typically, these systems are highly idealized, i.e., the theories describe only certain aspects of reality. With respect to the stringency with which theories have been quantitatively tested, theories in physics, such as general relativity or quantum electrodynamics, are certainly what can be considered our best working scientific theories.

In general, a theory provides an explanation of the obtained results. However, these explanations do not come in the form of a natural language, like English, but are formulated using mathematics. This can lead to problems or hurdles in understanding for non-expert users as natural languages can only insufficiently capture mathematical formalism. For example, there is to date no generally accepted interpretation of quantum mechanics but there are several schools of thought with regards to what these results actually mean in the context of the real world.

If one considers that the applications of AI lie beyond physics, one finds that one cannot expect to have a simple universal theory for AI in medicine, sociology, or management than what we have for physics. For this reason, even the interpretability of such a theory can be expected to be more complex than the interpretation of quantum mechanics, because the fields studied by AI, which can be considered as complex adaptive systems, are on a higher level complexity as they exhibit emergent properties and nonequilibrium behavior.   

To visualize the nature of the problem, suppose a theory of cancer is known, in the sense of a theory in physics discussed above. Then this cancer theory would be highly mathematical in nature, which would not permit a simple one-to-one interpretation in any natural language. Hence, only highly trained theoretical cancer mathematicians (in analogy to theoretical physicists) would be able to derive and interpret meaningful statements from this theory. Regardless of the potential success of such a cancer theory, this implies that medical doctors — not trained as theoretical cancer mathematicians — could not understand nor interpret such a theory properly and from their perspective, the cancer theory would appear opaque or non-explainable.

Despite these inherent limitations of AI with respect to our ability understanding it in a natural language, the future is optimistic. Specifically, an AI system may not need to be explainable as long as its prediction error (formally the generalization error) does not exceed an acceptable level. An example for such a pragmatic approach is given by nuclear power plants.

Since nuclear power plants are based on physical laws, the functioning mechanisms of such plants are not explainable to members of the regulatory management or politicians responsible for governing energy politics. Nevertheless, nuclear power plants have been operating for many decades, contributing to our energy supply. One could envision a similar approach for self-driving cars, medical diagnosis systems, or managerial decision-making using AI. 

It is important to realize that while natural languages are insufficient to understand or describe complex mathematical formalisms, such as the ones behind AI, this won’t limit the function and utility of AI. If one wants to know how the future of explainable AI might look like, it is helpful to look back into the past and learn from our best theories in physics or from our practical experience in operating nuclear power plants.

Written by: Frank Emmert‐Streib, Olli Yli‐Harja, and Matthias Dehmer

Reference: Frank Emmert‐Streib, Olli Yli‐Harja, Matthias Dehmer. Explainable artificial intelligence and machine learning: A reality rooted perspective, WIREs Data Mining and Knowledge Discovery (2020). DOI: 10.1002/widm.1368