Interpretable machine learning – what to consider

Well-designed machine learning models can be powerful tools for driving better decision making and growing your business. But what if your models are inadvertently creating problems for your business or clients? For ‘black box’ models, it can be difficult to ‘look under the hood’ and find out what’s driving model performance. In this article, we look at what you should consider to create explainable and interpretable machine learning models that perform ethically.

What does my model really do?

Algorithms and machine learning are affecting us as individuals more and more. On a day-to-day basis, it’s seemingly routine applications such as the books or movies we’re recommended, or the search results returned by Google. But they also feature in more significant decisions: whether our loans are approved and how much our insurance costs, through to life-changing decisions made in healthcare and criminal justice settings. And even routine applications such as YouTube or Google search results sometimes have far-reaching consequences – including creating echo chambers only showing us what the algorithm thinks we want to see. As a harsh light is shone on discrimination and biases within our society, the ethics around some of these uses of machine learning (ML) are becoming increasingly more topical.

In particular, the machine learning community is paying close attention to topics of fairness, interpretability, transparency and accountability, and new data protection and privacy laws attempt to address some of these concerns. As we’ve discussed previously, much current ML research centres on developing tools for explainable ML. Many ML models are considered ‘black box’ models – models that produce decisions based on various inputs, but the process by which the decisions are made are opaque, usually due to model complexity. Explainable ML essentially involves a post-processing stage, which takes the output of the black box model and overlays a mechanism to explain the results of it. This focus on ‘explainability’, however, can be problematic. Many such explanations are poor, incomplete and not particularly meaningful, merely justifying what the model does. Cynthia Rudin, Professor of Computer Science, Electrical and Computer Engineering, and Statistical Science at Duke University has advocated for a push towards ‘interpretability’ over ‘explainability’.

In contrast to explainability, interpretability actually requires a full understanding of the path of computations leading to a prediction. In an article in Nature Machine Intelligence, appropriately titled Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead, Rudin and her collaborators argue:

There’s often a false dichotomy between interpretability and accuracy, and interpretable machine learning models can often perform just as well as black box models. Rudin and her team entered a competition to build an explainable black box model, and found the interpretable model they constructed instead actually had similar predictive accuracy to the best black box model.
It’s highly likely an interpretable model of comparable performance to the best black box models exists for many data sets. Rudin and her team use a theoretical argument and anecdotal experience to argue for this. Significantly, they call into question the widely-held belief there is always a trade-off between accuracy and interpretability. So, although it may be easier and less time-consuming to create a black-box model, chances are we could find an interpretable model if we tried.
Interpretable models should be the default type for any high-stakes decision. This is particularly relevant for decisions that deeply affect people’s lives, because not only are they often comparable to the performance of black box models but the basis underlying a model’s decision can be fully understood. Also, interpretable models are less prone to error, and any errors that do occur can be detected much easier and sooner.

Why do we use black box models?

Before we drill down further into interpretable ML, it’s helpful to remember there are several reasons why we might favour black box models. Essentially, it’s typically due to the fact it’s efficient to fit such models so they perform well. Unlike an interpretable model, which often requires considerable analyst skill, domain knowledge and feature engineering to fit, a black box model can process a lot of data, discover the key patterns and generate accurate predictions in a fraction of the time.

In many settings, fitting a good model quickly is important:

Where customer behaviour changes rapidly and frequent model updates are required.
In automating the labour-intensive parts of a process, which frees up skilled analysts to have more time for understanding and taking a deep dive into the results.
To enable more regular updating of key financial information.
To increase a model’s speed to market, which can lead to better results for a company.

In the insurance world, for example, black box models can be important in maintaining market position in personal lines pricing, and also provide a mechanism for monitoring outstanding claims more closely. They’re useful in marketing models for detecting patterns in customer behaviour or determining products of interest.

In short, black box type algorithms such as gradient boosting machines or neural networks are extremely flexible and high-performing tools for fitting models to large data sets. It’s no accident that algorithms such as XGBoost are top performers on competition websites such as Kaggle.

Black box models are particularly helpful when speed to market is important

So, we should always build black box models?

If only it were that simple. A problem with black box models is that we don’t know exactly why they do what they do (if we did, they would be interpretable). This is true even if we add an explainability overlay on these models – by definition these explainable components must be wrong some of the time, otherwise they would be the original model. Explainability techniques can identify trends in how the predictions change with various factors but there’s no guarantee this reflects what the black box model is actually doing. Troubleshooting black box models can be difficult. If we have an explainability overlay, in effect, we have two models to troubleshoot!

Let’s consider two types of situation in which we build models:

Models that have very significant individual impacts (we’ll call these SII models.). This could be a decision to give an individual a mortgage, or a medical diagnosis relating to a serious illness.
Those that don’t (non-SII models). This might include things like marketing, pricing or recommender systems.

A simple way to think of a model is that it categorises people into different groups and develops prediction rules for each of these groups. Then, for a new person not in the model, it works out what groups this person is most similar to and forms a prediction on that basis. If the model is good, this prediction will be accurate more often than not. But no matter how good the model, it will be wrong sometimes because people are individuals, not averages.

For a non-SII model, being wrong often does not lead to highly negative individual outcomes, and a lot of the time the cost of being wrong is borne by a company and not the individual. For example, if an online entertainment streaming service starts recommending horror movies to me, I’m not going to watch them and it’s possible that I might cancel my subscription. In this case, the cost of the error is mainly borne by the streaming service responsible for developing the model that made the sub-par recommendation.

However, suppose a model is used to diagnose a serious medical condition. This would be an example of an SII model. Here, the wrong diagnosis could be fatal where someone has a disease or lead to unnecessary testing and distress where they don’t. A particular issue in medical problems is leakage, where inappropriate data is used in developing a model. ‘Inappropriate’ here means something that would not be available in a true prediction problem. The concept of leakage is most easily explained with an example. Claudia Perlich recently won a Kaggle competition where she built a model to detect breast cancer from images. Her model was highly successful at predicting cancer, but this was due to the fact she discovered predictive information in the patient ID label which may have come from different machines being used for different severities of cancer. While the ID labels were highly predictive in the competition setting, they are unfortunately useless in real life. Identifying leakage can be tricky, particularly in a black box model where we don’t know exactly what’s going on.

Another nefarious example is the influence of typographical errors in a model such as COMPAS, a proprietary model used in the US. This model combines 137 different characteristics to produce a recidivism score and is widely used in the United States to make parole decisions throughout the country’s criminal justice system. The model is proprietary and therefore a black box, irrespective of its actual structure, although various analyses suggest age and number of past criminal offences are important factors. Consider the life-altering impact then of a clerical error in one of these fields. Rudin and her team cite one such incident where a typographical error on one of the factors wasn’t discovered until after a person was denied parole based on a high COMPAS score. She maintains it’s much easier to make errors with complex black box models than simpler, interpretable models and that any errors in the simpler models are easier to pick up because the process is transparent and can be double or even triple checked.

These examples caution the use of black box models in high stakes settings, but it’s also not the case that black box models can always be safely used for non-SII problems. Take the example of models used by companies to set prices. Frequently, these include an element of price elasticity modelling – which looks to estimate the trade-off between volume of sales and price of each unit. Typically, the lower the price, the higher the sales volume and vice versa, and somewhere in between the two extremes is where companies operate, balancing profitability and sales target figures. Sometimes, black box models get this relationship wrong and estimate the wrong effect when applied to out-of-sample data – sales increase as prices increase. Deploying this type of model would likely lead to poor commercial outcomes. An interpretable model avoids this – its transparency means that misfits like this should be apparent to analysts.

This is an example of something that’s often a problem for many models, but particularly black boxes: they don’t extrapolate well to new regions of data. An infamous example is the husky vs wolf classifier, which seemed to have very impressive accuracy rates at distinguishing between wolves and huskies, until it was realised that the classifier was just identifying snow in images – the model was essentially: if snow, then husky, else wolf.

No snow, must be a wolf!

Black boxes bad. Interpretable machine learning models good?

Yes and no.

In an ideal world, we would always build interpretable models because we would prefer an interpretable model over a black box. However, constructing interpretable models is often computationally hard (although Rudin and her team as well as others are seeking to improve the interpretable model toolbox). Even when it’s possible to construct an interpretable model, there’s usually a requirement to spend much more time on feature engineering and selection – models require much more skilled analyst time and domain expertise and are therefore considerably more expensive to construct.

There are many settings, particularly in the commercial world, where rapid reactivitity and speed of deployment are critical. For instance, the UK motor insurance industry is dominated by aggregator websites and operates on tight margins. Being able to react to changes and deploy models quickly is vital in that environment.

On the other hand, particularly for SII models, the expense of constructing an interpretable model should be weighed against the consequences of getting things wrong for the individual affected by the decision. Take a model like COMPAS for example. Suppose the model assigns a high score of reoffending largely because of a data entry error where previous convictions were recorded as a ‘seven’ instead of a ‘one’. An interpretable model, which explains the seven previous convictions were a major factor contributing to the high score, offers some hope that this error could be corrected, unlike a black box model with no explanation. Encouragingly, Rudin and her team were able to construct a very simple rules-based model that depends only on sex, age and prior offences using the CORELS algorithm which showed similar performance to the COMPAS model (the proprietary model of 137 factors mentioned earlier). As well as being simpler to apply and less prone to error, it’s a much more useful starting point to consider how best to neutralise some of the inherent biases in the data underlying the model (for example, the evidence that, as a population group, young African American men are over-policed relative to their white counterparts).

So why aren’t all models with significant impacts on individuals interpretable?

Rudin and her team highlight several reasons for why all SII models aren’t interpretable. These include the costs of development, and possible difficulties in recouping model development costs for models that end up being a simple list of rules, or a scoring system, based on a small number of factors. From our experience, there’s no denying the considerably greater cost associated with this process.

Furthermore, for something like the case of medical misdiagnosis, or the COMPAS example, there’s a system flaw in that the costs of being wrong are misaligned – in the case of getting a medical diagnosis wrong, the individual bears the cost of the mistake, not the company providing the algorithm. Dealing with this may require policy changes to encourage or demand greater interpretability in SII models, and greater accountability. Recently, the New Zealand Government launched its Algorithm Charter for Aotearoa New Zealand which emphasises ethics, fairness and transparency but is restricted to government bodies. The EU’s General Data Protection Regulation has broader application and aims to give more control to individuals over their personal data, and a ‘right to information about the logic involved’ in automated decisions (i.e. it’s a ‘right to an explanation’). This is a step in the right direction, but, as noted above, explainable ML models and interpretable ML models are two different things and there is no guarantee that any explanation would be accurate. Furthermore, you have to know you’ve actually received an automated decision before you can seek your explanation. For example, if you were a woman not being shown job ads from Amazon because an ML algorithm was trained on data with only men in similar jobs, then chances are you would never know.

What can we do when creating an interpretable machine learning model proves difficult?

Difficulties in the creation of interpretable machine learning models may sometimes be mitigated by using black box models as part of an iterative process towards developing an interpretable model. For example, one model-building process (which we often use at Taylor Fry) involves using a black box model at a preliminary stage in an analysis to identify key features of interest. Based on this, we refine our selection of features to then build an interpretable model. This is frequently an iterative process, since black box models are useful for identifying un-modelled patterns in the data and, at each step, our understanding of the underlying data grows. Like Rudin and her team, our experience suggests interpretable machine learning models are frequently more useful than black boxes as the ultimate products.

Working at Taylor Fry

Graduate program

Current opportunities

More opportunities

Catastrophic injury care – the hidden cost of e-bike adoption

Australia’s EV boom – How insurers need to respond

Interpretable machine learning – what to consider

What does my model really do?

Why do we use black box models?

So, we should always build black box models?

Black boxes bad. Interpretable machine learning models good?

So why aren’t all models with significant impacts on individuals interpretable?

What can we do when creating an interpretable machine learning model proves difficult?

Related articles

Related articles

How AI is transforming insurance

How AI will be impacted by the biggest overhaul of Australia’s privacy laws in decades

All articles