We are often asked about different Artificial Intelligence and Machine Learning algorithms, and how accurately we might be able to predict future outcomes based on past data.
The typical answer to that question is, perhaps rather uninspiringly, “it depends”. And although it’s true that the levels of predictions possible can vary depending on different types of data, there is usually a more subtle reason that leads us to that response. It’s because there is often a tradeoff between ‘explainability’ and accuracy (we’ll expand on what we mean by both those terms shortly) in Artificial Intelligence.
Artificial Intelligence can be a pretty complex and impenetrable topic at the best of times, so it is with that in mind that over this blog post and its subsequent Part 2, we will try to walk through the impact that the tradeoff between explainability and accuracy can have on organisations without going too deep into the technical details.
To try and do this, we will work through an example of trying to predict the quality of red wine (!) using a real-life dataset that contains 1,600 different red wines and their attributes which include Alcohol Level, pH Level and Volatile Acidity (whatever that is?!) and gives each wine a ‘quality’ score out of 10 as rated by professional tasters.
We’ll ignore the fact that measuring the ‘quality’ of wine is arguably pretty subjective, and instead set ourselves the challenge of trying to predict which wines would score 7 out of 10 or more. We’ll call the wines that score at least 7 out of 10 ‘Good Wines’ and (perhaps rather harshly), those that score below 7 as ‘Bad Wines’. If you’re interested, you can download the same dataset we’re using which is provided courtesy of The University of California.
So where do ethics come into this? Well, within Artificial Intelligence and Machine Learning, there is a range of different approaches and tools that can be used to make predictions. In general, we would call something that makes a prediction (such as the expected quality of red wine given its other attributes) a ‘model’. And in very basic terms, a good model would be one that usually makes accurate predictions.
To come up with a good model is often quite a complicated process, and people who describe themselves as ‘Data Scientists’ are (amongst other things) specialists in understanding how to manipulate data and select the right tools to create models that provide good results.
But, becoming a Data Scientist isn’t everybody’s cup of tea (to say the least!), and the models that they create can range from incredibly simple to completely impenetrable even to the experts. Therefore, if a non-Data Scientist had the task of explaining a prediction that one of these models has made, that task could range from being pretty straightforward to almost impossible depending on the type of model chosen.
The ethical aspects of this come about because although there might not be too much of an ethical challenge selecting a decent bottle of red wine, what about if you’re applying for a loan or a University place and you get rejected? In many of those cases, an answer of “The computer says no…” probably isn’t going to quite cut it.
It is in these types of examples (where a model is making a decision that has a direct impact on someone’s life) that if a decision can’t be explained and justified that this is at best an ethical grey area, and at worst, simply unethical. Furthermore, even when there isn’t a direct impact on someone’s life, or where it is thought that the impact could only ever be considered a positive one, that care still needs to be taken because unintended side-effects of well meaning actions can still bring about ethical concerns.
In a nutshell, we would say that ‘explainability’ is how easy it would be for a non-Data Scientist to understand and explain why a particular prediction has been made. And, it is often the case that there is a tradeoff between how explainable a particular model is, and how accurate its predictions are.
So, to try and bring this to life, we’re going to look at handful of Artificial Intelligence models that might be used with our red wine dataset to try and make predictions. We’ll try to keep the technical details to an absolute minimum, as the point of this post isn’t to go into lots of detail of specific models, rather it’s intended to highlight how using different models might require a tradeoff between accuracy and the ability to easily explain how a particular model has made its predictions. We’ll cover one model in this post, and a further two models in Part 2.
Technical Disclaimer (skip this if you’re not a Data Scientist!!!!): Before we begin, we just wanted to say that we do realise that we’ve massively oversimplified all this, and also that there is not an inherent link between the ability for a model to accurately make a prediction and the complexity/explainability of that model. It’s certainly true that some of the simplest models can produce excellent results in particular situations.
However, we do find that in many of our real-world interactions with clients that the more sophisticated models we use (which are typically more difficult to understand and explain) do often make better predictions, and therefore in many cases this correlation does exist even if there is no causal link.
One final point is that we have massively oversimplified the models and metrics including terms like ‘accuracy’ in quite a ‘non-data sciencey’ way, to try and make the post more understandable to the average person. We tried using more technical measures for each model’s performance, but found that they took away from the point of this article, so went with much looser terms – for which we ask your forgiveness! 🙂 Right, now that’s out of the way…