As AI techniques for modeling big data (especially deep learning) have exploded in interest and use, concerns over bias and negative impact have been raised. There are many articles and many general references (e.g., Weapons of Math Destruction, Kathy O’Neil). These concerns only become more acute as AI increasingly affects our lives.
We’ve lived with the impact of statistical model for many years. We know that we are assigned credit scores based on our financial histories. We pay more or less for car insurance depending on characteristics like age and driving history. But AI has changed the landscape in at least three ways:
1. Increased Impact
Not too long ago, we encountered models rarely. We applied for insurance. We applied for credit. Now we are ‘scored’ many times a day as we move around the web. These scores influence us in terms of offers we get, content we see etc.
2. More Data
The amount of data collected, stored and used is orders of magnitude bigger. Again not too long ago, businesses knew a bit about us, but now we are visible broadly and deeply.
3. Black Boxes
Old-style hand-crafted statistical models are comprehensible. It’s clear what goes in, what variables are used and how they are combined to produce a score. Many AI models, particularly deep learning ones, are opaque.
There are many implications:
1. Negative consequences
Facebook’s recommendation engines are widely criticized for leading to viral spread of ‘bad’ information. Generally, AI does a particular job - it is not values driven. When a recommendation engine is engineered and deployed, the code does not look at the emergent effects and make changes. This is a long term issue for AI as we are increasingly influenced/affected by its decisions. This is an area of research and already books are appearing about what is called the Alignment Problem:
Models depend on samples. They might be applied to populations that have different characteristics. This can lead to the out-of-sample individuals in peculiar or biased ways. In a later view we will discuss how this is possible in insurance. Many examples have been cited:
3. Unfair social practices
As a society we have decided that disparate treatment for different groups is something we want to work against. In early days, credit (arguably a social good), was denied to certain groups even when the models purported not to use identifiers. For example, the model might use zip code as a proxy. This practice was called redlining and it served to double penalize target groups. Given that AI can use many more features, the possibilities for redlining (or alias) variables goes up. Since many of the models are black boxes these might be hard to identify.
What do we do?
This is a complex issue and there is not one answer. In a follow-up view, we will discuss how Lingonautics is working with insurance regulators to address the problem. In other views, we will discuss other aspects such as the alignment problem.