Machine Learning and Fair Lending: Risks & Opportunities

As a lending institution, you are probably using data and models to make credit decisions. And since your business is subject to fair lending standards, you probably have rigorous practices in place to review those models for compliance with those fair lending standards.

Well known and widely adopted fair lending best practices include:
- Ensure that the data doesn’t contain any variables that may be a proxy for protected class status.
- Conduct disparate impact analysis (DIA) to determine whether the model has a disparate impact on members of any protected class.
- If the DIA finds that the model does have a disparate impact, conduct disparate impact alternatives analysis (DIAA) to find a less discriminatory alternative (LDA).
However, over the past few years your business has likely seen a significant increase in the data available to develop models and a corresponding increase in the complexity and opaqueness of the models used.
This increase in dataset size and model sophistication can create significant value and business opportunities, but it also creates problems. In particular, the fair lending standards and best practices don’t change with the introduction of big data and new machine learning models — but they can become a lot more difficult to perform. Consider two scenarios.

- The Problem of Searching for Less Discriminatory Alternatives
Conducting DIAA traditionally meant dropping or adding variables one at a time to look for a LDA. This is a reasonable approach when there are a handful of variables. But as the number of potential variables increases, the number of potential subsets of variables that could be used to train a model grows exponentially.
A dataset with 40 variables has over a trillion subsets of variables that could be used; a dataset with a few hundred variables has more potential subsets of variables than there are atoms in the known visible universe. Conducting a robust DIAA has become practically impossible as datasets have grown in size.
2. The Problem of Proxies
Much of the power of newer machine learning models is derived from the model’s ability to identify nuanced interactions between variables in ways that human analysts and traditional models never could. While this will likely increase the model’s performance, it also increases the risk that the model will learn a proxy for race through the interaction of variables.
Checking for proxies has historically been a relatively easy task of checking variables one by one; how to check for proxies in a more sophisticated machine learning model is far less clear — especially when alternative data not directly related to creditworthiness is considered.
Increased Scrutiny
At the same time advances in machine learning have made compliance with fair lending standards more difficult, those same advances have resulted in increased scrutiny. Getting the implementation of models right is more important than ever because the costs of getting it wrong are increasing.

- Regulators and legislators are increasing their fair lending oversight of machine learning models.
- Journalists, academics, and consumer groups are scrutinizing algorithmic decision making and raising public awareness of the issue.
- The number of artificial intelligence incidents is on the rise. Of particular interest, the leading causes of these incidents are algorithmic bias and opacity — both of which are significant issues for lenders.

SolasAI: Compliance by Design

SolasAI is a first-of-its-kind solution to this problem. Developed by BLDS, LLC — a consultancy with decades of experience advising in fair lending — and cutting edge data scientists, SolasAI provides innovative solutions to both the problem of unacceptable proxies and the problem of searching for less discriminatory models.
The Solution to the Less Discriminatory Alternatives Search

SolasAI starts with your existing model and evaluates it according to accepted fair lending standards. If disparate impact is found, SolasAI trains a new model using a subset of the original features, and uses artificial intelligence to learn which variable (or combinations of variables) are driving the disparity. The process is then iterated on repeatedly to search for better models with less disparate impact. At the end of the process, you are presented with an array of alternative models that clearly demonstrates the tradeoff between fairness and accuracy. This allows businesses to easily choose the model that meets their performance requirements while maximizing fairness to the extent possible.
Traditional solutions to these problems require costly teams of data scientists to wade through piles of academic literature looking for practice implementations and to conduct time intensive manual searches for alternative models that are likely insufficient in the face of such large datasets and sophisticated machine learning models. SolasAI automates all of this in a software package that plugs right into a business’ existing pipeline.
The Solution to Proxies
SolasAI implements the latest research in explainable artificial intelligence (XAI) to illuminate the “black-box” inner workings of sophisticated models including machine learning. These tools allow modelers to determine which variables — combinations of variables — are driving disparate impact. It also allows modelers to explore whether their model is learning a proxy for race through the interaction between multiple variables.
Tradition of Excellence
SolasAI has been developed by BLDS: the industry leading employment discrimination and fair lending consultancy.

- BLDS developed fairness and discrimination analysis based on statistical methods for testing for evidence of discrimination.
- These techniques are now widely used and generally accepted by regulators and in courts.
- BLDS has provided expert testimony in many pivotal employment discrimination cases.
- BLDS advises numerous regulatory agencies, including the Department of Justice, Federal Trade Commission (FTC), Department of Labor, and the Consumer Financial Protection Bureau (CFPB).
Although BLDS developed these tools in the context of assessing discrimination in labor markets, many of the leading lending institutions turned to BLDS in the mid-1990s to ascertain how to comply with fair lending laws when using statistical models. Today, nearly every large lending institution in America uses some version of these methods to assess and mitigate disparate impact risk.
Now all of this expertise is available in a powerful and easy to use software product: SolasAI.