Click to learn more about author Scott Reed.
Building an accurate, fast, and performant model founded upon strong Data Quality standards is no easy task. Taking the model into production with governance workflows and monitoring for sustainability is even more challenging. Finally, ensuring the model is explainable, transparent, and fair based on your organization’s ethics and values is the most difficult aspect of trusted AI.
We have identified three pillars of trust: performance, operations, and ethics. In our previous articles, we covered performance and operations. In this article, we will look at our third and final pillar of trust, ethics.
Ethics relates to the question: “How well does my model align with my organization’s ethics and values?” This pillar primarily focuses on understanding and explaining the mystique of model predictions, as well as identifying and neutralizing any hidden sources of bias. There are four primary components to ethics:
- Privacy
- Bias and fairness
- Explainability and transparency
- Impact on the organization
In this article, we will focus on two in particular: bias and fairness and explainability and transparency.
Bias and Fairness
Examples of algorithmic bias are everywhere today, oftentimes relating to the protected attributes of gender or race, and existing across almost every vertical, including health care, housing, and human resources. As AI becomes more prevalent and accepted in society, the number of incidents of AI bias will only increase without standardized responsible AI practices.
Let’s define bias and fairness before moving on. Bias refers to situations in which, mathematically, the model performed differently (better or worse) for distinct groups in the data. Fairness, on the other hand, is a social construct and subjective based on stakeholders, legal regulations, or values. The intersection between the two lies in context and the interpretation of test results.
At the highest level, measuring bias can be split into two categories: fairness by representation and fairness by error. The former means measuring fairness based on the model’s predictions among all groups, while the latter means measuring fairness based on the model’s error rate among all groups. The idea is to know if the model is predicting favorable outcomes at a significantly higher rate for a particular group in fairness by representation, or if the model is wrong more often for a particular group in fairness by error. Within these two families, there are individual metrics that can be applied. Let’s look at a couple of examples to demonstrate this point.
In a hiring use case where we are predicting if an applicant will be hired or not, we would measure bias within a protected attribute such as gender. In this case, we may use a metric like proportional parity, which satisfies fairness by representation by requiring each group to receive the same percentage of favorable predictions (i.e., the model predicts “hired” 50% of the time for both males and females).
Next, consider a medical diagnosis use case for a life-threatening disease. This time, we may use a metric like favorable predictive value parity, which satisfies fairness by equal error by requiring each group to have the same precision, or probability of the model being correct.
Once bias is identified, there are several different ways to mitigate and force the model to be fair. Initially, you can analyze your underlying data, and determine if there are any steps in data curation or feature engineering that may assist. However, if a more algorithmic approach is required, there are a variety of techniques that have emerged to assist. At a high level, those techniques can be classified by the stage of the machine learning pipeline in which they are applied:
- Pre-processing
- In-processing
- Post-processing
Pre-processing mitigation happens before any modeling takes place, directly on the training data. In-processing techniques relate to actions taken during the modeling process (i.e., training). Finally, post-processing techniques occur after modeling the process and operate on the model predictions to mitigate bias.
Explainability and Transparency
All Data Science practitioners have been in a meeting where they were caught off-guard trying to explain the inner workings of a model or the model’s predictions. From experience, I know that isn’t a pleasant feeling, but those stakeholders had a point. Trust in ethics also means being able to interpret, or explain, the model and its results as well as possible.
Explainability should be a part of the conversation when selecting which model to put into production. Choosing a more explainable model is a great way to build rapport between the model and all stakeholders. Certain models are more easily explainable and transparent than others – for example, models that use coefficients (i.e., linear regression) or ones that are tree-based (i.e., random forest). These are very different from deep learning models, which are far less intuitive. The question becomes, should we sacrifice a bit of model performance for a model that we can explain?
At the model prediction level, we can leverage explanation techniques like XEMP or SHAP to understand why a particular prediction was assigned to the favorable or unfavorable outcome. Both methods are able to show which features contribute most, in a negative or positive way, to an individual prediction.
Conclusion
In this series, we have covered the three pillars of trust in AI: performance, operations, and ethics. Each plays a significant role in the lifecycle of an AI project. While we’ve covered them in separate articles, in order to fully trust an AI system, there are no trade-offs between the pillars. Enacting trusted AI requires buy-in at all levels and a commitment to each of these pillars. It won’t be an easy journey, but it is a necessity if we want to ensure the maximum benefit and minimize the potential for harm through AI.