Bias and Variance are two fundamental but most important terms in AI. These two concepts will help in creating useful simulations as these terms represent the challenges of the model fit. Let’s discuss Bias in AI and/or ML models.
What is Bias in AI or ML?
In an AI environment, any model that we create analyzes and finds patterns within the data to make predictions. The model then learns these patterns and applies them to the unknown set of similar data to predict values. Depending on the data gathering process and data quality, bias can vary in predicting these values. The goal of any AI model is to minimize the Bias and Variance to increase the efficiency of prediction, which are also categorised as reducible errors of the model.
Bias, is the difference between actual and predicted values in its simplest definition. It occurs due to the inability of the model, also called algorithms such as Linear Regression, to capture the true relationship between the data points. Each model begins with some amount of bias because bias occurs from the assumptions in the model, which makes the target function simple to learn.
High Bias:
When the bias is high, assumptions made by the model are too basic or there is a lack of an appropriate set of features. This is why the model cannot capture the underlying patterns of the dataset during training. In this case, the model will become inefficient and unable to perform on the test or new dataset. This condition leads to Underfitting, a concept we will discuss in another post.
Lack of appropriate data could also result in high bias even if the dataset has the appropriate set of features. Extra care should be taken while collecting the data and deciding on the volume of data.
How does Bias occur in a Predictive model?
As discussed before, Bias can occur during the designing of the project or the data collection process where the population is too small, or the sample does not represent the whole population.
After processing the data generated online, you stated that 85% of city residents feel positive about the work done by the current city government. However, this statement is flawed. The online survey is taken by only those people who have access to the internet plus who have access to the platform you have posted your survey on. This could also include people that are not located in that particular city or are biased towards a particular political ideology. This could also overrepresent a particular age group (i.e., users aged between 20-40 years), race, gender or occupation class.
Impact on Industries with AI Bias
Bias, intentional or unintentional, can arise in various use cases across industries. Here are the industry-wise use cases along with real-world evidence of AI bias:
- Banking
Imagine a scenario when a valid applicant’s loan request is not approved. This could as well happen as a result of bias in the system introduced to the features, and related data used for model training such as gender, education, race, location etc.
In another example, imagine an applicant whose loan got approved although he is not suitable enough. In yet another example, imagine an applicant’s credit card application getting rejected although the applicant was a valid applicant who satisfied all the requirements for getting the credit card. It may so happen that the model used to classify the credit card application to be approved or rejected had an underlying bias owing to the educational qualification of the applicants.
- Insurance
Imagine a person being asked to pay a higher premium based on the predictions made by the model which took into account some of the attributes such as gender, race for making the predictions.
- Employment
Imagine a machine learning model inappropriately filtering the candidates’ resumes based on the attributes such as race, colour etc. of the candidates.
This could not only impact the employability of the right candidates but also results in the missed opportunity of the company to hire a great candidate.
In 2018, Amazon stopped using its Hiring Tool for being biased against women.
- Housing
Imagine a model with a high bias making incorrect predictions of house pricing. This may result in both, the house owner and the end-user (the buyer) missing the opportunity related to buy-sell.
The bias may get introduced due to data related to location, community, geography etc.
AI bias has caused racial bias in housing as 80% of Black Mortgage applicants have been rejected
- Fraud (Criminal/Terrorist)
Imagine the model incorrectly classifying a person as the potential offender and getting him/her questioned for the offence which he/she did not do.
That could be the predictable outcome of a model which may be biased towards race, religion, national origin etc. For example, in certain countries or geographies, a person of a specific religion or national origin is suspected to perform a certain kind of crime such as terrorism. Now, this becomes part of individual bias. This bias gets reflected in the model prediction.
IBM abandons ‘biased’ facial recognition tech
Facial recognition fails on race, study says
Passport facial checks fail to work with dark skin
- Government
Imagine government schemes to be provided to a certain section of people and machine learning models being used to classify these people who would get benefits from these schemes.
A bias would result in either some eligible people not getting the benefits (false positives) or some ineligible people getting the benefits (false negatives).
- Education
Imagine an applicant admission application getting rejected due to underlying machine learning model bias. The bias may have resulted due to data using which model was trained.
- Finance
In the financial industry, the model built with biased data may result in predictions that could offend the Equal Credit Opportunity Act (fair lending) by not approving the credit request of the right applicants.
And, the end users could challenge the same requiring the company to explain not approving the credit request. The law, enacted in 1974, prohibits credit discrimination based on attributes such as race, colour, religion, gender etc. While building models, product managers (business analysts) and data scientists do take steps to ensure that correct/generic data (covering different aspects) related to some of the above-mentioned features have been used to build (train/test) the model, the unintentional exclusion of some of the important features or data sets could result in bias.
CFPB warns of the gender and racial bias in AI for lending services
Let me know your views about the state of AI in today’s world. And your suggestions about my first article over the improvements and facts presented.
Also, don’t forget to subscribe newsletter for updates on new articles and upcoming exciting sections.
One Reply to “Bias in AI & ML: A Practical Understanding”