What constitutes an unacceptable real-world bias? We might argue that any perspective which unfairly disadvantages someone based on non-material criteria is an unfair bias. For example:
- Choosing one job candidate over another due to skin color or “ethnic-sounding names”
- Setting different pricing for one customer over another because of religious views
- Paying one person less than another despite equal performance due to gender
- Wrongfully terminating someone without cause due to sexual orientation
All of the above scenarios are generally regarded as illegal and unacceptable in modern business. Companies which willfully implement policies which enable the above behaviors face severe consequences, as they should.
What if our machines are learning these behaviors from us in an unconscious way?
How Machines Learn
The basic process of training a machine learning system goes something like this:
- Gather data
- Clean data
- Engineer features
- Choose algorithms
- Test algorithms
- Select model
- Test model
- Refine model
- Operationalize model
What’s happening in the process is that we give machines the data we want them to learn from (steps 1-2), tell them what data to use (3), then help them decide how they’ll learn (4-8). Once the machine has learned and is generating good results, we release it into production (9).
When data scientists execute the machine learning process above, they spend the vast majority – 60-80% – of their time on steps 1 and 2 (according to data scientist David Langer). They spend a minority of time on step 3 (~20%), and invest their remaining time on steps 4-9.
Consider the process we just outlined. Is it any surprise that companies rush to step 9 as quickly as possible in order to start achieving ROI?
Is it any surprise that the crops of brand new data scientists, fresh out of university classes or professional development courses, spend most of their time and energy studying algorithms and modeling?
These are natural human tendencies – to want to do the cool stuff, to want to achieve results as quickly as possible for maximum gain.
Where Bias Creeps Into Data Science
Where bias creeps in, however, is in feature engineering. During feature engineering, we choose and shape the data for the algorithms we’ll expose it to. Bias creeps into data science because we breeze past feature engineering as quickly as possible to “get to the good stuff”.
Consider this simple dataset for a fictional HR recruiting database:
- College or University Attended
- Last Company Employer
- Last Company Employer Separation Date
- Number of LinkedIn Recommendations
- Number of LinkedIn Endorsements
- Last Job Applied For
- Last Job Applied For Date
- Last Website Visit Date
- Last Email Opened Date
- Number of Clicks in Last Email
Suppose our mission as data scientists was to develop a machine learning model that could predict who we should hire.
An inexperienced data scientist might look through the database to find missing or corrupted data, then load the entire dataset, as is, and start testing algorithms. They’d select some of the best-known algorithms and dig right into building a model, find a model that generates what looks like statistically accurate, relevant results, and hand it off to the business user triumphantly.
Do you see the problem?
What the machine might learn from this dataset is that a predictor of who to hire might be white men, aged 31-36, who have more than 20 LinkedIn endorsements. The model would then recommend only job candidates who fit that criteria.
While that model might be statistically valid, it’s also illegal. Age, ethnicity, and gender should not be considerations in a hiring model. Yet the inexperienced or rushed data scientist skipped past feature engineering, the critical stage at which those invalid fields would have been removed. That data would not and should not be a part of the machine learning model.
What Should Have Happened
The experienced data scientist would know to invest lots of time in feature engineering to explicitly screen out potential bias from our training data. If our hiring data to date has a past human bias of not hiring women at the same rate as men, our machine learning model would learn to emulate that behavior unless we explicitly removed gender from consideration.
What should have happened is that we should have removed any data which could have led to an illegal outcome, an illegal model.
The important part here is that we did not intentionally create bias. We did not set out to willfully discriminate against one group or another. However, historically we may have, especially if we use large longitudinal datasets that span decades. Our inexperience, our haste, or our inability to recognize situations involving potential bias caused the problem.
Now, the dataset example above is just a handful of criteria. Imagine a dataset with thousands of columns and millions of rows, something we cannot physically remember. It’s easy to see how bias could creep in if inexperienced or rushed data scientists are building models from massive datasets.
The great danger here is that in many machine learning applications, the end user never sees the model, never sees the code, never sees the training data. Thus, we may be working with biased models and not know it until months or years later when we start seeing unexpected trends in our results.
Protecting the Future of AI, Protecting Our Future
If AI is to have a permanent, valuable place in our society in the years and decades to come, we must rigorously oversee the creation of its models. We must tell it what is explicitly forbidden, and train it to recognize biases conscious and unconscious. If we do that well, we will create a more fair, more just, and more pleasant society as our machines guide us away from our baser instincts.
Want to read more like this from Christopher Penn? Get updates here:
Get your copy of Marketing Blue Belt!