Introduction to AI Bias

By Nidhi Singh, CCG

Note: This article is adapted from an op-ed published in the Hindu Business Line which can be accessed here

A recent report by Nasscom talks about the integrated adoption of artificial intelligence (AI) and data utilisation strategy, which can add an estimated USD 500 billion to the Indian economy. In June 2022, Meity published the Draft National Data Governance Framework Policy, which aims to enhance the access, quality, and use of non-personal data in ‘line with the current emerging technology needs of the decade.’ This is another step, in the world-wide push by governments to adopt machine learning and AI models, which are trained on individuals’ data, into the sphere of governance. 

While India is currently considering the legislative and regulatory safeguards which must be implemented for the use of this data and its use in AI systems, many countries have begun implementing these AI systems. For example, in January 2021, the Dutch government resigned en masse in response to a child welfare fraud scandal that involved the alleged misuse of benefit schemes. 

The Dutch tax authorities used a ‘self-learning’ algorithm to assess benefit claims and classify them according to the potential risk for fraud. The algorithm flagged certain applications as being at a higher risk for fraud, and these applications were then forwarded to an official for manual scrutiny. While the officials would receive applications from the system stating that they had a higher likelihood of containing false claims, they were not told why the system flagged these applications as being high-risk. 

Following the adoption of an overly strict interpretation of the government policy on identifying fraudulent claims, the AI system being used by the tax authorities began to flag every data inconsistency — including actions like failure to sign a page of the form — as an act of fraud. Additionally, the Dutch government’s zero tolerance for tax fraud policy meant that the erroneously flagged families would have to return benefits not only from the time period in which the fraud was alleged to be committed but up to 5 years before that as well. Finally, the algorithm also learnt to systematically identify claims which were filed by parents with dual citizenship — as being high-risk. These were subsequently marked as potentially fraudulent. This meant that out of the people who were labelled as fraudsters by the algorithm, a disproportionately high number of them had an immigrant background. 

What makes the situation more complicated is that it is difficult to narrow down to a single factor that caused the ‘self-learning algorithm’ to arrive at the biassed output due to the ‘black box effect’ and the lack of transparency about how an AI system makes its decisions. This biassed output delivered by the AI system is an example of AI bias.

The problems of AI Bias

AI bias is said to occur when there is an anomaly in the output produced by a machine learning algorithm. This may be caused due to prejudiced assumptions made during the algorithm’s development process or prejudices in the training data. The concerns surrounding potential AI bias in the deployment of algorithms are not new. For almost a decade, researchers, journalists, activists, and even tech workers have repeatedly warned about the consequences of bias in AI. The process of creating a machine learning algorithm is based upon the concept of ‘training’. In a machine learning process, the computer is exposed to vast amounts of data, which it uses as a sample to study how to make judgements or predictions. For example, an algorithm designed to judge a beauty contest would be trained upon pictures and data relating to beauty pageants from the past. AI systems use algorithms made by human researchers, and if they are trained on flawed data sets, they may end up hardcoding bias into the system. In the example of the algorithm used for the beauty contest, the algorithm failed its desired objective as it eventually made its choice of winners based solely on skin colour, thereby excluding contestants who were not light-skinned.

This brings us to one of the most fundamental problems in AI systems – ‘Garbage in – Garbage out’. AI systems are heavily dependent on the use of accurate, clean, and well-labeled training data to learn from, which will, in turn, produce accurate and functional results. A vast majority of the time in the deployment of AI systems is spent in the process of preparing the data through processes like data collection, cleaning, preparation, and labeling, some of which tend to be very human-intensive. Additionally, AI systems are usually designed and operationalised by teams that tend to be more homogenous in their composition, that is to say, they are generally composed of white men. 

There are several factors that make AI bias hard to oppose. One of the main problems of AI systems is that the very foundations of these systems are often riddled with errors. Recent research has shown that ten key data sets, which are often used for machine learning and data science, including ImageNet (a large dataset of annotated photographs intended to be used as training data) are in fact riddled with errors. These errors can be traced to the quality of data the system was trained on or, for instance, biases being introduced by the labelers themselves, such as labelling more men as doctors and more women as nurses in pictures. 

How do we fix bias in AI systems?

This is a question that many researchers, technologists, and activists are trying to answer. Some of the more common approaches to this question include inclusivity – both in the context of data collection as well as the design of the system. There have also been calls about the need for increased transparency and explainability, which would allow people to understand how AI systems make their decisions. For example, in the case of the Dutch algorithm, while the officials received an assessment from the algorithm stating that the application was likely to be fraudulent, it did not provide any reasons as to why the algorithm detected fraud. If the officials in charge of the second round of review had more transparency about what the system would flag as an error, including missed signatures or dual citizenship, it is possible that they may have been able to mitigate the damage.

One possible mechanism to address the problem of bias is — the blind taste test mechanism – The mechanism works to check if the results produced by an AI system are dependent upon a specific variable such as sex, race, economic status or sexual orientation. Simply put, the mechanism tries to ensure that protected characteristics like gender, skin colour, or race should not play a role in decision-making processes.

The mechanism includes testing the algorithm twice, the first time with the variable, such as race, and the second time without it. Therefore in the first set, the model is trained on all the variables including race, and the second time the model is trained on all variables, excluding race.If the model returns the same results, then the AI system can be said to make predictions that are blind to the factor, but if the predictions change with the inclusion of a variable, such as by inclusion of dual citizenship status in the case of the Dutch algorithm, or the inclusion of skin colour in the beauty contest the AI system would have to be investigated for bias. This is just one of the potential mitigation tests. States are also experimenting with other technical interventions such as the use of synthetic data, which can be used to create less biased data sets. 

Where do we go from here 

The Dutch case is merely one of the examples in a long line of instances that warrant higher transparency and accountability requirements for the deployment of AI systems. There are many approaches that have been, and are still being developed and considered to counter bias in AI systems. However, the crux remains that it may be impossible to fully eradicate bias from AI systems due to the biased nature of human developers and engineers, which is bound to be reflected within technological systems. The effects of these biases can be devastating depending upon the context and the scale at which they are implemented. 

While new and emerging technical measures can be used as stopgaps, in order to comprehensively deal with bias in AI systems, we must address the issues of bias in those who design and operationalise the system. In the interim, regulators and states must step up to carefully scrutinise, regulate or in some cases halt the use of AI systems which are being used to provide essential services to people. An example of such regulation could include the framing and adoption of risk based assessment frameworks for the adoption of AI systems, wherein the regulatory requirements for AI systems are dependent upon the level of risk they pose to individuals. This could include permanently banning the deployment of AI systems in areas where AI systems may pose a threat to people’s safety, livelihood, or rights, such as credit scoring systems, or other systems which could manipulate human behaviour. For AI systems which are scored to be lower risk, such as AI chatbots being used for customer service, there may be a lower threshold for the prescribed safeguards. The debate on whether or not AI systems can ever truly be free from bias may never be fully answered; however, we can say that the harms that these biases cause can be mitigated with proper regulatory and technical measures. 


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s