Algorithms of Classifications
This blog is part of a series which critically addresses the use of algorithms and machine learning in our society. Highlighting the issues that emerge, it discusses what can be done to address these issues from a technological and societal point of view. The series also aims to inform periodically about the interim findings of the research project FairAlgos. This episode focuses on a fundamental function of algorithms – that of classifications – which have become increasingly prevalent and are the basis for many of the discriminatory and unfair results of algorithm and AI use.
In my introductory blog post, I broadly highlighted the many instances where AI applications produced inaccurate results, leading to biased and unfair decision-making. The list of examples can be repeated over and over again, each time leading to similar conclusions: the AI reproduces the biases of our society that is represented in the data used to train the algorithms; the data is flawed and inaccurate in general – “garbage in, garbage out”; the bias only emerged after the technology was in operation, because designers didn’t foresee the context in which the bias occurred; etc. The questions we have to ask, when critically investigating algorithms, are, why these biases and discriminations emerge again and again, and what is the core issue we are dealing with? One answer to these questions revolves around the issue of classifications and profiling, and the resulting actions and non-actions that emerge from the classifications.
Classifications are a normal part of our everyday life, we calculate and compare on a regular basis “such potentially incomparable values as career and family, [… or] freedom and commitment in love”[1]. Due to them being omnipresent and found in almost every situation of life, the processes behind the creation of these classifications are often not visible and run in the background. Classifications appear natural, however, if we look more closely at how these classifications are created and individuals ordered and grouped, we can see that this is a practice that involves politics and power.[2] Decisions are actively taken on how these classifications are made and what characteristics should be used to build them, even if they seem like natural traits. Like all classifications, they feed back onto individuals and groups, shaping how they ought to be by requiring a constant adaptation to the definitions of the classifications. These practices are increasingly hidden in classification algorithms, which have taken a significant role in the classifications of individuals, as they are deemed a neutral technology that are able to perform this task objectively and accurately.
To examine the issue of the politics of (automated) classification and profiling, I will use an example that has been discussed prominently in the Austrian media and which I have also addressed in the previous post: the algorithm that is supposed to classify the employment chances of job seekers in Austria – more commonly known as the “AMS-Algorithm”. The AMS-Algorithm was first introduced in late 2018, with the goal to classify job seekers into three categories based on their potential prospect on the job market. Specifically, the system classifies into three categories, with the intention to distribute the effort of support more efficiently:
Group A: High prospects in the short term
Group C: Low prospects in the long-term
Group B: Mediocre prospects, meaning job seekers not part of groups A or C.”[3]
The distribution of support by the Public Employment Service Austria (AMS) is thus defined by the likelihood of each candidate to find a new job. This means that job seekers that are classified in Group A will receive less support, as their likelihood to find a new job in the next months is already high, irrespective of further training or other support measures. Job seekers classified in Group C will also receive less support from the AMS but will instead be referred to other institutions, as this is a more cost-effective approach. The aim is thus for the AMS to fully focus their effort on job seekers that are classified in Group B – the “mediocre prospects” – as they will lead to a better Return of Investment (ROI) compared to the low prospect group C. Although the AMS-Algorithm is meant as a semi-automated decision support system, it is to be expected that the outcome of the system will rarely be questioned and in most instances implemented as is.
This classification of individuals into prospects and the connection of support to these classifications is problematic due to the characteristics that are used to classify job seekers which appear to be entirely driven by a managerial politics and ideology, which assigns a monetary value to individuals and only those with the highest ROI are deemed ‘suitable for investment’. The characteristics that are considered for the classification of job seekers in the AMS-Algorithm are designed to introduce pre-exiting biases and thus reinforcing these biases onto the job market. The data used for the predictions by the system are the individual’s employment records matched against wider job market data. Which means that the variables, that are used include different attributes that historically are discriminated against on the labour market, such as gender, age, citizenship, health, obligations of care (however only considered for women). Many of these variables are hard coded in such a way to ‘best reflect the current job market’. This means that if you are a female non-EU citizen with care obligation in the late 40ies, you will almost always be classified in the low prospect group C, irrespective of other characteristics such as previous employment or education. Without even having taken into account that there is always a possibility of erroneous data, or a wrong attribution of variables, which might lead to a wrong classification of individuals.
This example shows that the automated classifications through algorithms, which is currently responsible for most of the profiling, bears significant risks of being a reinforcer of societal inequalities and is certainly not a neutral technology. While it might be true that the algorithm is designed to accurately represent the current society or in this case the conditions of the labour market, this also means that all its flaws – known and unknown – are represented by the algorithm as well. It is known that in most situations on the labour market, male job seekers are favoured over female job seekers, younger job seekers are over older, and nationals over foreigners – for a variety of reasons. The inclusion of all these variables will ultimately lead to these inequalities being inscribed into the system, which will make it even more difficult to address and mitigate them in the future. Hence, if the goal is to improve the support for job seekers, simply copying the state of the labour market into the algorithm,s won’t help. Even if the idea is to be more cost efficient, as has been argued by the head of the AMS Johannes Kopf, the introduction of score based classifications and the profiling of individuals can nonetheless result in a wrong classifications and hence an “inefficient” allocation of support. Besides the fact that it produces a quite specific politics of classification, that groups individuals into efficient and inefficient workforce.
The AMS-algorithm is just one of many examples, where such automated classifications mechanisms are applied on a large scale and with significant outcomes, performing politics onto society through technology. In marketing, finance and the insurance industry, this type of automated profiling and classifying is the norm. Policing and crime fighting is increasingly assisted by predictive technologies, which function in a similar way. And as I have also discussed in the last blog, in the US, such a system is in place to classify potential reoffenders in the court system. All these systems are introduced with the same rationale of having more efficient systems in place, that are less prone to subjective decision-making by individuals as they are performed by – supposedly – neutral technology. What these systems all have in common is that they are not necessarily more efficient, but also not necessarily less objective. While the AMS-algorithm might introduce certain Key-Performance-Indicators (KPIs) that appear to be objective, it tends to ignore entire biographies of individuals, which are an important aspect for their prospect on the labour market, difficult to evaluate into a numerical value and often only established through the in-person consultation.
And while the proponents of such automated classification systems constantly repeat that their algorithm is only meant as a decision-support system and can easily be overruled by the end-users, once in use end-users tend to (blindly) trust the outcome of such a system. As the overruling requires a justification by the end-user, this will only happen in the rare instances where the error-threshold is significantly higher as in normal cases. Furthermore, the presence of such a system also leads to the scoring becoming the primary source of information as it is readily available and hence will dominate the decision-making regardless.
It should thus be noted that these automated classification systems have the potential to significantly impact individuals: On the one hand by inscribing existing inequalities into algorithmic classification technologies and thus reinforcing and exacerbating them in practice; on the other hand, by introducing another layer of technological errors and biases, which can always be included in the data or the algorithmic models. And while human oversight is guaranteed in most instances, there will rarely be much divergence between the algorithmic outcome and the human decision. The reasons for this, however, will be the topic of the next blog post of my series on critical algorithms studies, when I will be discussing the epistemic authority of algorithmic knowledge.
[1] Espeland, W. N. and Stevens, M. L. (1998) ‘Commensuration as a Social Process’, Annual Review of Sociology, vol. 24, no. 1, p. 316.
[2] Bowker, G. C. and Star, S. L. (2008) Sorting things out: Classification and its consequences, Cambridge, Mass., MIT Press.
[3] Allhutter, Doris; Cech, Florian; Fischer, Fabian; Grill, Gabriel; Mager, Astrid (2020): Algorithmic Profiling of Job Seekers in Austria. How Austerity Politics Are Made Effective. In Front. Big Data 3, p. 326. DOI: 10.3389/fdata.2020.00005.