AI ethics is not an optimization problem

Often, AI researchers and engineers think of themselves as neutral and “objective”, operating in a framework of strict formalization. Fairness and absence of bias, however, are social constructs; there is no objectivity, no LaTeX-typesettable remedies, no algorithmic way out. AI models are developed based on a history and deployed in a context. In AI as in data science, the very absence of action can be of political significance.

Why this post, why now

When you work in a field as intellectually-satisfying, challenging and inspiring as software design for machine learning, it is easy to focus on the technical, keeping out of sight the broader context. Some would even say it is required. How else can you keep up the necessary level of concentration?

But even for someone who hasn’t been in the field that long, it is evident that with every year that passes, with deep-learning-based technologies progressing faster and faster over ever-shorter time spans, misuse of these technologies has increased as well, not just in selected countries but all over the world.

To eliminate one possible misunderstanding right from the start: When I’m talking about “faster and faster” progress, I’m not over-hyping things. I’m far from thinking that AI is close to “solving” problems like language understanding, concept learning and their likes — the kind of problems some would argue hybrid models were needed for. The thing is that it doesn’t matter. It is exactly the kinds of things AI does do so well that lend themselves to misuse. It is people we should be afraid of, not machines.

Back to the why. Over time, it increasingly appeared to me that writing regularly about AI in a technical way, but not ever writing about its misuse, was ethically questionable in itself. However, I also became increasingly conscious of the fact that with a topic like this, once you enter the political realm — and that we have to is the main point of this text –, likelihood rises that people will disagree, or worse, feel offended for reasons not anticipated by the writer. Some will find this too radical, some not radical (explicit) enough.

But when the alternative is to stay silent, it seems better to try and do one’s best.

Let’s start with two terms whose recent rise in popularity matches that of AI as a whole: bias and fairness.

But we are doing a lot to improve fairness and remove bias, aren’t we?

Search for “algorithmic fairness”, “deep learning fairness” or something similar, and you’ll find lots of papers, tools, guidelines … more than you have time to read. And bias: Haven’t we all heard about image recognition models failing on black people, women, and black women in particular; about machine translation incorporating gender stereotypes; about search engine results reflecting racism and discrimination? Given enough media attention, the respective algorithms get “fixed”; what’s more, we may also safely assume that overall, researchers developing new models will probe for these exact kinds of failures. So as a community, we do care about bias, don’t we?

Re: fairness. It’s true that there are many attempts to increase algorithmic fairness. But as Ben Green, who I’ll cite a lot in this text, points out, most of this work does so via rigid formalization, as if fairness — its conflicting definitions notwithstanding — were an objective quantity, a metric that could be optimized just like the metrics we usually optimize in machine learning.

Assume we were willing to stay in the formalist frame. Even then there is no satisfying solution to this optimization problem. Take the perspective of group fairness, where a fair algorithm is one that results in equal outcomes between groups. Chouldechova then shows that when an algorithm achieves predictive parity (a.k.a. precision), but prevalence differs between groups, it is not possible to also have both equal false positive and equal false negative rates.

That said, here I mainly want to focus on why pure formalism is not enough.

What with bias? That datasets can be (or rather: are) biased is hardly something anyone working in AI would object to. What amount of bias is being admitted to in other components/stages of the machine learning process varies between people.

Harini and Guttag map sources of bias to the model development and deployment process. Forms of bias they distinguish include representation (think: dataset), measurement (think: metrics), aggregation (think: model), and evaluation (similar to the two preceding ones, but at test time) bias. The paper is written in a technical style; in fact, there is even a diagram where the authors attempt to formalize the process in a mathematical way. (Personally, I find it hard to see what value is added by this diagram; its existence can probably best be explained by [perceived] requirements of the genre, namely, research paper.)

Now, so far I’ve left out the remaining two sources of bias they name. Mapped to the end of the process is deployment bias, when a system “is used or interpreted in inappropriate ways”. That raises a question. Is this about the end of a process, or is what happens now a completely different process (or: processes)? For an in-house data scientist, it may well be the end of a process; for a scientist, or for a consultant, it is not. Once a scientist has developed and published a model, they have no control over who uses it and how. From the collection of humankind’s empirical truths: Once a technology exists, and it is useful, it will be used.

Things get further out of control at the other side of the timeline. Here we have historical bias:

Historical bias arises when there is a misalignment between the world as it is and the values or objectives to be encoded and propagated in a model. It is a normative concern with the state of the world, and exists even given perfect sampling and feature selection.

I’ll get back to why this is called “historical bias” later; the definition is a lot broader though. Now definitely we’re beyond the realm of formalization: We’re entering the realm of normativity, of ways of viewing the world, of ethics.

Please don’t get me wrong. I’m not criticizing the paper; in fact, it provides a useful categorization that may be used as a “check list” by practitioners. But given the formalist style, a reader is likely to focus on the readily-formalizable parts. It is then easy to dismiss the two others as not really being relevant to one’s work, and certainly not something one could exert influence on. (I’ll get back to that later.)

One little aside before we leave formalism: I do believe that a lot of the formalist work on AI ethics is done in good intent; however, work in this area also benefits organizations that undertake it. Citing Tom Slee,

Standards set public benchmarks and provide protection from future accusations. Auditable criteria incorporated into product development and release processes can confirm compliance. There are also financial incentives to adopt a technical approach: standards that demand expertise and investment create barriers to entry by smaller firms, just as risk management regulations create barriers to entry in the financial and healthcare industries.

Not everything in life is an optimization problem

Stephen Boyd, who teaches convex optimization at Stanford, is said to often start the introductory lecture with the phrase “everything is an optimization problem”. It sounds intriguing at first; certainly a lot of things in my life can be thought of like that. It may become awkward once you start to compare, say, time spent with loved ones and time spent working; it becomes completely unfeasible when comparing across people.

We saw that even under formalist thinking, there is no impartial way to optimize for fairness. But it’s not just about choosing between different types of fairness. How do you weigh fairness against stakeholder interests? No algorithm will be deployed that does not serve an organization’s purpose.

There is thus an intimate link between metrics, objectives and power.

Metrics and power

Even though terminologically, in deep learning, we distinguish between optimization and metrics, the metrics really are what we are optimizing for: Goodhart’s law — When a measure becomes a target, it ceases to be a good measure — does not seem to apply. Still, they deserve the same questioning and inspection as metrics in other contexts.

In AI as elsewhere, optimization objectives are proxies; they “stand in” for the thing we’re really interested in. That proxying process could fail in many ways, but failure is not the only applicable category to think in here.

For one, objectives are chosen according to the dominant paradigm. Dotan and Milli show how a technology’s perceived success feeds back into the criteria used to evaluate other technologies (as well as future instances of itself). Imagine a world where models were ranked not just for classification accuracy, but also for, say, climate friendliness, robustness to adversarial attacks, or reliable quantification of uncertainty.

Objectives thus do not emerge out of nothing. That they reflect paradigms may still sound abstract; that they serve existing power structures less so. Power structures are complex; they reflect more than just who “has the say”. As pointed out in various “classics” of critical race theory, intersectionalist feminism and related areas[^6], we should ask ourselves: Who profits?

This is an all but simple question, comparable in difficulty, it seems to me, to the request that we question the unspoken premises that underlie our theories. The main point about such premises is that we aren’t aware of them: If we were, we could have stated them explicitly. Similarly, if I noticed I was profiting from someone, I would — hopefully — take some action. Hard as the task may be, though, we have to do our best.

If it’s hard to see how one is privileged, can’t we just be neutral? Objective? Isn’t this getting too political?

There is no objectivity, and all is politics

To some people, the assertion that objectivity cannot exist is too self-evident to require much dwelling on. Are numbers objective? Maybe in some formal way; but once I use them in communication, they convey a message, independently of my intention. 

Let’s say I want to convey the fact, taken from the IPCC website, that between 2030 and 2052, global warming is likely to reach 1.5°C. Let’s also assume that I look up the pre-industrial average for the place where I happen to live (15°C, say), and that I want to show off my impressive meteorological knowledge. Thus I say, “… so here, guys, temperature will rise from 288.15 Kelvin to 289.65 Kelvin, on average”. This surely is an objective statement. But what if the people I’m talking to don’t know that — even though the absolute values, when expressed in Kelvin, are so much higher than when expressed in degrees Celsius — the relative differences are the same? They might get the impression, totally unintended, that not much warming is going to happen.

If even numbers, once used in communication, lose their objectivity, this must hold even more for anything that involves more design choices: visualizations, APIs, and, of course, written text. For visualizations, this is nicely illustrated in d’Ignazio and Klein’s Data Feminism.

That book is also an exemplary exercise in what appears like the only way to “deal with” the fact that objectivity is impossible: Trying to be as clear as possible about where we stand, who we are, what are the circumstances that have influenced our way of thinking. Of course, like with the unspoken premises and assumptions discussed above, this is not easy; in fact, it’s impossible to do in perfection. But one can try.

In fact, the above “Trying to be as clear as possible …” is deliberately ambiguous. It refers to two things: For one, to me striving to analyze how I’m privileged, and secondly, to me giving information to others. The first alone is laudable but necessarily limited; the second, as exercised by d’Ignazio and Klein, opens the door not just for better mutual understanding, but also, for feedback and learning. The person I’m talking to might lead me to insights I wouldn’t have gotten otherwise.

Putting things slightly differently, there is no objectivity because there’s always a context. Focus on metrics and formalization detract from that context. Green relates an interesting parallel from American law history. Until the early twentieth century, US law was dominated by a formalist ethos. Ideally, all rules should be traceable to a small number of universal principles, derived from natural rights. Autonomy being such a right, in a famous 1905 case, the U.S. Supreme Court concluded that a law limiting the working hours of employees represented “unreasonable, unnecessary and arbitrary interference with the right and liberty of the individual to contract”. However,

In his dissent, Justice Oliver Wendell Holmes argued that the Court failed to consider the context of the case, noting, “General propositions do not decide concrete cases”.

Thus, the context matters. Autonomy is good; but it can’t be used as an excuse for exploitation.

The same context dependence holds in AI. It is always developed and deployed in a context, — a context shaped by history. History determines what datasets we work with; what we optimize for; who tells us what to optimize for. An daunting example is so-called “predictive policing”. Datasets used to train prediction algorithms incorporate a history of racial injustice: The very definition of “crime” they rely on wasshaped by racist and classist practice. The new method then perpetuates — more than that: exacerbates — the current system, creating a vicious cycle of positive feedback that makes it look like the algorithm was successful.

Summing up: When there is no neutrality, everything is politics. Not acting is acting. Citing Green,

But efforts for reform are no more political than efforts to resist reform or even the choice simply to not act, both of which preserve existing systems.

But I’m just an engineer

When machine learning people, or computer science people in general, are asked about their views on the societal impact of modern AI, an often-heard answer is: “But I’m just an engineer…”. This is completely understandable. Most of us are just tiny cogs in those big machines, wired together in a complex network, that run the world. We’re not in control; how could we be accountable?

For the AI scientist, though normally all but sitting in an “ivory tower”, it’s nevertheless the ML engineers who are responsible of how a model gets deployed, and to what consequences⁷. The ML engineer may delegate to the head of IT, who in turn had no choice but implement what was requested “by business”. And so on and so forth, ad infinitum.

That said, it is hard to come up with a moral imperative here. In the line of thinking exercised above: There is always a context. In some parts of the world, you have more choices than in others. Options vary based on race, gender, abledness, and more. Maybe you can just quit and get another job; maybe you can’t.

There is another — related — point though, on which I’d like to dwell a little longer.

Technology optimism

Sometimes, it’s not that the people working in AI are fully aware of, but don’t see how to counter, the harm that is being done in their field. On the contrary. They are convinced that they’re doing good. Just do a quick search for “AI for good”, and you’ll be presented with a great number of projects and initiatives. But how, actually, is decided what is good? Who decides? Who profits?

Ben Green, again, relates an instructive example:

USC’s Center for Artificial Intelligence in Society (CAIS) is emblematic of how computer science projects labeled as promoting “social good” can cause harm by wading into hotly contested political territory with a regressive perspective. One of the group’s projects involved deploying game theory and machine learning to predict and prevent behavior from “adversarial groups.” Although CAIS motivated the project by discussing “extremist organizations such as ISIS and Jabhat al-Nusra,” it quickly slipped into focusing on “criminal street gangs” [43]. In fact, the project’s only publication was a controversial paper that used neural networks to classify crimes in Los Angeles as gang-related [28, 41].

Predictive policing, already mentioned above, can also be seen in this category. At first thought, isn’t it a good thing? Wouldn’t it be nice if we could make our world a bit more secure?

Phillip Rogaway, who I’ll mention again in the concluding section, talks a lot about how technology optimism dominates among his students; he seems to be as intrigued by it as I am. Personally, I think that whether someone “intrinsically” tends to be a technology optimist or a pessimist is a persistent trait; it seems to be a matter of personality and socialization (or just call it fate: I don’t want to go into any nature-nurture debates here). That glass, is it half full or half empty?

All I could say to a hardcore technology optimist is that their utopia may be another person’s dystopia. Especially if that other person is poor, or black, or poor and a black woman … and so on.

Let me just end this section with a citation from a blog post by Ali Alkhatib centered around the launch of the Stanford institute for human-centered artificial intelligence (HAI). Referring to the director’s accompanying blog post, he writes

James opens with a story of an office that senses you slouching, registers that you’re fatigued, intuits that your mood has shifted, and alters the ambiance accordingly to keep you alert throughout the day.

You can head to Alkhatib’s post (very worth reading) and read the continuation, expanding on the scenario. But for some people, this single sentence may already be enough.

And now?

At some point, an author is expected to wrap up and present ideas for improvement. With most of the topics touched upon here, this is yet another intimidating task. I’ll give it a try anyway. The best synthesis I can come up with at the moment looks about like this.

First, some approaches filed by Green under “formalist” can still make for a good start, or rather, can constitute a set of default measures, to be taken routinely. Most prominently, these include dataset and model documentation.

Beyond the technical, I like the advice given in Baumer and Silberman’s When the Implication is Not to Design. If the consequences, especially on socially marginalized groups, of a technological approach are unforeseeable, think whether the problem can be solved in a “low-tech” way. By all means, do not start with the solution and then, go find a problem.

In some cases, even that may not be enough. With some goals, don’t look for alternative ways to achieve them. Sometimes the goal itself has to be questioned.

The same can be said for the other direction. With some technologies, there is no goal that could justify their application. This is because such a technology is certain to get misused. Facial recognition is one example.

Lastly, let me finish on a speculative note. Rogaway, already referred to above for his comments on technology optimism, calls on his colleagues, fellow cryptographers, to devise protocols in such a way that private communication stays private, that breaches are, in plain terms, impossible. While I personally can’t think see how to port the analogy to AI, maybe others will be able to, drawing inspiration from his text. Until then, changes in politics and legislation seem to be the only recourse.


1: For brevity, I’ll be subsuming deep learning and other contemporary machine learning methods under AI, following common(-ish) usage.

2:  I can’t hope to express this better than Maciej Cegłowski did here, so I won’t elaborate on that topic any further.

3: More on that below. Metrics used in machine learning mostly are proxies for things we really care about; there are lots of ways this can go wrong.

4: Meaning, a one-model-fits-all approach puts some groups at a disadvantage.

5: Cited after Thomas and Uminsky.

6: See e.g., DataFeminism and RaceAfterTechnology.