yoavg/fairness in inida.md

## fairness in inida.md

      
    Raw
  

              fairness in inida.md
            
          
    Thoughts and some criticism on "Re-imagining Algorithmic Fairness in India and Beyond".

Yoav Goldberg, Jan 30, 2021
This new paper from Google Research Ethics Team (by Sambasivan, Arnesen, Hutchinson, Doshi, and Prabhakaran) touches on a very imortant topic: research (and supposedly also applied) work on algorithmic fairness---and more broadly AI-ethics---is US-centric[*], reflecting US subgroups, values, and methods. But AI is also applied elsewhere (for example, India). Do the methods and result developed for/in the US transfer? The answer is, of course, no, and the paper is doing a good job of showing it. If you are the kind of person who is impressed by the number of citations, this one has 220, a much higher number than another paper (not) from Google Research that became popular recently and which boasts many citations. I think this current paper (let's call it "the India Paper") is substantially more important, given that it raises a very serious issue that is at the center of AI ethics/fairness/etc, and which, unlike other papers, did not really receive any attention before. I hope it bootstraps many works along these lines, we certainly need more like these! Please do go read this paper.
Where the paper falls short

I really enjoyed reading this paper, I learned a lot from it. But at the same time I left it misses the mark. Simply put, I think they could have gone much further, as I will explain below. This is very understandable for a first paper on a topic, and it also might be related to all author---including ones of Indian origin---living and working around Mountain-View, California. The paper is based on interviews, and of the 36 interviewees, 9 are from the US. Many of the interviewees seem to be academics, in fields related to fairness---these are also likely influenced by current AI ethics research (which, as the authors point out, and I agree, is US-centric). So that's the "why", and why it is understandable (esp for a first paper). Now let's talk about the "what".
This can be everywhere.

The feeling I got when reading this paper was: "I really liked the intro, but  it is really not about India at all, you could sub any country here, even European ones, and many things will be pretty much the same". The problem it raises (different subgroups and different markers for these subgroups which translatre to different axes of unfairness; data distortions; lack of local resources; lack of jounralistic interest; opacity; overfitting to the privilaged group; differnt legal system; and many others) are universal.
Choose a random country in the world (yes, also in Europe), and it wouldn't have the same protected groups as the US (race, gender, age may reccur, of course, but with different levels of importance. And you almost surely will have other divisions and social/ethnic/other classes that are not present in the US-centric discourse). Similarly "zipcode is correlated with race" will not behave like that elsewhere, and so on. Data will be distorted (there is an implicit claim that this is less of a case in the US, which I am doubtful about), the issue may not be the center of journalistic attention, apps will be built by and for the privilaged group, etc, etc. Taking datasets and methods from the US as-is will not work well, and the ecosystem and media attention around AI is very different. These things are universal.
This culminates in Figure 2, titled "Research pathways for Algorithmic Fairness in India", which is in some ways the "core" contribution of the paper. It is universal. Its message applies to any continent, country or region in the world in which you want to apply AI ethics/fairness.
In a sense this actually makes the point of the paper stronger: it is a universal problem. I wish it was stated this way, for the benefit of US-centric readers, who may not realize how different the world is.
It is still US-centric / FAccT-centric.

[disclaimer: Though my Mom was born in India, I am truly not an expert on India society, culture or anything else. The authors know much more than me. These are just some observations. I hope an interesting debate would arise.]
Moving to my bigger issue with the paper: while it argues against US-centrism (which it calls West-centrism), it feels that, overall, the approach taken is still very much US/West/FAccT-centric, with only slight adaptations for India. In particular, the ethics and fairness framework remains the same, only the technical details (eg who is marginalized and how, what are the correlated features, etc) are adapted. There are many frameworks of thought about ethics, fairness and scales of harm, and these vary across cultures, populations, and social systems. The paper touches on it in the Introduction and in Section 2, and in particular in 2.2 ("fairness perception across cultures") where it mentions research into differences in what is perceived fair, different notions of justice, and different moral foundations. However, sadly it stops there. I would have loved to have more discussion around this issue, as it applies to India.
The paper does touch on what it calls "indic justice models", where it discusses a "reservation system" ("quotas") which is enacted in India. This is an affirmative action system by which different bodies allocate quotes of between 30% to 80% of positions to people from certain classes. The paper rightly points out that most of the developed fairness algorithms are not compatible with such a quota system (such quotas are illegal in the US, and so there isn't much incentive to develop algorithms that support them). It is an important issue for people who want to apply "fair ML" algorithms within a system that requires quotas, and important information for people who develops such models.
But note that this is a still only a technical issue. The legal system in India allows/requires quotas for historically oppressed populations, the US system forbids them, but the quotas are fully compatible with the Rawlsian / systemic-oppression / equity-driven system that governs the US/FAccT AI-fairness work. And they may not be fully compatible with how people in India perceive fairness and justice. As pointed to me in a DM by an Indian person (who identifies as "admittedly from a priveleged caste") "that ignores the fact that there is opposition to it here. The opposition goes from fundamental opposition from the principles of individual equality to pragmatic arguments on how it ends up further entrenching the caste divisions. Even Dr. Ambedkar who proposed it advocates it as a temporary measure". And in any case, the private sector in India, to my understanding, does not have such "quotas", or at least is not required to have them. So is this really an "Indic framework of fairness"? Or is it just a legal situation that happens to be in place, and which coincides with the underlying justice framework that is assumed by the discourse on AI-ethics already?
In case some institutions have quotas and others do not, what should algorithms (and algorithm developers) optimize for? Are there other fairness frameworks that apply to India which could be considered? (And maybe also in the US?). Maybe scales of harm differ? Maybe differnet tradeoffs are in order? I wish the authors would venture further into these topics, and I seriously hope followup works will go there.
At the end of page 1, the authors state very explicitly "We use feminist, decolonial, and anti-caste lenses to analyze our data". I'm not an expert on India by any means, but at least the first two sound like rather western-influenced lenses to me, not necessarily India ones. I'm sure India have their share of decolonical and feminist thought, but to what extent are these compatible with the "west" thought on these concepts? Likewise a lot of the discourse revolves about opression and marginalization. Given my cultural background, I find all of these to be very natural, but are these the India frameworks of choice, or is it a subtle case of western (academic) colonialism? I truly don't know, and I'd love to hear from Indian people or India scholars about these topics.
Transparency about Ethics / Fairness Frameworks

This relates to a criticism that was directed also to the Stochastic Parrots / Large-LMs paper, and which I think applies to many "non-mathy" FAccT paper: not being explicit about your fairness/ethics framework. I think we should embrace a culture of transparency around ethics frameworks, even the default ones. While "mathy" fairness-works are always explicit about their fairness assumptions (they optimize a loss, and so the encode the fairness objective mathematically), this is not the case for many of the "policy-papers" (or "opinion papers"). I think it should be. In NLP, Emily Bender enacted the "Bender Rule" by which if you talk about "Language" on your research paper, you should state what Language it is, even, and especially, if it is the default English. I would like to have the same system for people using Ethic/Fairness Systems, even if they are the FAccT default. Are we optimizing for equality gap? for total utility? are we optimizing the maxmin welfare? the average? How do we rank harms against each others? Or do we believe that all harms are equally bad? etc. These are important quesations, and I'd love to see them made explicit.
A note about "Google Research Ethics Team"

Let us take a step back from this paper and its authors, and discuss "Google Research Ethics Team" as an institution. I stated before that I think the Ethics Team is ethics-washing for Google, and that it is a missed opportunity. I maintain that this is the case. And that this paper is a prime example.
This paper has a very nice level of scholarship, it asks an important question, it is interesting, it is well executed, well researched and well written (sans my reservations above). It does good for the world.
It also has nothing to do with Google or with Google's business or products. It could have just as well be written by academics, in any university. Why does it needs to be done at Google?
It's nice that Google as investing in "general ethics" and contributes to the community, but is it a diversion from considering Google's own important ethics issues? Given the outcomes of the (very, very mild) large-LMs paper story, it seems that Google really does not want to publicly touch any topics which may conflict with its business any way.
This is vey understandable as far as Google is concered. Surely they don't want to "wash their dirty laundry" in public. But this is, I think, where the missed opportunity is. What if the ethics team was not public facing, and had no publishing objectives. What if it was more like a security team, auditing actual google applications for issues and working with the teams to address them, without making things public, and without discussing failures outside the company? Sure, it will not be as ethically pure, and it may not be hardly as lucrative, but it could makde a very significant change for the good on the world.
(This proposal does not mean that things should not be regulated outside of the companies. They certainly should.)
[*] The paper calls this West-centric, but almost all occurrences of the word "West" can be safely replaced with "US".