Measuring Gender Bias in Spanish Language Models
Nick Doiron, Tufts University / Independent Research
Neural networks in natural language processing (NLP) are frequently based on large pre-trained models of word embeddings. It's understood that proximity of words' vectors can create a gender bias in the final model (e.g. assuming male doctors). Researchers have developed tools to measure bias in English pretrained models, but how can they be adapted for Spanish, a language with grammatical gender?
I translated the word lists for the Word Embedding Association Test (WEAT) and created parallel columns for masculine and feminine words where appropriate (gendered professions, family relationships, and adjectives). This made it easier to determine bias of mBERT and a monolingual Spanish model (BETO).
Later I used spaCy and the BETO word vectors to "flip" gender of long sentences. This allows us to train a model on less biased examples, or to test if a model would have a different outcome with a different gender.
Results, Outcomes, Conclusions
- Compared results on BETO (monolingual Spanish model) and mBERT (100 language model)
- Generated counterfactual sentences from Wikipedia articles
- Data augmentation on a Spanish movie reviews dataset greatly improved accuracy
Python libraries used
NumPy, HuggingFace/Transformers, spaCy, WEAT
- Could a seq2seq model replace the programmatic, spaCy-based counterfactual generator?