Skip to content

Our fine-tuned BERT model is a sophisticated tool designed to detect sexism in texts. By leveraging advanced NLP techniques, contextual embeddings, and extensive pre-training, we aim to contribute to the ongoing efforts in addressing biases in language. By Arnaud Blanchard, Carina Prunkl, Marion Gagnadre & Elizabeth den Dulk

Notifications You must be signed in to change notification settings

Esmedd/detecting-sexism

Repository files navigation

detecting-sexism

You're Not Sexist

In the context of the final project for Le Wagon Data Science & AI course, we have built an NLP model able to detect sexism in text. The initial classification will be a binary 'sexist' or 'not sexist'.

As opposed to previous research and models, we used a dataset composed of 6 text corpi, not limited to social media.

As the datasets use different annotation rules, we hoped to provide a richness and nuance to the model, at the risk of being 'overly accusatory'.

yourenotsexist Initial Models

Our final model, which can be tested here, is a fine-tuned BERT model, trained for Precision.

yourenotsexist Final Model

The options for further exploration include, but are not limited to:

  • Developing a french model
  • Developing a multi-lingual model
  • Augmenting our dataset(s)
    • Translation
    • Scraping reddit/instagram/youtube/twitch
    • Generative AI
    • Text templates (eg, "I hate women, they're all (bitch/slut/whore)s")
  • Annotating our own data set based on gender theory and language theory
  • Multi-class classification

About

Our fine-tuned BERT model is a sophisticated tool designed to detect sexism in texts. By leveraging advanced NLP techniques, contextual embeddings, and extensive pre-training, we aim to contribute to the ongoing efforts in addressing biases in language. By Arnaud Blanchard, Carina Prunkl, Marion Gagnadre & Elizabeth den Dulk

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •  

Languages