In the last decades, the Natural Language Processing (NLP) community has demonstrated committed involvement in addressing societal challenges, particularly in the realm of hate-speech detection. Despite advancements, these phenomena continue to perpetrate, especially online, where users on social network platforms often find themselves in unsafe and possibly harmful environments. Among the various manifestations of hate speech and offensive language, one aspect that has been overlooked by the NLP community is body-shaming. Despite its prevalence among hateful users and its potential to harm a diverse group of individuals, from women to people with disabilities, efforts to counteract this damaging phenomenon remain limited. In this work, we first introduce a novel taxonomy designed to distinguish and classify instances of body-shaming by the targeted group. Following this, we present a dataset of Instagram comments for body-shaming detection and classification in the Italian language, which has been manually annotated according to the taxonomy. After detailing the data-gathering and annotation process, we present a classification benchmark using three BERT-based models to showcase our dataset’s classification potential. Results demonstrate good performances in detecting body-shaming instances across several categories of our proposed taxonomy.

Body-Shaming Detection and Classification in Italian Social Media

Valese, Alberto;
2024

Abstract

In the last decades, the Natural Language Processing (NLP) community has demonstrated committed involvement in addressing societal challenges, particularly in the realm of hate-speech detection. Despite advancements, these phenomena continue to perpetrate, especially online, where users on social network platforms often find themselves in unsafe and possibly harmful environments. Among the various manifestations of hate speech and offensive language, one aspect that has been overlooked by the NLP community is body-shaming. Despite its prevalence among hateful users and its potential to harm a diverse group of individuals, from women to people with disabilities, efforts to counteract this damaging phenomenon remain limited. In this work, we first introduce a novel taxonomy designed to distinguish and classify instances of body-shaming by the targeted group. Following this, we present a dataset of Instagram comments for body-shaming detection and classification in the Italian language, which has been manually annotated according to the taxonomy. After detailing the data-gathering and annotation process, we present a classification benchmark using three BERT-based models to showcase our dataset’s classification potential. Results demonstrate good performances in detecting body-shaming instances across several categories of our proposed taxonomy.
2024
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
29th International Conference on Natural Language and Information Systems, NLDB 2024
9783031702389
9783031702396
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11577/3540789
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 2
  • OpenAlex 1
social impact