Online Social Networks (OSNs) rely on content moderation systems to ensure platform and user safety by preventing malicious activities, like the spread of harmful content. However, there is a growing consensus suggesting that such systems are unfair to historically marginalized individuals, fragile users, and minorities. Additionally, OSN policies are often hardcoded in AI-based violation classifiers, making personalized content moderation challenging. In addition, there is a need for more communication between users and platform administrators, especially in case of disagreement about a moderation decision. To address these issues, we propose integrating content moderation systems with Large Language Models (LLMs) to enhance support for personal content moderation and improve user-platform communication. We also evaluate the content moderation capabilities of GPT 3.5 and LLaMa 2, comparing them to commercial products, as well as discuss the limitations of our approach and the open research directions.

Integrating Content Moderation Systems with Large Language Models

Franco, Mirko
;
Gaggi, Ombretta;Palazzi, Claudio E.
2024

Abstract

Online Social Networks (OSNs) rely on content moderation systems to ensure platform and user safety by preventing malicious activities, like the spread of harmful content. However, there is a growing consensus suggesting that such systems are unfair to historically marginalized individuals, fragile users, and minorities. Additionally, OSN policies are often hardcoded in AI-based violation classifiers, making personalized content moderation challenging. In addition, there is a need for more communication between users and platform administrators, especially in case of disagreement about a moderation decision. To address these issues, we propose integrating content moderation systems with Large Language Models (LLMs) to enhance support for personal content moderation and improve user-platform communication. We also evaluate the content moderation capabilities of GPT 3.5 and LLaMa 2, comparing them to commercial products, as well as discuss the limitations of our approach and the open research directions.
2024
File in questo prodotto:
File Dimensione Formato  
3700789.pdf

accesso aperto

Tipologia: Published (Publisher's Version of Record)
Licenza: Creative commons
Dimensione 519.26 kB
Formato Adobe PDF
519.26 kB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11577/3540876
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 7
  • ???jsp.display-item.citation.isi??? ND
  • OpenAlex 6
social impact