In this paper, we describe the results of the participation of the Information Management Systems (IMS) group at AILA 2020 Task 1, precedents and statutes retrieval. In particular, we participated in both subtasks: precedents retrieval (task a) and statutes retrieval (task b). The goal of our work was to compare and evaluate the efficacy of a simple reproducible approach based on the use of either lemmas or stems with a tf-idf vector space model and a plain BM25 model. The results vary significantly from one subtask/evaluation measure to another. For the subtask of statutes retrieval, our approach performed well, being second only to a participant that used BERT to represent documents.
A study on lemma vs stem for legal information retrieval using R tidyverse. IMS UniPD @ AILA 2020 Task 1
Di Nunzio G. M.
2020
Abstract
In this paper, we describe the results of the participation of the Information Management Systems (IMS) group at AILA 2020 Task 1, precedents and statutes retrieval. In particular, we participated in both subtasks: precedents retrieval (task a) and statutes retrieval (task b). The goal of our work was to compare and evaluate the efficacy of a simple reproducible approach based on the use of either lemmas or stems with a tf-idf vector space model and a plain BM25 model. The results vary significantly from one subtask/evaluation measure to another. For the subtask of statutes retrieval, our approach performed well, being second only to a participant that used BERT to represent documents.File | Dimensione | Formato | |
---|---|---|---|
T1-10.pdf
accesso aperto
Tipologia:
Published (publisher's version)
Licenza:
Creative commons
Dimensione
470.49 kB
Formato
Adobe PDF
|
470.49 kB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.