Staging Cancer Through Text Mining of Pathology Records

Belloni, Pietro; Boccuzzo, Giovanna; Guzzinati, Stefano; Irene, Italiano; Rossi Carlo, R.; Rugge, Massimo; Zorzi, Manuel

doi:10.1007/978-3-030-51222-4_4

Valuable information is stored in a healthcare record system and over 40% of it is estimated to be unstructured in the form of free clinical text. A collection of pathology records is provided by the Veneto Cancer Registry: these medical records refer to cases of melanoma and contain free text, in particular, the diagnosis. The aim of this research is to extract from the free text the size of the primary tumour, the involvement of lymph nodes, the presence of metastasis, and the cancer stage of the tumour. This goal is achieved with text mining techniques based on a supervised statistical approach. Since the procedure of information extraction from a free text can be traced back to a statistical classification problem, we apply several machine learning models in order to extract the variables mentioned above from the text. A gold standard for these variables is available: the clinical records have already been assessed case-by-case by an expert. The most efficient of the estimated models is the gradient boosting. Despite the good performance of gradient boosting, the classification error is not low enough to allow this kind of text mining procedures to be used in a Cancer Registry as it is proposed.

Staging Cancer Through Text Mining of Pathology Records

Belloni Pietro;Boccuzzo Giovanna;Guzzinati Stefano;Italiano Irene;Rossi Carlo R.;Rugge Massimo;Zorzi Manuel

2021

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2021
			
	Titolo del Libro
	
				Data Science and Social Research – II. Methods, Technologies and Applications
			
	Collana/serie monografica
	
				STUDIES IN CLASSIFICATION, DATA ANALYSIS, AND KNOWLEDGE ORGANIZATION
			
	Codice DOI
	
				https://dx.doi.org/10.1007/978-3-030-51222-4_4
			
	Codice Scopus
	
				2-s2.0-85097646488
			
	Codice OpenAlex
	
				W3109393126
			
	Codice ISBN
	
				9783030512217
			
	Appare nelle tipologie:
	
				02.01 - Contributo in volume (Capitolo o Saggio)

File in questo prodotto:

File	Dimensione	Formato
Belloni2021_Chapter_StagingCancerThroughTextMining.pdf Accesso riservato Tipologia: Published (Publisher's Version of Record) Licenza: Accesso privato - non pubblico Dimensione 327.7 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	327.7 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11577/3353104

Staging Cancer Through Text Mining of Pathology Records

Belloni Pietro;Boccuzzo Giovanna;Guzzinati Stefano;Italiano Irene;Rossi Carlo R.;Rugge Massimo;Zorzi Manuel

2021

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

Pubblicazioni consigliate

Citazioni

social impact

Staging Cancer Through Text Mining of Pathology Records

Belloni Pietro;Boccuzzo Giovanna;Guzzinati Stefano;Italiano Irene;Rossi Carlo R.;Rugge Massimo;Zorzi Manuel

2021

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Pubblicazioni consigliate

Informazioni

Citazioni

social impact

Conferma cancellazione

Scheda breve

Scheda completa

Scheda completa (DC)