MACHINE LEARNING AND ADVANCED STATISTICS IN ASTRONOMY:
TWO APPLICATIONS

De Pascale, Marco

In spectroscopy and photometry domains, the amount of data produced by surveys is rapidly increasing, and this trend will continue thanks to next future surveys. To extract information from these data in a useful time scale, the analysis can be done by means of techniques from statistics and computer science. This work presents the development and application of two automatic methods. The thesis is in two parts. The first part describes the use of MATISSE, a parameterisation algorithm for stellar spectra developed at the Observatoire de la Cote d'Azur, and part of the AMBRE project. It has been applied to ~ 126 000 spectra observed by the ESO:HARPS spectrograph. The parameters extracted by MATISSE are effective temperature, gravity, metallicity and α elements abundance and comes with relative errors. Quality selection criteria have been defined. The accepted subsample of parameters, has been compared with results from independent works, showing very good agreement. Additionally, these parameters identify the great majority of stars as of spectral type G and K, in agreement with the type of targets observed by HARPS. This confirms MATISSE as an excellent parameterisation algorithm. The second part is concerned with the analysis of large amounts of photometric observations. It describes the development of a supernova classifier and its application to a set of simulated light curves. The method is developed using a "data-driven" approach. The aim is to extract from the data all the information necessary to solve the problem, using the as less assumption as possible. For this purpose, techniques from the machine learning domain are exploited. These techniques are able to make a computer learn the rule transforming input into output using example observations. The machine learning algorithms used are Gaussian processes to perform light curve interpolation, diffusion maps to extract parameters, and random forest to build the classification model. The goal is to reproduce the spectroscopy-based classification in the three classes of type Ia, Ib/c and II, using only light curves. In this respect the method fails, since it is not reliable in classifying type Ib/c. The main cause of this failure is to be found in the set of example light curves, not representative of the observed population of supernovae. On the other hand, when compared with independent results, the method developed results competitive in the identification of supernovae Ia.

Nel campo della spettroscopia e della fotometria, la mole di dati prodotta dalle survey sta aumentando molto velocemente, e continuerà a farlo sempre più nei prossimi anni. Un'analisi che estragga informazioni in tempi utili può essere affidata a metodi automatici sviluppati utilizzando tecniche statistiche e della scienza computazionale. Questo lavoro presenta lo sviluppo e l'applicazione di due metodi automatici. La tesi `e divisa in due parti. La prima parte riporta l'utilizzo dell'algoritmo MATISSE, sviluppato all'Observatoire de la Cote d'Azur, e della pipeline AMBRE per la parametrizzazione di ~ 126 000 spettri prodotti dallo spettrografo ESO:HARPS. I parametri estratti da MATISSE sono temperatura effettiva, gravità, metallicità e abbondanza di elementi α, completi di errori. Il sottoinsieme di parametri che ha superato i criteri di qualità definiti per il campione, è stato confrontato con i risultati di lavori indipendenti mostrando un ottimo accordo. Inoltre, i risultati identificano la grande maggioranza delle stelle come di tipo spettrale G e K, in accordo con il tipo di oggetti osservato da HARPS. Questo conferma MATISSE come un ottimo algoritmo di parametrizzazione. La seconda parte è dedicata all'analisi di grandi quantità di dati fotometrici. Qui è descritto lo sviluppo di un classificatore di supernovae e la sua applicazione a curve di luce simulate. Il metodo è sviluppato seguendo un approccio detto "data-driven'', in cui si cerca di estrarre dai dati tutta l'informazione necessaria a risolvere il problema, affidandosi al minor numero possibile di assunzioni. A questo scopo, il metodo fa affidamento a tecniche del "machine learning'', in grado di far apprendere a un computer la regola che trasforma l'input nell'output usando campioni di esempio. Nello specifico vengono utilizzati i processi gaussiani per l'interpolazione delle curve di luce, le "diffusion maps'' per la parametrizzazione e le "random forest'' per costruire il classificatore vero e proprio. Lo scopo è quello di replicare la classificazione spettroscopica nei tre tipi Ia, Ib/c e II usando solo curve di luce. In questo il metodo fallisce, non riuscendo a classificare le Ib/c in maniera soddisfacente. La causa maggiore è da ricercarsi nell'insieme di esempi disponibili, non rappresentativo della popolazione di supernovae osservata. Invece, confrontato con risultati indipendenti, il metodo presentato risulta competitivo nell'identificazione delle supernovae Ia.

MACHINE LEARNING AND ADVANCED STATISTICS IN ASTRONOMY: TWO APPLICATIONS / De Pascale, Marco. - (2015 Jul 30).

MACHINE LEARNING AND ADVANCED STATISTICS IN ASTRONOMY: TWO APPLICATIONS

De Pascale, Marco

2015

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di discussione
	
				30-lug-2015
			
	Abstract
	
				In spectroscopy and photometry domains, the amount of data produced by surveys is rapidly increasing, and this trend will continue thanks to next future surveys.  To extract information from these data in a useful time scale, the analysis can be done by means of techniques from statistics and computer science. This work presents the development and application of two automatic methods. The thesis is in two parts.

The first part describes the use of MATISSE, a parameterisation algorithm for stellar spectra developed at the Observatoire de la Cote d'Azur, and part of the AMBRE project. It has been applied to  ~ 126 000 spectra observed by the ESO:HARPS spectrograph.  The parameters extracted by MATISSE are effective temperature, gravity, metallicity and α elements abundance and comes with relative errors.  Quality selection criteria have been defined. The accepted subsample of parameters, has been compared with results from independent works, showing very good agreement. Additionally, these parameters identify the great majority of stars as of spectral type G and K, in agreement with the type of targets observed by HARPS. This confirms MATISSE as an excellent parameterisation algorithm.

The second part is concerned with the analysis of large amounts of photometric observations. It describes the development of a  supernova classifier and its application to a set of simulated light curves. The method is developed using a "data-driven" approach. The aim is to extract from the data all the information necessary to solve the problem, using the as less assumption as possible. For this purpose, techniques from the machine learning domain are exploited. These techniques are able to make a computer learn the rule transforming input into output using example observations. The machine learning algorithms used are Gaussian processes to
perform light curve interpolation, diffusion maps to extract parameters, and random forest to build the classification model. The goal is to reproduce the spectroscopy-based classification in the three classes of type Ia, Ib/c and II, using only light curves. In this respect the method fails, since it is not reliable in classifying type Ib/c. The main cause of this failure is to be found in the set of example light curves, not representative of the observed population of supernovae. On the other hand, when compared with independent results, the method developed results competitive in the identification of supernovae Ia.
			
	Parole chiave
	
				supernova, photometric classification, machine learning, statistics, big data
			
	Citazione
	
				MACHINE LEARNING AND ADVANCED STATISTICS IN ASTRONOMY:
TWO APPLICATIONS / De Pascale, Marco. - (2015 Jul 30).
			
	Appare nelle tipologie:
	
				08.01 - Tesi di Dottorato UNIPD (Deposito Legale)

File in questo prodotto:

File	Dimensione	Formato
depascale_phd_thesis.pdf accesso aperto Tipologia: Tesi di dottorato Licenza: Creative commons Dimensione 12.33 MB Formato Adobe PDF Visualizza/Apri	12.33 MB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11577/3424205

MACHINE LEARNING AND ADVANCED STATISTICS IN ASTRONOMY: TWO APPLICATIONS

De Pascale, Marco

2015

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

Pubblicazioni consigliate

Citazioni

social impact

MACHINE LEARNING AND ADVANCED STATISTICS IN ASTRONOMY: TWO APPLICATIONS

De Pascale, Marco

2015

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Pubblicazioni consigliate

Informazioni

Citazioni

social impact

Conferma cancellazione

Scheda breve

Scheda completa

Scheda completa (DC)