Phishing attacks are on the rise and phishing websites are everywhere, denoting the brittleness of security mechanisms reliant on blocklists. Prior work proposed en- hancing Phishing Website Detectors (PWD) to mitigate this threat with data-driven techniques powered by Machine Learning (ML). The main advantage of ML models is their intrinsic ability of noticing weak patterns in the data that are overlooked by a human, and then leveraging such patterns to devise ‘flexible’ detectors that can counter even adaptive attackers. This dissertation addresses three significant aspects arising from the interaction between machine learning and phishing website detection: (i) Adversarial attack for machine learning-based phishing website detection (ML-PWD), (ii) User percep- tions of Phishing webpages, and (iii) Phishing website detection in multi-language environment (i.e., Chinese and Western) The first part presents the security of ML-based phishing website detection. Ex- isting literature on adversarial Machine Learning (ML) focuses either on showing attacks that break every ML model, or defenses that withstand most attacks. Unfor- tunately, little consideration is given to the actual cost of the attack or the defense. We formalize the “evasion-space" in which an adversarial perturbation can be intro- duced to fool a ML-PWD and propose a realistic threat model describing evasion attacks against ML-PWD that are cheap to stage. Our contribution paves the way for a much needed re-assessment of adversarial attacks against ML systems for cy- bersecurity. The second part of the dissertation presents a study to understand user perceptions of phishing and adversarial phishing webpages. Adversarial phishing webpages containing perturbations can easily fool ML-based PWD, but it remains uncertain whether these perturbations enhance individuals’ ability to identify phish- ing webpages. Our study indicates adversarial phishing webpages containing typos are more likely to be perceived by users. The third - and last - part of the dissertation reveals the gap between Chinese and Western ML-based PWD, aiming to urge that future work in PWD should take into account the applicability of multilingual envi- ronments and pave the way for PWD systems that can protect users having different backgrounds.

Machine Learning for Phishing Website Detection / Yuan, Ying. - (2024 Mar 07).

Machine Learning for Phishing Website Detection

YUAN, YING
2024

Abstract

Phishing attacks are on the rise and phishing websites are everywhere, denoting the brittleness of security mechanisms reliant on blocklists. Prior work proposed en- hancing Phishing Website Detectors (PWD) to mitigate this threat with data-driven techniques powered by Machine Learning (ML). The main advantage of ML models is their intrinsic ability of noticing weak patterns in the data that are overlooked by a human, and then leveraging such patterns to devise ‘flexible’ detectors that can counter even adaptive attackers. This dissertation addresses three significant aspects arising from the interaction between machine learning and phishing website detection: (i) Adversarial attack for machine learning-based phishing website detection (ML-PWD), (ii) User percep- tions of Phishing webpages, and (iii) Phishing website detection in multi-language environment (i.e., Chinese and Western) The first part presents the security of ML-based phishing website detection. Ex- isting literature on adversarial Machine Learning (ML) focuses either on showing attacks that break every ML model, or defenses that withstand most attacks. Unfor- tunately, little consideration is given to the actual cost of the attack or the defense. We formalize the “evasion-space" in which an adversarial perturbation can be intro- duced to fool a ML-PWD and propose a realistic threat model describing evasion attacks against ML-PWD that are cheap to stage. Our contribution paves the way for a much needed re-assessment of adversarial attacks against ML systems for cy- bersecurity. The second part of the dissertation presents a study to understand user perceptions of phishing and adversarial phishing webpages. Adversarial phishing webpages containing perturbations can easily fool ML-based PWD, but it remains uncertain whether these perturbations enhance individuals’ ability to identify phish- ing webpages. Our study indicates adversarial phishing webpages containing typos are more likely to be perceived by users. The third - and last - part of the dissertation reveals the gap between Chinese and Western ML-based PWD, aiming to urge that future work in PWD should take into account the applicability of multilingual envi- ronments and pave the way for PWD systems that can protect users having different backgrounds.
Machine Learning for Phishing Website Detection
7-mar-2024
Machine Learning for Phishing Website Detection / Yuan, Ying. - (2024 Mar 07).
File in questo prodotto:
File Dimensione Formato  
tesi_Ying_Yuan.pdf

accesso aperto

Descrizione: tesi_Ying_Yuan
Tipologia: Tesi di dottorato
Dimensione 7.21 MB
Formato Adobe PDF
7.21 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11577/3511376
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact