Abstract
Phishing site URLs are designed to gather confidential data such as user
identities, passwords, and transactions involving online money. Phishing strategies
have begun to advance quickly as technology advances; this could be avoided by using
anti-phishing tools to identify phishing. Employing machine learning techniques to
identify fraudulent websites was previously suggested and put into practice. This
project's primary goal is to develop the system in a way that is highly efficient,
accurate, and economical. Delivered to the system, the dataset of genuine and phishing
URLs is pre-processed to put the data in a format that can be used for analysis. Each
category has unique, defined phishing features against a dataset of real and fake URLs.
We evaluated the classifier's performance using a different test set after training it and
its values. A classifier has been created for phishing websites and tested for
effectiveness with a set of labeled phishing and legal URLs. When compared to seven
different classifiers of machine learning, the proposed model scored the greatest test
accuracy of up to 97.5% with the Gradient Boosting Classifier.