PhishShield: URL Phishing Detection Tool using Machine Learning Algorithm
Keywords:
Phishing, Machine Learning Algoritm, Uniform Resource Locator(URL), Random Forest, SecurityAbstract
Phishing attacks are a significant threat in the realm of cybersecurity. Phishing uses social engineering to steal customer identification and financial credentials. Existing phishing detection tools relies on blacklists and heuristics-based methods. However, blacklists are ineffective against zero-day phishing attacks, while heuristics has many false positives. To address challenges such as rapid Uniform Resource Locator (URL) growth, false positives, and the inability to adapt to new phishing attack, a proactive and adaptable solution is needed. The proposed tool, PhishShield is designed through Object-Oriented Analysis and Design (OOAD) and implemented in Python programming language. PhishShield consists of nine modules accessible to both admin and user roles, each based on their respective privileges. Machine learning for URL prediction involves feature extraction from address bar-based feature and the Random Forest algorithm used for classification tasks. While the machine learning algorithm is powerful, it still produces false positives and false negatives. To mitigate this, a scoring mechanism is employed. Out of five phishing URLs tested, 40% were flagged as phishing, 40% as suspicious, and 20% as legitimate. The Random Forest algorithm identified 80% of these as malicious and 20% as non-malicious. Regarding the five legitimate URLs tested, 80% were correctly classified as legitimate, while 20% were marked as suspicious. The Random Forest algorithm flagged 20% of these as malicious and 80% as non-malicious. In summary, PhishShield's advanced detection capabilities, role-based access control, adaptability, and continuous learning offer a significant improvement over traditional phishing detection tools, making it a powerful solution in combating phishing threats.



