Skip to content

Adnan-commits/phishguard

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

3 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ›‘οΈ PhishGuard β€” Hybrid Phishing Detection Platform

Python Flask ML Status

PhishGuard is a hybrid phishing detection platform that combines rule-based heuristics and machine learning to detect malicious URLs and phishing emails β€” with explainable risk scoring so users understand why something was flagged.


πŸ“Š Results

Target Accuracy
Malicious URL Detection 87.8%
Phishing Email Detection 99.2%

πŸ—οΈ Architecture

PhishGuard runs two independent detection pipelines β€” one for URLs, one for emails β€” each combining a rule engine with a trained ML model.

         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
         β”‚              Flask Web App              β”‚
         β”‚          app/app.py + templates/        β”‚
         β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                         β”‚
          β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
          β”‚                             β”‚
   β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”               β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”
   β”‚  URL Input  β”‚               β”‚ Email Input β”‚
   β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜               β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜
          β”‚                             β”‚
   β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”               β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”
   β”‚  url_rules  β”‚               β”‚ email_rules β”‚   ← Rule-based heuristics
   β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜               β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜     (rules/)
          β”‚                             β”‚
   β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”               β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”
   β”‚url_features β”‚               β”‚email_featuresβ”‚  ← Feature extraction
   β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜               β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜     (features/)
          β”‚                             β”‚
   β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
   β”‚phishing_url_model β”‚    β”‚ phishing_email_model    β”‚  ← Trained ML models
   β”‚      .pkl         β”‚    β”‚ + email_tfidf_vectorizerβ”‚    (model/)
   β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
          β”‚                             β”‚
          β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                         β”‚
                  β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”
                  β”‚ Risk Score  β”‚
                  β”‚  + Verdict  β”‚
                  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

How Each Pipeline Works

URL Detection

  1. url_rules.py applies rule-based heuristic checks on the URL structure
  2. url_features.py extracts numerical features from the URL
  3. phishing_url_model.pkl (trained ML model) classifies based on those features
  4. A combined risk score and verdict is returned

Email Detection

  1. email_rules.py applies heuristic checks on email headers and content
  2. email_features.py extracts features from the email body and metadata
  3. email_tfidf_vectorizer.pkl vectorizes the text content
  4. phishing_email_model.pkl classifies the vectorized input
  5. A combined risk score and verdict is returned

🧰 Tech Stack

Layer Technology
Web Framework Python, Flask
Frontend HTML, CSS (served via Flask templates)
ML Models scikit-learn (.pkl β€” separate models for URL and email)
Text Vectorization TF-IDF (email_tfidf_vectorizer.pkl)
Feature Engineering Custom Python modules (url_features.py, email_features.py)
Rule Engine Custom Python modules (url_rules.py, email_rules.py)

πŸ“ Project Structure

phishguard/
β”œβ”€β”€ app/
β”‚   β”œβ”€β”€ app.py               # Flask application β€” routes & detection logic
β”‚   β”œβ”€β”€ static/              # CSS, JS, and static assets
β”‚   └── templates/           # HTML templates
β”œβ”€β”€ features/
β”‚   β”œβ”€β”€ url_features.py      # Feature extraction for URLs
β”‚   β”œβ”€β”€ email_features.py    # Feature extraction for emails
β”‚   └── feature_order.txt    # Defines feature vector ordering
β”œβ”€β”€ model/
β”‚   β”œβ”€β”€ phishing_url_model.pkl        # Trained URL classifier
β”‚   β”œβ”€β”€ phishing_email_model.pkl      # Trained email classifier
β”‚   β”œβ”€β”€ email_tfidf_vectorizer.pkl    # TF-IDF vectorizer for email text
β”‚   β”œβ”€β”€ load_model.py                 # URL model loader
β”‚   └── load_email_model.py           # Email model loader
β”œβ”€β”€ rules/
β”‚   β”œβ”€β”€ url_rules.py         # Heuristic rules for URL detection
β”‚   └── email_rules.py       # Heuristic rules for email detection
β”œβ”€β”€ utils/                   # Shared utility functions
β”œβ”€β”€ requirements.txt
└── runtime.txt

πŸš€ Getting Started

Prerequisites

  • Python 3.x
  • pip

Installation

git clone https://github.com/Adnan-commits/phishguard.git
cd phishguard
pip install -r requirements.txt

Running the App

cd app
python app.py

Then open http://localhost:5000 in your browser.


πŸ‘€ Author

Adnan Bardgujar LinkedIn Β· GitHub


⚠️ PhishGuard is an academic project. It is not intended as a production-grade security tool without further dataset validation and hardening.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors