ML Malicious URL

The document discusses the application of machine learning algorithms in detecting malicious URLs, highlighting a project aimed at achieving 98% accuracy. The author gathered a dataset of approximately 400,000 URLs, including 80,000 malicious ones, and plans to use Logistic Regression for analysis. The initial step involved creating a custom tokenizer to process the unique structure of URLs.

Uploaded by

davychuinz

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views2 pages

ML Malicious URL

Uploaded by

davychuinz

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 2

With the growth of Machine Learning in the past few years, many tasks are being done with

the help of machine learning algorithms.Unfortunately or fortunately, there has been little
work done on machine learning and cyber security. So I thought of presenting some
at Fsecurify.

A few days ago, I had this idea about what if we could detect a malicious URL from a non-
malicious URL using some machine learning algorithm. There has been some research done
on the topic so I thought that I should give it a go and implement something from scratch.
So lets start.

Machine Learning and Security | Using Machine Learning

to detect Malicious URLs with 98% accuracy
Gathering Data

The first task was gathering data. I did some surfing and found some websites offering
malicious links. I set up a little crawler and crawled a lot of malicious links from various
websites. The next task was finding clear URLs. Fortunately, I did not have to crawl any.
There was a data set available. Don’t worry if I am not mentioning the sources of the data.
You’ll get the data at the end of this post.

So, I gathered around 400,000 URLs out of which around 80,000 were malicious and others
were clean. There we have it, our data set. Lets move next.

Analysis

We’ll be using Logistic Regression since it is fast. The first part was tokenizing the URLs. I
wrote my own tokenizer function for this since URLs are not like some other document text.
Some of the tokens we get are like ‘virus’,’exe’,’php’,’wp’,’dat’ etc.

Maliciousurlpaper
No ratings yet
Maliciousurlpaper
6 pages
Malicious URL Detection via Logistic Regression
No ratings yet
Malicious URL Detection via Logistic Regression
6 pages
B.E Cse Batchno 256
No ratings yet
B.E Cse Batchno 256
57 pages
Comparative Evaluation of Machine Learning Models For Malicious URL Detection
No ratings yet
Comparative Evaluation of Machine Learning Models For Malicious URL Detection
7 pages
Batch 18-Journal
No ratings yet
Batch 18-Journal
7 pages
Detecting Malicious Urls Using Machine Learning Techniques: A Comparative Literature Review
No ratings yet
Detecting Malicious Urls Using Machine Learning Techniques: A Comparative Literature Review
5 pages
Sensors 23 07760
No ratings yet
Sensors 23 07760
14 pages
Phishing URL Detection Research Paper
No ratings yet
Phishing URL Detection Research Paper
12 pages
Malicious - Url - Detect - 1BY21IS087,88
No ratings yet
Malicious - Url - Detect - 1BY21IS087,88
5 pages
Malicious URL Detection Guide
No ratings yet
Malicious URL Detection Guide
11 pages
Analysis For Malicious URLs Using
No ratings yet
Analysis For Malicious URLs Using
17 pages
SafeLink AI - Malicious URL Detection - Synopsis
No ratings yet
SafeLink AI - Malicious URL Detection - Synopsis
9 pages
ICT4SD Published Version
No ratings yet
ICT4SD Published Version
11 pages
Malicious URL Detection and Classification Analysis Using Machine Learning Models
No ratings yet
Malicious URL Detection and Classification Analysis Using Machine Learning Models
9 pages
Malicious URL Detection Using Machine Learning Tec
No ratings yet
Malicious URL Detection Using Machine Learning Tec
12 pages
Man Jeri 2019
No ratings yet
Man Jeri 2019
7 pages
MaliciousURLDetection Acomparativestudy
No ratings yet
MaliciousURLDetection Acomparativestudy
6 pages
IJCSP22B1046
No ratings yet
IJCSP22B1046
8 pages
Detecting Malicious Urls Using Machine Learning Techniques
No ratings yet
Detecting Malicious Urls Using Machine Learning Techniques
8 pages
Malicious URL Detection Survey
No ratings yet
Malicious URL Detection Survey
2 pages
Malicious URL Detection Using Machine Learning: Mr. Swapnil Thorat
No ratings yet
Malicious URL Detection Using Machine Learning: Mr. Swapnil Thorat
18 pages
(IJIT-V10I6P4) :roopesh Kumar B N, Rekha B Venkatapur, Suman B S, Gagan Shivanna
No ratings yet
(IJIT-V10I6P4) :roopesh Kumar B N, Rekha B Venkatapur, Suman B S, Gagan Shivanna
5 pages
Quantum ML for URL Fraud Detection
No ratings yet
Quantum ML for URL Fraud Detection
18 pages
SafeLink AI - URL Threat Detection
No ratings yet
SafeLink AI - URL Threat Detection
17 pages
Malicious URL Detection with ML
No ratings yet
Malicious URL Detection with ML
52 pages
A12. Malicious URL
No ratings yet
A12. Malicious URL
1 page
A New Dataset and Methodology For Malicious URL Classification
No ratings yet
A New Dataset and Methodology For Malicious URL Classification
10 pages
Phishing Final
No ratings yet
Phishing Final
13 pages
Using Lexical Features For Malicious URL Detection - A Machine Learning Approach
No ratings yet
Using Lexical Features For Malicious URL Detection - A Machine Learning Approach
6 pages
Report
No ratings yet
Report
35 pages
Beyond Blacklists: Learning To Detect Malicious Web Sites From Suspicious Urls
No ratings yet
Beyond Blacklists: Learning To Detect Malicious Web Sites From Suspicious Urls
9 pages
Beyond Blacklists Justin Ma
No ratings yet
Beyond Blacklists Justin Ma
9 pages
Paper 19-Malicious URL Detection Based On Machine Learning
No ratings yet
Paper 19-Malicious URL Detection Based On Machine Learning
6 pages
Detection of Malicious Web Contents Using Machine and Deep Learning Approaches
No ratings yet
Detection of Malicious Web Contents Using Machine and Deep Learning Approaches
6 pages
Detecting Malicious Web Links and Identifying Their Attack Types
No ratings yet
Detecting Malicious Web Links and Identifying Their Attack Types
12 pages
Scalable Malicious URL Classification: Leveraging Lexical Analysis and API Integration
No ratings yet
Scalable Malicious URL Classification: Leveraging Lexical Analysis and API Integration
5 pages
Theme Based Project
No ratings yet
Theme Based Project
11 pages
15th ICCCNT 2024 Paper 452
No ratings yet
15th ICCCNT 2024 Paper 452
6 pages
Applsci 12 12030 v2
No ratings yet
Applsci 12 12030 v2
14 pages
Detecting Malicious Urls Using Lexical Analysis: (Msi - Mamun, Mahmad - Rathore, A.Habibi.L, Natalia, Ghorbani) @unb - Ca
No ratings yet
Detecting Malicious Urls Using Lexical Analysis: (Msi - Mamun, Mahmad - Rathore, A.Habibi.L, Natalia, Ghorbani) @unb - Ca
16 pages
1ds19scn09 - Mtech Project Phase-3
No ratings yet
1ds19scn09 - Mtech Project Phase-3
27 pages
Empirical Study On Malicious URL Detection Using Machine Learning
No ratings yet
Empirical Study On Malicious URL Detection Using Machine Learning
9 pages
Explanation
No ratings yet
Explanation
6 pages
Proposal
No ratings yet
Proposal
4 pages
Malicious URL Detection Via Pretrained Language Model Guided Multi-Level Feature Attention Network
No ratings yet
Malicious URL Detection Via Pretrained Language Model Guided Multi-Level Feature Attention Network
11 pages
Applsci 12 12070
No ratings yet
Applsci 12 12070
15 pages
Url Pishing
No ratings yet
Url Pishing
28 pages
INFOCOMP+Journal+Final 3
No ratings yet
INFOCOMP+Journal+Final 3
6 pages
Malicious Url: Analysis and Detection Using Machine Learning
No ratings yet
Malicious Url: Analysis and Detection Using Machine Learning
58 pages
Detecting Malicious Websites
No ratings yet
Detecting Malicious Websites
58 pages
Detecting Malicious URLs Using Machine Learning Techniques Review and Research Directions
No ratings yet
Detecting Malicious URLs Using Machine Learning Techniques Review and Research Directions
23 pages
Malicious URL Proposal
No ratings yet
Malicious URL Proposal
2 pages
Paper 7AdvancesinEngineeringSoftware
No ratings yet
Paper 7AdvancesinEngineeringSoftware
6 pages
Malicious URL Detection via Browser Extension
No ratings yet
Malicious URL Detection via Browser Extension
14 pages
2 Review
No ratings yet
2 Review
21 pages
Sniffing Dtetction IEEE Paper
No ratings yet
Sniffing Dtetction IEEE Paper
3 pages
Malicious URL Detection via ML
No ratings yet
Malicious URL Detection via ML
63 pages

ML Malicious URL

Uploaded by

ML Malicious URL

Uploaded by

With the growth of Machine Learning in the past few years, many tasks are being done with

Machine Learning and Security | Using Machine Learning

You might also like