0% found this document useful (0 votes)

13 views8 pages

A Reputation-Based Collaborative Approach For Spam Filtering

This paper presents a reputation-based collaborative approach for spam filtering that utilizes fingerprinting techniques and evaluates the trust of reporters. The proposed method outperforms existing spam filtering systems by leveraging shared knowledge and MIME features of emails. Experimental results demonstrate its effectiveness across various email corpora compared to traditional spam filters.

Uploaded by

fatna.elmendili

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views8 pages

A Reputation-Based Collaborative Approach For Spam Filtering

Uploaded by

fatna.elmendili

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

Available online at www.sciencedirect.

com

ScienceDirect
AASRI Procedia 5 (2013) 220 – 227

2013 AASRI Conference on Parallel and Distributed Computing Systems

A Reputation-based Collaborative Approach for Spam Filtering

Wenxuan Shia, Maoqiang Xieb
a
College of Software, Nankai University, Tianjin 300071, China
b
College of Software, Nankai University, Tianjin 300071, China

Abstract

Spam and spam filters are contrarious components of a complex interdependent social ecosystem. Traditional spam
filtering techniques or systems are usually designed and deployed individually that neglect the distributed and bulk
characteristics of spam. This paper proposes a reputation-based collaborative anti-spam approach. This approach, adopting
fingerprinting technique, evaluating reporters’ trust, achieved better performance and robustness than the state-of-the-art
in comparison experiments on several known email corpora.

© 2013
© 2013The
Published
Authors.by ElsevierbyB.V.
Published Selection
Elsevier and/or
B.V. Open peer
access review
under under responsibility
CC BY-NC-ND license. of American Applied
Science Research
Selection and/or peerInstitute
review under responsibility of American Applied Science Research Institute

Keywords: Spam filtering; reputation evaluation; collaborative; fingerprinting

1. Introduction

There is frequent and close mutual relationship between people and data on web, and such relationship
offers some lawbreakers potential opportunity to send all kinds of spam information including spam email,
spam poster, spam invitation, spam advertisement, etc. According to characteristic of spam information, there
are many definitions about spam in practice, such as unsolicited and unwanted email, indiscriminate bulk
email sent directly or indirectly, unsolicited commercial email, etc. In addition to wasting recipients' time to
deal with spam, spam also eats up a lot of network bandwidth. Moreover, spam occupies vast resources of
computation and storage and it is profoundly annoying clients and email service providers (ESP).
Consequently, many anti-spam techniques and systems, known as spam filters, have appeared associated
with the appearance of spam. Such situation is called as spam ecosystem, that spam and spam filters are
components of a complex interdependent system of social and technical structures. The anti-spam techniques

2212-6716 © 2013 The Authors. Published by Elsevier B.V. Open access under CC BY-NC-ND license.
Selection and/or peer review under responsibility of American Applied Science Research Institute
doi:10.1016/j.aasri.2013.10.082
Wenxuan Shi and Maoqiang Xie / AASRI Procedia 5 (2013) 220 – 227 221

can be categorized into two types: anti-spam techniques based on EK (expert knowledge) and anti-spam
techniques based on ML (machine learning). Anti-spam techniques based on EK contain rule-based filtering,
such as whitelists, blacklists, challenge-response, enhanced protocol, etc. There have been many spam filters
based on EK, such as Sender Policy Framework (SPF) [1], Sender-ID [2], and Domain Keys Identified Mail
(DKIM) [3], etc. Anti-spam techniques based on ML contain probability-based filtering, linear classifiers,
Rocchio method, nearest neighbor method, logic-based method, data compression model, etc.
Most of recent spam systems are individual spam filters which are designed and deployed individually
according to various clients and ESPs. Considering the distributed and bulk characteristics of spam, the better
schema of anti-spam system should be implemented depend on multi-client or multi-ESP collaboration and
has individuals sharing their judgments of legitimate email and spam. This paper proposes a reputation-based
collaborative anti-spam approach, adopting fingerprinting technique, evaluating reporters’ trust, and it
outperformed the state-of-the-art in comparison experiments on several known email corpora.

2. Related work

In spam ecosystem, spam filters should pay close attention to two issues: one is how to protect recipients’
privacy and information safety, and another is how to utilize the distributed and bulk characteristics of spam.

2.1. Fingerprinting

Fingerprint and fingerprint recognition are concepts originated from the area of biometric authenticate
identity. After applying them in the area of information retrieval, fingerprint is defined as short tag for large
object. Fingerprinting technique has the advantage that it can identify the same or similar duplicated
documents with small partial variations by calculating and matching their hash values.
In spam ecosystem, spam information has produced massive waste to network resource and person’s time,
such as mass data transmission on internet and mass storage on servers. On the other hand, spam information
has produced serious damage to web information safety and personal information privacy. Fingerprinting
schema is derived from traditional research domain of data encryption and digital signature. This schema
adopts certain encryption hash arithmetic by which to generate shorter content digests for large original
messages in order to take the place of original information storage and transmission. Fingerprint functions
may be seen as high-performance hash functions and there are two kinds of known algorithms: Rabin's
algorithm and cryptographic hash functions [4].

2.2. Collaborative spam filtering

Collaborative spam filtering is more efficient strategy to content filtering where rather than employing
someone or certain computers to attract and analyze spam, and rather than having different user train himself
individual filters. The whole collaborative community works together with shared spam knowledge. Therefore
a collaborative spam filter needs certain shared and efficient database where storing different user’s judgment
about which is spam and which not. At present there has been some collaborative spam filters on web, such as
DCC, Vipul's Razor, Pyzor, Cloudmark, etc[5, 6, 7]. These methods have similar strategy of using shared
knowledge. The DCC (Distributed Checksum Clearinghouse) system detects spam by computing spam
checksums and querying the same checksums in a database of checksums. Vipul's Razor system filters out
known spam by maintaining a catalogue server of signatures feedback by spam receivers. Pyzor and
Cloudmark have different protocols of spam filtering similar to Vipul's Razor.
222 Wenxuan Shi and Maoqiang Xie / AASRI Procedia 5 (2013) 220 – 227

3. Reputation Evaluation

This paper proposes a collaborative approach for spam filtering based on reputation evaluation with
weighing fingerprints and shared fingerprints database by means of which we can record, query, log, report
and amend shared fingerprints, as shown in Figure 1.

Fig. 1. Reputation evaluation with weighing fingerprints

3.1. Fingerprinting based on MIME-division

Multipurpose Internet Mail Extensions (MIME) is an Internet standard that extends the format of email to
sup-port a variety of format content, such as text in multiple character sets, non-text attachments, and message
bodies with multiple parts, etc. However, the traditional spam filtering techniques commonly just consider
plain text content of email but ignore MIME Features, and only considered the text content, as well as some
released email corpora to public which only contain text content extracted from original email body. In our
work, the anti-spam system handles spam as follows:
1) Divide each incoming email into five subparts: email header, text/plain content of email body, text/html
content of email body, embedded resources and attachments.
2) Generate weighing fingerprint for different MIME subpart.
3) Compute an indicator score for each fingerprint set to indicate how spammy the fingerprint set is.
4) Calculate a compound weighted score based on individual indicator score.
5) Make a decision of spam or non-spam to the incoming email by comparing the compound weighted
score with certain predefined threshold value.
Define an incoming email as symbol , after the processing of MIME-division, the email is turned into
a subparts set: with the weighting vector . Suppose the subpart
can be expressed as a feature vector . Suppose the generated fingerprint set of
is . The compound weighted score could be
calculated as follows:
Wenxuan Shi and Maoqiang Xie / AASRI Procedia 5 (2013) 220 – 227 223

(1)

where is the indicator score of fingerprint which can be trained, maintained

and queried from certain fingerprints database. In the fingerprints database, we conserve every fingerprint
information as a triplet where
is a global unique hash value calculated by fingerprinting arithmetic, is the life cycle or
generated fingerprint.

3.2. Reputation evaluation

In our spam filtering system, we build a reputation evaluation method with weighing fingerprints and
shared fingerprints database. By creating the shared fingerprints database, we can record new arrival
fingerprints, query stored fingerprints, log users operations, report searching results and amend outdated
fingerprints. To a reporter of the collaborative spam filtering system, we calculate the reporter’s reputation
value as follows:

(2)

is current reputation value of reporter calculated by last reputation and current feedback result
multiply by weighing factor,
is reporter feedback result to certain email-m, which can be calculated as follows:

(3)

is weighing factor for different kinds of feedback rolls in order to balance reporter feedback result:

(4)

where
is pre-defined adjusting parameter for different feedback rolls defined by different source types
of reporter.

3.3. Indicator score calculation

To judge which is spam and which not on the basis of feedback, the collaborative anti-spam approach
based on reputation calculates an indicator score for each generated fingerprint after MIME-division and
weighting-assignment as follows:
224 Wenxuan Shi and Maoqiang Xie / AASRI Procedia 5 (2013) 220 – 227

(5)

where
, the indicator score of fingerprint , is the second element
value of the triplet which is stored in
fingerprints database.
is the reputation value of reporter which can be configured and adjusted over time based on
different reporter rolls defined as above( ).
is the probabilistic value of feature calculated based on IDF (inverse document
frequency).

3.4. Spam detection

Spammers often control some puppet PCs and engaged servers to send mass spam. It is the purpose of
spam filters to recognize these machines and senders. Spam filtering is the automatic processing to distinguish
spam from non-spam between incoming emails according to specified criteria. After the steps of
fingerprinting based on MIME-Division, and indicator score calculation based on reputation evaluation, we
can generate a diverter for spam and non-spam as shown in Figure 2.

Fig. 2. Diverter for spam and non-spam

Wenxuan Shi and Maoqiang Xie / AASRI Procedia 5 (2013) 220 – 227 225

In order to predict the label of an incoming email and then put into appropriate email folders, we can
execute the judgment as follows:

(6)

where t is a pre-defined judgment threshold of legitimate email and spam.

4. Experiments

In order to examine the performance of our approach (Collaborative Anti-Spam Scheme, CAS3 for short)
proposed in this paper, we performed three groups of experiments based on three different corpora.

4.1. Experiment evaluation measure

We performed three groups of experiments based on three different corpora (Table 1) where the Handwork
corpus is collected by our previous accumulation which contains whole email content, such as attachments,
embedded resources, etc.

Table 1. Corpora used in comparison experiments

Corpus Spam Non-Spam MIME

Ling-Spam [8] 481 2412 Subject, Text/Plain
Spam-Assassin [9] 1897 4150 Five Subparts
Handwork 4059 831 Five Subparts
In each group of experiments, we compare CAS3 with two known spam filters: SpamAssassin and Vipul's
Razor. In comparison experiments, we draw certain Receiver Operating Characteristic curve to show different
filtering effects between these methods.

4.2. Experiment evaluation results

The first group of experiments is simulated based on Ling-Spam corpus and the comparison results are
shown in Figure 3.
Ling-Spam Corpus
M isclassified Non-Spams (of 2412)
Misclassified Spams (of 481)

Fig. 3. ROC curve graph on Ling-Spam corpus

226 Wenxuan Shi and Maoqiang Xie / AASRI Procedia 5 (2013) 220 – 227

The second group of experiments is simulated based on Spam-Assassin corpus and the comparison results
are shown in Figure 4.
Spam-Assassin Corpus
M isclassified Non-Spams (of 4150)

Misclassified Spams (of 1897)

Fig. 4. ROC curve graph on Spam-Assassin corpus

The third group of experiments is simulated based on Handwork corpus and the comparison results are
shown in Figure 5.
Handwork Corpus
M isclassified Non-Spams (of 831)
Misclassified Spams (of 4059)

Fig. 5. ROC curve graph on Handwork corpus

5. Conclusions

In this paper, a reputation-based collaborative approach for spam filtering has been proposed that using the
MIME features of email and adopts fingerprinting schema according to different subparts of email. Our
approach achieved better performance and robustness than current popular filtering methods on several email
corpora.

Corresponding Author:

Wenxuan Shi, shiwx@nankai.edu.cn, 086-13920561100

References

[1] M. Wong and W. Schlitt. 2006. Sender Policy Framework (SPF) for Authorizing Use of Domains in E-
Wenxuan Shi and Maoqiang Xie / AASRI Procedia 5 (2013) 220 – 227 227

mail, Vol. RFC 4408.

[2] J. Lyon and M. Wong. 2006. Sender-ID: Authenticating E-mail RFC 4406, Internet Engineering Task
Force.
[3] B. Lieba and J. Fenton. 2007. DomainKeys identified email (DKIM): Using digital signatures for domain
verification. in CEAS 2007: The Third Conference on Email and Anti-Spam.
[4] wikipedia.org. 2012. Fingerprint (computing). http://en.wikipedia.org/wiki/Fingerprint_(computing).
Andrew G. West, Avantika Agrawal, etc. 2011. Autonomous link spam detection in purely collaborative
environments. In WikiSym 2011: The 7th International Symposium on Wikis and Open Collaboration.
Network Security, pp. 15–17.
[5] Wenxuan Shi, Maoqiang Xie, etc. 2011. Collaborative Spam Filtering Technique Based on MIME
Fingerprints. In WCICA 2011: The 9th World Congress on Intelligent Control and Automation. Washington,
DC, USA: IEEE Computer Society. 2011.
[6] Wenxuan Shi, Maoqiang Xie, etc. 2011. Cooperative Anti-Spam System Based on Multilayer Agents. In
WWW 2011: The 20th International World Wide Web Conference. NY, USA: ACM. 2011. 415~420.
[7] Stason.org. 2006. Anti-SPAM Techniques: Collaborative Content Filtering.
http://stason.org/articles/technology/email/junk-mail/collaborative_content_filtering.html.
[8] I. Androutsopoulos, J. Koutsias, K.V. Chandrinos, George Paliouras, and C.D. Spyropoulos. 2000. An
Evaluation of Naive Bayesian Anti-Spam Filtering. in Proceedings of the Workshop on Machine Learning in
the New Information Age, 11th European Conference on Machine Learning, Barcelona, Spain, pp. 9-17.
[9] Spamassassin.org. 2003. The Spamassassin Public Mail Corpus.
http://spamassassin.apache.org/publiccorpus.

Email Spam
No ratings yet
Email Spam
12 pages
Email Spam
No ratings yet
Email Spam
12 pages
Spam Filtering Using Spam Mail Communities: A Paper On
No ratings yet
Spam Filtering Using Spam Mail Communities: A Paper On
13 pages
Madhavan 2021 IOP Conf. Ser. Mater. Sci. Eng. 1022 012113
No ratings yet
Madhavan 2021 IOP Conf. Ser. Mater. Sci. Eng. 1022 012113
12 pages
Feature Selection and Similarity Coefficient Based Method For Email Spam Filtering
No ratings yet
Feature Selection and Similarity Coefficient Based Method For Email Spam Filtering
4 pages
Efficient Spam Filtering System Based On Smart Cooperative Subjective and Objective Methods
No ratings yet
Efficient Spam Filtering System Based On Smart Cooperative Subjective and Objective Methods
12 pages
IJCNS CooperativeSpam
No ratings yet
IJCNS CooperativeSpam
12 pages
Article 28
No ratings yet
Article 28
5 pages
Final Doc SPAM
No ratings yet
Final Doc SPAM
64 pages
Email Spam Detection Using Spot Algorithm
No ratings yet
Email Spam Detection Using Spot Algorithm
3 pages
Hybrid Machine Learning Based E-Mail Spam Filtering Technique
100% (2)
Hybrid Machine Learning Based E-Mail Spam Filtering Technique
58 pages
Cosdes: A Collaborative Spam Detection System With A Novel E-Mail Abstraction Scheme
No ratings yet
Cosdes: A Collaborative Spam Detection System With A Novel E-Mail Abstraction Scheme
14 pages
Detecting Spam Zombies by Monitoring Outgoing Messages
No ratings yet
Detecting Spam Zombies by Monitoring Outgoing Messages
25 pages
Email Spam Detection
No ratings yet
Email Spam Detection
8 pages
Phases 2 and 3 of Project "Spamabwehr": SMTP Based Concepts and Cost-Profit Models
No ratings yet
Phases 2 and 3 of Project "Spamabwehr": SMTP Based Concepts and Cost-Profit Models
50 pages
Spam Filtering Techniques Survey
No ratings yet
Spam Filtering Techniques Survey
7 pages
Jebin 2
No ratings yet
Jebin 2
22 pages
Ijirt156181 Paper
No ratings yet
Ijirt156181 Paper
5 pages
B.Sc. Project: Email Spam Filter
No ratings yet
B.Sc. Project: Email Spam Filter
35 pages
Decision Tree Model For Email Classification: Ivana Čavor
No ratings yet
Decision Tree Model For Email Classification: Ivana Čavor
4 pages
Kongunadu College of Engineering and Technology: Automated Spam Filtering: A Fuzzy Similarity Approach
No ratings yet
Kongunadu College of Engineering and Technology: Automated Spam Filtering: A Fuzzy Similarity Approach
6 pages
46 - Ijme... Mech Engg..Research Paper-1
No ratings yet
46 - Ijme... Mech Engg..Research Paper-1
10 pages
Privacy Aware Collaborative Spam Detection
No ratings yet
Privacy Aware Collaborative Spam Detection
26 pages
AI-Enabled Email Classiciation Spam Detection (RP)
No ratings yet
AI-Enabled Email Classiciation Spam Detection (RP)
6 pages
IEEE Conference Template 148
No ratings yet
IEEE Conference Template 148
6 pages
EmailSpamFilteringTechniques AReview
No ratings yet
EmailSpamFilteringTechniques AReview
13 pages
Constructing A User Preference Ontology For Anti-Spam Mail Systems
No ratings yet
Constructing A User Preference Ontology For Anti-Spam Mail Systems
12 pages
44 Decision Tree Model For Email Classification
No ratings yet
44 Decision Tree Model For Email Classification
4 pages
Advances in Spam Filtering Techniques: January 2012
No ratings yet
Advances in Spam Filtering Techniques: January 2012
17 pages
Spam Detection via Machine Learning
No ratings yet
Spam Detection via Machine Learning
11 pages
Survey On Spam Filtering in Text Analysis: Saksham Sharma, Rabi Raj Yadav
No ratings yet
Survey On Spam Filtering in Text Analysis: Saksham Sharma, Rabi Raj Yadav
7 pages
3.1 External Interfaces
No ratings yet
3.1 External Interfaces
35 pages
Modelling and Analysis On The Propagation Dynamics of Email Malware
No ratings yet
Modelling and Analysis On The Propagation Dynamics of Email Malware
30 pages
Characterizing Spam Traffic and Spammers:, Department of Computer Engineering Hannam University, South Korea
No ratings yet
Characterizing Spam Traffic and Spammers:, Department of Computer Engineering Hannam University, South Korea
6 pages
Graphsage-Based Spammer Detection Using Social Attribute Relationship
No ratings yet
Graphsage-Based Spammer Detection Using Social Attribute Relationship
14 pages
Chung-Kwei Spam IA
No ratings yet
Chung-Kwei Spam IA
18 pages
Machine Learning Spam Filter Review
No ratings yet
Machine Learning Spam Filter Review
28 pages
Enhancing Email Security With Naïve Bayes Spam Detection - Docx Fully Edited
No ratings yet
Enhancing Email Security With Naïve Bayes Spam Detection - Docx Fully Edited
64 pages
Optimizing Spam Filtering With Machine Learning
No ratings yet
Optimizing Spam Filtering With Machine Learning
35 pages
Comparative Analysis of Classifiers For PDF
No ratings yet
Comparative Analysis of Classifiers For PDF
6 pages
Research Paper Spam Detection
No ratings yet
Research Paper Spam Detection
4 pages
ETCW15
No ratings yet
ETCW15
4 pages
Major-Final Research Paper
No ratings yet
Major-Final Research Paper
3 pages
Email Filter For Spam Mail: A Review
No ratings yet
Email Filter For Spam Mail: A Review
5 pages
Email Spam Detection (Research Paper)
No ratings yet
Email Spam Detection (Research Paper)
8 pages
Evaluating The Effectiveness of Machine Learning Methods For
No ratings yet
Evaluating The Effectiveness of Machine Learning Methods For
8 pages
IJRPR8167
No ratings yet
IJRPR8167
7 pages
Email Spam Detection Techniques
No ratings yet
Email Spam Detection Techniques
5 pages
Spam Detection for SMTP Servers
No ratings yet
Spam Detection for SMTP Servers
15 pages
Security and Communication Networks - 2022 - Ahmed - Machine Learning Techniques For Spam Detection in Email and IoT
No ratings yet
Security and Communication Networks - 2022 - Ahmed - Machine Learning Techniques For Spam Detection in Email and IoT
19 pages
IJCSIS Camera Ready Academia
No ratings yet
IJCSIS Camera Ready Academia
4 pages
Email
No ratings yet
Email
27 pages
VBK23 Cse 041
No ratings yet
VBK23 Cse 041
6 pages
Improving Spam Email Classification Accuracy Using Ensemble Techniques: A Stacking Approach
No ratings yet
Improving Spam Email Classification Accuracy Using Ensemble Techniques: A Stacking Approach
13 pages
Voting Classification Method For Email Spam Prediction
No ratings yet
Voting Classification Method For Email Spam Prediction
10 pages
(S1 IJECE 2019 Yasmine Khalid Zamil) Spam Image Email Filtering KNN SVM
No ratings yet
(S1 IJECE 2019 Yasmine Khalid Zamil) Spam Image Email Filtering KNN SVM
10 pages
PPT
0% (1)
PPT
15 pages
Elshoush 2019
No ratings yet
Elshoush 2019
6 pages
View
No ratings yet
View
26 pages
Exploring The Security of Blockchain Applications: A Review of Current Solutions and Open Challenges
No ratings yet
Exploring The Security of Blockchain Applications: A Review of Current Solutions and Open Challenges
9 pages
Mathematics 12 03860
No ratings yet
Mathematics 12 03860
24 pages
Stephen 2018 IOP Conf. Ser. Mater. Sci. Eng. 396 012030
No ratings yet
Stephen 2018 IOP Conf. Ser. Mater. Sci. Eng. 396 012030
8 pages
Applsci 15 06835 v2
No ratings yet
Applsci 15 06835 v2
26 pages
1 s2.0 S209672092500017X Main
No ratings yet
1 s2.0 S209672092500017X Main
17 pages
Neurocomputing: Xianghan Zheng, Zhipeng Zeng, Zheyi Chen, Yuanlong Yu, Chunming Rong
No ratings yet
Neurocomputing: Xianghan Zheng, Zhipeng Zeng, Zheyi Chen, Yuanlong Yu, Chunming Rong
8 pages
1 s2.0 S1877050918316909 Main
No ratings yet
1 s2.0 S1877050918316909 Main
8 pages
An Early and Accurate Diagnosis and Detection of The Coronary Heart Disease Using Deep Learning and Machine Learning Algorithms
No ratings yet
An Early and Accurate Diagnosis and Detection of The Coronary Heart Disease Using Deep Learning and Machine Learning Algorithms
32 pages
Mathematics 12 01969 v2
No ratings yet
Mathematics 12 01969 v2
20 pages
Techniques To Detect Spammers in Twitter-A Survey: Monika Verma Divya, Sanjeev Sofat
No ratings yet
Techniques To Detect Spammers in Twitter-A Survey: Monika Verma Divya, Sanjeev Sofat
6 pages
VDIAZ - MT DetectingMaliciousProfilesTwitter
No ratings yet
VDIAZ - MT DetectingMaliciousProfilesTwitter
66 pages
Cao Duke 0066D 12508
No ratings yet
Cao Duke 0066D 12508
143 pages
Compa Ndss13
No ratings yet
Compa Ndss13
17 pages
Detectin NG Malic Cious Ur Rlsine E-Mail - An Imp Plementa Ation
No ratings yet
Detectin NG Malic Cious Ur Rlsine E-Mail - An Imp Plementa Ation
7 pages
Pre-Quiz 04 Attempt Review
No ratings yet
Pre-Quiz 04 Attempt Review
6 pages
Privacy, Security & Ethics Guide
No ratings yet
Privacy, Security & Ethics Guide
24 pages
"Swasame En Swasame Novel Search"
50% (2)
"Swasame En Swasame Novel Search"
2 pages
Sexual Identity and Media Use in US Men
No ratings yet
Sexual Identity and Media Use in US Men
14 pages
Cs435 Cloud Computing Mid Term by Vu Rocky
No ratings yet
Cs435 Cloud Computing Mid Term by Vu Rocky
11 pages
Empowerment Technology (Empotek) : Mr. Enrico Hidalgo
No ratings yet
Empowerment Technology (Empotek) : Mr. Enrico Hidalgo
69 pages
710 Chapter 42 PDF
No ratings yet
710 Chapter 42 PDF
32 pages
BA OpenCom40dl (GB) 26 - 07 - 01
No ratings yet
BA OpenCom40dl (GB) 26 - 07 - 01
128 pages
Babelnew
No ratings yet
Babelnew
98 pages
Faq Online Guard Plus
No ratings yet
Faq Online Guard Plus
4 pages
AI Book 9 - Answer Key - Part A
No ratings yet
AI Book 9 - Answer Key - Part A
21 pages
Vivotek VS8100-V2 - Digital Encoder
No ratings yet
Vivotek VS8100-V2 - Digital Encoder
3 pages
Labsim Instructions For Students 18
No ratings yet
Labsim Instructions For Students 18
12 pages
Cloud Tech Guide for IT Professionals
No ratings yet
Cloud Tech Guide for IT Professionals
18 pages
Worksheet 2
No ratings yet
Worksheet 2
3 pages
Elastix Easy
No ratings yet
Elastix Easy
197 pages
Windows XP Hacks, 2nd Edition at - by Preston Gralla
No ratings yet
Windows XP Hacks, 2nd Edition at - by Preston Gralla
2,574 pages
SKyWAN Operation Manual 5 72 Revision C
No ratings yet
SKyWAN Operation Manual 5 72 Revision C
394 pages
Honeywell
No ratings yet
Honeywell
1 page
Wipro's Infocrossing Acquisition
No ratings yet
Wipro's Infocrossing Acquisition
13 pages
Smartfabric Os10 Virtual Link Trunking Ra
No ratings yet
Smartfabric Os10 Virtual Link Trunking Ra
33 pages
Digital Banking Services Customers Pros and Cons
No ratings yet
Digital Banking Services Customers Pros and Cons
9 pages
English 10: Expanded Word Definitions Module
No ratings yet
English 10: Expanded Word Definitions Module
20 pages
Live Transcribing Phone Calls Using Twilio Media Streams and Google Speech-to-Text
No ratings yet
Live Transcribing Phone Calls Using Twilio Media Streams and Google Speech-to-Text
21 pages
Netflix Cookie Data Overview
0% (1)
Netflix Cookie Data Overview
3 pages
NIPS WebUI User Manual-6-1
No ratings yet
NIPS WebUI User Manual-6-1
691 pages
Jio Report
100% (1)
Jio Report
36 pages
Effects of Social Media To Students' Behavior
73% (11)
Effects of Social Media To Students' Behavior
17 pages
EUROITV2011 Adjunct Proceedings
No ratings yet
EUROITV2011 Adjunct Proceedings
182 pages
Hidden Wiki - Português
100% (1)
Hidden Wiki - Português
6 pages

A Reputation-Based Collaborative Approach For Spam Filtering

Uploaded by

A Reputation-Based Collaborative Approach For Spam Filtering

Uploaded by

Available online at www.sciencedirect.

2013 AASRI Conference on Parallel and Distributed Computing Systems

A Reputation-based Collaborative Approach for Spam Filtering

Keywords: Spam filtering; reputation evaluation; collaborative; fingerprinting

2.2. Collaborative spam filtering

Fig. 1. Reputation evaluation with weighing fingerprints

3.1. Fingerprinting based on MIME-division

where is the indicator score of fingerprint which can be trained, maintained

3.2. Reputation evaluation

3.3. Indicator score calculation

3.4. Spam detection

Fig. 2. Diverter for spam and non-spam

where t is a pre-defined judgment threshold of legitimate email and spam.

4.1. Experiment evaluation measure

Table 1. Corpora used in comparison experiments

Corpus Spam Non-Spam MIME

4.2. Experiment evaluation results

Fig. 3. ROC curve graph on Ling-Spam corpus

Misclassified Spams (of 1897)

Fig. 4. ROC curve graph on Spam-Assassin corpus

Fig. 5. ROC curve graph on Handwork corpus

Wenxuan Shi, shiwx@nankai.edu.cn, 086-13920561100

mail, Vol. RFC 4408.

You might also like