On feature extraction for spam e-mail detection

Günal, Serkan; Ergin, Semih; Gülmezoğlu, M. Bilginer; Gerek, Ömer Nezih

Advanced Search

View/Open

Tam Metin / Full Text (226.8Kb)

Access

info:eu-repo/semantics/closedAccess

Date

2006

Author

Günal, Serkan
Ergin, Semih
Gülmezoğlu, M. Bilginer
Gerek, Ömer Nezih

Metadata

Show full item record

Abstract

Electronic mail is an important communication method for most computer users. Spam e-mails however consume bandwidth resource, fill-up server storage and are also a waste of time to tackle. The general way to label an e-mail as spam or non-spam is to set up a finite set of discriminative features and use a classifier for the detection. In most cases, the selection of such features is empirically verified. In this paper, two different methods are proposed to select the most discriminative features among a set of reasonably arbitrary features for spam e-mail detection. The selection methods are developed using the Common Vector Approach (CVA) which is actually a subspace-based pattern classifier. Experimental results indicate that the proposed feature selection methods give considerable reduction on the number of features without affecting recognition rates.

Source

Multimedia Content Representation, Classification and Security

Volume

4105

URI

https://hdl.handle.net/11421/20608
https://dx.doi.org/10.1007/11848035_84

Collections

Bildiri Koleksiyonu [355]
Scopus İndeksli Yayınlar Koleksiyonu [8325]
WoS İndeksli Yayınlar Koleksiyonu [7605]