On feature extraction for spam e-mail detection
Abstract
Electronic mail is an important communication method for most computer users. Spam e-mails however consume bandwidth resource, fill-up server storage and are also a waste of time to tackle. The general way to label an e-mail as spam or non-spam is to set up a finite set of discriminative features and use a classifier for the detection. In most cases, the selection of such features is empirically verified. In this paper, two different methods are proposed to select the most discriminative features among a set of reasonably arbitrary features for spam e-mail detection. The selection methods are developed using the Common Vector Approach (CVA) which is actually a subspace-based pattern classifier. Experimental results indicate that the proposed feature selection methods give considerable reduction on the number of features without affecting recognition rates.
Source
Multimedia Content Representation, Classification and SecurityVolume
4105Collections
- Bildiri Koleksiyonu [355]
- Scopus İndeksli Yayınlar Koleksiyonu [8325]
- WoS İndeksli Yayınlar Koleksiyonu [7605]