dc.contributor.author | Agun, Hayri Volkan | |
dc.contributor.author | Yılmazel, Özgür | |
dc.date.accessioned | 2019-10-21T19:44:08Z | |
dc.date.available | 2019-10-21T19:44:08Z | |
dc.date.issued | 2019 | |
dc.identifier.issn | 0165-5515 | |
dc.identifier.issn | 1741-6485 | |
dc.identifier.uri | https://dx.doi.org/10.1177/0165551519863350 | |
dc.identifier.uri | https://hdl.handle.net/11421/19815 | |
dc.description | WOS: 000476683200001 | en_US |
dc.description.abstract | Domain, genre and topic influences on author style adversely affect the performance of authorship attribution (AA) in multi-genre and multi-domain data sets. Although recent approaches to AA tasks focus on suggesting new feature sets and sampling techniques to improve the robustness of a classification system, they do not incorporate domain-specific properties to reduce the negative impact of irrelevant features on AA. This study presents a novel scaling approach, namely, bucketed common vector scaling, to efficiently reduce negative domain influence without reducing the dimensionality of existing features; therefore, this approach is easily transferable and applicable in a classification system. Classification performances on English-language competition data sets consisting of emails and articles and Turkish-language web documents consisting of blogs, articles and tweets indicate that our approach is very competitive to top-performing approaches in English competition data sets and is significantly improving the top classification performance in mixed-domain experiments on blogs, articles and tweets. | en_US |
dc.language.iso | eng | en_US |
dc.publisher | SAGE Publications LTD | en_US |
dc.relation.isversionof | 10.1177/0165551519863350 | en_US |
dc.rights | info:eu-repo/semantics/closedAccess | en_US |
dc.subject | Authorship Attribution | en_US |
dc.subject | Common Vector Approach | en_US |
dc.subject | Domain Scaling | en_US |
dc.subject | Text Classification | en_US |
dc.title | Bucketed common vector scaling for authorship attribution in heterogeneous web collections: A scaling approach for authorship attribution | en_US |
dc.type | article | en_US |
dc.relation.journal | Journal of Information Science | en_US |
dc.contributor.department | Anadolu Üniversitesi, Mühendislik Fakültesi, Bilgisayar Mühendisliği Bölümü | en_US |
dc.relation.publicationcategory | Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı | en_US] |
dc.contributor.institutionauthor | Yılmazel, Özgür | |