Website Category Classification Using Fine-Tuned Bert Language Model

dc.contributor.author Demirkıran, Ferhat
dc.contributor.author Dağ, Hasan
dc.contributor.author Çayır, Aykut
dc.contributor.author Demirkıran, Ferhat
dc.contributor.author Ünal, Uğur
dc.contributor.author Dağ, Hasan
dc.contributor.other Management Information Systems
dc.date.accessioned 2020-12-17T18:36:21Z
dc.date.available 2020-12-17T18:36:21Z
dc.date.issued 2020
dc.department Fakülteler, İşletme Fakültesi, Yönetim Bilişim Sistemleri Bölümü en_US
dc.description.abstract The contents on the Word Wide Web is expanding every second providing web users a rich content. However, this situation may cause web users harm rather than good due to its harmful or misleading information. The harmful contents can contain text, audio, video, or image that can be about violence, adult contents, or any other harmful information. Especially young people may readily be affected with these harmful information psychologically. To prevent youth from these harmful contents, various web filtering techniques, such as keyword filtering, Uniform Resource Locator (URL) based filtering, Intelligent analysis, and semantic analysis, are used. We propose an algorithm that can classify websites, which may contain adult contents, with 67.81% (BERT) accuracy among 32 unique categories. We also show that a BERT model gives higher accuracy than both the Sequential and Functional API models when used for text classification. en_US
dc.identifier.citationcount 3
dc.identifier.doi 10.1109/UBMK50275.2020.9219384 en_US
dc.identifier.endpage 336 en_US
dc.identifier.isbn 978-172817565-2 en_US
dc.identifier.scopus 2-s2.0-85095717414 en_US
dc.identifier.startpage 333 en_US
dc.identifier.uri https://hdl.handle.net/20.500.12469/3562
dc.identifier.uri https://doi.org/10.1109/UBMK50275.2020.9219384
dc.identifier.wos WOS:000629055500065 en_US
dc.institutionauthor Demirkıran, Ferhat en_US
dc.institutionauthor Çayır, Aykut en_US
dc.institutionauthor Ünal, Uğur en_US
dc.institutionauthor Daǧ, Hasan en_US
dc.language.iso en en_US
dc.publisher Institute of Electrical and Electronics Engineers Inc. en_US
dc.relation.journal 5th International Conference on Computer Science and Engineering, UBMK 2020 en_US
dc.relation.publicationcategory Konferans Öğesi - Uluslararası - Kurum Öğretim Elemanı en_US
dc.rights info:eu-repo/semantics/closedAccess en_US
dc.scopus.citedbyCount 9
dc.subject BERT en_US
dc.subject Functional API en_US
dc.subject Sequential API en_US
dc.subject Text classification en_US
dc.subject Web filtering en_US
dc.title Website Category Classification Using Fine-Tuned Bert Language Model en_US
dc.type Conference Object en_US
dc.wos.citedbyCount 5
dspace.entity.type Publication
relation.isAuthorOfPublication e02bc683-b72e-4da4-a5db-ddebeb21e8e7
relation.isAuthorOfPublication 695a8adc-2330-4d32-ab37-8b781716d609
relation.isAuthorOfPublication.latestForDiscovery e02bc683-b72e-4da4-a5db-ddebeb21e8e7
relation.isOrgUnitOfPublication ff62e329-217b-4857-88f0-1dae00646b8c
relation.isOrgUnitOfPublication.latestForDiscovery ff62e329-217b-4857-88f0-1dae00646b8c

Files