Is cross-linguistic advert flaw detection in Wikipedia feasible? A multilingual-BERT-based transfer learning approach
作者:
Highlights:
• Introduce transfer learning for cross-linguistic Wikipedia advert detection.
• English Wikipedia samples can detect Non-English Wikipedia advert.
• Multi-lingual BERT is qualified for a cross-linguistic transfer learning encoder.
• Proposed fine-tuning transfer performs the best for different dataset scales.
摘要
•Introduce transfer learning for cross-linguistic Wikipedia advert detection.•English Wikipedia samples can detect Non-English Wikipedia advert.•Multi-lingual BERT is qualified for a cross-linguistic transfer learning encoder.•Proposed fine-tuning transfer performs the best for different dataset scales.
论文关键词:Wikipedia quality flaw,Cross-lingual transfer learning,Pretraining language model,Text classification
论文评审过程:Received 18 January 2022, Revised 7 June 2022, Accepted 24 June 2022, Available online 30 June 2022, Version of Record 8 July 2022.
论文官网地址:https://doi.org/10.1016/j.knosys.2022.109330