2024
Journal article  Open Access

Forging the Forger: an attempt to improve authorship verification via data augmentation

Corbara S., Moreo Fernandez A.

Computer Science - Machine Learning  data augmentation  TK1-9971  Authorship identification  text classification  Authorship verification  Computer Science - Computation and Language  Computation and Language (cs.CL)  FOS: Computer and information sciences  Artificial Intelligence (cs.AI)  Data augmentation  authorship verification  Electrical engineering. Electronics. Nuclear engineering  Machine Learning (cs.LG)  Computer Science - Artificial Intelligence  Text classification 

Authorship Verification (AV) is a text classification task concerned with inferring whether a candidate text has been written by one specific author (A) or by someone else (A). It has been shown that many AV systems are vulnerable to adversarial attacks, where a malicious author actively tries to fool the classifier by either concealing their writing style, or by imitating the style of another author. In this paper, we investigate the potential benefits of augmenting the classifier training set with (negative) synthetic examples. These synthetic examples are generated to imitate the style of A. We analyze the improvements in the classifier predictions that this augmentation brings to bear in the task of AV in an adversarial setting. In particular, we experiment with three different generator architectures (one based on Recurrent Neural Networks, another based on small-scale transformers, and another based on the popular GPT model) and with two training strategies (one inspired by standard Language Models, and another inspired by Wasserstein Generative Adversarial Networks). We evaluate our hypothesis on five datasets (three of which have been specifically collected to represent an adversarial setting) and using two learning algorithms for the AV classifier (Support Vector Machines and Convolutional Neural Networks). This experimentation yields negative results, revealing that, although our methodology proves effective in many adversarial settings, its benefits are too sporadic for a pragmatical application.

Source: IEEE ACCESS, vol. 12, pp. 171911-171925


[1] E. Stamatatos, ''Authorship verification: A review of recent advances,'' Res. Comput. Sci., vol. 123, no. 1, pp. 9-25, Dec. 2016.
[2] B. Stein, N. Lipka, and S. M. Z. Eissen, ''Meta analysis within authorship verification,'' in Proc. 19th Int. Conf. Database Expert Syst. Appl., Sep. 2008, pp. 34-39.
[3] M. Koppel, J. Schler, and E. Bonchek-Dokow, ''Measuring differentiability: Unmasking pseudonymous authors,'' J. Mach. Learn. Res., vol. 8, no. 6, pp. 1261-1276, 2007.
[4] P. Juola, ''Authorship attribution,'' Found. Trends Inf. Retr., vol. 1, no. 3, pp. 233-334, 2006.
[5] M. Brennan, S. Afroz, and R. Greenstadt, ''Adversarial stylometry: Circumventing authorship recognition to preserve privacy and anonymity,'' ACM Trans. Inf. Syst. Secur., vol. 15, no. 3, pp. 1-22, Nov. 2012.
[6] C. Faust, G. Dozier, J. Xu, and M. C. King, ''Adversarial authorship, interactive evolutionary hill-climbing, and author CAAT-III,'' in Proc. IEEE Symp. Ser. Comput. Intell. (SSCI), Nov. 2017, pp. 1-8.
[7] S. Corbara, A. Moreo, F. Sebastiani, and M. Tavoni, ''The epistle to Cangrande through the lens of computational authorship verification,'' in Proc. 1st Int. Workshop Pattern Recognit. Cultural Heritage, Trento, IT, USA, 2019, pp. 148-158.
[8] R. McCarthy and J. O'Sullivan, ''Who wrote wuthering heights?'' Digit. Scholarship Humanities, vol. 36, no. 2, pp. 383-391, Sep. 2021.
[9] A. Nini, ''An authorship analysis of the Jack the ripper letters,'' Digit. Scholarship Humanities, vol. 33, no. 3, pp. 621-636, Sep. 2018.
[10] J. Savoy, ''Authorship of pauline epistles revisited,'' J. Assoc. Inf. Sci. Technol., vol. 70, no. 10, pp. 1089-1097, Oct. 2019.
[11] E. Tuccinardi, ''An application of a profile-based method for authorship verification: Investigating the authenticity of Pliny the Younger's letter to Trajan concerning the Christians,'' Digit. Scholarship Humanities, vol. 32, no. 2, pp. 435-447, 2017.
[12] R. Vainio, R. Välimäki, A. Hella, M. Kaartinen, T. Immonen, A. Vesanto, and F. Ginter, ''Reconsidering authorship in the Ciceronian corpus through computational authorship attribution,'' Ciceroniana Line, vol. 3, no. 1, pp. 15-48, 2019.
[13] T. Fagni, F. Falchi, M. Gambini, A. Martella, and M. Tesconi, ''TweepFake: About detecting deepfake tweets,'' PLoS ONE, vol. 16, no. 5, May 2021, Art. no. e0251415.
[14] J. Salminen, C. Kandpal, A. M. Kamel, S.-G. Jung, and B. J. Jansen, ''Creating and detecting fake reviews of online products,'' J. Retailing Consum. Services, vol. 64, Jan. 2022, Art. no. 102771.
[15] M. Potthast, M. Hagen, and B. Stein, ''Author obfuscation: Attacking the state of the art in authorship verification,'' in Proc. CLEF, 2016, pp. 716-749.
[16] J. Bevendorff, M. Potthast, M. Hagen, and B. Stein, ''Heuristic authorship obfuscation,'' in Proc. 57th Annu. Meeting Assoc. Comput. Linguistics, 2019, pp. 1098-1108.
[17] J. Allred, S. Packer, G. Dozier, S. Aykent, A. Richardson, and M. C. King, ''Towards a human-AI hybrid for adversarial authorship,'' in Proc. SoutheastCon, Mar. 2020, pp. 1-8.
[18] W. Zhai, J. Rusert, Z. Shafiq, and P. Srinivasan, ''Adversarial authorship attribution for deobfuscation,'' in Proc. 60th Annu. Meeting Assoc. Comput. Linguistics, S. Muresan, P. Nakov, and A. Villavicencio, Eds., 2022, pp. 7372-7384.
[19] A. Uchendu, T. Le, and D. Lee, ''Attribution and obfuscation of neural text authorship: A data mining perspective,'' ACM SIGKDD Explor. Newslett., vol. 25, no. 1, pp. 1-18, Jun. 2023.
[20] H. Wang, ''Defending against authorship identification attacks,'' 2023, arXiv:2310.01568.
[21] K. Cho, B. van Merrienboer, D. Bahdanau, and Y. Bengio, ''On the properties of neural machine translation: Encoder-decoder approaches,'' in Proc. 8th Workshop Syntax, Semantics Struct. Stat. Transl., 2014, pp. 103-111.
[22] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, ''Attention is all you need,'' in Proc. Adv. Neural Inf. Process. Syst., I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, Eds., Long Beach, CA, USA: Curran Associates, 2017, pp. 5998-6008.
[23] A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, and I. Sutskever, ''Language models are unsupervised multitask learners,'' OpenAI Blog, vol. 1, no. 8, p. 9, 2019.
[24] M. Kestemont, E. Stamatatos, E. Manjavacas, W. Daelemans, M. Potthast, and B. Stein, ''Overview of the cross-domain authorship attribution task at PAN 2019,'' in Proc. CLEF, vol. 2380, L. Cappellato, N. Ferro, D. E. Losada, and H. Müller, Eds., 2019, pp. 1-15.
[25] J. Bevendorff, B. Ghanem, A. Giachanou, M. Kestemont, E. Manjavacas, I. Markov, M. Mayerl, M. Potthast, F. M. R. Pardo, P. Rosso, G. Specht, E. Stamatatos, B. Stein, M. Wiegmann, and E. Zangerle, ''Overview of PAN 2020: Authorship verification, celebrity profiling, profiling fake news spreaders on Twitter, and style change detection,'' in Proc. 11th Int. Conf. (CLEF), in Lecture Notes in Computer Science, vol. 12260, A. Arampatzis, E. Kanoulas, T. Tsikrika, S. Vrochidis, H. Joho, C. Lioma, C. Eickhoff, A. Névéol, L. Cappellato, and N. Ferro, Eds., Thessaloniki, Greece: Springer, Sep. 2020, pp. 372-383.
[26] J. Bevendorff, B. Chulvi, G. L. D. L. P. Sarracén, M. Kestemont, E. Manjavacas, I. Markov, M. Mayerl, M. Potthast, F. Rangel, P. Rosso, E. Stamatatos, B. Stein, M. Wiegmann, M. Wolska, and E. Zangerle, ''Overview of PAN 2021: Authorship verification, profiling hate speech spreaders on Twitter, and style change detection,'' in Proc. 12th Int. Conf. (CLEF), in Lecture Notes in Computer Science, vol. 12880, K. S. Candan, B. Ionescu, L. Goeuriot, B. Larsen, H. Müller, A. Joly, M. Maistro, F. Piroi, G. Faggioli, and N. Ferro, Eds., Bucharest, Romania: Springer, 2021, pp. 419-431.
[27] E. Stamatatos, M. Kestemont, K. Kredens, P. Pezik, A. Heini, J. Bevendorff, B. Stein, and M. Potthast, ''Overview of the authorship verification task at PAN 2022,'' in Proc. CEUR, vol. 3180, 2022, pp. 2301-2313.
[28] J. Bevendorff, M. Chinea-Ríos, M. Franco-Salvador, A. Heini, E. Körner, K. Kredens, M. Mayerl, P. Pezik, M. Potthast, and F. Rangel, ''Overview of PAN 2023: Authorship verification, multi-author writing style analysis, profiling cryptocurrency influencers, and trigger detection,'' inProc. Eur. Conf. Inf. Retr. Dublin, Ireland: Springer, 2023, pp. 518-526.
[29] M. Koppel and Y. Winter, ''Determining if two documents are written by the same author,'' J. Assoc. Inf. Sci. Technol., vol. 65, no. 1, pp. 178-187, Jan. 2014.
[30] R. Zheng, J. Li, H. Chen, and Z. Huang, ''A framework for authorship identification of online messages: Writing-style features and classification techniques,'' J. Amer. Soc. Inf. Sci. Technol., vol. 57, no. 3, pp. 378-393, Feb. 2006.
[31] T. Boran, M. Martinaj, and M. S. Hossain, ''Authorship identification on limited samplings,'' Comput. Secur., vol. 97, Oct. 2020, Art. no. 101943.
[32] T. Young, D. Hazarika, S. Poria, and E. Cambria, ''Recent trends in deep learning based natural language processing,'' IEEE Comput. Intell. Mag., vol. 13, no. 3, pp. 55-75, Aug. 2018.
[33] D. Bagnall, ''Author identification using multi-headed recurrent neural networks,'' in Proc. CLEF, vol. 1391, L. Cappellato, N. Ferro, G. J. F. Jones, and E. SanJuan, Eds., 2015, pp. 1-11.
[34] E. Stamatatos, W. Daelemans, B. Verhoeven, P. Juola, A. López-López, M. Potthast, and B. Stein, ''Overview of the author identification task at PAN 2015,'' in Proc. CLEF, vol. 1391, 2015, pp. 877-897.
[35] M. Kestemont, M. Tschuggnall, E. Stamatatos, W. Daelemans, G. Specht, B. Stein, and M. Potthast, ''Overview of the author identification task at PAN-2018: Cross-domain authorship attribution and style change detection,'' in Proc. CLEF, vol. 2125, L. Cappellato, N. Ferro, J.-Y. Nie, and L. Soulier, Eds., 2018, pp. 1-25.
[36] A. Theophilo, R. Giot, and A. Rocha, ''Authorship attribution of social media messages,'' IEEE Trans. Computat. Social Syst., vol. 10, no. 1, pp. 10-23, Feb. 2023.
[37] B. Boenninghoff, R. M. Nickel, S. Zeiler, and D. Kolossa, ''Similarity learning for authorship verification in social media,'' in Proc. IEEE Int. Conf. Acoust., Speech Signal Process. (ICASSP), May 2019, pp. 2457-2461.
[38] X. Zhang, J. Zhao, and Y. LeCun, ''Character-level convolutional networks for text classification,'' inProc. 28th Int. Conf. Neural Inf. Process. Syst., vol. 1. Cambridge, MA, USA: MIT Press, 2015, pp. 649-657.
[39] S. Kobayashi, ''Contextual augmentation: Data augmentation by words with paradigmatic relations,'' in Proc. Conf. North Amer. Chapter Assoc. Comput. Linguistics, Human Lang. Technol., vol. 2, 2018, pp. 452-457. [Online]. Available: https://aclanthology.org/N18-2072
[40] I. J. Goodfellow, J. Shlens, and C. Szegedy, ''Explaining and harnessing adversarial examples,'' 2014, arXiv:1412.6572.
[41] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, ''Generative adversarial nets,'' in Proc. Adv. Neural Inf. Process. Syst., vol. 27, 2014, pp. 1-9.
[42] T. Karras, S. Laine, and T. Aila, ''A style-based generator architecture for generative adversarial networks,'' in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2019, pp. 4396-4405.
[43] L. Yu, W. Zhang, J. Wang, and Y. Yu, ''SeqGAN: Sequence generative adversarial nets with policy gradient,'' in Proc. AAAI Conf. Artif. Intell., vol. 31, 2017, pp. 1-7.
[44] D. Donahue and A. Rumshisky, ''Adversarial text generation without reinforcement learning,'' 2018, arXiv:1810.06640.
[45] M. J. Kusner and J. M. Hernández-Lobato, ''GANS for sequences of discrete elements with the gumbel-softmax distribution,'' 2016, arXiv:1611.04051.
[46] Y. Zhang, Z. Gan, K. Fan, Z. Chen, R. Henao, D. Shen, and L. Carin, ''Adversarial feature matching for text generation,'' in Proc. Int. Conf. Mach. Learn., 2017, pp. 4006-4015.
[47] A. Hatua, A. M. Mukherjee, and R. Verma, ''On the feasibility of using GANs for claim verification-experiments and analysis,'' inProc. Workshop Reducing Online Misinformation Through Credible Inf. Retr., 2021, pp. 1-12.
[48] E. Manjavacas, J. De Gussem, W. Daelemans, and M. Kestemont, ''Assessing the stylistic properties of neurally generated text in authorship attribution,'' in Proc. Workshop Stylistic Variation, 2017, pp. 116-125.
[49] S. Corbara and A. Moreo, ''Enhancing adversarial authorship verification with data augmentation,'' in Proc. 13th Italian Inf. Retr. Workshop, 2023, pp. 73-78.
[50] K. Jones, J. R. C. Nurse, and S. Li, ''Are you Robert or RoBERTa? Deceiving online authorship attribution models using neural text generators,'' in Proc. Int. AAAI Conf. Web Social Media, vol. 16, 2022, pp. 429-440.
[51] A. Ezen-Can, ''A comparison of LSTM and BERT for small corpus,'' 2020, arXiv:2009.05451.
[52] A. Uchendu, T. Le, K. Shu, and D. Lee, ''Authorship attribution for neural text generation,'' in Proc. Conf. Empirical Methods Natural Lang. Process. (EMNLP), 2020, pp. 8384-8395.
[53] V. Sanh, L. Debut, J. Chaumond, and T. Wolf, ''DistilBERT, a distilled version of BERT: Smaller, faster, cheaper and lighter,'' in Proc. NeurIPS, Vancouver, BC, Canada, 2019.
[54] M. Arjovsky, S. Chintala, and L. Bottou, ''Wasserstein generative adversarial networks,'' in Proc. Int. Conf. Mach. Learn., 2017, pp. 214-223.
[55] I. Gulrajani, F. Ahmed, M. Arjovsky, V. Dumoulin, and A. C. Courville, ''Improved training of Wasserstein GANs,'' in Proc. Adv. Neural Inf. Process. Syst., vol. 30, 2017, pp. 1-11.
[56] M. Kestemont, ''Function words in authorship attribution. From black magic to theory?'' in Proc. 3rd Workshop Comput. Linguistics Literature (CLFL), 2014, pp. 59-66.
[57] T. C. Mendenhall, ''The characteristic curves of composition,'' Science, vol. 9, no. 214, pp. 237-249, 1887.
[58] T. Joachims, ''Text categorization with support vector machines: Learning with many relevant features,'' in Proc. 10th Eur. Conf. Mach. Learn., Apr. 1998, pp. 137-142.
[59] I. Loshchilov and F. Hutter, ''Decoupled weight decay regularization,'' in Proc. Int. Conf. Learn. Represent., New Orleans, LA, USA, 2018.
[60] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and É. Duchesnay, ''Scikit-learn: Machine learning in Python,'' J. Mach. Learn. Res., vol. 12, pp. 2825-2830, Nov. 2011.
[61] A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, A. Desmaison, A. Kopf, E. Yang, Z. DeVito, M. Raison, A. Tejani, S. Chilamkurthy, B. Steiner, L. Fang, J. Bai, and S. Chintala, ''PyTorch: An imperative style, highperformance deep learning library,'' in Proc. Adv. Neural Inf. Process. Syst., H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett, Eds., Vancouver, BC, Canada: Curran Associates, 2019, pp. 8024-8035. [Online]. Available: http://papers.neurips.cc/paper/9015- pytorch-an-imperative-style-high-performance-deep-learning-library.pdf
[62] A. Riddell, H. Wang, and P. Juola, ''A call for clarity in contemporary authorship attribution evaluation,'' in Proc. Conf. Recent Adv. Natural Lang. Process. Deep Learn. Natural Lang. Process. Methods Appl., 2021, pp. 1174-1179.
[63] S. Argamon and P. Juola, ''Overview of the international authorship identification competition at PAN-2011,'' in Proc. CLEF, vol. 1177, V. Petras, P. Forner, and P. D. Clough, Eds., 2011, pp. 1-10.
[64] A. Gungor, ''Benchmarking authorship attribution techniques using over a thousand books by fifty Victorian era novelists,'' Ph.D. dissertation, Dept. Comput. Inf. Sci., Purdue Univ., Indianapolis, IN, USA, 2018.
[65] F. Sebastiani, ''An axiomatically derived measure for the evaluation of classification algorithms,'' inProc. Int. Conf. Theory Inf. Retr., Sep. 2015, pp. 11-20.
[66] Q. McNemar, ''Note on the sampling error of the difference between correlated proportions or percentages,'' Psychometrika, vol. 12, no. 2, pp. 153-157, Jun. 1947.
[67] Z. Hu, R. K.-W. Lee, C. C. Aggarwal, and A. Zhang, ''Text style transfer: A review and experimental evaluation,'' ACM SIGKDD Explorations Newslett., vol. 24, no. 1, pp. 14-45, Jun. 2022.
[68] H. Abdullah, A. Karlekar, V. Bindschaedler, and P. Traynor, ''Demystifying limited adversarial transferability in automatic speech recognition systems,'' in Proc. Int. Conf. Learn. Represent. (ICLR), Vienna, Austria, 2021.
[69] D. P. Kingma and J. Ba, ''Adam: A method for stochastic optimization,'' 2014, arXiv:1412.6980.

Metrics



Back to previous page
BibTeX entry
@article{oai:iris.cnr.it:20.500.14243/552086,
	title = {Forging the Forger: an attempt to improve authorship verification via data augmentation},
	author = {Corbara S. and Moreo Fernandez A.},
	doi = {10.1109/access.2024.3481161 and 10.48550/arxiv.2403.11265},
	year = {2024}
}

SoBigData-PlusPlus
SoBigData++: European Integrated Infrastructure for Social Mining and Big Data Analytics


OpenAIRE