Have a personal or library account? Click to login
Constructing a Gold Corpus of Annotated Youtube Comments for Discursive Strategies Span Classification Cover

Constructing a Gold Corpus of Annotated Youtube Comments for Discursive Strategies Span Classification

Open Access
|Jul 2025

References

  1. Al-Khatib, K., Wachsmuth, H., Kiesel, J. et al., 2016. A News Editorial Corpus for Mining Argumentation Strategies. In Y. Matsumoto & R. Prasad (Eds.), Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers. The COLING 2016 Organizing Committee, pp. 3433–3443.
  2. Bou-Franch, P., Lorenzo-Dus, N., & Blitvich, P. G., 2012. Social Interaction in YouTube Text-Based Polylogues: A Study of Coherence. Journal of Computer-Mediated Communication, 17, 501–521.
  3. Card, D., Boydstun, A. E., Gross, J. H. et al., 2015. The Media Frames Corpus: Annotations of Frames Across Issues. In: 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing Conference Proceeding (Short Papers), China: Beijing, pp. 4171–4186.
  4. Ceci, L., 2021a. Distribution of video comments removed from YouTube worldwide Q3 2021, by reason. Edited by Google. Statista, available at: < https://www.statista.com/statistics/1133165/share-removed-youtube-video-comments-worldwide-by-reason/>.
  5. Ceci, L., 2021b. YouTube - Statistics & Facts. Statista, available at: < https://www.statista.com/topics/2019/youtube/#dossierKeyfigures>.
  6. Da San Martino, G., Yu, S., Barrón-Cedeño, A. et al., 2019. Fine-Grained Analysis of Propaganda in News Articles. In: 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing Conference Proceeding, China: Hon Kong, pp. 5636–5646.
  7. Da San Martino, G., Barrón-Cedeño, A., Wachsmuth, H. et al., 2020. SemEval-2020 Task 11: Detection of Propaganda Techniques in News Articles. In: 14th International Workshop on Semantic Evaluation Proceeding conference, Barcelona, Spain, pp. 1377–1414.
  8. Del Vicario, M., Bessi, A., Zollo, F. et al., 2016. The spreading of misinformation online. Proceedings of the National Academy of Sciences of the United States of America, 113(3), 554–559.
  9. Demata, M., Heaney, D., & Herring, S. C. (Eds.)., 2018. Language and Discourse in Social Media: New Challenges, New Approaches. Altre Modernità: Università degli Studi di Milano.
  10. Deutsches Institut für Menschenrechte (DIM)., 2017. Gutachten: Geschlechtervielfalt im Recht. Status quo und Entwicklung von Regelungsmodellen zur Anerkennung und zum Schutz von Geschlechtervielfalt. In Bundesministerium für Familie, Frauen, Senioren und Jugend (Eds.), Begleitmaterial zur Interministeriellen Arbeitsgruppe Inter- und Transsexualität – Band 8, available at: < https://www.bmfsfj.de/resource/blob/114066/8a02a557eab695bf7179ff2e92d0ab28/imag-band-8-geschlechtervielfalt-im-rechtdata.pdf>.
  11. Devlin, J., Chang, M.-W., Lee, K. et al., 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of NAACL-HLT 2019, Minneapolis, Minnesota, USA, pp. 4171–4186.
  12. Dynel, M., 2014. Participation framework underlying YouTube interaction. Journal of Pragmatics, 73, 37–52.
  13. Eger, S., Daxenberger, J., & Gurevych, I., 2017. Neural Endto-End Learning for Computational Argumentation Mining. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Vancouver, Canada, pp. 11–22.
  14. European Union Agency for Fundamental Rights (FRA)., 2015. Being Trans in the EU. Comparative analysis of the EU LGBT survey data. Summary. European Union Agency for Fundamental Rights (FRA)., 2020. EU-LGBTI II: A long way to go for LGBTI equality. Luxembourg.
  15. Europäische Kommission, Generaldirektion Justiz (EK)., 2011. Trans- und intersexuelle Menschen. Diskriminierung von trans- und intersexuellen Menschen aufgrund des Geschlechts, der Geschlechtsidentität und des Geschlechtsausdrucks. Amt für amtliche Veröffentlichungen der Europäischen Gemeinschaften, available at: < https://op.europa.eu/o/opportal-service/download-handler?identifier=9b338479-c1b5-4d88-a1f8a248a19466f1&format=pdf&language=de&productionSystem=cellar&part=>
  16. Fairclough, N., 1989. Language and power. Longman Group.
  17. Fairclough, N., 1992. Discourse and Social Change. Polity Press.
  18. Fleiss, J. L., 1971. Measuring nominal scale agreement among many raters. Psychological bulletin, 76(5), 378.
  19. Franzen, J., & Sauer, A., 2010. Benachteiligung von Trans*Personen, insbesondere im Arbeitsleben. Antidiskriminierungsstelle des Bundes, available at: < https://www.antidiskriminierungsstelle.de/SharedDocs/downloads/DE/publikationen/Expertisen/expertise_benachteiligung_von_trans_personen.pdf?__blob=publicationFile&v=3>.
  20. Herring, S. C., 2004. Computer-Mediated Discourse Analysis: An Approach to Researching Online Behavior. In S. A. Barab, R. Kling, & J. H. Gray (Eds.), Designing for Virtual Communities in the Service of Learning, Cambridge University Press, pp. 338–376.
  21. Herring, S. C., & Androutsopoulos, J., 2015. Computer-mediated discourse 2.0. In D. Tannen, H. E. Hamilton, & D. Schiffrin (Eds.), The handbook of discourse analysis (2nd ed.), John Wiley & Sons, pp. 127–151.
  22. Herring, S. C., & Stoerger, S., 2014. Gender and (A) nonymity in Computer-Mediated Communication. In J. Holmes, M. Meyerhoff, & S. Ehrlich (Eds.), Handbook of Language, Gender, and Sexuality (2nd ed.), John Wiley & Sons, pp. 567–586.
  23. Jeong, A. C., 2003. The Sequential Analysis of Group Interaction and Critical Thinking in Online Threaded Discussions. The American Journal of Distance Education, 17(1), 25–43.
  24. Jurkiewicz, D., Borchmann, Ł., Kosmala, I. et al., 2020. ApplicaAI at SemEval-2020 Task 11: On RoBERTa-CRF, Span CLS and Whether Self-Training Helps Them. In Proceedings of the 14th International Workshop on Semantic Evaluation, Barcelona, Spain (Online), pp. 1415–1424.
  25. Krippendorff, K., 2004. Content Analysis. An Introduction to Its Methodology (2nd ed.). Sage.
  26. Li, T., Lin, L., Choi, M. et al., 2018. YouTube AV 50K: An Annotated Corpus for Comments in Autonomous Vehicles. In iSAI-NLP 2018 Proceedings, IEEE, pp. 1–5.
  27. Liu, Y., Ott, M., Goyal, N. et al., 2019. RoBERTa: A robustly optimized bert pretraining approach, available at: < https://arxiv.org/pdf/1907.11692.pdf>.
  28. Macgilchrist, F., 2007. Positive Discourse Analysis: Contesting Dominant Discourses by Reframing the Issues. Critical Approaches to Discourse Analysis Across Disciplines, 1(1), 74–94.
  29. Madden, A., Ruthven, I., & McMenemy, D., 2013. A classification scheme for content analyses of YouTube video comments. Journal of Documentation, 69(5), 693–714.
  30. Mathet, Y., Widlöcher, A., & Métivier, J.-P., 2015. The Unified and Holistic Method Gamma (γ) for Inter-Annotator Agreement Measure and Alignment. Computational Linguistics, 41(3), 437–479.
  31. Maurya, P., Jafari, O., Thatte et al., 2022. Building a comprehensive NER model for Satellite Domain. SN Computer Science, 3(3), 199.
  32. Mochales, R., & Moens, M.-F., 2011. Argumentation mining. Artificial Intelligence and Law,19, 1–22.
  33. Morio, G., Morishita, T., Ozaki, H. et al., 2020. Hitachi at SemEval-2020 Task 11: An Empirical Study of Pre-Trained Transformer Family for Propaganda Detection. In Proceedings of the 14th International Workshop on Semantic Evaluation, Barcelona, Spain (Online), pp. 1739–1748.
  34. Naim, J., Hossain, T., Tasneem, F. et al., 2022. Leveraging fusion of sequence tagging models for toxic spans detection. Neurocomputing, 50, 688–702.
  35. Park, J., Katiyar, A., & Yang, B., 2015. Conditional Random Fields for Identifying Appropriate Types of Support for Propositions in Online User Comments. In C. Cardie (Ed.), Proceedings of the 2nd Workshop on Argumentation Mining. 2nd Workshop on Argumentation Mining, Denver, Association for Computational Linguistics, pp. 39–44.
  36. Peldszus, A., 2017. Automatic recognition of argumentation structure in short monological texts PhD thesis. Institutional Repository of the University of Potsdam, Potsdam, Germany.
  37. Persing, I., & Ng, V., 2016. End-to-End Argumentation Mining in Student Essays. In Proceedings of NAACLHLT 2016, San Diego, California, pp. 1348–1394.
  38. Reisigl, M., & Wodak, R., 2009. The Discourse-Historical Approach (DHA). In R. Wodak & M. Meyer (Eds.), Methods of Critical Discourse Analysis, Sage, pp. 87–121.
  39. Rushton, A., Gray, L., Canty, J. et al., 2019. Review. Beyond Binary: (Re)Defining “Gender” for 21st Century Disaster Risk Reduction Research, Policy, and Practice. International Journal of Environmental Research and Public Health, 16(3984), 1–14.
  40. Schilt, K., & Westbrook, L., 2009. Doing Gender, Doing Heteronormativity. “Gender Normals,” Transgender People, and the Social Maintenance of Heterosexuality. Gender & Society, 23(4), 440–464.
  41. Schultes, P., Dorner, V., & Lehner, F., 2013. Leave a Comment! An In-Depth Analysis of User Comments on YouTube. In R. Alt & B. Franczyk (Eds.), Proceedings of the 11th International Conference on Wirtschaftsinformatik (WI2013), pp. 659–674.
  42. Stab, C., & Gurevych, I., 2017. Parsing Argumentation Structures in Persuasive Essays. Computational Linguistics, 43(3), 619–660.
  43. Stab, C., Miller, T., Schiller, B. et al., 2018. Cross-topic Argument Mining from Heterogeneous Sources. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium.
  44. Taylor, E., 2010. Cisgender Privilege: On the Privileges of Performing Normative Gender. In K. Bornstein & S. B. Bergmann (Eds.), Gender outlaws: The next generation, Seal Press, pp. 268–272.
  45. Thelwall, M., Sud, P., & Vis, F., 2012. Commenting on YouTube Videos: From Guatemalan Rock to El Big Bang. Journal of the American Society for Information Science and Technology, 63(3), 616–629.
  46. van Dijk, T. A., 1992. Discourse and the denial of racism. Discourse & Society, 3(1), 87–118.
  47. van Dijk, T. A., 1993a. Analyzing Racism Through Discourse Analysis. Some Methodological Reflections. In J. H. Stanfield & R. M. Dennis (Eds.), Race and Ethnicity in Research Methods, Sage, pp. 92–134.
  48. van Dijk, T. A., 1993b. Principles of critical discourse analysis. Discourse & Society, 4(2), 249–283.
  49. van Dijk, T. A., 1995. Discourse, power and access. In C. R. Caldas-Coulthard & M. Coulthard (Eds.), Texts and Practices. Readings in Critical Discourse Analysis, Routledge, pp. 84–104.
  50. van Dijk, T. A., 2001. Critical Discourse Analysis. In D. Schiffrin, D. Tannen, & H. E. Hamilton (Eds.), The Handbook of Discourse Analysis, Blackwell, pp. 352–371.
  51. van Dijk, T. A., 2011. Discourse, knowledge, power and politics. Towards critical epistemic discourse analysis. In C. Hart (Ed.), Critical Discourse Studies in Context and Cognition, John Benjamins, pp. 27–63.
  52. van Dijk, T. A., 2012. A note on epistemics and discourse analysis. British Journal of Social Psychology, 51, 478–485.
  53. van Leeuwen, T., 2008. Discourse and Practice. New Tools for Critical Discourse Analysis. Oxford University Press.
  54. Wilce, J. M., 2009. Language and emotion. Cambridge University Press.
  55. Wodak, R., 2001. What CDA is about ± a summary of its history, important concepts and its developments. In R. Wodak & M. Meyer (Eds.), Methods of Critical Discourse Analysis, Sage, pp. 1–13.
  56. Worthen, M. G. F., 2016. Hetero-cis-normativity and the gendering of transphobia. International Journal of Transgenderism, 17(1), 31–57.
  57. Worthen, M. G. F., 2021. Why Can’t You Just Pick One? The Stigmatization of Non-binary/Genderqueer People by Cis and Trans Men and Women: An Empirical Test of Norm-Centered Stigma Theory. Sex Roles, 85, 343–356.
  58. YouTube. (n.d.). YouTube by the Numbers, available at: < https://blog.youtube/press/>.
  59. Zhang, A. X., Culbertson, B., & Paritosh, P., 2017. Characterizing Online Discussion Using Coarse Discourse Sequences. In Proceedings of the International AAAI Conference on Web and Social Media. International AAAI Conference on Web and Social Media, Montréal, pp. 357–366.
  60. Ziegele, M., 2016. Nutzerkommentare als Anschlusskommunikation. Theorie und qualitative Analyse des Diskussionswertes von Online-Nachrichten. Springer VS.
  61. Zhu, X., Cao, J., Tang, D. et al., 2023. Text as Image: Learning Transferable Adapter for Multi-Label Classification. arXiv preprint arXiv:2312.04160.
Language: English
Page range: 1 - 16
Submitted on: Oct 11, 2024
|
Accepted on: May 8, 2024
|
Published on: Jul 11, 2025
In partnership with: Paradigm Publishing Services
Publication frequency: 3 issues per year

© 2025 Linda Feld, Lidiia Wegert-Melnyk, published by Palacký University Olomouc
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.