From Character to Poem: Nested Contexts and Scalar Limits of Parallelism Detection in Classical Chinese Poetry

Maciej Kurzynski

doi:10.5334/johd.488

Abstract

Benchmarking for literary analysis is complicated by a persistent mismatch between the fixed context windows of classification models and the emergent properties of literary forms. Here, I approach this challenge by reconsidering semantic parallelism in Chinese regulated verse (lüshi 律詩) as a problem of scale. I first employ a “teacher” model to label parallelism at the couplet (meso) level and then test which “student” model architecture—micro (character), meso (couplet), or macro (poem)—can most effectively recover this labeling rule. The experiment points to a Goldilocks hypothesis: performance is maximized when the classifier is structurally aligned with the scale at which the feature has been encoded. This finding yields further practical insights: (1) bottom-up aggregation of local predictions sacrifices raw performance but offers greater interpretability by exposing the specific decisions of a misaligned model; (2) top-down inference requires additional training computation to compensate for global noise and achieve performance comparable to aligned models; (3) if the goal is to better understand how artificial intelligence represents specific literary phenomena internally (“vector poetics”), aligned classifiers afford the most direct and promising access. By examining different forms of (mis)alignment between texts and models, the study invites discussion on whether meaningful benchmarking requires matching the computational “unit of analysis” with the humanistic “unit of inquiry.”

References

Bender, E. M., & Koller, A. (2020). Climbing towards NLU: On meaning, form, and understanding in the age of data. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 5185–5198. 10.18653/v1/2020.acl-main.463
Open DOI Search in Google Scholar Back to article
Blondel, V. D., Guillaume, J.-L., Lambiotte, R., & Lefebvre, E. (2008). Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment, 2008(10), P10008. 10.1088/1742-5468/2008/10/P10008
Open DOI Search in Google Scholar Back to article
Bode, K. (2018). A world of fiction: Digital collections and the future of literary history. Ann Arbor: University of Michigan Press. 10.3998/mpub.8784777
Open DOI Search in Google Scholar Back to article
Cai, Z. (1989). Synthetic parallelism as a cultural expression: A cross-cultural and cross-disciplinary study. Tamkang Review, 20(2), 151–168. https://tamkangreview.org/data/10000/upload/files/17339140645027.pdf
Search in Google Scholar Back to article
Cai, Z. (Ed.) (2008). How to read Chinese poetry: A guided anthology. New York: Columbia University Press.
Search in Google Scholar Back to article
Chao, Y. R. (1968). A Grammar of Spoken Chinese. University of California Press.
Search in Google Scholar Back to article
Da, N. Z. (2019). The computational case against computational literary studies. Critical Inquiry, 45(3), 601–639. 10.1086/702594
Open DOI Search in Google Scholar Back to article
Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 1, 4171–4186. 10.18653/v1/N19-1423
Open DOI Search in Google Scholar Back to article
Fuller, M. A. (2017). An introduction to Chinese poetry: From the Canon of Poetry to the lyrics of the Song Dynasty. Cambridge, Massachusetts: Harvard University Asia Center. Distributed by Harvard University Press.
Search in Google Scholar Back to article
Gärdenfors, P. (2000). Conceptual Spaces: The Geometry of Thought. MIT Press. 10.7551/mitpress/2076.001.0001
Open DOI Search in Google Scholar Back to article
Kao, Y. (1986). The aesthetics of regulated verse. In S. Lin & S. Owen, (Eds.), The Vitality of the Lyric Voice: Shih Poetry from the Late Han to the T’ang (pp. 332–385). Princeton University Press. 10.1515/9781400858385.332
Open DOI Search in Google Scholar Back to article
Kao, Y., & Mei, T. (1971). Syntax, diction, and imagery in T’ang poetry. Harvard Journal of Asiatic Studies, 31, 49–136. 10.2307/2718714
Open DOI Search in Google Scholar Back to article
Kurzynski, M., Xu, X., & Feng, Y. (2024). Vector Poetics: Parallel Couplet Detection in Classical Chinese Poetry. In Proceedings of the 4th International Conference on Natural Language Processing for Digital Humanities, 200–208, Miami, USA. Association for Computational Linguistics. 10.18653/v1/2024.nlp4dh-1.19
Open DOI Search in Google Scholar Back to article
Kurzynski, M., Xu, X., & Feng, Y. (2025). The Game of Keys and Queries: Parallelism and Cognitive Geometry in Chinese Regulated Verse, International Journal of Humanities and Arts Computing, 19(2), 143–157. 10.3366/ijhac.2025.0355
Open DOI Search in Google Scholar Back to article
Lee, J., Kong, Y. H., & Luo, M. (2018). Syntactic patterns in classical Chinese poems: A quantitative study, Digital Scholarship in the Humanities, 33(1), 82–95. 10.1093/llc/fqw059
Open DOI Search in Google Scholar Back to article
Long, H., & So, R. J. (2016). Literary pattern recognition: Modernism between close reading and machine learning. Critical Inquiry, 42(2), 235–267. 10.1086/684353
Open DOI Search in Google Scholar Back to article
Moretti, F. (2013). Operationalizing. New Left Review, 84. 10.64590/daw
Open DOI Search in Google Scholar Back to article
Owen, S. (1985). Traditional Chinese Poetry and Poetics: An Omen of the World. University of Wisconsin Press.
Search in Google Scholar Back to article
Peters, M. E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., & Zettlemoyer, L. (2018). Deep contextualized word representations. Proceedings of NAACL-HLT 2018, 2227–2237. 10.18653/v1/N18-1202
Open DOI Search in Google Scholar Back to article
Piper, A. (2020). Can We Be Wrong? The Problem of Textual Evidence in a Time of Data. Cambridge: Press (Elements in Digital Literary Studies). 10.1017/9781108922036
Open DOI Search in Google Scholar Back to article
Plaks, A. H. (1988). Where the lines meet: Parallelism in Chinese and Western literatures. Chinese Literature: Essays, Articles, Reviews (CLEAR), 10(1/2), 43–60. 10.2307/1772825
Open DOI Search in Google Scholar Back to article
Pulleyblank, E. G. (1995). Outline of classical Chinese grammar. Vancouver: UBC Press.
Search in Google Scholar Back to article
Rudin, C. (2019). Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence, 1(5), 206–215. 10.1038/s42256-019-0048-x
Open DOI Search in Google Scholar Back to article
Wang, D., Liu, C., Zhu, Z., Liu, J., Hu, H., Shen, S., & Li, B. (2022). SikuBERT yu SikuRoBERTa: mianxiang shuzirenwen de ‘Siku Quanshu’ yuxunlian moxing goujian ji yingyong yanjiu “SikuBERT 與 SikuRoBERTa: 面向數字人文的《四庫全書》預訓練模型構建及應用研究” (SikuBERT and SikuRoBERTa: Construction and Application of the Models Pre-trained on the Complete Library of the Four Branches of Learning). Tushuguan luntan, 42(6), 31–43.
Search in Google Scholar Back to article
Watson, B. (1971). Chinese lyricism: Shih poetry from the second to the twelfth century. New York: Columbia University Press.
Search in Google Scholar Back to article
Yu, Pauline. (1987). The Reading of Imagery in the Chinese Poetic Tradition. Princeton University Press.
Search in Google Scholar Back to article

From Character to Poem: Nested Contexts and Scalar Limits of Parallelism Detection in Classical Chinese Poetry

Abstract

Paradigm

My account