A Topic Detection Method Based on Word-attention Networks

Xie, Zheng

doi:10.2478/jdis-2021-0032

Abstract

Purpose

We proposed a method to represent scientific papers by a complex network, which combines the approaches of neural and complex networks.

Design/methodology/approach

Its novelty is representing a paper by a word branch, which carries the sequential structure of words in sentences. The branches are generated by the attention mechanism in deep learning models. We connected those branches at the positions of their common words to generate networks, called word-attention networks, and then detect their communities, defined as topics.

Findings

Those detected topics can carry the sequential structure of words in sentences, represent the intra- and inter-sentential dependencies among words, and reveal the roles of words playing in them by network indexes.

Research limitations

The parameter setting of our method may depend on practical data. Thus it needs human experience to find proper settings.

Practical implications

Our method is applied to the papers of the PNAS, where the discipline designations provided by authors are used as the golden labels of papers’ topics.

Originality/value

This empirical study shows that the proposed method outperforms the Latent Dirichlet Allocation and is more stable.

References

Agrawal, R., & Srikant, R. (1994, September). Fast algorithms for mining association rules. In Proceedings of the 20th International Conference of Very Large Data Bases. 1215, 487–499.
Search in Google Scholar Back to article
Ahn, Y.Y., Bagrow, J.P., & Lehmann, S. (2010). Link communities reveal multiscale complexity in networks. nature, 466(7307), 761–764.
Search in Google Scholar Back to article
Asuncion, A., Welling, M., Smyth, P., & Teh, Y.W. (2012). On smoothing and inference for topic models. UAI Press. arXiv:1205.2662.
Search in Google Scholar Back to article
Blei, D.M., Ng, A.Y., & Jordan, M.I. (2003). Latent dirichlet allocation. the Journal of machine Learning research, 3, 993–1022.
Search in Google Scholar Back to article
Blondel, V.D., Guillaume, J.L., Lambiotte, R., & Lefebvre, E. (2008). Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and experiment, 2008(10), P10008.
Search in Google Scholar Back to article
Boyack, K.W., Newman, D., Duhon, R.J., Klavans, R., Patek, M., Biberstine, J.R., ... & Börner, K. (2011). Clustering more than two million biomedical publications: Comparing the accuracies of nine text-based similarity approaches. PloS one, 6(3), e18029.
Search in Google Scholar Back to article
Cheng, J.P., Dong, L., & Lapata, M. (2016). Long short-term memory-networks for machine reading. Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 551–561.
Search in Google Scholar Back to article
Doucet, A., & Ahonen-Myka, H. (2010). An efficient any language approach for the integration of phrases in document retrieval. Language resources and evaluation, 44(1), 159–180.
Search in Google Scholar Back to article
Gehring, J., Auli, M., Grangier, D., Yarats, D., & Dauphin, Y.N. (2017, July). Convolutional sequence to sequence learning. In International Conference on Machine Learning. 1243–1252.
Search in Google Scholar Back to article
Gildea, D., & Jurafsky, D. (2002). Automatic labeling of semantic roles. Computational Linguistics, 28(3), 245–288.
Search in Google Scholar Back to article
Girvan, M., & Newman, M.E. (2002). Community structure in social and biological networks. Proceedings of the National Academy of Sciences, 99(12), 7821–7826.
Search in Google Scholar Back to article
Griffiths, T.L., & Steyvers, M. (2004). Finding scientific topics. Proceedings of the National Academy of Sciences, 101(suppl 1), 5228–5235.
Search in Google Scholar Back to article
Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735–1780.
Search in Google Scholar Back to article
Kalchbrenner, N., et al. Espeholt, L., Simonyan, K., Oord, A.V.D., Graves, A., & Kavukcuoglu, K. (2016). Neural machine translation in linear time. arXiv preprint arXiv:1610.10099.
Search in Google Scholar Back to article
Kim, Y., Denton, C., Hoang, L., & Rush, A.M. (2017). Structured attention networks. In International Conference on Learning Representations. arXiv:1702.00887
Search in Google Scholar Back to article
Kingsbury, P., & Palmer, M. (2002). From treebank to propbank. Language Resources & Evaluation. Proceedings of the Third International Conference on Language Resources and Evaluation (LREC’02).
Search in Google Scholar Back to article
Leicht, E.A., & Newman, M.E. (2008). Community structure in directed networks. Physical Review Letters, 100(11), 118703.
Search in Google Scholar Back to article
Li, P.J., Lam, W., Bing, L., & Wang, Z. (2017). Deep Recurrent Generative Decoder for Abstractive Text Summarization. Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2081–2090.
Search in Google Scholar Back to article
McDonald, R., Pereira, F., Kulick, S., Winters, S., Jin, Y., & White, P. (2005, June). Simple algorithms for complex relation extraction with applications to biomedical IE. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL05). 491–498.
Search in Google Scholar Back to article
Mintz, M., Bills, S., Snow, R., & Jurafsky, D. (2009, August). Distant supervision for relation extraction without labeled data. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP. 1003–1011.
Search in Google Scholar Back to article
Pons, P., & Latapy, M. (2005, October). Computing communities in large networks using random walks. International symposium on computer and information sciences. ISCIS 2005: Computer and Information Sciences - ISCIS 2005, 284–293.
Search in Google Scholar Back to article
Ramage, D., Manning, C.D., & Dumais, S. (2011, August). Partially labeled topic models for interpretable text mining. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 457–465. 457–465.
Search in Google Scholar Back to article
Schmidhuber, J. (2001). Gradient Flow in Recurrent Nets: The Difficulty of Learning Long-Term Dependencies. Wiley-IEEE Press.
Search in Google Scholar Back to article
Sethy, A., & Ramabhadran, B. (2008). Bag-of-word normalized n-gram models. In Ninth Annual Conference of the International Speech Communication Association. 1594–1597.
Search in Google Scholar Back to article
Shannon, C.E. (1948). A mathematical theory of communication. The Bell System Technical Journal, 27(3), 379–423.
Search in Google Scholar Back to article
Small, H., Boyack, K.W., & Klavans, R. (2014). Identifying emerging topics in science and technology. Research policy, 43(8), 1450–1467.
Search in Google Scholar Back to article
Swampillai, K., & Stevenson, M. (2011, September). Extracting relations within and across sentences. In Proceedings of the International Conference Recent Advances in Natural Language Processing 2011. 25–32.
Search in Google Scholar Back to article
Talley, E.M., Newman, D., Mimno, D., Herr, B.W., Wallach, H.M., Burns, G.A., ... & McCallum, A. (2011). Database of NIH grants using machine-learned categories and graphical clustering. Nature Methods, 8(6), 443–444.
Search in Google Scholar Back to article
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., & Polosukhin, I. (2017). Attention is all you need. In Advances in neural information processing systems. 5998–6008.
Search in Google Scholar Back to article
Velden, T., Boyack, K.W., Gläser, J., Koopman, R., Scharnhorst, A., & Wang, S. (2017). Comparison of topic extraction approaches and their results. Scientometrics, 111(2), 1169–1221.
Search in Google Scholar Back to article
Wallach, H.M. (2006, June). Topic modeling: Beyond bag-of-words. In Proceedings of the 23rd international conference on Machine learning, 977–984.
Search in Google Scholar Back to article
Yin, W., Schütze, H., Xiang, B., & Zhou, B. (2016). Abcnn: Attention-based convolutional neural network for modeling sentence pairs. Transactions of the Association for Computational Linguistics, 4, 259–272.
Search in Google Scholar Back to article
Zeng, D.J., Liu, K., Lai, S., Zhou, G., & Zhao, J. (2014, August). Relation classification via convolutional deep neural network. In Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers. 2335–2344.
Search in Google Scholar Back to article
Zhang, Y., Lu, J., Liu, F., Liu, Q., Porter, A., Chen, H., & Zhang, G. (2018). Does deep learning help topic extraction? A kernel k-means clustering method with word embedding. Journal of Informetrics, 12(4), 1099–1117.
Search in Google Scholar Back to article