Coreference Resolution for Anaphoric Pronouns in Texts on Medical Products

Jerzy Krawczuk; Mariusz Ferenc

doi:10.2478/slgr-2018-0050

.blurhash-client-img { display: none !important; }

Coreference Resolution for Anaphoric Pronouns in Texts on Medical Products

Studies in Logic, Grammar and Rhetoric

Volume 56 (2018): Issue 1 (December 2018)

By: Jerzy Krawczuk and Mariusz Ferenc

Open Access

|Mar 2019

Abstract

Coreference resolution is the task of finding all expressions that refer to the same entity in a text. It is one of the higher level NLP (Natural Language Processing) tasks. It allows, for example, to extract more information about medical products from larger texts. A product such as ‘ambidextrous gloves’ may appear in a text in many different forms. For example, they could be referred to by the pronoun ‘they’, such as in this sentence. The algorithm presented in this paper finds pronouns and for each of them (except the pleonastic ‘it’) it creates a coreference candidate with entities that appeared earlier in the same sentence or in the previous sentence. Each candidate (pair of mentions) is described by 48 binary features which represent their grammatical and location properties. In the training set, each pair is marked as a coreference or not, based on which a decision tree classifier is trained. A classifier with a high precision of 0.94 and a decent recall of 0.61 were obtained on the training set, still with a good precision out of a sample of 0.64.

References

Clark, K., & Manning, C. D. (2015). Entity-Centric Coreference Resolution with Model Stacking. In Proceedings of the 53th Annual Meeting of the Association for Computational Linguistic and the 7th International Joint Conference on Natural Language Processing (pp. 1405–1415). Association for Computational Linguistics, Beijing, China.10.3115/v1/P15-1136
Search in Google Scholar Back to article
Clark, K., & Manning, C. D. (2016). Deep reinforcement learning for mention-ranking coreference models. In Proceedings of the 2016 Conference on Empirical Methods on Natural Language Processing (pp. 2256–2262). Association for Computational Linguistics, Austin, Texas.10.18653/v1/D16-1245
Search in Google Scholar Back to article
Ge, N., Hale, J., & Charniak, E. (1998). A Statistical Approach to Anaphora Resolution. In Proceedings of the Sixth Workshop on Very Large Corpora (pp. 167–170). Association for Computational Linguistics, Montreal, Canada.
Search in Google Scholar Back to article
Le Cessie, S., & Van Houwelingen, J. C. (1992). Ridge estimators in logistic regression. Applied Statistics, 41(1), 191–201. doi: 10.2307/234762810.2307/2347628
Open DOI Search in Google Scholar Back to article
Lee, H., Chang, A., Peirsman, Y., Chambers, N., Surdeanu, M., & Jurafsky, D. (2013). Deterministic coreference resolution based on entity-centric, precision-ranked rules. Computational Linguistics, 39(4), 885–916.10.1162/COLI_a_00152
Open DOI Search in Google Scholar Back to article
Lee, H., Peirsman, Y., Chang, A., Chambers, N., Surdeanu, M., & Jurafsky, D. (2011). Stanford’s Multi-Pass Sieve Coreference Resolution System at the CoNLL-2011 Shared Task. In Proceedings of the 15^th Conference on Computational Natural Language Learning: Shared Task (pp. 28–34). Association for Computational Linguistics, Portland, Oregon, USA.
Search in Google Scholar Back to article
Levy, R., & Andrew, G. (2006). Tregex and Tsurgeon: tools for querying and manipulating tree data structures. In Proceedings of the Fifth International Conference on Language Resources and Evaluation (pp. 2231–2234). European Language Resources Association, Genoa, Italy.
Search in Google Scholar Back to article
Miller, G., & Fellbaum, C. (1998). WordNet: An Electronic Lexical Database. Cambridge: MIT Press.10.7551/mitpress/7287.001.0001
Search in Google Scholar Back to article
Quinlan, J. R. (2014). C4.5: Programs for Machine Learning. Elsevier.
Search in Google Scholar Back to article
Santorini, B. (1990). Part-of-Speech Tagging Guidelines for the Penn Treebank Project (3rd revision) (Technical Report No. MS-CIS-90-47). University of Pennsylvania.
Search in Google Scholar Back to article
Soon, W. M., Ng, H. T., & Lim, D. C. Y. (2001). A Machine Learning Approach to Coreference Resolution of Noun Phrases. Computational Linguistics, 27(4), 521–544.10.1162/089120101753342653
Open DOI Search in Google Scholar Back to article
Witten, I. H., Frank, E., Hall, M. A., & Pal, C. J. (2016). Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann.
Search in Google Scholar Back to article

Articles in this issue

DOI: https://doi.org/10.2478/slgr-2018-0050 | Journal eISSN: 2199-6059 | Journal ISSN: 0860-150X

Journal RSS Feed

Language: English

Page range: 205 - 216

Published on: Mar 16, 2019

Published by: University of Białystok

In partnership with: Paradigm Publishing Services

Related subjects:

Philosophy,

Philosophy, other

© 2019 Jerzy Krawczuk, Mariusz Ferenc, published by University of Białystok
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.

Volume 56 (2018): Issue 1 (December 2018)