Have a personal or library account? Click to login
Linking Datasets Using Semantic Textual Similarity Cover
Open Access
|Mar 2018

Abstract

Linked data has been widely recognized as an important paradigm for representing data and one of the most important aspects of supporting its use is discovery of links between datasets. For many datasets, there is a significant amount of textual information in the form of labels, descriptions and documentation about the elements of the dataset and the fundament of a precise linking is in the application of semantic textual similarity to link these datasets. However, most linking tools so far rely on only simple string similarity metrics such as Jaccard scores. We present an evaluation of some metrics that have performed well in recent semantic textual similarity evaluations and apply these to linking existing datasets.

DOI: https://doi.org/10.2478/cait-2018-0010 | Journal eISSN: 1314-4081 | Journal ISSN: 1311-9702
Language: English
Page range: 109 - 123
Submitted on: Sep 29, 2017
Accepted on: Nov 30, 2017
Published on: Mar 30, 2018
Published by: Bulgarian Academy of Sciences, Institute of Information and Communication Technologies
In partnership with: Paradigm Publishing Services
Publication frequency: 4 issues per year

© 2018 John P. McCrae, Paul Buitelaar, published by Bulgarian Academy of Sciences, Institute of Information and Communication Technologies
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 3.0 License.