A Column Styled Composable Schema Matcher for Semantic Data-Types

Xiaofeng Liao; Jordy Bottelier; Zhiming Zhao

doi:10.5334/dsj-2019-025

A Column Styled Composable Schema Matcher for Semantic Data-Types

Data Science Journal

Volume 18 (2019): Issue 1

By: Xiaofeng Liao, Jordy Bottelier and Zhiming Zhao

Open Access

|Jun 2019

Figures & Tables

Classifying a Column Using the Match Tree.

Table 1

Column mapping between CKAN and CERIF.

A	is_source_of_has_classification_has_term
B	is_destination_of_has_source_is_source_of_has_destination_has_URI
C	is_dstination_of_has_source_is_source_of_has_destination_type
D	is_source_of_has_destination_type
E	is_destination_of_has_source_is_source_of_has_endDate
F	has_identifier_is_source_of_has_endDate
H	has_identifier_has_id_value
I	is_destination_of_has_source_has_identifier_has_URI
J	is_destination_of_has_source_has_identifier_type
K	is_destination_of_type
M	is_destination_of_has_source_is_source_of_has_destination_has_name
N	is_destination_of_has_source_type
O	is_destination_of_has_classification_type
P	has_identifier_has_URI
Q	is_source_of_has_classification_type
R	has_identifier_is_source_of_has_classification_type
S	is_destination_of_has_endDate
T	is_destination_of_has_startDate
V	is_destination_of_has_source_has_identifier_has_id_value
X	is_source_of_has_endDate
Y	has_identifier_is_source_of_has_startDate
a	is_destination_of_has_source_is_source_of_has_classification_type
b	is_destination_of_has_source_is_source_of_type
c	is_destination_of_has_source_is_source_of_has_startDate
d	has_identifier_is_source_of_type
e	is_source_of_has_startDate
has_description	has_description
has_identifier_label	has_identifier_label
has_identifier_type	has_identifier_type
has_name	has_name
is_source_of_type	is_source_of_type
label	label
type	type
unknown	unknown

Number of instances per class in the CERIF learn set.

Number of instances per class in the CKAN Test set.

Inlier test on the CKAN-CERIF dataset. Accuracy was averaged over 5 tests with 31 classes. Number of simulated columns per class: 15.

Outlier Test on the CKAN-CERIF dataset. Scores were averaged over 5 tests with 31 classes. Number of simulated columns per class: 15.

Finalized Fitted Pipeline for the CKAN-CERIF dataset.

Inlier test on the CKAN-CERIF dataset. Accuracy was averaged over 3 tests.

Outlier test on the CKAN-CERIF dataset. Scores were averaged over 3 tests.

References

Authors

Metrics

Articles in this issue

DOI: https://doi.org/10.5334/dsj-2019-025 | Journal eISSN: 1683-1470

Journal RSS Feed

Language: English

Page range: 25 - 25

Submitted on: Feb 23, 2019

Accepted on: May 14, 2019

Published on: Jun 24, 2019

Published by: Ubiquity Press

In partnership with: Paradigm Publishing Services

Publication frequency: 1 issue per year

Keywords:

Schema Matching,

Semantic Data-types,

XML,

RDF

© 2019 Xiaofeng Liao, Jordy Bottelier, Zhiming Zhao, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.

Volume 18 (2019): Issue 1

A Column Styled Composable Schema Matcher for Semantic Data-Types

Figures & Tables

Figure 1

Figure 2

Figure 3

Table 1

Figure 4

Figure 5

Figure 6

Figure 7

Figure 8

Figure 9

Figure 10

Paradigm

My account