Have a personal or library account? Click to login
A Method for Extending Ontologies with Application to the Materials Science Domain Cover

A Method for Extending Ontologies with Application to the Materials Science Domain

Open Access
|Oct 2019

Figures & Tables

dsj-18-1030-g1.png
Figure 1

Example from NanoParticle Ontology.

dsj-18-1030-g2.png
Figure 2

Example from NanoParticle Ontology opened in Protégé.

dsj-18-1030-g3.png
Figure 3

Example from NanoParticle Ontology – OWL/XML Syntax Format.

dsj-18-1030-g4.png
Figure 4

Approach: The upper part of the Figure shows the creation of a phrase-based topic model with as input unstructured text and as output phrases and topics. The lower part shows the formal topical concept analysis with as input topics and as output a topical concept lattice. In both parts a domain expert validates and interprets the results.

dsj-18-1030-g5.png
Figure 5

Examples of (a) phrase occurrences in topics, (b) Formal Topical Concept Lattice and (c) Formal Topical Concepts.

Table 1

Result of interpreting phrases. The first column defines the case using the number of topics, low or high mining threshold, and ontology. The precision is truncated.

ADDADD-mEXISTEXIST-mNo-gNoprecision
20, low, NanoParticle32426191690.91
20, low, eNanoMapper293242514120.88
30, low, NanoParticle30426181690.91
30, low, eNanoMapper283242612110.89
40, low, NanoParticle324261516100.90
40, low, eNanoMapper293242214120.88
20, high, NanoParticle91147401.00
20, high, eNanoMapper821210301.00
30, high, NanoParticle82148010.96
30, high, eNanoMapper711210010.96
40, high, NanoParticle921412440.91
40, high, eNanoMapper921214240.90

[i] For the meanings of ADD(-m), EXIST(-m) and No(-g), see Section 3.3.

For ADD and ADD-m, a new concept is defined in the ontology and one or more subsumption axioms are added.

Table 2

The number (and truncated percentage in parentheses) of topics that contribute to extending the ontologies. The first column defines the case using the number of topics, low or high mining threshold, and ontology.

Contribute to ADD and ADD-mContribute to EXIST and EXIST-mContribute to No-g
20, low, NanoParticle18 (90.0%)16 (80.0%)6 (30.0%)
20, low, eNanoMapper18 (90.0%)16 (80.0%)5 (40.0%)
20, high, NanoParticle11 (55.0%)13 (65.0%)3 (15.0%)
20, high, eNanoMapper11 (55.0%)13 (65.0%)2 (10.0%)
30, low, NanoParticle19 (63.0%)19 (63.0%)11 (36.6%)
30, low, eNanoMapper18 (60.0%)20 (66.6%)11 (36.6%)
30, high, NanoParticle10 (33.3%)19 (63.3%)3 (10.0%)
30, high, eNanoMapper9 (30.0%)20 (66.6%)2 (6.6%)
40, low, NanoParticle22 (55.0%)21 (52.5%)12 (30.0%)
40, low, eNanoMapper21 (52.5%)23 (57.5%)9 (22.5%)
40, high, NanoParticle13 (32.5%)16 (40.0%)4 (10.0%)
40, high, eNanoMapper12 (30.0%)18 (45.0%)3 (7.5%)
Table 3

Result of interpreting topics. The first column defines the case using the number of topics, low or high mining threshold, and ontology. Note that some topics may be empty and some topics may require several concepts. The values in parentheses show the number of added concepts that were not found in the phrase interpretation phase.

ADDADD-mEXISTEXIST-mNo-gQNoprecision
20, low, both3(1)02011301.00
30, low, both8(2)04011301.00
40, low, both16(1)011121050.88
20, high, both8(1)0320701.00
30, high, both3(2)01020701.00
40, high, NanoParticle10(2)01032320.93
40, high, eNanoMapper10(2)0942320.93

[i] For the meanings of ADD(-m), EXIST(-m), No(-g) and Q, see Section 3.3.

For ADD and ADD-m, a new concept is defined in the ontology and one or more subsumption axioms are added.

dsj-18-1030-g6.png
Figure 6

Part of the lattice for the 40 topics and low mining threshold setting. Nodes that contain one topic/one phrase and have as child the bottom node and as parent the top node are not shown.

Table 4

Result of interpreting lattice nodes. The first column defines the case using the number of topics, low or high mining threshold, and ontology. The values in parentheses show the number of added concepts that were not found in the phrase or topic interpretation phases.

ADDADD-mEXISTEXIST-mNo-gQNoprecision
20, low, both1(0)0102001.00
30, low, NanoParticle4(2)0301001.00
30, low, eNanoMapper3(2)0401001.00
40, low, both3(0)0100001.00
20, high, both0(0)0101101.00
30, high, both1(1)0100001.00
40, high, both0(0)0000001.00

[i] For the meanings of ADD(-m), EXIST(-m), No(-g) and Q, see Section 3.3.

For ADD a new concept is defined in the ontology and one or more subsumption axioms are added.

Table 5

New concepts for the NanoParticle and eNanoMapper ontologies.

ConceptsNanoParticleeNanoMapper
amorphous silicon
band gap
Barium Titanate
block copolymer
copolymer
polymer
CdSe nanocrystal
CdTe nanoparticle
copper nanoparticle
conductivity
electrical
gold nanorod
growth mechanism
resolution
layer by layer growth
liquid solid
pressure
MCM 41
mechanical property
viscosity
melt spin
mesoporous silica nanoparticle
mesoporous silica nanosphere
microcrystalline silicon
optical property
polymorphous silicon
pore size
porous silicon
quantum confinement
reverse micelle-type quantum dot
semiconductor nanocrystal
nanocrystal
silicon thin film
thin film
crystallinity
thermal conductivity
tunnel spectroscopy
ZnO nanowire
3532
Table 6

New axioms for the NanoParticle and eNanoMapper ontologies.

AxiomsNanoParticleeNanoMapper
amorphous silicon is a silicon
band gap is a quality
Barium Titanate is an inorganic compound or molecule
Barium Titanate is a chemical substance
block copolymer is a copolymer
copolymer is a polymer
polymer is an organic material
CdSe nanocrystal is a nanocrystal
CdTe nanoparticle is a nanoparticle
copper nanoparticle is a metal nanoparticle
conductivity is an independent general individual quality
conductivity is a quality
electrical conductivity is a conductivity
gold nanorod is a nanorod
growth mechanism is a process
resolution is an independent general individual quality
resolution is a quality
layer by layer growth is a mechanism process
liquid solid is a liquid solid interface
pressure is an independent general individual quality
MCM 41 is a mesoporous silica nanoparticle
mechanical property is a realizable entity
mechanical property is a quality
viscosity is a mechanical property
melt spin is a technique
mesoporous silica nanoparticle is a nanoparticle
mesoporous silica nanosphere is a nanosphere
microcrystalline silicon is a silicon
microcrystalline silicon is a chemical substance
nanotube array has part nanotube
optical property is a property
polymorphous silicon is a silicon
polymorphous silicon is a chemical substance
pore size is a nanoparticle property
porous silicon is a silicon
porous silicon is a chemical substance
raman scatter is a synonym of raman spectroscopy
quantum confinement
reverse micelle-type quantum dot is a quantum dot
semiconductor nanocrystal is a semiconductor and is a nanocrystal
nanocrystal is a nano-object and is a crystal
silicon thin film is a thin film
thin film is a fiat material part and one-dimensional nano-object
crystallinity is an independent general individual quality
crystallinity is a quality
transition metal is a synonym of transition element
thermal conductivity is a conductivity
tunnel spectroscopy is a spectroscopy
scanning tunneling spectroscopy is same as tunnel spectroscopy
chemical vapor disposition is a vapor disposition
physical vapor disposition is a vapor disposition
ZnO nanowire is a nanowire
4237
Table 7

Performance of ontology learning systems in different domains (Wong et al. 2012). (Precision is truncated).

SystemDomainPrecision
ASIUMFrench journal Le Monde0.86
CRCTOLPatterns of Global Terrorism0.92
OntoGainComputer Science corpus0.86
Medical corpus0.89
OntoLearnTourism0.85
Text2OntoText from the paper (Navigli & Velardi 2004)0.61
Patterns of Global Terrorism0.74
Table 8

The results of Text2Onto with different algorithms and different number of returned candidates. (Precision is truncated).

# of elementsAlgorithmADDADD-mEXISTEXIST-mNo-gNoprecision
100Entropy5039194330.67
C-value/NC-value5039194330.67
Relative term frequency5039204320.68
TF-IDF17022126430.57
200Entropy7163438790.60
C-value/NC-value7163437790.60
Relative term frequency7163428790.60
TF-IDF241381919990.50
300Entropy1218052161390.53
C-value/NC-value1218052161390.53
Relative term frequency1317852161400.53
TF-IDF2815836291480.50
400Entropy1819862201990.50
C-value/NC-value1819862201990.50
Relative term frequency19110061201990.50
TF-IDF3617044382110.47
Table 9

Results for Text2Onto using all algorithms per setting and our method for extending NanoParticle Ontology. (Precision is truncated).

ADDADD-mEXISTEXIST-mNo-gNoprecision
Text2Onto-100200512711710.60
Text2Onto-2002918455261640.54
Text2Onto-30039111878442660.51
Text2Onto-40041112073473130.47
Our Method323251814220.80
Table 10

New concepts found by our method and Text2Onto for the NanoParticle Ontology.

ConceptsOur methodText2Onto
acid group
activation energy
amorphous silicon
band gap
Barium Titanate
Barium Titante nanowire
block copolymer
boron nanowire
catalyst
cluster
copolymer
crystallite
crystallinity
CdSe nanocrystal
CdTe nanoparticle
copper nanoparticle
conductivity
diblock copolymer
electrical conductivity
esterification
ethylene oxide
gold nanorod
growth mechanism
intensity
resolution
layer by layer growth
liquid solid
pressure
MCM 41
mechanical property
melting
melt spin
mesoporous silica nanoparticle
mesoporous silica nanosphere
microcrystalline silicon
nano colloid
nano composite
nanocrystal
nano crystalline silicon particle
nanogrid
nano ribbon
nanotube array
nanowire array
oxidation
photo activity
polyelectrolyte
polymorphous silicon
pore size
porous silicon
pressureP
quantum confinement
reverse micelle-type quantum dot
semiconductor nanocrystal
silicon thin film
silica nanosphere
silicon nanowire
silicon nanowire array
superlattice nanowire
thin film
titanium nanotube
thermal conductivity
tunnel spectroscopy
ZnO nanowire
3542
Language: English
Submitted on: Jun 22, 2019
Accepted on: Sep 23, 2019
Published on: Oct 3, 2019
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2019 Huanyu Li, Rickard Armiento, Patrick Lambrix, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.