Have a personal or library account? Click to login
Extracting and Measuring Uncertain Biomedical Knowledge from Scientific Statements Cover

Extracting and Measuring Uncertain Biomedical Knowledge from Scientific Statements

By: Xin Guo,  Yuming Chen,  Jian Du and  Erdan Dong  
Open Access
|Apr 2022

Figures & Tables

Figure 1

Research framework for extracting and measuring computable biomedical knowledge.
Research framework for extracting and measuring computable biomedical knowledge.

Figure 2

The evolution of structured biomedical knowledge in cardiovascular research in China.
The evolution of structured biomedical knowledge in cardiovascular research in China.

Figure 3

Network visualization of co-occurrence of SO pairs.
Network visualization of co-occurrence of SO pairs.

Figure 4

Distribution of Unknown/Hedging/Conflicting cue words in different parts of scientific statements.
Distribution of Unknown/Hedging/Conflicting cue words in different parts of scientific statements.

Figure 5

Trends in IE of SO pairs as the number of the supporting sentences.
Trends in IE of SO pairs as the number of the supporting sentences.

Examples of SPO triples extracted from scientific statements_

#PMIDYearSentenceSubjectPredicateObject
111748351200195232561. The high level of low-density lipoprotein (LDL) is a risk factor for cardiovascular disease.Low-Density Lipoproteins (Chem. & Drugs)PREDISPOSES (others)Cardiovascular Diseases (Disorders)
216541193200662140. Our finding suggests that the CYP2C9*3 gene variant significantly alters the plasma concentration and acute DBP response at the 6-h point following irbesartan treatment in Chinese hypertensive patientsHypertensive (Disorders)Therapeutic procedure Proceduresirbesartan (Chem. & Drugs)irbesartan (Chem. & Drugs)PROCESS_OF (functionally_related_to)USES (functionally_related_to)TREATS (functionally_related_to)TREATS (functionally_related_to)Patients Living Beingsirbesartan (Chem. & Drugs)Patients Living BeingsHypertensive (Disorders)
32773322020167853995. Subgroup analysis for each outcome measure was performed for the observing time point after the transplantation of MSCs.Stem cells AnatomyPART_OF physically_related_toMarrow Anatomy
4324213812020345332630. Our data seem to suggest that COVID-19 is probably an additional risk factor for DVT in hospitalized patients.COVID-19 (Disorders)PREDISPOSES othersDeep Vein Thrombosis Physiology
5324930732020345298871. COVID-19 presented with deep vein thrombosis: an unusual presentation.COVID-19 (Disorders)COEXISTS_WITH othersDeep Vein Thrombosis Physiology
6323511212020345367243. In this brief review, we will elaborate on the role of RAS and ACE2 in pathogenesis of COVID-19.Angiotensin converting enzyme 2 (Chem. & Drugs)CAUSES (functionally_related_to)COVID-19 (Disorders)

Frequencies of the uncertain cue words_

Cue wordsFrequency in all SemMedDB sentencesFrequency in our Triples Store sentences
Unknown lexicon
uncertain*227,014(10.57%)191(12.87%)
unknown525,536(24.48%)499(33.62%)
Hedging lexicon
maybe10,286(0.48%)142(9.57%)
may5,946,955(276.96%)10,050(677.19%)
might949,536(44.22%)2,639(177.82%)
possible1,751,994(81.59%)1,611(108.55%)
potential2,879,336(134.10%)2,675(180.25%)
seems333,677(15.54%)154(10.38%)
perhaps84,058(3.91%)32(2.16%)
likely1,052,986(49.04%)1,248(84.09%)
sometimes119,942(5.59%)26(1.75%)
Conflicting lexicon
conflict*175,516(8.17%)59(3.98%)
contradict*46,639(2.17%)10(0.67%)
controvers*208,264(9.70%)308(20.75%)
debat*122,332(5.70%)54(3.64%)
no consensus17,907(0.83%)9(0.61%)
questionable*21,159(0.99%)5(0.34%)
refut*9,710(0.45%)9(0.61%)

Examples of uncertain sentences and triples from different parts of scientific statements_

Statement LocationSupporting SentencesSPO Triples
Premise (Background)Liver X receptors (LXRs) play a central role in atherosclerosis; however, LXR activity of organic pollutants and associated potential risk of atherosclerosis have not yet been characterized.liver X receptor_AFFECTS_Atherosclerosis
Lipoprotein-associated phospholipase A2 (Lp-PLA2) is considered to be a risk factor for acute coronary syndrome (ACS), but this remains controversial.Phospholipase A2_PREDISPOSES_Acute coronary syndrome
Hypothesis (Objective)We therefore performed a case-control study investigating the possible relation between ACE gene polymorphisms and MVPS in Taiwan Chinese.Mitral Valve Prolapse_ASSOCIATED_WITH_gene polymorphism
Given the uncertainty regarding the relationship of C-reactive protein (CRP) and homocysteine (Hcy) to atherosclerotic burden, our aim was to determine whether CRP and Hcy are related to the presence of subclinical coronary plaque and stenosis.Stenosis_ASSOCIATED_WITH_C-reactive proteinStenosis_ASSOCIATED_WITH_homocysteine
Evidence (Results)Ischemic heart disease was identified as the possible etiology of HF in a greater proportion of non-Chinese patients (47.7% vs. 35.3%; p < 0.001) whereas hypertension (26.1% vs. 16.1%; p < 0.001) and valvular heart disease (11.6% vs. 7.2%; p < 0.001) were relatively more common in Chinese patients.Myocardial Ischemia_CAUSES_Heart failure
Genetic polymorphisms of four genes, methylenetetrahydrofolate reductase (MTHFR) and apolipoprotein E (ApoE) have been demonstrated to associate with the increased risk for both MDD and stroke, while the association between identified polymorphisms in angiotensin-converting enzyme (ACE) and serum paraoxonase (PON1) with depression is still under debate, for the existing studies are insufficient in sample size.Peptidyl-Dipeptidase A_PREDISPOSES_Cerebrovascular accident Peptidyl-Dipeptidase A_PREDISPOSES_Major Depressive Disorder Arylesterase_PREDISPOSES_Cerebrovascular accident Arylesterase_PREDISPOSES_Major Depressive Disorder
Claims (Conclusions)This study shows a significant association of hypertension susceptibility loci only in obese Chinese children, suggesting a likely influence of childhood obesity on the risk of hypertension.Hypertensive disease_AFFECTS_Obesity
Our data demonstrate that TrkB protects endothelial integrity during atherogenesis by promoting Ets1-mediated VE-cadherin expression and plays a previously unknown protective role in the development of CADETS1 gene, ETS1_INTERACTS_WITH_cadherin 5

Top 10 SO pairs with the highest IE value_

#Subject_Object PairStart yearEnd year# SentenceIE
1Polymorphism, Genetic (Genetic Function / Physiology)_Coronary Arteriosclerosis (Disease or Syndrome / Disorders)2010201991.837
2Fibrinogen (AAPP / Chem. & Drugs)_Ischemic stroke (Disease or Syndrome / Disorders)2006201551.522
3Vascular Diseases (Disease or Syndrome / Disorders)_Human (Human / Living Beings)2003201341.500
4Epinephrine (Hormone / Chem. & Drugs)_Cardiopulmonary Arrest (Pathologic Function / Disorders)2007200741.500
5Ischemic stroke (Disease or Syndrome / Disorders)_Variation (Genetics) (NPOP / Phenomena)2008201641.500
6Gene Expression (Genetic Function / Physiology)_Population Group (Human / Living Beings)2016201841.500
7Reactive Oxygen Species (BACS / Chem. & Drugs)_Apoptosis (Cell Function / Physiology)2017202041.500
8Basal Ganglia (BPOC / Anatomy)_Hematoma (Pathologic Function / Disorders)2012201741.500
9HMG-CoA Reductase Inhibitors (Organic Chemical /Chem. & Drugs)_Acute coronary Syndrome (Disease or Syndrome / Disorders)2009201961.459
10Hyperuricemia (Disease or Syndrome / Disorders)_Hypertensive Disease (Disease or Syndrome / Disorders)20122019141.430

Sample sentences from SO pairs with high IE_

Polymorphism, Genetic_Coronary Arteriosclerosis

#PMIDYearPredicateSentence
1.1206684622010AFFECTSTo clarify whether polymorphisms of the RAGE gene were related to CAD, we performed a case-control study in Chinese Han patients.
1.2223636372012AFFECTSOur findings failed to demonstrate a correlation between (CAG)(n) polymorphism with CAD; however, we concluded that the rare 21bp deletion might have a more compelling effect on CAD than the common (CAG)(n) polymorphism, and MEF2A genetic variant might be a rare but specific cause of CAD/MI.
1.3223450932012NEG_AFFECTSThe three other polymorphisms of the RAS do not seem to influence the development of CAD in type 2 diabetes.
1.4235837982013AFFECTSHowever, given the limited number of studies and the potential biases, the influence of this (myeloperoxidase (MPO) G463A) polymorphism on CAD risk needs further investigation.
1.5241559132013CAUSESOur findings provided strong evidence for the potentially contributory roles of RAGE multiple genetic polymorphisms, especially in the context of locus-to-locus interaction, in the pathogenesis of CAD among northeastern Han Chinese.
1.6242392272014AFFECTSSome polymorphisms in the fibroblast growth factor receptor 4 gene (FGFR-4) have been correlated with coronary artery disease, however, the role of polymorphisms in the FGFR-4 gene in ischemic stroke remain unknown.
1.7273231322016AFFECTSThere is growing evidence that polymorphisms in NOS3 influence the progression of CAD; however, there is also a controversy regarding the association of polymorphisms in the gene encoding NOS3 and CAD.
1.8308268132019AFFECTSStudies have reported that inflammatory cytokine interleukin-8 (IL-8) gene −251 A/T (rs4073) polymorphism is correlated with CAD susceptibility, but the result remains controversial.
1.9317702002019AFFECTSThus, a meta-analysis was conducted to reassess the effects of this (interleukin-8 gene) polymorphism on CAD risks.
DOI: https://doi.org/10.2478/jdis-2022-0008 | Journal eISSN: 2543-683X | Journal ISSN: 2096-157X
Language: English
Page range: 6 - 30
Submitted on: Oct 25, 2021
Accepted on: Mar 5, 2022
Published on: Apr 25, 2022
Published by: Chinese Academy of Sciences, National Science Library
In partnership with: Paradigm Publishing Services
Publication frequency: 4 issues per year

© 2022 Xin Guo, Yuming Chen, Jian Du, Erdan Dong, published by Chinese Academy of Sciences, National Science Library
This work is licensed under the Creative Commons Attribution 4.0 License.