Sentence, Phrase, and Triple Annotations to Build a Knowledge Graph of Natural Language Processing Contributions—A Trial Dataset

Figures & Tables

	Tasks	Information Units			Sentences			Phrases			Triples

		P	R	F1	P	R	F1	P	R	F1	P	R	F1
1	MT	66.66	73.68	70.0	66.67	54.55	60.0	37.47	30.96	33.91	19.73	17.46	18.53
2	NER	79.55	81.40	80.46	60.89	69.43	64.88	44.09	42.60	43.34	22.34	21.63	21.98
3	QA	93.18	93.18	93.18	67.96	79.55	73.30	54.04	45.21	49.23	37.50	32.0	34.52
4	RC	70.21	73.33	71.74	64.64	60.31	62.40	35.31	29.24	32.0	12.59	11.45	11.99
5	TC	86.67	84.78	85.71	75.44	78.66	77.01	54.77	45.38	49.63	27.41	22.41	24.66
Cum.	micro	78.83	80.65	79.73	67.25	67.63	67.44	45.36	38.83	41.84	23.76	20.97	22.28
	macro	78.8	80.49	79.64	67.33	68.51	67.92	45.2	38.91	41.82	23.87	20.95	22.31

Information Unit	No. of triples	No. of papers	Ratio of triples to papers
Experiments	168	3	56
Tasks	277	8	34.63
ExperimentalSetup	300	16	18.75
Model	561	32	17.53
Hyperparameters	254	15	16.93
Results	688	42	16.38
Approach	283	18	15.72
Baselines	148	10	14.8
AblationAnalysis	155	13	11.92
Dataset	8	1	8
ResearchProblem	169	50	3.38
Code	9	9	1

[1a. sentence 159] As expected, adding features computed by neural networks consistently improves the performance over the baseline performance.
[1b. phrases from sentence 159] {adding features, computed by, neural networks, improves the performance, over baseline performance}
[1c. triples from entities above] {(Contribution, has, Results), (Results, improves the performance, adding features), (adding features, computed by, neural networks), (Results, improves the performance, over baseline performance)}

[2a. sentence 160] The best performance was achieved when we used both CSLM and the phrase scores from the RNN Encoder – Decoder.
[2b. phrases from sentence 160] {best performance was achieved, used both CSLM and the phrase scores, from, RNN Encoder – Decoder}
[2c. triples from entities above] {(Contribution, has, Results), (Results, best performance was achieved, used both CSLM and the phrase scores), (used both CSLM and the phrase scores, from, RNN Encoder – Decoder)}

	MT	NER	QA	RC	TC	Overall
total IUs	38	43	44	45	46	216
ann Sentences	209	157	176	194	164	900
avg ann Sentences	0.081	0.068	0.07	0.1	0.079	-
ann Phrases	956	770	960	978	1038	4,702
avg Toks per Phrase	2.81	2.87	2.76	2.91	2.7	-
avg ann Phrase Toks	0.28	0.25	0.26	0.28	0.34	-
ann Triples	590	504	619	620	647	2,980