Have a personal or library account? Click to login
Mining an English-Chinese parallel Dataset of Financial News Cover

Mining an English-Chinese parallel Dataset of Financial News

Open Access
|Mar 2022

Figures & Tables

Table 1

Linguistic features of the text collection (‘Lang.’ is language, ‘NP’ is noun phrases, ‘MultiWD’ is multiwords, ‘Sent.’ is sentences, ‘NE’ is named entities, ‘Hanzi’ is Chinese characters.

LANG.TOKENNPMULTIWDPARAG.SSENT.NEHANZI
English2,598,3091,672,5772,376,424272,756597,3721,190,6820
Chinese7,480,1391,491,7903,466,453258,213572,1851,268,67421,679,815
Table 2

20 most frequently used financial terms.

Capital9383Net Worth195
Asset3086Liability141
Liquidity1704Business Plan126
Interest Rate1036Fixed Asset101
Bankruptcy616Debt Financing97
Balance Sheet522Working Capital83
Principle382Financial Statements72
Collateral371Equity Financing64
Depreciation368Line of Credit46
Cash Flow209Appraisal42
DOI: https://doi.org/10.5334/johd.62 | Journal eISSN: 2059-481X
Language: English
Published on: Mar 18, 2022
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2022 Nicolas Turenne, Ziwei Chen, Guitao Fan, Jianlong Li, Yiwen Li, Siyuan Wang, Jiaqi Zhou, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.