Have a personal or library account? Click to login
Chinese Text Auto-Categorization on Petro-Chemical Industrial Processes Cover

Chinese Text Auto-Categorization on Petro-Chemical Industrial Processes

By: Jing Ni,  Ge Gao and  Pengyu Chen  
Open Access
|Jan 2017

Abstract

There is a huge growth in the amount of documents of corporations in recent years. With this paper we aim to improve classification performance and to support the effective management of massive technical material in the domain-specific field. Taking the field of petro-chemical process as a case, we study in detail the influence of parameters on classification accuracy when using Support Vector Machine (SVM) and K-Nearest Neighbor (KNN) Text auto-classification algorithm. Advantages and disadvantages of the two text classification algorithms are presented in the field of petro-chemical processes. Our tests also show that more attention to the professional vocabulary can significantly improve the F1 value of the two algorithms. These results have reference value for the future information classification in related industry fields.

DOI: https://doi.org/10.1515/cait-2016-0078 | Journal eISSN: 1314-4081 | Journal ISSN: 1311-9702
Language: English
Page range: 69 - 82
Published on: Jan 25, 2017
Published by: Bulgarian Academy of Sciences, Institute of Information and Communication Technologies
In partnership with: Paradigm Publishing Services
Publication frequency: 4 issues per year

© 2017 Jing Ni, Ge Gao, Pengyu Chen, published by Bulgarian Academy of Sciences, Institute of Information and Communication Technologies
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 3.0 License.