论文部分内容阅读
Purpose:The thrust of this paper is to present a method for improving the accuracy of automatic indexing of Chinese-English mixed documents.Design/methodology/approach:Based on the inherent characteristics of Chinese-English mixed texts and the cybeetics theory,we proposed an integrated control method for indexing documents.It consists of "feed-forward control","in-progress control" and "feed-back control",aiming at improving the accuracy of automatic indexing of Chinese-English mixed documents.An experiment was conducted to investigate the effect of our proposed method.Findings:This method distinguishes Chinese and English documents in grammatical structures and word formation rules.Through the implementation of this method in the three phases of automatic indexing for the Chinese-English mixed documents,the results were encouraging.The precision increased from 88.54% to 97.10% and recall improved from 97.37% to 99.47%.Research limitations:The indexing method is relatively complicated and the whole indexing process requires substantial human intervention.Due to patte matching based on a bruteforce (BF) approach,the indexing efficiency has been reduced to some extent.Practical implications:The research is of both theoretical significance and practical value in improving the accuracy of automatic indexing of multilingual documents (not confined to Chinese-English mixed documents).The proposed method will benefit not only the indexing of life science documents but also the indexing of documents in other subject areas.Originality/value:So far,few studies have been published about the method for increasing the accuracy of multilingual automatic indexing.This study will provide insights into the automatic indexing ofmultilingual documents,especially Chinese-English mixed documents.