论文部分内容阅读
自动文摘系统中一个关键的问题是找出能构成摘要的重点句子。找出这些句子的方法很多,但用机器学习的方法却较少,该文提出了一种关于文摘句式的自动学习方法。该方法以经过简单的预处理的若干语句为训练样本集,以正例句为基点进行由底向上的泛化学习,抽象出关于句式的一般概念,形成句式规则集,作为判断文中哪些语句可作为文摘句的有效手段。这是文摘系统实现的核心部分。
A key issue in automated summarization systems is finding the key sentences that make up the summary. There are many ways to find out these sentences, but there are fewer ways to learn the machine. This paper presents an automatic learning method about sentence patterns. The method takes several sentences which have been subjected to simple preprocessing as the training sample set, and uses the positive example as a starting point to conduct the bottom-up generalized learning to abstract the general concepts about the sentence patterns and form the sentence rule sets as the sentences to judge which sentences Can be used as an effective means of abstracts sentence. This is the core part of the digest system.